Last Updated on 2024-10-26 by Clay
Introduction
Hydra is an open-source Python framework designed to simplify the research and deployment process, especially for complex applications. Hydra dynamically creates hierarchical configuration files during deployment and allows command line-based overwriting of these configurations.
The main features documented in the official records are as follows:
- Hierarchical configuration deployment from multiple sources
- Command line-based specification and overwriting of configurations
- Dynamic command-line auto-completion
- Run applications both locally and remotely
- Execute multiple tasks with a single command
Currently, Hydra is at version 1.2, supporting Linux, macOS, and Windows (as of 2022-12-09).
Quick start
Installation
We can install Hydra using the pip
package manager.
pip3 install hydra-core --upgrade
Basic example
Below is a basic example of Hydra in use. The actual program to be executed is app.py
, and conf is the directory where our configuration files are stored (in later examples, we will have many configuration files).
Directory Structure
hydra_sample/
├── app.py
└── conf
└── config.yaml
If you want to test Hydra, you can create an identical structure.
conf/config.yaml
This is a YAML configuration file with some settings for the website.
web:
web_name: clay-atlas.com
account: clay
password: secret
app.py
In the program we want to execute, the main logic must be contained within a function. The function should be decorated with Hydra's decorator @hydra.main()
, which must take the required parameters.
import hydra
from omegaconf import DictConfig, OmegaConf
@hydra.main(version_base=None, config_path="conf", config_name="config")
def my_app(cfg: DictConfig) -> None:
print(OmegaConf.to_yaml(cfg))
if __name__ == "__main__":
my_app()
Execution Result
$ python3 app.py
web:
web_name: clay-atlas.com
account: clay
password: secret
If we want to modify the parameters in the configuration file, we can pass in the desired values directly via the command line. For example, here I have changed account
and password
.
python3 app.py web.account=test web.password=1234
web:
web_name: clay-atlas.com
account: test
password: 1234
Multiple Configuration Files
(As you can see, the structure has become more complex!)
hydra_sample/
├── app.py
└── conf
├── config.yaml
└── db
├── web1.yaml
└── web2.yaml
conf/config.yaml
First, the top-level configuration file only specifies one parameter: which configuration file under db
to load.
Here, we set web1
as the default.
defaults:
- db: web1
conf/db/web1.yaml
In web1, we configure Google’s website, account, and password.
web_name: www.google.com
account: google
password: 1234
conf/db/web2.yaml
In web2, we do the same as web1, except we use my personal website, account, and password instead.
web_name: clay-atlas.com
account: clay
password: secret
RUN
Next is the execution part. app.py remains unchanged, simply outputting the parameters.
$ python3 app.py
db:
web_name: www.google.com
account: google
password: 1234
We can see that the default parameters are those configured for Google. If we want to switch to a different configuration, we can simply specify db=web2
in the command line.
$ python3 app.py db=web2
Output:
db:
web_name: clay-atlas.com
account: clay
password: secret
Besides setting db
to use a specific configuration file, we can directly specify parameters under db as well.
$ python3 app.py db=web2 db.password=9999
Outputs:
db:
web_name: clay-atlas.com
account: clay
password: 9999
The above is a basic usage of Hydra, including how to write configuration files and modify parameters from the command line.
Adding Parameters During Execution with "+"
In the following example, Hydra creates an empty cfg
object by default and passes it to the function decorated with @hydra.main()
.
We can dynamically add parameters at runtime using +
in the command line.
This differs from specifying parameters directly on the command line, as we can add parameters that are not present in the configuration file.
from omegaconf import DictConfig, OmegaConf
import hydra
@hydra.main(version_base=None)
def my_app(cfg: DictConfig) -> None:
print(OmegaConf.to_yaml(cfg))
if __name__ == "__main__":
my_app()
Execution:
python3 app.py +arg1=111 +arg2=222
Outputs:
arg1: 111
arg2: 222
Specifying a Configuration File
We can specify the path of the configuration file to load by passing config_path
and config_name
to hydra.main()
.
config_path
: The location of the configuration file (usually the directory where the configuration file is stored)config_name
: The name of the configuration file
from omegaconf import DictConfig, OmegaConf
import hydra
@hydra.main(version_base=None, config_path=".", config_name="config")
def my_app(cfg: DictConfig) -> None:
print(OmegaConf.to_yaml(cfg))
if __name__ == "__main__":
my_app()
In this example, the configuration file being loaded is named config.yaml located in the current directory.
We can also add parameters using +
.
python3 app.py +arg3=333
If a parameter already exists in the configuration file, we can directly override it.
python3 app.py arg2=333
Use ++
to add a parameter if it doesn’t exist, or override it if it does.
python3 app.py ++arg4=444
Accessing Variables Using Attributes and Dictionaries
from omegaconf import DictConfig, OmegaConf
import hydra
@hydra.main(version_base=None, config_path="conf", config_name="config")
def my_app(cfg: DictConfig) -> None:
print(cfg.node.arg1)
print(cfg["node"]["arg1"])
if __name__ == "__main__":
my_app()
Setting Default Configuration Combinations
The default output when executed is as follows:
python3 app.py
Output:
db:
web_name: www.google.com
account: google
password: 1234
Of course, if we no longer need this parameter, we can remove it from the command line.
python3 app.py ~db
Output:
{}
If Our Default Configuration Includes a Fixed Combination and a Custom Parameter
Suppose we know our db
will use almost all fixed configuration combinations from web1
.
However, we also know that account
needs to be changed to BERT
.
In such cases, do we need to create an additional fixed combination configuration web3
just for this single parameter?
Actually, we can set _self_
under defaults and configure the custom parameter we need.
config.yaml
defaults:
- db: web1
- _self_
db:
account: BERT
Then run:
python3 app.py
Output:
db:
web_name: www.google.com
account: BERT
password: 1234
This covers the commonly used features of Hydra. If there are additional useful features, they may become apparent as I encounter practical cases.