Skip to content

Note Of Hydra: Environment Configure Manager Package

Last Updated on 2024-10-26 by Clay

Introduction

Hydra is an open-source Python framework designed to simplify the research and deployment process, especially for complex applications. Hydra dynamically creates hierarchical configuration files during deployment and allows command line-based overwriting of these configurations.

The main features documented in the official records are as follows:

  • Hierarchical configuration deployment from multiple sources
  • Command line-based specification and overwriting of configurations
  • Dynamic command-line auto-completion
  • Run applications both locally and remotely
  • Execute multiple tasks with a single command

Currently, Hydra is at version 1.2, supporting Linux, macOS, and Windows (as of 2022-12-09).


Quick start

Installation

We can install Hydra using the pip package manager.

pip3 install hydra-core --upgrade



Basic example

Below is a basic example of Hydra in use. The actual program to be executed is app.py, and conf is the directory where our configuration files are stored (in later examples, we will have many configuration files).


Directory Structure

hydra_sample/
├── app.py
└── conf
    └── config.yaml

If you want to test Hydra, you can create an identical structure.


conf/config.yaml

This is a YAML configuration file with some settings for the website.

web:
  web_name: clay-atlas.com
  account: clay
  password: secret


app.py

In the program we want to execute, the main logic must be contained within a function. The function should be decorated with Hydra's decorator @hydra.main(), which must take the required parameters.

import hydra
from omegaconf import DictConfig, OmegaConf


@hydra.main(version_base=None, config_path="conf", config_name="config")
def my_app(cfg: DictConfig) -> None:
    print(OmegaConf.to_yaml(cfg))


if __name__ == "__main__":
    my_app()


Execution Result

$ python3 app.py
web:
  web_name: clay-atlas.com
  account: clay
  password: secret


If we want to modify the parameters in the configuration file, we can pass in the desired values directly via the command line. For example, here I have changed account and password.

python3 app.py web.account=test web.password=1234
web:
  web_name: clay-atlas.com
  account: test
  password: 1234


Multiple Configuration Files

(As you can see, the structure has become more complex!)

hydra_sample/
├── app.py
└── conf
    ├── config.yaml
    └── db
        ├── web1.yaml
        └── web2.yaml


conf/config.yaml

First, the top-level configuration file only specifies one parameter: which configuration file under db to load.

Here, we set web1 as the default.

defaults:
  - db: web1



conf/db/web1.yaml

In web1, we configure Google’s website, account, and password.

web_name: www.google.com
account: google
password: 1234



conf/db/web2.yaml

In web2, we do the same as web1, except we use my personal website, account, and password instead.

web_name: clay-atlas.com
account: clay
password: secret



RUN

Next is the execution part. app.py remains unchanged, simply outputting the parameters.

$ python3 app.py 
db:
web_name: www.google.com
account: google
password: 1234



We can see that the default parameters are those configured for Google. If we want to switch to a different configuration, we can simply specify db=web2 in the command line.

$ python3 app.py db=web2


Output:

db:
  web_name: clay-atlas.com
  account: clay
  password: secret


Besides setting db to use a specific configuration file, we can directly specify parameters under db as well.

$ python3 app.py db=web2 db.password=9999


Outputs:

db:
  web_name: clay-atlas.com
  account: clay
  password: 9999


The above is a basic usage of Hydra, including how to write configuration files and modify parameters from the command line.


Adding Parameters During Execution with "+"

In the following example, Hydra creates an empty cfg object by default and passes it to the function decorated with @hydra.main().
We can dynamically add parameters at runtime using + in the command line.

This differs from specifying parameters directly on the command line, as we can add parameters that are not present in the configuration file.

from omegaconf import DictConfig, OmegaConf
import hydra


@hydra.main(version_base=None)
def my_app(cfg: DictConfig) -> None:
    print(OmegaConf.to_yaml(cfg))


if __name__ == "__main__":
    my_app()


Execution:

python3 app.py +arg1=111 +arg2=222


Outputs:

arg1: 111
arg2: 222

Specifying a Configuration File

We can specify the path of the configuration file to load by passing config_path and config_name to hydra.main().

  • config_path: The location of the configuration file (usually the directory where the configuration file is stored)
  • config_name: The name of the configuration file
from omegaconf import DictConfig, OmegaConf
import hydra


@hydra.main(version_base=None, config_path=".", config_name="config")
def my_app(cfg: DictConfig) -> None:
    print(OmegaConf.to_yaml(cfg))


if __name__ == "__main__":
    my_app()


In this example, the configuration file being loaded is named config.yaml located in the current directory.

We can also add parameters using +.

python3 app.py +arg3=333


If a parameter already exists in the configuration file, we can directly override it.

python3 app.py arg2=333


Use ++ to add a parameter if it doesn’t exist, or override it if it does.

python3 app.py ++arg4=444

Accessing Variables Using Attributes and Dictionaries

from omegaconf import DictConfig, OmegaConf
import hydra


@hydra.main(version_base=None, config_path="conf", config_name="config")
def my_app(cfg: DictConfig) -> None:
    print(cfg.node.arg1)
    print(cfg["node"]["arg1"])


if __name__ == "__main__":
    my_app()


Setting Default Configuration Combinations

The default output when executed is as follows:

python3 app.py


Output:

db:
  web_name: www.google.com
  account: google
  password: 1234


Of course, if we no longer need this parameter, we can remove it from the command line.

python3 app.py ~db


Output:

{}

If Our Default Configuration Includes a Fixed Combination and a Custom Parameter

Suppose we know our db will use almost all fixed configuration combinations from web1.

However, we also know that account needs to be changed to BERT.

In such cases, do we need to create an additional fixed combination configuration web3 just for this single parameter?

Actually, we can set _self_ under defaults and configure the custom parameter we need.

config.yaml

defaults:
  - db: web1
  - _self_

db:
  account: BERT


Then run:

python3 app.py


Output:

db:
  web_name: www.google.com
  account: BERT
  password: 1234


This covers the commonly used features of Hydra. If there are additional useful features, they may become apparent as I encounter practical cases.


References


Read More

Leave a ReplyCancel reply

Exit mobile version