YACS is a very lightweight (the core implementation is no more than 250 lines) yet sufficiently powerful Python configuration utility, requiring no 3rd-party package. I used it in my several machine-learning/deep-learning projects, and it worked well and reliably.
Since YACS is so simple, we recommend just copying the single yacs.py
file to
your project. That is it. No tedious package installation is needed.
If wish to load/dump configurations from/to a yaml file, PyYAML is required.
For a regular-scale project, developers usually use a configuration file to define some default behaviors of their programs.
Take a machine learning project as example, the configuration file defines under which mode the experiment is running, use which model to run the task, and how the data is organized:
# default_config.yaml
mode: train
model:
backbone: vgg19
data:
source: dir/to/data/*.jpg
batch_size: 32
YACS uses a Config
object to implement all necessary interaction and
manipulation to the configurations.
Let's start by loading these configurations from the yaml and printing them:
from yacs import Config
cfg = Config('default_config.yaml')
cfg.print()
In the terminal, it shows something like this:
mode: train
model:
backbone: vgg19
data:
source: dir/to/data/*.jpg
batch_size: 32
Config
is actually a child class of the built-in dict
, so we can access its attributes by keys,
or more compactly, in a dotted-dict way (more recommended):
mode = cfg['mode'] # 'train'
mode = cfg.mode # 'train'
For inputs with nested structures, Config
objects will be recursively created, so you can access
its attributes in a recursive way:
bs = cfg.data.batch_size # 32
For safety reason, attributes are not allowed to be modified nor deleted by default:
cfg.data.batch_size = 512 # AttributeError: attempted to modify an immutable Config
Instead, one have to use the unfreeze()
context manager to make any modification:
with cfg.unfreeze():
cfg.data.batch_size = 512
Similarly, to add a new attribute or a child object to the current Config
object:
with cfg.unfreeze():
cfg.training = Config({'optimizer': 'Adam'})
cfg.print()
mode: train
model:
backbone: vgg19
data:
source: dir/to/data/*.jpg
batch_size: 512
training:
optimizer: Adam
Here, by typing cfg.training = Config({'optimizer': 'Adam'})
, we instantiate a temporary
Config
object from a dict and add it as the training
attribute to cfg
.
For a machine learning project, hyper-parameters or other setups vary case-by-case for each training or inference, so in addition to the default configurations, developers often require a temporary config file at hand, by which to override parts of the default configurations.
Assume we are now using another yaml to store these user-specific configurations:
# user_config.yaml
model:
backbone: resnet50
data:
batch_size: 128
Now we can use merge()
to merge these temporary configurations into the default ones,
overriding the duplicates while keeping others unchanged:
cfg = Config('default_config.yaml')
cfg.merge('user_config.yaml')
cfg.print()
mode: train
model:
backbone: resnet50
data:
source: dir/to/data/*.jpg
batch_size: 128
If the user-specific configurations contain attributes that are not in cfg
, use
exclusive=False
to explicitly claim that you wish to add new attributes:
cfg.merge('user_config.yaml', exclusive=False)
Let's see another example. Assume there is a optimizer
attribute in the default yaml, in which
we assign three children attributes optimizer_name
, lr
, and momentum
:
# sgd.yaml
mode: train
optimizer:
optimizer_name: SGD
lr: 0.01
momentum: 0.9
Now in one experiment, we wish to replace SGD with Adam optimizer, so we can create a temporary yaml and merge it like this:
# adam.yaml
optimizer:
optimizer_name: Adam
lr: 1.0E-5
cfg = Config('sgd.yaml')
cfg.merge('adam.yaml')
cfg.print()
mode: train
optimizer:
optimizer_name: Adam
lr: 1e-05
momentum: 0.9
Note that by default, the non-conflict attributes will be kept unchanged after merging, so in this case, cfg.optimizer.momentum
attribute is still kept after merging, which is not our intention because Adam does not require a momentum
parameter.
In such scenarios that we would like to completely replace an attribute (cfg.optmizer
here) and all its children attributes, use keep_existed_attr=False
to make cfg
neater:
cfg = Config('sgd.yaml')
adam_cfg = Config('adam.yaml')
cfg.optimizer.merge(adam_cfg.optimizer, keep_existed_attr=False)
cfg.print()
mode: train
optimizer:
optimizer_name: Adam
lr: 1e-05
Now cfg.optimizer.momentum
is gone because we explicitly ask not to keep those old and
non-conflict
attributes.
One appealing feature in Hydra is that it allows
users to control their programs' running options in the terminal, with the help of argparse
package.
YACS also allows initializing or merging configurations from the command line.
Config
's to_parser()
method offers a way to automatically generate an argparse.ArgumentParser
object, whose arguments are converted from the key-attribute pairs. For a nested attribute, keys
from hierarchies are concatenated into an argument, with .
as separators.
Let's use default_config.yaml
as example again. Instead of explicitly
creating an argument parser such as
import argparse
def create_parser():
parser = argparse.ArgumentParser()
parser.add_argument('--mode', type=str, default='train')
parser.add_argument('--model.backbone', type=str, default='vgg19')
parser.add_argument('--data.source', type=str, default='dir/to/data/*.jpg')
parser.add_argument('--data.batch_size', type=int, default=32)
return parser
we do this in an easier way:
cfg = Config('default_config.yaml')
parser = cfg.to_parser()
Then you can put this parser to the entry of your program, to accept arguments from the terminal,
and then merge the parsed arguments (will be stored in an argparse.Namespace
object) back into
the
cfg
:
# main.py
from yacs import Config
def main(cfg):
cfg.print()
# core program ...
if __name__ == '__main__':
cfg = Config('default_config.yaml')
parser = cfg.to_parser()
cfg.merge(parser.parse_args()) # merge from an argparse.Namespace object
main(cfg)
Finally we run main.py
in the terminal with some extra arguments:
$ python main.py --model.backbone resnet101 --data.batch_size 1024
and will get results:
mode: train
model:
backbone: resnet101
data:
source: dir/to/data/*.jpg
batch_size: 1024
Config
provides following methods to dump or convert your configurations to other datatype:
-
dump(yaml_path)
dumps the configurations into a yaml file; -
copy()
creates a deep copy of the currentConfig
object; -
to_dict()
converts aConfig
object into a regular nested dict; -
string()
converts aConfig
object into a string with pretty format.
See examples
directory for more practical usages.
YACS shares part of designs from rbgirshick's yacs and OmegaConf.
Copyright 2021 Qiu Jueqin.
Licensed under MIT.