- Author: Fantine Huot
After cloning the repository, make sure to run the following commands to initialize and update the submodules.
git submodule init
git submodule update
- TensorFlow
- bin: Scripts to run jobs.
- config: Configuration files.
- log: Log files.
- trainer: Machine learning model trainer.
This repository provides a parameterized, modular framework for creating and running ML jobs.
To train a machine learning model, use the following command:
bin/train.sh model_config dataset
model_config: Name of ML model configuration to use. This should correspond to a configuration file namedconfig/model_config.sh.dataset: Dataset identifier. Check the variablesdatapath,train_file, andeval_fileinbin/train.shto ensure that this maps to the correct input data.label: Optional label to add to the job name.
Parameters for an ML job can be set by creating a corresponding configuration
file: config/your_model_config.sh.
- Create a new
your_model.pyfile inside thetrainer/modelfolder. Look at other models inside the folder for examples. - Reference your new model in
trainer/model/__init__.py. - Set the
modelargument to your new model's name in your model configuration fileconfig/your_model_config.sh.
The hyperparameters are tuned using bayesian optimization.
To tune the hyperparameters for a machine learning model, use the following command:
bin/tunehp.sh model_config dataset
model_config: Name of ML model configuration to use. This should correspond to a configuration file namedconfig/model_config.sh.dataset: Dataset identifier. Check the variablesdatapath,train_file, andeval_fileinbin/train.shto ensure that this maps to the correct input data.
You can define the domain to explore for hyperparameter tuning by creating a
corresponding configuration file: config/your_model_config_hptuning.yaml.
Look at other hyperparameter tuning configuration files for examples.