Organizing Machine Learning Projects - Repository Organization

Organizing Machine Learning Projects
Repository Organization
Hao-Wen Dong

If you…
֎ have a hard time managing a bunch of experiments
֎ always forgot the exact configuration used in a specific experiment
֎ annoyed at having to copy all the code just to test a new architecture
2

Repository organization (core)
data/ Training data
exp/ Experimental inputs and outputs
scripts/ Shell scripts for training, testing, etc.
src/ Source code
LICENSE.txt License file
README.md Readme file
requirement.txt Requirements file
3

data/ Training data
src/ Source code
4

Source code (src/)
musegan/ Model source code
__init__.py File to make it a package
inference.py Script for inference
interpolation.py Script for interpolation
process_data.py Script for data preprocessing
train.py Script for training
5

Source code (src/)
6

Model source code (src/musegan/)
presets/ Preset network architectures
config.py System configuration file
data.py Data loader
default_config.yaml Default configurations
default_params.yaml Default parameters
io_utils.py I/O utilities
losses.py Loss functions
metrics.py Metrics
model.py Main model class
utils.py Utilities
7

data.py Data loader
metrics.py Metrics
utils.py Utilities
8
Define your model

data.py Data loader
metrics.py Metrics
utils.py Utilities
9
presets/
__init__.py
generator/
__init__.py
default.py
ablated.py
discriminator/
__init__.py
default.py
ablated.py

data.py Data loader
metrics.py Metrics
utils.py Utilities
10
Recommend to use high-level
APIs for flexibility such as
tf.data or torch.utils.data

data.py Data loader
metrics.py Metrics
utils.py Utilities
11
params config
Define the model
Define the settings for
training/inference
Will be copied to the experiment
directory when setting up a new
experiment

data.py Data loader
metrics.py Metrics
utils.py Utilities
12
(for developers only)
Define rarely-changed
configuration variables.
For example,
- logging level/format
- prefetch/buffer size
for the data loader

Source code (src/)
13
Python scripts
(to be called by shell scripts)

data/ Training data
src/ Source code
14
All the outputs are saved here
(logs, samples, and checkpoints)

data/ Training data
src/ Source code
15

Shell scripts (scripts/)
download_data.sh Download training data
download_models.sh Download pretrained models
process_data.py Process the data
rerun_exp.sh Rerun the experiment
run_exp.sh Run the experiment
run_inference.sh Run the inference
run_interpolation.sh Run the interpolation
run_train.sh Run the training
setup_exp.sh Setup the experiment
16

Experimenting
֎ Create an experiment directory (under exp/)
 One experiment means to compare n different settings
 Examples: default (the default setting), net_archs, streams, binary_neurons
֎ Setup the experiment items (run setup_exp.sh)
 Run n times if you intend to compare n different settings in this experiment
 Each experiment item (i.e., one setting) has its own directory
֎ Modify the configuration and parameters for each experiment item
 Modify config.yaml and params.yaml in each experiment item directory
֎ Run the experiment (run run_exp.sh for each experiment item)
17

Set up an experiment (scripts/setup_exp.sh)
֎ Given
 [exp_name]
 [exp_note]
֎ Do
 Create an experiment directory named [exp_name] under exp/
 Copy the default configuration and parameter files to the experiment directory
 Write exp_note as a text file to the experiment directory
 Examples: ‘Compare different network architectures’ and ‘Compare different types of binary neurons’
18

Run an experiment (scripts/run_exp.sh)
֎ Given
 [exp_dir]
 [gpu_num]
֎ Do
 Automatically search for config.yaml and params.yaml in exp_dir
 Run the scripts in specific orders. For example, a typical GAN experiment might look like
 run_train.sh
 run_inference.sh
 run_interpolation.sh
19

20
- Remove the outputs
- Keep the configuration
and parameter files
- Run the experiment again

21
The training data and
pretrained models should
be large and thus hosted
somewhere else

22
Process the downloaded
data for training

data/ Training data
src/ Source code
23
Should be listed in .gitignore

data/ Training data
src/ Source code
24
Required for others to use your code
See https://choosealicense.com/ to
choose a proper open source license

data/ Training data
src/ Source code
25
(for reproducibility)

Repository organization (complete)
data/ Training data
docs/ Website contents
src/ Source code
Pipfile Dependency files
Pipfile.lock Dependency files
26

27
data/ Training data
src/ Source code
Recommend to use pipenv for packaging

28
data/ Training data
src/ Source code
Recommend to use GitHub Pages for simplicity

Benefits
֎ Easy to manage lots of experiments
 Each experiment has its own directory
 Configuration and parameters used in each experiment are saved
 Configuration and parameters are loaded locally (no need to modify the source code)
֎ Easy to examine new network architectures
 Simply add a new preset to the preset directory
 No need to modify other source code
29

Thank you for your attention See an example project using
this template—MuseGAN

Organizing Machine Learning Projects - Repository Organization

Recommended

Recommended

More Related Content

Similar to Organizing Machine Learning Projects - Repository Organization

Similar to Organizing Machine Learning Projects - Repository Organization (20)

More from Hao-Wen (Herman) Dong

More from Hao-Wen (Herman) Dong (6)

Recently uploaded

Recently uploaded (20)

Organizing Machine Learning Projects - Repository Organization