Optimizing Models
Optimizing models is similar in code to Training Models. Now, however, we can let the Engine experiment with different training/model configurations to find the best model for our system.
Optimization Script
We will optimize a model using the following code, you can paste this into your editor and then follow the explanation below:
from onyxengine.modeling import (
ModelSimulatorConfig,
State,
MLPOptConfig,
RNNOptConfig,
TransformerOptConfig,
AdamWOptConfig,
SGDOptConfig,
CosineDecayWithWarmupOptConfig,
CosineAnnealingWarmRestartsOptConfig,
OptimizationConfig
)
import onyxengine as onyx
# Model sim config (used across all trials)
sim_config = ModelSimulatorConfig(
outputs=['acceleration'],
states=[
State(name='velocity', relation='derivative', parent='acceleration'),
State(name='position', relation='derivative', parent='velocity'),
],
controls=['brake_input'],
dt=0.0025
)
# Model optimization configs
mlp_opt = MLPOptConfig(
sim_config=sim_config,
num_inputs=sim_config.num_inputs,
num_outputs=sim_config.num_outputs,
sequence_length={"select": [1, 2, 4, 5, 6, 8, 10]},
hidden_layers={"range": [2, 4, 1]},
hidden_size={"select": [12, 24, 32, 64, 128]},
activation={"select": ['relu', 'tanh']},
dropout={"range": [0.0, 0.4, 0.1]},
bias=True
)
rnn_opt = RNNOptConfig(
sim_config=sim_config,
num_inputs=sim_config.num_inputs,
num_outputs=sim_config.num_outputs,
rnn_type={"select": ['RNN', 'LSTM', 'GRU']},
sequence_length={"select": [1, 2, 4, 5, 6, 8, 10, 12, 14, 15]},
hidden_layers={"range": [2, 4, 1]},
hidden_size={"select": [12, 24, 32, 64, 128]},
dropout={"range": [0.0, 0.4, 0.1]},
bias=True
)
transformer_opt = TransformerOptConfig(
sim_config=sim_config,
num_inputs=sim_config.num_inputs,
num_outputs=sim_config.num_outputs,
sequence_length={"select": [1, 2, 4, 5, 6, 8, 10, 12, 14, 15]},
n_layer={"range": [2, 4, 1]},
n_head={"range": [2, 10, 2]},
n_embd={"select": [12, 24, 32, 64, 128]},
dropout={"range": [0.0, 0.4, 0.1]},
bias=True
)
# Optimizer configs
adamw_opt = AdamWOptConfig(
lr={"select": [1e-5, 5e-5, 1e-4, 3e-4, 5e-4, 8e-4, 1e-3, 5e-3, 1e-2]},
weight_decay={"select": [1e-4, 1e-3, 1e-2, 1e-1]}
)
sgd_opt = SGDOptConfig(
lr={"select": [1e-5, 5e-5, 1e-4, 3e-4, 5e-4, 8e-4, 1e-3, 5e-3, 1e-2]},
weight_decay={"select": [1e-4, 1e-3, 1e-2, 1e-1]},
momentum={"select": [0, 0.8, 0.9, 0.95, 0.99]}
)
# Learning rate scheduler configs
cos_decay_opt = CosineDecayWithWarmupOptConfig(
max_lr={"select": [1e-4, 3e-4, 5e-4, 8e-4, 1e-3, 3e-3, 5e-3]},
min_lr={"select": [1e-6, 5e-6, 1e-5, 3e-5, 5e-5, 8e-5, 1e-4]},
warmup_iters={"select": [50, 100, 200, 400, 800]},
decay_iters={"select": [500, 1000, 2000, 4000, 8000]}
)
cos_anneal_opt = CosineAnnealingWarmRestartsOptConfig(
T_0={"select": [200, 500, 1000, 2000, 5000, 10000]},
T_mult={"select": [1, 2, 3]},
eta_min={"select": [1e-6, 5e-6, 1e-5, 3e-5, 5e-5, 8e-5, 1e-4, 3e-4]}
)
# Optimization config
opt_config = OptimizationConfig(
training_iters=2000,
train_batch_size=512,
test_dataset_size=500,
checkpoint_type='single_step',
opt_models=[mlp_opt, rnn_opt, transformer_opt],
opt_optimizers=[adamw_opt, sgd_opt],
opt_lr_schedulers=[None, cos_decay_opt, cos_anneal_opt], # None means no scheduler, ie. constant learning rate
num_trials=5
)
# Execute training
onyx.optimize_model(
model_name='brake_model_optimized',
model_sim_config=sim_config,
dataset_name='brake_train_data',
optimization_config=opt_config,
)
OptConfigs
Every Onyx model, model weight optimizer, and learning rate scheduler has its own “OptConfig” which allows you to specify the search spaces for hyperparameters.
Search spaces can be defined with the following:
A value - Constrain the Engine to one value for all optimization trials.
mlp_opt = MLPOptConfig( hidden_layers=2, hidden_size=32, activation='relu', dropout=0.2, bias=True )
“select” - A selection of values the Engine can choose from for each trial.
mlp_opt = MLPOptConfig( hidden_layers={"select": [2, 3, 4]}, hidden_size={"select": [12, 24, 32, 64, 128]}, activation={"select": ['relu', 'tanh']}, dropout={"select": [0.0, 0.1, 0.2, 0.3, 0.4]} bias={"select": [True, False]} )
“range” - A range of values the Engine can choose from for each trial. Ranges must be defined as a list of three values: [start, end, step].
mlp_opt = MLPOptConfig( hidden_layers={"range": [2, 4, 1]}, hidden_size={"range": [12, 128, 16]}, activation='relu', # Ranges not supported for strings dropout={"range": [0.0, 0.4, 0.1]} bias=True, # Ranges not supported for booleans )
All of the above - You can mix these methods to define the search space for each hyperparameter.
mlp_opt = MLPOptConfig( hidden_layers={"range": [2, 5, 1]}, hidden_size={"select": [12, 24, 32, 64, 128]}, activation={"select": ['relu', 'tanh']}, dropout={"range": [0.0, 0.4, 0.1]}, bias=True )
Once you’ve defined your OptConfigs, you can pass them to the complete OptimizationConfig
# Optimization config
opt_config = OptimizationConfig(
training_iters=2000,
train_batch_size=512,
test_dataset_size=500,
checkpoint_type='single_step',
opt_models=[mlp_opt, rnn_opt, transformer_opt],
opt_optimizers=[adamw_opt, sgd_opt],
opt_lr_schedulers=[None, cos_decay_opt, cos_anneal_opt], # None means no scheduler, ie. constant learning rate
num_trials=5
)
The TrainingConfig parameters like training_iters, checkpoint_type, etc. define the training process for each trial.
The num_trials parameter specifies how many trials the Engine will run.
In general, the default values our team has selected for OptConfigs tend to be good starting points for model optimization.
optimize_model()
We are now ready to optimize our model in the Engine:
# Execute model optimization
onyx.optimize_model(
model_name='brake_model_optimized',
model_sim_config=sim_config,
dataset_name='brake_train_data',
optimization_config=opt_config,
)
You can monitor the optimization process via the Engine Platform. The different trials of the model optimization will be stored as separate versions of the model in the Engine. To download a specific model version, you can use the version parameter of the onyx.load_model()
function.
onyx.load_model('brake_model_optimized', version=1)
Now that we have optimized our model, we can deploy it for simulation in Simulating Models.