Skip to main content
Before optimizing models in Onyx Engine, start with the Training Models tutorial to learn about model inputs/outputs, configurations, and loading trained models. Optimizing models is no different than training individual models, but it allows you to train many models at once, in parallel, on Onyx’s cloud training infrastructure.

Complete Optimization Example

Here’s a full optimization script that you can use as a starting point:
from onyxengine import Onyx
from onyxengine.modeling import (
    Output,
    Input,
    OptimizationConfig,
    MLPOptConfig,
    RNNOptConfig,
    TransformerOptConfig,
    AdamWOptConfig,
    SGDOptConfig,
    CosineDecayWithWarmupOptConfig,
)

# Initialize the client
onyx = Onyx()

# Define model outputs and inputs
outputs = [
    Output(name='acceleration'),
]
inputs = [
    Input(name='velocity', parent='acceleration', relation='derivative'),
    Input(name='position', parent='velocity', relation='derivative'),
    Input(name='control_input'),
]

# Model optimization configs
mlp_opt = MLPOptConfig(
    outputs=outputs,
    inputs=inputs,
    dt=0.0025,
    sequence_length={"range": [1, 10, 1]},
    hidden_layers={"select": [2, 3, 4, 5]},
    hidden_size={"select": [32, 64, 128]},
    activation={"select": ['relu', 'gelu', 'tanh']},
    dropout={"range": [0.0, 0.2, 0.1]},
    bias=True
)
rnn_opt = RNNOptConfig(
    outputs=outputs,
    inputs=inputs,
    dt=0.0025,
    rnn_type={"select": ['LSTM']},
    sequence_length={"range": [10, 30, 2]},
    hidden_layers={"select": [2, 3, 4, 5]},
    hidden_size={"select": [32, 64, 128]},
    dropout={"range": [0.0, 0.2, 0.1]},
    bias=True
)
transformer_opt = TransformerOptConfig(
    outputs=outputs,
    inputs=inputs,
    dt=0.0025,
    sequence_length={"range": [10, 40, 2]},
    n_layer={"range": [2, 4, 1]},
    n_head={"range": [2, 8, 2]},
    n_embd={"select": [24, 32, 64, 128]},
    dropout={"range": [0.0, 0.2, 0.1]},
    bias=True
)
    
# Optimizer config
adamw_opt = AdamWOptConfig(
    lr={"select": [5e-5, 1e-4, 3e-4, 5e-4, 8e-4]},
    weight_decay={"select": [1e-4, 1e-3, 1e-2]}
)

# Learning rate scheduler config
cos_decay_opt = CosineDecayWithWarmupOptConfig(
    max_lr={"select": [8e-5, 1e-4, 3e-4, 8e-4, 1e-3, 5e-3]},
    min_lr={"select": [1e-6, 5e-6, 1e-5, 3e-5, 5e-5]},
    warmup_iters={"select": [50, 100, 200, 400, 800]},
    decay_iters={"select": [500, 1000, 2000, 4000, 8000]}
)

# Optimization config
opt_config = OptimizationConfig(
    training_iters=3000,
    train_batch_size=1024,
    test_dataset_size=500,
    checkpoint_type='multi_step',
    opt_models=[mlp_opt, rnn_opt, transformer_opt],
    opt_optimizers=[adamw_opt],
    opt_lr_schedulers=[cos_decay_opt],
    num_trials=20
)

# Execute model optimization
onyx.optimize_model(
    model_name='example_model',
    dataset_name='example_data',
    optimization_config=opt_config,
)

Defining Search Spaces

Search spaces dictate the possible values that a given hyperparameter can be given for an individual model optimization trial. There are three ways to define a search space:
TypeSyntaxDescription
Fixedparam=2Lock a parameter to a single value
Selectparam={"select": [2, 4, 6, 8]}Choose from a discrete list of values
Rangeparam={"range": [2, 5, 1]}Choose from a range of [start, end, step]

Fixed Values

Lock a parameter to a single value:
mlp_opt = MLPOptConfig(
    hidden_layers=3,      # Always use 3 layers
    activation='relu',    # Always use relu
    bias=True,            # Always set bias to True
    ...
)

Select (Discrete Options)

Choose from a list of values:
mlp_opt = MLPOptConfig(
    hidden_layers={"select": [2, 3, 4]}, # Choose from 2, 3, or 4 layers
    hidden_size={"select": [32, 64, 128, 256]}, # Choose from 32, 64, 128, or 256 hidden units
    activation={"select": ['relu', 'gelu', 'tanh']}, # Choose from relu, gelu, or tanh
    ...
)

Range (Numeric Intervals)

Specify a range with [start, end, step]:
mlp_opt = MLPOptConfig(
    hidden_layers={"range": [2, 5, 1]},     # Choose from 2, 3, 4, or 5 layers
    dropout={"range": [0.0, 0.4, 0.1]},     # Choose from 0.0, 0.1, 0.2, 0.3, or 0.4 dropout rate
    ...
)
Range is only supported for numeric parameters. Use select for strings and booleans.

Model Optimization Configs

Model optimization configs (eg. MLPOptConfig, RNNOptConfig, TransformerOptConfig) are identical to their regular model config counterparts, but allow for parameter search spaces. They use the same model inputs and outputs as regular model configs, which makes it easy to jump between training individual models and optimizing models — see the Training Models tutorial for details on defining inputs/outputs. After optimization, you will still load the resulting models using their regular model configs, Opt configs are only used to configure the optimization.

MLP (MLPOptConfig)

Best for systems with relatively simple dynamics or for fast inference:
from onyxengine.modeling import MLPOptConfig

mlp_opt = MLPOptConfig(
    outputs=outputs,
    inputs=inputs,
    dt=0.0025,
    sequence_length={"range": [1, 10, 1]},
    hidden_layers={"select": [2, 3, 4, 5]},
    hidden_size={"select": [32, 64, 128]},
    activation={"select": ['relu', 'gelu', 'tanh']},
    dropout={"range": [0.0, 0.2, 0.1]},
    bias=True
)

RNN (RNNOptConfig)

Better for systems with complex temporal dependencies:
from onyxengine.modeling import RNNOptConfig

rnn_opt = RNNOptConfig(
    outputs=outputs,
    inputs=inputs,
    dt=0.0025,
    rnn_type={"select": ['LSTM']},
    sequence_length={"range": [10, 30, 2]},
    hidden_layers={"select": [2, 3, 4, 5]},
    hidden_size={"select": [32, 64, 128]},
    dropout={"range": [0.0, 0.2, 0.1]},
    bias=True
)

Transformer (TransformerOptConfig)

Powerful for capturing long-range dependencies:
from onyxengine.modeling import TransformerOptConfig

transformer_opt = TransformerOptConfig(
    outputs=outputs,
    inputs=inputs,
    dt=0.0025,
    sequence_length={"range": [10, 40, 2]},
    n_layer={"range": [2, 4, 1]},
    n_head={"range": [2, 8, 2]},
    n_embd={"select": [24, 32, 64, 128]},
    dropout={"range": [0.0, 0.2, 0.1]},
    bias=True
)

Optimizer Configs

AdamW (AdamWOptConfig) (recommended for most cases):
from onyxengine.modeling import AdamWOptConfig

adamw_opt = AdamWOptConfig(
    lr={"select": [5e-5, 1e-4, 3e-4, 5e-4, 8e-4]},
    weight_decay={"select": [1e-4, 1e-3, 1e-2]}
)
SGD (SGDOptConfig):
from onyxengine.modeling import SGDOptConfig

sgd_opt = SGDOptConfig(
    lr={"select": [5e-5, 1e-4, 3e-4, 5e-4, 8e-4]},
    weight_decay={"select": [1e-4, 1e-3, 1e-2]},
    momentum={"select": [0.9, 0.95, 0.99]}
)

Learning Rate Scheduler Configs

Cosine Decay with Warmup (CosineDecayWithWarmupOptConfig):
from onyxengine.modeling import CosineDecayWithWarmupOptConfig

cos_decay_opt = CosineDecayWithWarmupOptConfig(
    max_lr={"select": [8e-5, 1e-4, 3e-4, 8e-4, 1e-3, 5e-3]},
    min_lr={"select": [1e-6, 5e-6, 1e-5, 3e-5, 5e-5]},
    warmup_iters={"select": [50, 100, 200, 400, 800]},
    decay_iters={"select": [500, 1000, 2000, 4000, 8000]}
)
Cosine Annealing with Warm Restarts (CosineAnnealingWarmRestartsOptConfig):
from onyxengine.modeling import CosineAnnealingWarmRestartsOptConfig

cos_anneal_opt = CosineAnnealingWarmRestartsOptConfig(
    T_0={"select": [500, 1000, 2000]},
    T_mult={"select": [1, 2]},
    eta_min={"select": [1e-6, 1e-5, 1e-4]}
)
No Scheduler: Include None in your scheduler list to try training without a scheduler:
opt_config = OptimizationConfig(
    opt_lr_schedulers=[None, cos_decay_opt],  # Try both with and without scheduler
    ...
)

OptimizationConfig

The OptimizationConfig is used to bring together your search spaces for hyperparameters. Note that for models, optimizers, and learning rate schedulers, you pass in lists of Opt configs. This is effectively a high-level Select search space, so you can optimize across different model architectures, optimizers, and learning rate schedulers. While you have lots of flexibility in defining search spaces, we recommend starting with a baseline like the example at the top of this page and then narrowing down to the most promising configurations.
from onyxengine.modeling import OptimizationConfig

opt_config = OptimizationConfig(
    # Training parameters (fixed for all trials)
    training_iters=3000,                 # Total training iterations per trial
    train_batch_size=1024,               # Batch size for all trials
    test_dataset_size=500,               # Samples for test visualization
    checkpoint_type='multi_step',        # Training checkpoint type

    # Search spaces
    opt_models=[mlp_opt, rnn_opt, transformer_opt],           # Model architecture search spaces
    opt_optimizers=[adamw_opt, sgd_opt],                      # Optimizer search spaces
    opt_lr_schedulers=[None, cos_decay_opt, cos_anneal_opt],  # Scheduler search spaces (include None for no scheduler)

    # Number of trials
    num_trials=20                        # Total number of configurations to sample
)

Running Optimization

onyx.optimize_model(
    model_name='optimized_model',         # Name for each of the trained models
    dataset_name='example_data',          # Name of the dataset to optimize on
    dataset_version_id=None,              # Optional: specific dataset version
    optimization_config=opt_config,
)
Each trial creates an individual version of the model. Monitor progress in the Engine Platform.

Loading Optimized Models

As optimization trials complete, each trial will result in a new trained model version. You can load these model versions the same way you load individually trained models - see the Training Models tutorial for more details on loading models, including offline mode and local caching.
from onyxengine import Onyx

onyx = Onyx()

# Load the latest model version
model = onyx.load_model('optimized_model')

# Load a specific model version
model = onyx.load_model('optimized_model', version_id='dcfec841-1748-47e2-b6c7-3c821cc69b4a')

# Check what configuration was used
print(model.config.model_dump_json(indent=2))

Next Steps

Simulating with Models

Deploy your optimized models for simulation