onyxengine

The primary api functions for interacting with the Engine.

onyxengine.api.get_object_metadata(name: str, version_id: str | None = None) → dict[source]

Get the metadata for an object in the Engine.

Parameters:

name (str) – The name of the object to get metadata for
version_id (str, optional) – The version id of the object to get metadata for, None = latest_version. (Default is None)

Returns:

The metadata for the object, or None if the object does not exist.

Return type:

dict

Example:

# Get metadata for an Onyx object (dataset, model)
metadata = onyx.get_object_metadata('example_data')
print(metadata)

# Get metadata for a specific version
metadata = onyx.get_object_metadata('example_data', version_id='a05fb872-0a7d-4a68-b189-aeece143c7e4')
print(metadata)

onyxengine.api.load_dataset(name: str, version_id: str | None = None) → OnyxDataset[source]

Load a dataset from the Engine, either from a local cached copy or by downloading from the Engine.

Parameters:

name (str) – The name of the dataset to load.
version_id (str, optional) – The version id of the dataset to load, None = latest_version. (Default is None)

Returns:

The loaded dataset.

Return type:

OnyxDataset

Example:

# Load the training dataset
train_dataset = onyx.load_dataset('example_train_data')
print(train_dataset.dataframe.head())

onyxengine.api.load_model(name: str, version_id: str | None = None, mode: Literal['online', 'offline'] = 'online') → MLP | RNN | Transformer[source]

Load a model from the Engine, either from a local cached copy or by downloading from the Engine.

Parameters:

name (str) – The name of the model to load.
version_id (str, optional) – The version of the model to load, None = latest_version. (Default is None)
mode (Literal["online", "offline"]) – Whether to use the Engine or restrict model loading to offline storage (Default is “online”)

Returns:

The loaded Onyx model.

Return type:

MLP | RNN | Transformer

Example:

# Load our model
model = onyx.load_model('example_model')
print(model.config)

# Load a specific version of the model
model = onyx.load_model('example_model', version_id='a05fb872-0a7d-4a68-b189-aeece143c7e4')
print(model.config)

onyxengine.api.optimize_model(model_name: str = '', dataset_name: str = '', dataset_version_id: str | None = None, optimization_config: OptimizationConfig | None = None)[source]

Optimize a model on the Engine using a specified dataset, model simulator config, and optimization configs. Optimization configs define the search space for hyperparameters.

Parameters:

model_name (str) – The name of the model to optimize. (Required)
model_sim_config (ModelSimulatorConfig) – The configuration for the model simulator. (Required)
dataset_name (str) – The name of the dataset to optimize on. (Required)
dataset_version_id (str, optional) – The version of the dataset to optimize on, None = latest_version. (Default is None)
optimization_config (OptimizationConfig) – The configuration for the optimization process. (Required)

Example:

# Model inputs/outputs
outputs = [
    Output(name='acceleration_prediction'),
]
inputs = [
    State(name='velocity', relation='derivative', parent='acceleration_prediction'),
    State(name='position', relation='derivative', parent='velocity'),
    Input(name='control_input'),
]

# Model optimization configs
mlp_opt = MLPOptConfig(
    outputs=outputs,
    inputs=inputs,
    dt=0.0025,
    sequence_length={"select": [1, 2, 4, 5, 6, 8, 10]},
    hidden_layers={"range": [2, 4, 1]},
    hidden_size={"select": [12, 24, 32, 64, 128]},
    activation={"select": ['relu', 'tanh']},
    dropout={"range": [0.0, 0.4, 0.1]},
    bias=True
)
rnn_opt = RNNOptConfig(
    outputs=outputs,
    inputs=inputs,
    dt=0.0025,
    rnn_type={"select": ['RNN', 'LSTM', 'GRU']},
    sequence_length={"select": [1, 2, 4, 5, 6, 8, 10, 12, 14, 15]},
    hidden_layers={"range": [2, 4, 1]},
    hidden_size={"select": [12, 24, 32, 64, 128]},
    dropout={"range": [0.0, 0.4, 0.1]},
    bias=True
)
transformer_opt = TransformerOptConfig(
    outputs=outputs,
    inputs=inputs,
    dt=0.0025,
    sequence_length={"select": [1, 2, 4, 5, 6, 8, 10, 12, 14, 15]},
    n_layer={"range": [2, 4, 1]},
    n_head={"range": [2, 10, 2]},
    n_embd={"select": [12, 24, 32, 64, 128]},
    dropout={"range": [0.0, 0.4, 0.1]},
    bias=True
)

# Optimizer configs
adamw_opt = AdamWOptConfig(
    lr={"select": [1e-5, 5e-5, 1e-4, 3e-4, 5e-4, 8e-4, 1e-3, 5e-3, 1e-2]},
    weight_decay={"select": [1e-4, 1e-3, 1e-2, 1e-1]}
)
sgd_opt = SGDOptConfig(
    lr={"select": [1e-5, 5e-5, 1e-4, 3e-4, 5e-4, 8e-4, 1e-3, 5e-3, 1e-2]},
    weight_decay={"select": [1e-4, 1e-3, 1e-2, 1e-1]},
    momentum={"select": [0, 0.8, 0.9, 0.95, 0.99]}
)

# Learning rate scheduler configs
cos_decay_opt = CosineDecayWithWarmupOptConfig(
    max_lr={"select": [1e-4, 3e-4, 5e-4, 8e-4, 1e-3, 3e-3, 5e-3]},
    min_lr={"select": [1e-6, 5e-6, 1e-5, 3e-5, 5e-5, 8e-5, 1e-4]},
    warmup_iters={"select": [50, 100, 200, 400, 800]},
    decay_iters={"select": [500, 1000, 2000, 4000, 8000]}
)
cos_anneal_opt = CosineAnnealingWarmRestartsOptConfig(
    T_0={"select": [200, 500, 1000, 2000, 5000, 10000]},
    T_mult={"select": [1, 2, 3]},
    eta_min={"select": [1e-6, 5e-6, 1e-5, 3e-5, 5e-5, 8e-5, 1e-4, 3e-4]}
)

# Optimization config
opt_config = OptimizationConfig(
    training_iters=2000,
    train_batch_size=512,
    test_dataset_size=500,
    checkpoint_type='single_step',
    opt_models=[mlp_opt, rnn_opt, transformer_opt],
    opt_optimizers=[adamw_opt, sgd_opt],
    opt_lr_schedulers=[None, cos_decay_opt, cos_anneal_opt],
    num_trials=5
)

# Execute model optimization
onyx.optimize_model(
    model_name='example_model_optimized',
    model_sim_config=sim_config,
    dataset_name='example_train_data',
    optimization_config=opt_config,
)

onyxengine.api.save_dataset(name: str, dataset: OnyxDataset, source_datasets: List[Dict[str, str | None]] = [])[source]

Save a dataset to the Engine.

Parameters:

name (str) – The name for the new dataset
dataset (OnyxDataset) – The OnyxDataset object to save
source_datasets (List[Dict[str, Optional[str]]]) – The source datasets used as a list of dictionaries, eg. [{‘name’: ‘dataset_name’, ‘version_id’: ‘dataset_version’}]. If no version is provided, the latest version will be used.

Example:

# Load data
raw_data = onyx.load_dataset('example_data')

# Pull out features for model training
train_data = pd.DataFrame()
train_data['acceleration_predicted'] = raw_data.dataframe['acceleration']
train_data['velocity'] = raw_data.dataframe['velocity']
train_data['position'] = raw_data.dataframe['position']
train_data['control_input'] = raw_data.dataframe['control_input']
train_data = train_data.dropna()

# Save training dataset
train_dataset = OnyxDataset(
    dataframe=train_data,
    outputs=['acceleration_predicted'],
    inputs=['velocity', 'position', 'control_input'],
    dt=0.0025
)
onyx.save_dataset(name='example_train_data', dataset=train_dataset, source_datasets=[{'name': 'example_data'}])

onyxengine.api.save_model(name: str, model: MLP | RNN | Transformer, source_datasets: List[Dict[str, str | None]] = [])[source]

Save a model to the Engine. Generally you won’t need to use this function as the Engine will save models it trains automatically.

Parameters:

name (str) – The name for the new model.
model (MLP | RNN | Transformer) – The Onyx model to save.
source_datasets (List[Dict[str, Optional[str]]]) – The source datasets used as a list of dictionaries, eg. [{‘name’: ‘dataset_name’, ‘version_id’: ‘dataset_version’}]. If no version is provided, the latest version will be used.

Example:

# Create model configuration
outputs = [
    Output(name='acceleration_prediction'),
]
inputs = [
    State(name='velocity', relation='derivative', parent='acceleration_prediction'),
    State(name='position', relation='derivative', parent='velocity'),
    Input(name='control_input'),
]
mlp_config = MLPConfig(
    outputs=outputs,
    inputs=inputs,
    dt=0.0025,
    sequence_length=8,
    hidden_layers=3,
    hidden_size=64,
    activation='relu',
    dropout=0.2,
    bias=True
)

# Create and save model
model = MLP(mlp_config)
onyx.save_model(name='example_model', model=model, source_datasets=[{'name': 'example_train_data'}])

onyxengine.api.train_model(model_name: str = '', model_config: MLPConfig | RNNConfig | TransformerConfig | None = None, dataset_name: str = '', dataset_version_id: str | None = None, training_config: TrainingConfig = TrainingConfig(type='training_config', training_iters=3000, train_batch_size=32, train_val_split_ratio=0.9, test_dataset_size=500, checkpoint_type='single_step', optimizer=AdamWConfig(type='adamw', lr=0.0003, weight_decay=0.01), lr_scheduler=None), monitor_training: bool = True)[source]

Train a model on the Engine using a specified dataset, model config, and training config.

Parameters:

model_name (str) – The name of the model to train. (Required)
model_config (Union[MLPConfig, RNNConfig, TransformerConfig]) – The configuration for the model to train. (Required)
dataset_name (str) – The name of the dataset to train on. (Required)
dataset_version_id (str, optional) – The version of the dataset to train on, None = latest_version. (Default is None)
training_config (TrainingConfig) – The configuration for the training process. (Default is TrainingConfig())
monitor_training (bool, optional) – Whether to monitor the training job. (Default is True)

Example:

# Model config
outputs = [
    Output(name='acceleration_prediction'),
]
inputs = [
    State(name='velocity', relation='derivative', parent='acceleration_prediction'),
    State(name='position', relation='derivative', parent='velocity'),
    Input(name='control_input'),
]

model_config = MLPConfig(
    outputs=outputs,
    inputs=inputs,
    dt=0.0025,
    sequence_length=8,
    hidden_layers=3,
    hidden_size=64,
    activation='relu',
    dropout=0.2,
    bias=True
)

# Training config
training_config = TrainingConfig(
    training_iters=2000,
    train_batch_size=32,
    test_dataset_size=500,
    checkpoint_type='single_step',
    optimizer=AdamWConfig(lr=3e-4, weight_decay=1e-2),
    lr_scheduler=CosineDecayWithWarmupConfig(max_lr=3e-4, min_lr=3e-5, warmup_iters=200, decay_iters=1000)
)

# Execute training
onyx.train_model(
    model_name='example_model',
    model_config=model_config,
    dataset_name='example_train_data',
    training_config=training_config,
    monitor_training=True
)