Skip to main content
The Onyx Engine platform provides a visual interface for configuring and running training jobs.

Starting a Training Job

From a processed dataset, you can start training by:
  1. Navigate to your dataset in the System Overview
  2. Click “Train Model” from the dataset actions
  3. Configure the model and training parameters
  4. Start training

Feature Configuration

First, define which dataset columns are outputs vs inputs:
Feature configuration interface

Setting Up Features

1

Select Outputs

Choose which columns the model should predict (typically derivatives like acceleration)
2

Select Inputs

Choose columns to feed as model inputs (states and external inputs)
3

Define Relations

For state inputs, specify parent features and relation types (derivative, delta, equal)
4

Set Scaling

Choose normalization method for each feature

Example Configuration

For a system with acceleration, velocity, position, and control input:
FeatureTypeParentRelation
accelerationOutput--
velocityInputaccelerationderivative
positionInputvelocityderivative
control_inputInput--

Model Configuration

Choose your model architecture and hyperparameters:
Model configuration interface

Architecture Selection

Select from:
  • MLP: Multi-Layer Perceptron (fastest, good baseline)
  • RNN: Recurrent Neural Network (LSTM/GRU for temporal patterns)
  • Transformer: Attention-based (complex dependencies)

Common Parameters

ParameterDescriptionTypical Range
Sequence LengthInput history window1-16
Hidden LayersNetwork depth2-4
Hidden SizeNeurons per layer32-128
DropoutRegularization0.1-0.3

Training Parameters

ParameterDescriptionTypical Value
Training IterationsTotal training steps2000-5000
Batch SizeSamples per batch256-1024
Learning RateStep size3e-4
Checkpoint TypeOptimization targetsingle_step or multi_step

Running Training

Once configured:
  1. Click “Start Training”
  2. The job queues and starts when resources are available
  3. Monitor progress in the Jobs tab

Monitoring Training

View real-time training progress:
Training jobs view

Metrics Displayed

  • Training Loss: Error on training data (should decrease)
  • Validation Loss: Error on held-out data (watch for overfitting)
  • Learning Rate: Current LR if using a scheduler
  • Progress: Iterations completed / total

Understanding Loss Curves

  • Training loss steadily decreases
  • Validation loss follows training loss
  • Both converge to low values
  • Training loss decreases
  • Validation loss increases or plateaus
  • Fix: Increase dropout, reduce model size, add more data
  • Both losses remain high
  • Model not learning the dynamics
  • Fix: Increase model size, check feature configuration, verify data quality

Training Tips

Begin with an MLP and single_step checkpoint. Add complexity only if needed.
Check dataset statistics before training. Outliers or incorrect time steps cause poor results.
Start with 256-512. Larger batches train faster but may need learning rate adjustment.
If validation loss increases while training loss decreases, you’re overfitting.

Canceling Training

To stop a running job:
  1. Go to the Jobs tab
  2. Find your training job
  3. Click “Cancel”
Partial results may be saved depending on when you cancel.

Next Steps