Uploading Datasets

Upload your hardware data through the Onyx Engine platform to prepare it for model training.

Upload Methods

You can upload data in three ways:

Upload button: Click “Upload” from the System Overview or Table view
Drag and drop: Drag files directly into the System Overview or Table
SDK: Use the Onyx client’s save_dataset() method for programmatic uploads

Supported Formats

Format	Extension	Notes
CSV	`.csv`	Most common, good for small-medium datasets
Parquet	`.parquet`	Efficient for large datasets

Data Requirements

Structure

Your data should be a timeseries with:

Rows: Sequential timesteps
Columns: Features (states, outputs, inputs)
Consistent sampling: Regular time intervals between rows

Example Data

time,acceleration,velocity,position,control_input
000,0.12,0.0,0.0,0.5
010,0.15,0.0012,0.0,0.5
020,0.18,0.0027,0.00001,0.5
030,0.14,0.0041,0.00003,0.5

Requirements Checklist

Upload Workflow

Select Files

Click Upload or drag your data file into the platform

Review Preview

Check the data preview to verify columns and format

Configure

Set the dataset name and time step (dt)

Upload

Click Upload to start processing

Processing

After upload, the platform processes your data:

Processing includes:

Format validation
Statistics calculation
Indexing for training

The dataset status shows:

Processing: Upload in progress
Active: Ready for training
Error: Issues detected (check format)

Data Preparation Tips

Collecting Data

Sampling rate: 50-400Hz works well for most hardware
Duration: Less than one hour of data is typically sufficient
Coverage: Include varied operating conditions

Cleaning Data

Before uploading:

import pandas as pd

# Load your data
df = pd.read_csv('raw_data.csv')

# Remove NaN values
df = df.dropna()

# Convert to float32 for efficiency
for col in df.columns:
    if df[col].dtype == 'float64':
        df[col] = df[col].astype('float32')

# Save cleaned data
df.to_csv('clean_data.csv', index=False)

Multiple Files

If your data is split across multiple files (e.g., separate episodes):

Upload all files together
The platform concatenates them vertically
They become one continuous time series

Ensure all files have identical columns in the same order.

Viewing Dataset Details

After processing, click on the dataset to view:

Features: List of columns
Statistics: Min, max, mean, std for each feature
Metadata: Time step, number of points, memory size

Using the Inspector

Quick-view any dataset by clicking its node in the System Overview:

The Inspector shows:

Dataset name and version
Creation date
Feature list
Quick actions (view, delete)

Next Steps

Training via UI

Train a model on your dataset

SDK Upload

Upload datasets programmatically

Getting Started

Tutorials

Concepts

Upload Methods

Supported Formats