Skip to content

Machine Learning Methods in SindbadML

This page provides an overview of machine learning methods available within SindbadML. It includes details on various components such as activation functions, gradient methods, ML models, optimizers, and training methods, and how to extend them for experiment related to hybrid ML-physical modeling.

Extending SindbadML: How to Add New Components

This guide shows how to add new activation functions, gradient methods, ML models, optimizers, and training methods by following the conventions in the src/Types/MLTypes.jl and related files.


1. Adding a New Activation Function

Step 1: Define the Activation Type

In src/Types/MLTypes.jl, add a new struct subtype of ActivationType and export it:

julia
export MyActivation

struct MyActivation <: ActivationType end
purpose(::Type{MyActivation}) = "Describe your activation function here"

Step 2: Implement the Activation Function

In lib/SindbadML/src/activationFunctions.jl, extend activationFunction:

julia
function activationFunction(model_options, ::MyActivation)
    # Example: Swish activation
    swish(x) = x * Flux.sigmoid(x)
    return swish
end

2. Adding a New Gradient Method

Step 1: Define the Gradient Type

In src/Types/MLTypes.jl, add and export your new gradient type:

julia
export MyGradMethod

struct MyGradMethod <: MLGradType end
purpose(::Type{MyGradMethod}) = "Describe your gradient method"

Step 2: Implement the Gradient Logic

In lib/SindbadML/src/mlGradient.jl, extend gradientSite and/or gradientBatch!:

julia
function gradientSite(::MyGradMethod, x_vals::AbstractArray, gradient_options::NamedTuple, loss_f::F) where {F}
    # Implement your gradient calculation here
    return my_gradient(x_vals, loss_f)
end

3. Adding a New ML Model

Step 1: Define the Model Type

In src/Types/MLTypes.jl, add and export your new model type:

julia
export MyMLModel

struct MyMLModel <: MLModelType end
purpose(::Type{MyMLModel}) = "Describe your ML model"

Step 2: Implement the Model Constructor

In lib/SindbadML/src/mlModels.jl, extend mlModel:

julia
function mlModel(info, n_features, ::MyMLModel)
    # Build and return your model
    return MyModelConstructor(n_features, ...)
end

4. Adding a New Optimizer

Step 1: Define the Optimizer Type

In src/Types/MLTypes.jl, add and export your optimizer type:

julia
export MyOptimizer

struct MyOptimizer <: MLOptimizerType end
purpose(::Type{MyOptimizer}) = "Describe your optimizer"

Step 2: Implement the Optimizer Constructor

In lib/SindbadML/src/mlOptimizers.jl, extend mlOptimizer:

julia
function mlOptimizer(optimizer_options, ::MyOptimizer)
    # Return an optimizer object
    return MyOptimizerConstructor(optimizer_options...)
end

5. Adding a New Training Method

Step 1: Define the Training Type

In src/Types/MLTypes.jl, add and export your training type:

julia
export MyTrainingMethod

struct MyTrainingMethod <: MLTrainingType end
purpose(::Type{MyTrainingMethod}) = "Describe your training method"

Step 2: Implement the Training Function

In lib/SindbadML/src/mlTrain.jl, extend trainML:

julia
function trainML(hybrid_helpers, ::MyTrainingMethod)
    # Implement your training loop here
end

6. Register and Use Your New Types

  • Export your new types in MLTypes.jl.

  • Reference your new types in experiment or parameter JSON files (e.g., "activation_out": "my_activation").

  • Make sure your new types are imported where needed.


Summary Table

ComponentAbstract TypeFile(s) to EditFunction to Extend
ActivationActivationTypeMLTypes.jl, activationFunctions.jlactivationFunction
Gradient MethodMLGradTypeMLTypes.jl, mlGradient.jlgradientSite, gradientBatch!
ML ModelMLModelTypeMLTypes.jl, mlModels.jlmlModel
OptimizerMLOptimizerTypeMLTypes.jl, mlOptimizers.jlmlOptimizer
Training MethodMLTrainingTypeMLTypes.jl, mlTrain.jltrainML

Tip: Always add a purpose(::Type{YourType}) method for documentation and introspection. Tip: Export your new types for use in other modules.

For more examples, see the existing code in the referenced files.

SINDBAD Hybrid ML Experiment Outputs

This document describes the outputs generated by SINDBAD experiments, with a focus on hybrid and machine learning (ML) workflows and its ouptu in JLD2 format. The outputs are produced during and after training and evaluation, and are essential for analysis, checkpointing, and reproducibility.

INFO

During the REPL run of the trainML, all the outputs are stored in memory in place in hybrid_helpers NamedTuple. These can be dirrectly accessed through hybrid_helpers.loss_array.training etc. However, they are not saved to disk unless explicitly requested via the save_checkpoint option in the experiment config.


1. Checkpoint Files

During training, SINDBAD saves checkpoint files at the end of each epoch (if checkpoint_path is specified). These files are typically stored in JLD2 format and named as epoch_<epoch_number>.jld2.

Contents of Each Checkpoint File

Each checkpoint file contains the following variables:

  • lower_bound: Lower bounds for each parameter (from the parameter table).

  • upper_bound: Upper bounds for each parameter.

  • parameter_names: Names of the parameters being learned.

  • parameter_table: The full parameter table used for training.

  • metadata_global: Global metadata from the experiment configuration (e.g., site info, variable names).

  • loss_array_training: Array of scalar loss values for the training set at the current epoch. Shape: (number of training sites, 1)

  • loss_array_validation: Array of scalar loss values for the validation set at the current epoch. Shape: (number of validation sites, 1)

  • loss_array_testing: Array of scalar loss values for the testing set at the current epoch. Shape: (number of testing sites, 1)

  • loss_array_components_training: Array of loss component vectors for the training set at the current epoch. Shape: (number of training sites, number of components, 1)

  • loss_array_components_validation: Array of loss component vectors for the validation set at the current epoch. Shape: (number of validation sites, number of components, 1)

  • loss_array_components_testing: Array of loss component vectors for the testing set at the current epoch. Shape: (number of testing sites, number of components, 1)

  • re: The function to reconstruct the model parameters from the flat vector (for Flux/Optimisers).

  • flat: The current flat vector of model parameters (weights).


2. Loss Arrays

Loss arrays are stored in memory during training and saved to checkpoint files. They track the evolution of the loss for each site and epoch.

  • loss_array.training: Matrix of scalar loss values for all training sites and epochs. Shape: (number of training sites, number of epochs)

  • loss_array.validation: Matrix of scalar loss values for all validation sites and epochs. Shape: (number of validation sites, number of epochs)

  • loss_array.testing: Matrix of scalar loss values for all testing sites and epochs. Shape: (number of testing sites, number of epochs)

Component-wise loss arrays:

  • loss_array_components.training: 3D array of loss components for training sites and epochs. Shape: (number of training sites, number of components, number of epochs)

  • loss_array_components.validation: 3D array of loss components for validation sites and epochs. Shape: (number of validation sites, number of components, number of epochs)

  • loss_array_components.testing: 3D array of loss components for testing sites and epochs. Shape: (number of testing sites, number of components, number of epochs)


3. Model Parameters

  • flat: The current flat vector of all model parameters (weights).

  • re: The function to reconstruct the full model (e.g., Flux neural network) from the flat parameter vector.

These are used for resuming training, model evaluation, or exporting the trained model.


4. Metadata

  • parameter_table: The full parameter table, including bounds, names, and other metadata.

  • metadata_global: Experiment-wide metadata, such as site information, variable names, and configuration details.


5. Additional Outputs

Depending on the experiment and configuration, additional outputs may include:

  • Predictions: Model predictions for each site and time step (if saved).

  • Best Model: The parameters or checkpoint corresponding to the best validation loss (if tracked).

  • Diagnostics: Any additional diagnostic arrays or logs produced during training or evaluation.


Example: Accessing Output Variables

To load a checkpoint and access its contents in Julia:

julia
using JLD2

data = jldopen("epoch_10.jld2", "r") do file
    Dict(name => read(file, name) for name in keys(file))
end

# Example: Access training loss for epoch 10
training_loss = data["loss_array_training"]

Summary Table

Variable NameDescriptionShape / Type
lower_boundLower bounds for parametersVector (n_params)
upper_boundUpper bounds for parametersVector (n_params)
parameter_namesNames of parametersVector (n_params)
parameter_tableFull parameter tableTable / NamedTuple
metadata_globalGlobal experiment metadataNamedTuple / Dict
loss_array_trainingTraining loss (scalar) for current epochMatrix (n_train_sites, 1)
loss_array_validationValidation loss (scalar) for current epochMatrix (n_val_sites, 1)
loss_array_testingTesting loss (scalar) for current epochMatrix (n_test_sites, 1)
loss_array_components_trainingTraining loss components for current epochArray (n_train_sites, n_comp, 1)
loss_array_components_validationValidation loss components for current epochArray (n_val_sites, n_comp, 1)
loss_array_components_testingTesting loss components for current epochArray (n_test_sites, n_comp, 1)
reFunction to reconstruct model from flat parametersFunction
flatFlat vector of model parametersVector (n_weights)