Skip to content

htjelsma/iris

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Example Cubonacci repository - Iris

This is an example repository for the Cubonacci platform. This README describes the basic machine learning lifecycle flow and how the different components interact.

Machine learning lifecycle flow

Training data

The machine learning model needs data when trained. If Cubonacci has a matching, existing data snapshot available, you can select that one for training. If not, Cubonacci will call the .load_data() method from the DataLoader to cache it inside the platform. By inspecting the returned data objects in depth, a schema is dynamically created. This schema is used for efficiently storing the data, setting up the API schemas and validating data in later stages.

Model training

When Cubonacci needs to train a model, it will instantiate the corresponding object with the hyperparameters for this training session. These hyperparameters either come from an experiment run where Cubonacci determines which hyperparameter settings to try or from the user interface. The data is loaded and passed to the .fit() method of the model.

Experiment run

During an experiment run, every model training is done on a part of the training data. After the model is trained, the remaining data is used for evaluating the performance of this model. The features of this validation set are passed to the trained model. The predictions generated by the model are then passed together with the targets to the different metrics defined in the metrics folder. In case of a validation schema that involves training models multiple times on different parts of the data, this process is repeated and the metrics will be averaged. After an experiment of a hyperparameter setting has completed, the metrics are collected by Cubonacci and will be used to steer the rest of the experiment run and are visible in the user interface.

Full model

When the optimal settings are found or when the hyperparameter search was already concluded in previous iterations a model can be trained on the full training set. Instead of collecting metrics, Cubonacci will save the model for later use and look at a number of predictions to determine what the schema of the predictions looks like so that the platform can prepare running the model in production.

API deployment

After a model is trained in full it is available to deploy as an API. When deployed, an endpoint is available that can be called with JSON or with gRPC. The process serving these models uses the previously generated schemas to base the API schemas on. The calls are transformed into the same object format as the training set the model was trained on, after which .predict() is called on the incoming data. The predictions are transformed back to the appropriate gRPC or JSON formats.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages