Example Cubonacci repository - Iris

This is an example repository for the Cubonacci platform. This README describes the basic machine learning lifecycle flow and how the different components interact.

Machine learning lifecycle flow

Training data

The machine learning model needs data when trained. If Cubonacci has a matching, existing data snapshot available, you can select that one for training. If not, Cubonacci will call the .load_data() method from the DataLoader to cache it inside the platform. By inspecting the returned data objects in depth, a schema is dynamically created. This schema is used for efficiently storing the data, setting up the API schemas and validating data in later stages.

Model training

When Cubonacci needs to train a model, it will instantiate the corresponding object with the hyperparameters for this training session. These hyperparameters either come from an experiment run where Cubonacci determines which hyperparameter settings to try or from the user interface. The data is loaded and passed to the .fit() method of the model.

Experiment run

During an experiment run, every model training is done on a part of the training data. After the model is trained, the remaining data is used for evaluating the performance of this model. The features of this validation set are passed to the trained model. The predictions generated by the model are then passed together with the targets to the different metrics defined in the metrics folder. In case of a validation schema that involves training models multiple times on different parts of the data, this process is repeated and the metrics will be averaged. After an experiment of a hyperparameter setting has completed, the metrics are collected by Cubonacci and will be used to steer the rest of the experiment run and are visible in the user interface.

Full model

When the optimal settings are found or when the hyperparameter search was already concluded in previous iterations a model can be trained on the full training set. Instead of collecting metrics, Cubonacci will save the model for later use and look at a number of predictions to determine what the schema of the predictions looks like so that the platform can prepare running the model in production.

API deployment

After a model is trained in full it is available to deploy as an API. When deployed, an endpoint is available that can be called with JSON or with gRPC. The process serving these models uses the previously generated schemas to base the API schemas on. The calls are transformed into the same object format as the training set the model was trained on, after which .predict() is called on the incoming data. The predictions are transformed back to the appropriate gRPC or JSON formats.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.idea		.idea
data_loader		data_loader
metrics		metrics
models		models
.gitignore		.gitignore
README.md		README.md
cubonacci.yaml		cubonacci.yaml
requirements.txt		requirements.txt
test		test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Example Cubonacci repository - Iris

Machine learning lifecycle flow

Training data

Model training

Experiment run

Full model

API deployment

About

Releases

Packages

Languages

htjelsma/iris

Folders and files

Latest commit

History

Repository files navigation

Example Cubonacci repository - Iris

Machine learning lifecycle flow

Training data

Model training

Experiment run

Full model

API deployment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages