Machine learning: TensorFlow project manages metadata for model training

Source: Heise.de added 11th Jan 2021

  • machine-learning:-tensorflow-project-manages-metadata-for-model-training

The team behind TensorFlow Extended (TFX) has published a project for storing metadata for training machine learning models. ML Metadata (MLMD) is intended to help with versioning of the training and with analyzing contexts. The project comes with an API for storing and retrieving the metadata and can work with different databases.

TensorFlow Extended is a platform for creating and managing of machine learning pipelines that bring models into productive use. The now presented ML Metadata project is intended to help versioning and analyzing the training of ML models in a similar way to source code with versioning systems. The project is available both as an integration into the TFX platform and in the form of an independent library.

Keeper of the minutes of the training process ML Metadata stores information about the data set used and the hyperparameters used for the training for the individual steps of the training. MLMD’s client libraries monitor every step in the pipeline and manage the respective input and output artifacts.

A GUI helps to manage and analyze the data. The backend for saving is exchangeable. Among other things, MySQL or the slim database library SQLite can be used. The latter runs optionally for test purposes in the “Fake Database” mode only in the memory, without storing any information on disk.

MLMD stores information about each step in the Data memory from.

(Image: TensorFlow Extended)

Artifacts at a glance Afterwards, ML Metadata enables an analysis of the training steps. In this way, data scientists can filter artifacts according to certain specifications, for example to list all models for whose training they have used a specific data set. They can also load artifacts of the same type to compare experiments or training processes. For individual artifacts, all steps can be traced back in order to identify, among other things, which data were used in the training of a model.

MLMD comes with a low-level API to integrate the creation of metadata into the ML pipeline. In addition, the library can be directly integrated into the model training in order to generate automatic logs and to create a history of the training processes.

Further details can be found on the TensorFlow blog. Currently the project has version number 0. 26. 0, and the readme of the GitHub repository explicitly states that it will be up to the release of Version 1.0 may have incompatibilities. The TFX team has published a tutorial to help you get started.

(rme)

Read the full article at Heise.de

brands: CODE  ML  other  RME  Team  
media: Heise.de  
keywords: Memory  

Related posts


Notice: Undefined variable: all_related in /var/www/vhosts/rondea.com/httpdocs/wp-content/themes/rondea-2-0/single-article.php on line 88

Notice: Undefined variable: all_related in /var/www/vhosts/rondea.com/httpdocs/wp-content/themes/rondea-2-0/single-article.php on line 88

Related Products



Notice: Undefined variable: all_related in /var/www/vhosts/rondea.com/httpdocs/wp-content/themes/rondea-2-0/single-article.php on line 91

Warning: Invalid argument supplied for foreach() in /var/www/vhosts/rondea.com/httpdocs/wp-content/themes/rondea-2-0/single-article.php on line 91