Experience tracking with mlflow

With Mlflow tracking you can organize all your ML project experiences in one single platform. Read the article to know more and click the github icon to access the repo.

Mlflow is an open source tool backed by databricks. It works with several languages such as R, python and Java. It was designed to be multiuser and integrates easily with Apache Spark. It has 3 main blocks:

MLflow Tracking
MLflow Projects
MLflow Models

MLflow Project is a tool to reproduce experience runs and MLflow Models is a file format to deploy the models. From now on I will only focus on MLflow tracking.

Wine dataset example

Wine dataset was collected in the North Region of Portugal and has the goal of predicting the wine's quality based on chemical features. [source]

Please check the github icon for full code and analysis.

MLflow step1 - Setup + UI

After installing mlflow using the console select your desire directory and run mlflow ui.

pip install mlflow
mlflow ui

After initiating the ui you should see an image like this

Mlflow step2 - run the first experience

MLflow provides some methods to configurate the mlflow session, like:

mlflow.set_tracking_uri('http://xx.xx.x.xx:5000')
mlflow.set_experiment("Elastic Net")
mlflow.start_run('name')
mlflow.set_tag("mlflow.note.content", "Greed search")

After adding this blocks you should get a similar structure.

Mlfow has 3 main blocks to track information:

mlflow.log_params('name', object, step)
mlflow.log_metrics('name', object, step)
mlflow.log_artifacts('name', object, step)

Once you have the metric logged you would be able to make plots, that can help you choose the model. You can compare the models.

Tips

In case you use a problem with epochs, I advise you to use the step params, because it will allow you a better comparison between epock. The same could be used in k fold validation

import mlflow
mlflow.log_metrics('name', variable, step)

To sum up

MLFlow is a tool of activily developent, with great features. I currently use MLflow tracking on my daily bases. I found this tool when I start to have a lot of experiences, I felt lost between all the natebooks.

Tiago Cabo