Deploying Machine Learning Workloads using MLFlow

We will not languish, straight to the point.

What is MLflow?

The MLflow project is designed to make it easy for machine learning to track experiments, manage projects, and deploy models.

It currently offers four components:

  • The MLflow Tracking Component is an API and user interface for logging parameters, code versions, metrics, and outputs when running machine learning code and then visualizing the results. MLflow Tracking allows you to register experiments and query them using the Python, REST, R API, and Java APIs.

  • The MLflow project is a format for packaging data processing and analysis code in a reusable and reproducible way based primarily on conventions. In addition, the Projects component includes APIs and command-line tools for launching projects, allowing you to combine projects into workflows.

  • An MLflow model is a standard format for packaging machine learning models that can be used in various downstream tools, such as real-time serving via the REST API or batch inference in Apache Spark. The format defines a convention that allows the model to be stored in different “flavors” that can be understood by different subsequent tools.

  • The MLflow Model Registry component is a centralized repository of models, a set of APIs, and a user interface for collaboratively managing the complete lifecycle of an MLflow model. It provides model origins (which experiments and MLflow model runs created the model), model versioning, transitions between stages (for example, staging to production), and annotations.

    You can install MLFlow following the documentation, everything is simple there, we will not waste time on this:

    When everything is installed, start the server with the following command:

nohup mlflow server --host \
                    --port 5000 \
                    --backend-store-uri file:///root/mlflow &

Let’s run the first MLflow example using the Tracking API.

The MLflow tracking API allows you to log metrics and artifacts (files) from your data processing code and view the execution history.

Here is a simple python script:

import os
from random import random, randint

import mlflow
from mlflow import log_metric, log_param, log_artifacts



if __name__ == "__main__":

    log_param("hyperparam1", randint(0, 100))

    log_metric("accuracy", random())
    log_metric("accuracy", random() + 1)
    log_metric("accuracy", random() + 2)

    if not os.path.exists("outputs"):
    with open("outputs/model.txt", "w") as f:
        f.write("hello world!")


It can also be tilted from

What we will do and run:

cd ~
git clone
cd ~/katacoda-notebooks/06_mlflow_install/

By default, wherever we run our program, the tracking API writes data to files in the location specified by the environment variable MLFLOW_TRACKING_URI.

MLflow training

Tracking our model training runs with MLflow is pretty easy.

We simply add an experiment tracker to our training script and run the model training as shown here:


### Framework imports ###


# Import mlflow and framework support

import mlflow
import mlflow.[keras|tensorflow|sklearn]

# Setup Experiment Tracker




### Framework specific model training code ###


# Start the MLflow run
with mlflow.start_run():
    history =, y_train,
    score = model.evaluate(x_test, y_test,
                       batch_size=batch_size, verbose=1)
    print('Test score:', score[0])
    print('Test accuracy:', score[1])

The MLflow tracking APIs can log information about each training run, such as the hyperparameters used to train the model and the metrics used to evaluate the model. We can also serialize the model in a format that MLflow knows how to deploy.

We can add configuration with mlflow.start_run():

# Log Parameters
    mlflow.log_param("alpha", alpha)
    mlflow.log_param("l1_ratio", l1_ratio)

    # Log Metrics
    mlflow.log_metric("rmse", rmse)
    mlflow.log_metric("r2", r2)
    mlflow.log_metric("mae", mae)

    # Log Model
    mlflow.[keras|tensorflow|sklearn].log_model(lr, "model")

Every time we run the code, MLflow logs information about our experiments in the location specified in the environment variable MLFLOW_TRACKING_URI.

Now let’s see this in action and train models with Keras, Tensorflow and Scikit-learn.

We are running a Keras example that trains and evaluates a simple MLP on a Reuters newsfeed topic classification task.

cd ~/katacoda-notebooks/07_mlflow_training
cd ~/katacoda-notebooks/07_mlflow_training

We also run the TensorFlow sample, which also trains and evaluates a simple MLP on a Reuters newsfeed topic classification task.

cd ~/katacoda-notebooks/07_mlflow_training
cd ~/katacoda-notebooks/07_mlflow_training

We run the Scikit-Learn example, which predicts wine quality with sklearn.linear_model.ElasticNet.

cd ~/katacoda-notebooks/07_mlflow_training
cd ~/katacoda-notebooks/07_mlflow_training

Let’s open the UI

Pipelines in MLflow

In MLflow, you can implement pipelines as multi-step workflows.

API in conjunction with mlflow.tracking allows you to create multi-step workflows with separate projects (or entry points in the same project) as separate steps. Every call returns an execution object that can be used with mlflow.trackingto determine when execution ended and get its output artifacts. These artifacts can then be passed to another step that accepts path or uri parameters. You can coordinate the entire workflow in a single Python program that reviews the results of each step and decides what to send next using native code.

In the following example, we will predict users’ movie ratings given their rating history of other movies (based on the MovieLens dataset).

This workflow has four steps:

  • Downloads the MovieLens dataset (a set of triplets of user ID, movie ID, and rating) in CSV format and places it in the artifact store.

  • Converts the MovieLens CSV file from the previous step to Parquet, removing unnecessary columns along the way. This reduces the size of the input data from 500MB to 49MB and provides columnar access to the data.

  • Runs an interleaved least squares collaborative filtering in the MovieLens Parquet version to evaluate movieFactors and userFactors. This gives a relatively accurate estimate.

  • trains a neural network on the original data augmented with ALS movie/userFactors – we hope this will improve the ALS scores.

In a programme we will perform these steps in order and pass the results of one step to the next.

Tracking MLflow

MLflow tracking is organized around the concept of runs, which are the execution of some piece of data processing code.

Each run records the following information:

  • Code version: The Git commit hash used to launch if it was launched from an MLflow project.

  • Start and end time: start and end time of the run.

  • A source: The filename to start the run, or the project name and entry point to the run if it’s running from an MLflow project.

  • Parameters: key-value input parameters of your choice. Both keys and values ​​are strings.

  • Metrics: key-value metrics, where the value is numeric. Each metric can be updated at runtime (for example, to track how your model’s loss function converges), and MLflow records and allows you to visualize the full history of the metric.

  • Artifacts: output files in any format. For example, you can write images (like a PNG), models (like a pickled scikit-learn model), and data files (like a Parquet file) as artifacts.

Optionally, we can organize runs into experiments that combine runs for a specific problem. We can create an experiment using the experiments command line interface mlflowfunction mlflow.create_experiment() or the corresponding REST options. The MLflow API and UI allows you to create and search for experiments.

Once your runs are recorded, we can query them using the tracking UI or the MLflow API.

Here is a simple Python example:

with mlflow.start_run():
    for epoch in range(0, 3):
        mlflow.log_metric(key="quality", value=2*epoch, step=epoch)

We are running a multi-step Spark and Keras sample

cd ~/katacoda-notebooks/08_mlflow_pipelines
cat MLproject

cd ~/katacoda-notebooks/08_mlflow_pipelines
MLFLOW_TRACKING_URI=file:///root/mlflow mlflow run --experiment-name spark-keras

Let’s see it in UI

Hyperparameter tuning

Hyperparameters are variables that control how the model is trained.

For example:

  • Learning rate.

  • The number of layers in the neural network.

  • The number of nodes in each layer.

Hyperparameter values ​​are not remembered. In other words, unlike node weights and other training parameters, the model training process does not adjust the hyperparameter values.

Hyperparameter tuning is the process of optimizing hyperparameter values ​​to maximize model prediction accuracy.

There are many approaches to optimizing/tuning hyperparameters:

  • Grid search

  • random search

  • Bayesian optimization

  • Gradient Based Optimization

  • Evolutionary optimization

  • Public education

Automatic hyperparameter tuning works by optimizing a target variable, also called a target metric, that you specify in the hyperparameter tuning job configuration.

A common metric is the accuracy of the model when testing the training task (validation accuracy). You also specify whether you want the hyperparameter tuning job to maximize or minimize the metric.

Hyperparameter tuning with MLflow

First, we need to decide what to track.

  • Hyperparameters: All vs. those that are configurable

  • Metric(s): training and validation, loss and target, multiple targets

  • Tags: origin, simple metadata

  • Artifacts: serialized model, big metadata

At a high level, the guidelines for organizing launches and tracking a hyperparameter setting are as follows (corresponding to the structure used by the setting itself):

In the following example, we will try to optimize the RMSE score of a Keras deep learning model for a wine quality dataset. The Keras model has two hyperparameters that we are trying to optimize: learning rate and momentum.

The input dataset is divided into three parts: training, validation, and testing. The training dataset is used to fit the model and the validation dataset is used to select the best hyperparameter values ​​and the test set is used to estimate the expected performance and verify that we are not overfitting a particular combination of training and validation.

All three metrics are logged with MLflow and we can use the MLflow UI to check how they differ between different hyperparameter values.

MLproject has 4 targets:

  • Train – We train a simple deep learning model on the wine quality dataset from our tutorial. It has 2 configurable hyperparameters: learning rate and momentum. Contains examples of how Keras callbacks can be used to integrate MLflow.

  • random performs a simple random search in the parameter space.

  • gpyopt uses GPyOpt to optimize train hyperparameters. GPyOpt can run multiple mlflow runs in parallel if run with batch size > 1 and max_p > 1.

  • hyperopt we use Hyperopt for hyperparameter optimization.

Let’s run this.

We run the hyperparameter tuning example:

cd ~/katacoda-notebooks/09_mlflow_tuning
cat MLproject

cd ~/katacoda-notebooks/09_mlflow_tuning
MLFLOW_TRACKING_URI=file:///root/mlflow mlflow run -e random --experiment-name random-search

Let’s take a look at the UI

MLflow Model Serving

MLflow can deploy models locally as local REST API endpoints or evaluate files directly. Additionally, MLflow can package models as standalone Docker images with a REST API endpoint. The image can be used to securely deploy the model to various environments such as Kubernetes.

We deploy the MLflow model locally or generate a Docker image using the CLI interface to the mlflow.models module.

The REST API server accepts the following data formats as POST input for the /invocations path:

  • JSON-serialized pandas DataFrames in split orientation. For example, data = pandas_df.to_json(orient=”split”). This format is specified using the Content-Type request header value application/json or application/json; format=pandas-split.

  • JSON-serialized pandas DataFrames in record orientation. I do not recommend using this format because it does not guarantee that the order of the columns will be preserved. This format is specified using the request header value Content-Type application/json; format=pandas-records.

  • CSV serialized pandas DataFrames. For example, data = pandas_df.to_csv(). This format is specified using the request header value Content-Type text/csv.

Here is an example call:

# split-oriented
curl http://localhost:5001/invocations -H 'Content-Type: application/json' -d '{
    "columns": ["a", "b", "c"],
    "data": [[1, 2, 3], [4, 5, 6]]

# record-oriented (fine for vector rows, loses ordering for JSON records)
curl http://localhost:5000/invocations -H 'Content-Type: application/json; format=pandas-records' -d '[[1, 2, 3], [4, 5, 6]]'

The prediction command accepts the same input formats. The format is specified as command line arguments.


  • serve deploys the model as a local REST API server.

  • build_docker packages a REST API endpoint serving the model as a docker image.

  • Predict uses the model to generate a prediction for a local CSV or JSON file.

    We show the model for deployment:

cd ~/katacoda-notebooks/10_mlflow_serving

We run the model deployment sample:

cd ~/katacoda-notebooks/10_mlflow_serving
MLFLOW_TRACKING_URI=file:///root/mlflow mlflow run --experiment-name serve-sample .

Deploy the model in the background:

MLFLOW_TRACKING_URI=file:///root/mlflow mlflow models serve --model-uri runs:/[PASTE-THE-RUN-ID-HERE]/model --port 5001 &

It remains to run the forecast:

curl -d '{"columns":[0],"index":[0,1],"data":[[1],[-1]]}' -H 'Content-Type: application/json' http://localhost:5001/invocations

We learned how to deploy different machine learning workloads using MLflow.

The article was prepared in anticipation of the start of the course MLOps by OTUS. You can watch it absolutely free by clicking the link. course demoand learn more about the course.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *