An industrial-scale brain or how to make a dream come true?

In the previous article, we looked at different types of neural networks and discussed what problems can be solved with their help. Now let's look at the problem of artificial intelligence from an organizational and technical point of view.

When working on complex projects, a team of developers and data scientists is usually involved, who immediately have questions: how to manage the project, jointly develop a Machine Learning model, test it, how to synchronize code and experimental results? After developing and optimizing the ML model, there is a need to deploy it in an industrial environment. All of these problems may seem less exciting than solving the machine learning problem itself, but they are critical to the success of ML projects.

In this article, we will take a detailed look at the life cycle of an ML service from idea to development and implementation, as well as the tools and principles used at each stage.

Article outline

Life cycle and participants of an ML project

Artificial intelligence projects may seem like a new world, but in practice they follow standard IT project steps. At stages related to machine learning, it is necessary to use specialized tools, which are quite advanced today. The diagram shows the main stages of an IT project, with ML specifics highlighted in green:

At this stage, business goals are set.

This is a key specificity of ML projects. The results of this stage affect the viability of the entire project, so it is placed at the beginning of the project, data scientists are immediately involved in it, and special time is allocated to study the data.

  • Initiating a project, with the adoption of a Go/No-Go decision, is also a standard stage in project management.

  • Formalization of requirements and criteria for acceptance of work. At this stage what happens:

    • Development and coordination with the customer of functional and non-functional requirements in the form of documents: technical specifications, design documentation, performance requirements, etc.

    • Determining the budget and necessary equipment.

  • The stage of developing and testing ML models and code is the most resource- and time-consuming.

  • If the results of the previous stages meet the business requirements, then a decision is made to industrially deploy the solution.

  • Next comes the operation, monitoring and updating of the solution and ML model.

At different stages of the project, machine learning specialists with different roles are involved. Within the framework of ML projects there are: Several key roles that may be performed by different people, by the same people, or have overlapping responsibilities:

Next, we will consider in detail the key stages of the ML project.

Evaluation, analysis and data preparation

In general, data analysis and preparation consists of the following stages:

Receiving data (Ingestion)

A good practice at this stage is to always keep the original data intact and experiment with copies of the data.

Exploration and Validation

At this stage, it is necessary to determine whether any additional data is needed or whether it is possible to move on.

Data Cleaning

After preparing the data, the stage of developing the code and ML model begins.

Development and testing of ML models

Development of ML models

Developing ML models usually consists of the following steps:

Model Training

Model Engineering

Model Evaluation & Testing

Model Packaging

Automation of the process of developing and testing ML models (MLOps)

To perform tasks within ML projects, it is necessary to use specialized tools. The use of such tools allows us to ensure reproducibility of results at all stages of the ML model life cycle:

To automate data analysis, tools are used to build pipelines, which allow you to version data and models and automatically perform the necessary steps. An example of such a system is Data Version Control (DVC)which is called Git for data.

To automate the development and deployment of ML models, it appeared MLOps concept. MLOps tools are used to save the results of ML experiments, model versions, for testing and deploying models. There are now several tools that implement the basic functionality of MLOps:

MLflow platform

Let's take a closer look at the MLflow platform. It consists of the following main components designed to work with ML projects:

  • MLflow Tracking – monitors and logs the model training process. Stores experimental results, configuration data, and model hyperparameters. Allows you to visualize metrics, compare results and select the best model option.

  • MLflow Projects – the module is designed to save data, code and all dependencies for the ability to repeat experiments on different platforms.

  • MLflow Models – allows you to save ML models in standard formats for further deployment in various environments. Most common formats:

MLflow Deployment Schemes

Exist various schemes to deploy and use MLflow. I will give the most general diagram with dedicated MLflow Server:

With this scheme, the process of developing and deploying an ML model looks like this:

  • The ML model is developed and tested on local equipment, which is integrated with the MLflow Tracking server.

  • Source codes and data for building the ML model are stored in Git.

  • Implemented ML models are saved in the MLflow Registry.

  • MLflow Models transfers the model to a virtual environment for local deployment or a docker container for deployment to cloud platforms and Kubernetes.

  • Using MLflow Deployment toolset, the model is deployed into an industrial environment.

Monitoring the operation of the ML model is carried out using special platforms, for example, Evidently.

Deploying ML Models

Deployment schemes depending on the type of training and prediction

The following types of learning and prediction are distinguished:

Using these types you can construct the following matrix:

Its cells contain the names of circuits that implement the required functionality:

The cells also contain templates for embedding ML models into industrial systems, with the help of which it is possible to implement the listed schemes:

Next, I will briefly describe the schemes and templates:

Forecast – to build a model, static data in the form of files is usually used. Prediction is also carried out on static data. BI systems and data science work in this mode. The circuit is not intended for use in industrial systems.

Web service – the most popular usage scheme. In it, the ML model is built on historical data, but the information for prediction is taken from the query in real time. Retraining the model on current data can be run periodically, or the query itself can launch the training process on current data (batch run).

Online learning – the most dynamic scheme. Typically used on streaming data when the model must constantly change. Retraining can occur not on an industrial system, but in parallel, and then the name “incremental training” is more suitable for this scheme. In such systems, there is a risk that incoming low-quality data will degrade the quality of the model.

Automated machine learning – in this scheme, automatic training, optimization of results and selection of an ML model occurs. The implementation of such systems is often a more complex task than online learning, since it only requires the user to provide data, and the ML model is automatically selected by the system itself. Typically implemented by large AI providers such as Google or Microsoft.

Model Serving Patterns

Each prediction and learning scheme can be implemented by different technical templates:

Model-as-Service – the simplest template. The ML model works as a service to which the application makes requests using the REST API:

Model-as-Dependency – the most straightforward way to use an ML model. When implementing this pattern, the model is embedded in the application:

Precompute serving – when implementing the template, pre-prepared predictions are used:

Model-on-Demand – The pattern is similar to Model-as-Dependency, which uses a message broker architecture with two components:

Conclusion

In the article, we looked at the stages of implementing machine learning projects and noted the importance of all stages for achieving success. We found out that the world of artificial intelligence consists not only of Data Scientists, but also of ML engineers who bring ideas to life. The stages of data analysis and preparation, development and testing of ML models, as well as deployment schemes in an industrial environment were discussed in detail.

Without competent technical implementation and process organization, a great idea will remain just an abstraction!

Some links for in-depth study

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *