Who manages the development of ML models and how + experience of Freight One

Why MLOps is needed – very briefly

Developing machine learning models is different from developing traditional services. Firstly, the quality of their work is affected by any changes in the data. We at Freight One deal with logistics and use MO to predict the volume of freight traffic between railway stations. So, if a model predicting car loads was trained on data from one region, and then received data from another, then the accuracy and relevance of the forecast it built will be lower. For the model to work correctly, it needs to be regularly updated: monitor the quality of the data and how well they correspond to the tasks being solved.

Secondly, developing ML models is an interdisciplinary activity—it involves different teams. First, data scientists conduct experiments and train the model, then developers integrate it with corporate systems, and DevOps set up the environment. Working in this format has become classic in MLOps and is generally familiar to IT specialists, but process orchestration still represents difficulties for managers.

Helps solve the problem MLOps — a set of practices and tools for automation and life cycle management of machine learning systems. It speeds up the deployment of models and helps everyone involved—from executives to data scientists—to work more efficiently. The purpose of MLOps is do The process of developing and deploying models is more predictable, stable and scalable. Typically, MLOps helps teams automate routine tasks related to data preparation, manage model versions, scale model training, and model deployment. We decided to discuss the topic, including making MLOps a little more accessible to managers.

“Mess” in the industry

Machine learning remains one of the hot topics in IT. Almost every week, startups and corporations release new products based on AI systems in the hope of capturing a piece of the market, which is valued at $40–120 billion. But a dispersion in estimates of such magnitude is akin to fortune telling on coffee grounds. And this fact inevitably affects the MLOps sphere, because the hype around technologies often leads to inflated expectations regarding the ease of their implementation.

The MLOps field is relatively young, so every company wants to secure its influence. Habitual solutions get new names – businesses introduce their own terms, trying to leave a mark on history. Thus, in MLOps, each new tool inevitably turns into a “store” or “showcase”: first there were model stores and signsNow – metrics showcases And benchmarks. As engineer Mikhail Eric notes in the article MLOps Is a Messcompanies are enthusiastically coming up with synonyms for the familiar phrase “database.”

Another problem is that the industry is still there are no established approaches to management and orchestration of intra-team processes. Back in 2015, Google engineers notedthat model development is only a small part of machine learning. The relevant infrastructure is incredibly extensive and includes tools for data collection and verification, feature extraction, monitoring, and covers dozens of other processes. In other words, the components of MLOps are widely known, but the boundaries between them are still being formed.

Finally, there are even disagreements at the level of basic meanings. Can meet opinionthat MLOps is another buzzword for Data Engineering. Yes, both areas involve data collection and processing, but MLOps covers much larger area of ​​knowledge. Here, supporting infrastructure, working with models, scaling them, optimizing for GPUs, and setting up CI/CD processes plays an important role. It turns out that MLOps requires more than just engineering skills.

Development of best practices

MLOps is still in a state of some chaos, but the first steps towards standardization are already there: companies publish their how-to's and exchange experiences. Thus, the German consulting agency INNOQ (in particular, their division Data and AI) published A set of MLOps best practices. The organization proposes to consider working with machine learning models as part of the CI/CD pipeline and identifies three key phases. The first is devoted to the analysis of data and business problems that need to be solved with their help. At this stage, the company proposes to identify the target audience and determine further stages of development.

The second is the design phase, during which the data necessary for training the model is selected and the requirements are identified. They help design the ML application architecture and testing framework. The third phase involves building a proof-of-concept model and preparing it for production. At this stage, the intelligent algorithm is improved to produce stable results.

Major software vendors and cloud providers also post checklists to prepare infrastructure and processes for the implementation of MLOps practices. Similar recommendations from community members can also be found on GitHub. As a rule, they come down to the need to carry out the practical implementation of MLOps in two directions:

  • Organizational. It involves developing a corporate culture, conducting training sessions and uniting DevOps engineers and data scientists into a team with a clear division of responsibilities.

  • Technical. Implementation of DevOps tools to automate the processes of development, testing and integration of data sets, models and code.

In addition to checklists and guides, you can find open source solutions for building an MLOps pipeline in the public domain. An example would be a curated list of specialized tools Awesome MLOps. It not only contains comparative reviewsbut also literature. For example, managers may be interested in the latest publication “Implementing MLOps in the Enterprise” by O'Reilly. Its authors focus on the production-first approach; not so much on developing models, but on building a CI/CD pipeline. Another example is the bookReliable Machine Learning”, which focuses on development team management and MLOps best practices.

How we work with MLOps

At Freight One we use machine learning to simplify the work of our employees. For example, our “Repair Optimizer” helps plan car repairs and select a depot with the optimal cost. For developers, we created the Python class MLExperimentManager – it helps with automating data loading and assessing the quality of models. As the pool of intelligent tools grows, we need to closely monitor model creation to ensure the quality of ML solutions. We have prepared a generalized set of rules and recommendations on how to develop models. I will share the key points:

1. Development of an ML system is accompanied by documentation

In the project knowledge base, we have created templates for model and project cards, in which data scientists reflect:

  • Problem to be solved;

  • Solution approach and algorithm;

  • Quality assessment methodology;

  • Data used and their sources;

  • Tools used (for example, Airflow or MLflow).

This approach allows you to save time on transferring development between specialists: analysts, architects or business.

2. The model interface is isolated from other systems

Our most suitable architectural solutions for deploying models were microservices and Python packages. This approach helped increase transparency in development, reduce the time for testing hypotheses and reduce the burden on other specialists, since the “rolling out” of models became the responsibility of data scientists.

3. The algorithm is part of the pipeline

Fragmentation of the model's work process (data loading, processing, forecast) simplified debugging, made it possible to parallelize tasks between colleagues and increase the quality of solutions through specialization. The ML system also provides the ability to automatically substitute current data for retraining.

4. A log of experiments is kept

Maintaining a log of experiments and a model registry is an important requirement for ML development. At the same time, for each algorithm, a quality assessment methodology is formulated, documented and agreed with the business. The selected metrics determine how the experiments will be designed.

5. The life cycle of the model is covered by tests

This practice reduces the number of errors during model operation. Tests cover data, forecast quality, code, and integrations with other systems. We also strive to maintain version control of code, data, artifacts, and experimental results in the interest of transparency and reproducibility of learning.

6. Monitoring is configured for the productive model

For projects that have passed the pilot stage, we try to monitor:

  • Data (anomalies, distributions, etc.);

  • Queries to the model;

  • Model predictions;

  • Actual values ​​of the predicted value;

  • System availability and performance.

In this way, we check that the model works in accordance with the scenario and complies with the developed quality assessment methodology. This approach also allows you to quickly respond to model problems in operation.

Our checklist

In general, the ideas described in our basic approach can be presented in a checklist format and based on it when building your own pipeline:

  1. Algorithm, data, model architecture are clearly documented

  2. A methodology for assessing model quality has been developed and documented

  3. Toolkit defined

  4. The model is part of the pipeline

  5. Version control of code, data, and experimental results is maintained

  6. A log of experiments is kept

  7. Testing of the entire life cycle of the algorithm has been established

  8. The possibility of automatic retraining has been implemented

  9. The collection of input data, forecasts and facts in production is set up

  10. Configured performance and quality monitoring


Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *