– If you are so smart, can you tell which car will go off the rails when?
The task was formulated by one of the railway operators in approximately the same spirit, but in a more Russian and more railway language. The original logic was to predict which carriage should not be allowed on which track in which weather. Naturally, in reality we did not solve it that way, but as a result there will be fewer such cases:
The carriage is going. It can be part of a passenger or freight train. At some point, one pair of wheels loses grip on the rail. This is the state of descent. In practice, this means that all traffic on the site stops, a team with a crane moves out to lift this car. In a very bad case, an environmental or man-made disaster. Naturally, this is a problem for the company.
Naturally, there is a standard for the regulations and maintenance of cars, there is a standard for the permissible speed, there is a standard for the permissible arc radius, slope, and so on. But the cars leave from time to time. It seems that it depends on their condition, load, the degree of sobriety of the linemen, the pressure of the driver and some other factors, and some of them can be estimated in advance.
The first thing we investigated was the known cases of derailments in order to understand what affects the carriages. And in parallel, we began to search for all the available data that the carrier could give us.
Derailment of a car with ethanol from the rails in the USA and its consequences.
After a preliminary study, it became clear that we were creating a model that would predict the conditions for the exit of a carriage on a track section.
We started by building a simple decision tree. Here, the main point that moves away from the ideal model to this day is the problem with the data. We are interested in the state of the car, the state of the track (in dynamics, not in static), how the car was serviced, it would be good to know the weather, the description of the locomotive crew, the cargo that the car is carrying. At the first stage, only the characteristics of the car were available.
To make it clearer why it is so difficult to obtain data, I will say that a lot is still being done on paper. For example, at the time of our arrival to unload the database (as it seemed to us), we found an amazing picture. The operator printed out a bunch of sheets with the data on the cars for a month, put them on the previous ones so that the previous ones would stick out one galley to the side … and began to write down the difference on them. But this is not the only problem, there is still a sea of approvals and other features of large systems with a taste of bureaucracy.
We had data on the tracks (their slopes, the last known state), but we cannot receive an operational summary from the linemen. Therefore, all data on the tracks are statics according to the last known tables + weather.
There is simply no data on teams, crawlers and other personnel.
For cars, the data is more interesting, for them there is just a very good dataset:
I must say right away that it is a great success to get even such a set of data on a real problem in such conditions. Practice boils down to the fact that you need to show at least some result on this data, and then the company will think about connecting some other sources to the model.
Research of known gatherings
In our limited datasets, the following weighting hierarchy is built:
Naturally, this reflects only part of the real picture, but this data is already sufficient for the primary model.
These gatherings describe only those situations that occurred without the influence of external factors, that is, for technical reasons and because of the human factor. This is about two-thirds of the carriages we know of. For example, we are talking about a side frame fracture due to a combination of metal fatigue and unsatisfactory track maintenance. Or the path was marked as clean, the brigade reported about it, but in fact did nothing. There was even a manufacturer who produced defective frames for two years in a row, judging by the statistical analysis.
The remaining third of the cases of derailment are accidents with the departure of vehicles on the way, cattle (cows have a fairly large inertia), landslides, fires near the tracks and other natural disasters.
The customer, of course, wanted a solution in the form of “this one will do, needs to be repaired or not allowed”, “but this one will not work.” But this is not how models work; we operate with probabilities. And here the difficulties of working with incomplete data began. We needed to come up with something that would show effectiveness even on them.
The practical essence of the task
For each carriage and each section of the track, we predict the likelihood of a derailment. In practice, given a lack of data, this means a lot of false positives, and our client, naturally, did not rush to consider each car as potentially dangerous during the running of the model. Nevertheless, they looked closely at the highest probabilities and already deliberately thought about the cars, the brigade was warned.
The model trained on known exits poorly predicted new exits, but it made it possible to figure out how to reduce the likelihood of this descent, that is, for example, which section to run this particular train with given specific wagons and with this specific cargo. Here’s an example of a problem and solution:
The same model allows you to give new recommendations for each haul, depending on the weather, season and a specific carriage. That is, you can choose another driving mode, for example.
In this model, now, in fact, it is mainly the condition of the car that is assessed. Its repairs, wagon passport (who made it, when), wagon type and, additionally, a description of the cargo on the current transportation. It is intuitively clear that the model was supposed to assign additional checks to full old cars, but this turned out to be not the case: almost every car is patched and repaired after a major overhaul. The greatest number of operations is given by the combination of the state of the car and the state of the track.
After proving some usefulness of the model, the customer began to take a closer look at the “highlighted” cars and situations. In fact, now they spend much more effort on repairs and inspections than before, but at the same time they must accumulate statistics on greater traffic safety. And if so, it will mean complete success, because even one statistically averted case of retirement fully pays for all this paranoia.
What else was found out as the decision was made
One of the important magic constants used on the railway is the train length of 65 cars. The carrier naturally wants to pack more, and the rail operator says, “No, 65 is the maximum.” Why? Because it is no longer accepted. Directly at the level of “grandfathers advised” and “there is such a sign.” Since the model is suitable for factor analysis, we drove trains with a large number of cars and were convinced that the constant is quite reasonable. No more. But now there is proof. That is, the initial data was already enough to start looking for some unknown dependencies and to confirm or deny facts known to customers.
And we also learned why they carry timber in cisterns. At first we were confused, but it turned out that there is such a way to optimize costs for fraudulent purposes. At first, everything is logical: there is a tank car. When it breaks down in the integration part, it is already difficult to carry liquid in it. Everything else in the carriage is good, only the “nalivayka” broke. Then they cut off the top of it and get an open-top wagon, where you can load timber. But the point is, this changes the freight charges. As a result, there was a period when, according to the documents, the timber was transported in a cistern, and in fact – fuel and lubricants. But they showed in the report that it was a forest. And they paid like a forest. At some point, they began to carry too much, and this became noticeable.
Well, we saw a repetition of the story about the speed of light. We connected one of the data sources and suddenly saw how at some point half of the users fell off, and it was precisely those who calculated the economy of additional routine repairs against the risks of collapse. It turned out that AIX had been rearranged on one of the machines, and user groups had been added to the configuration file. The line length of this file was written as a constant in the AIX config itself. When reinstalling, it was reset to default. Accordingly, the file was parsed to the first long line, then its processing stopped. Namely, below this line were our users.
What is the bottom line with the project?
Now we are waiting for the development of the situation. We need more data to continue working. Sometimes the connection to sources in a bloody enterprise is delayed for years, but we are also a bloody enterprise and know how to wait.
And yes, as they say here, our roads are iron, and people are golden. So everything will be fine with the project.