How I chose a model to forecast transport infrastructure needs

About Me

Hello everyone, Arseniy Eliseev is in touch. As part of the review of a consulting case of one of the clients, a large industrial logistics operator, the need arose to integrate machine learning.

In this article, I want to share my thoughts and description of the process of developing a forecasting model. Perhaps some of you may find this information useful. I would also like to note that due to the NDA, financial indicators and other sensitive information will be hidden.

Project context

I think we should first bring the readers up to speed. In our case, the client who contacted us is a 3PL provider that organizes the transportation of industrial goods. Most of the orders are carried out using road transport. An important point is that the client has its own fleet of vehicles, but sometimes resorts to the practice of outsourcing drivers and vehicles for the time prescribed by the contract, during periods of increasing demand.

As part of the search for problematic business processes of the company, pain points requiring improvement were identified. Four main problems were found that to varying degrees directly or indirectly affected the quality of the logistics service, and therefore the volume of realized profit. Dysfunctions, the degree of their danger and their consequences are conveniently represented using the Ishikawa diagram (“fishbone”), a popular analysis tool in management and consulting, which is a cause-and-effect diagram (Siti Holifahtus Sakdiyah et al., 2022):

Figure 1. Ishikawa diagrams

Figure 1. Ishikawa diagrams

I have drawn attention to a problem that has a critical impact and systematically understates the quality indicators of the provided logistics service. The essence of it is that the company often faces lost sales, since in some cases it cannot accept the client's order due to the lack of resources to fulfill it. Thus, if the terms of the order fulfillment allow the implementation of the attraction of vehicles and drivers by outsourcing from partners or competitors in an extremely limited time period, then the company carries out the order. Otherwise, there are lost sales, the fact of their presence is recorded in the information system.

A potential solution to this problem would be to improve the business planning process, which, in addition to the usual distribution of flights one or two weeks in advance by the functionality of the information system, would also be focused on forecasting consumer demand to calculate the need for transport capacity for certain periods of time.

Classification of consumer demand forecasting methods

I think it is worth paying a little attention to the theory. It would be good to look at the classification of consumer demand forecasting methods in order to get acquainted with the set of tools that can be potentially applied. I will note that there are many systems of typification for forecasting tools, but in this article I want to share my personal one, which, in my opinion, is able to fully describe their nature. It can be presented as follows:

Figure 2. Classification of consumer demand forecasting methods

Figure 2. Classification of consumer demand forecasting methods

Qualitative forecasting methods, despite their clarity and ease of implementation, are characterized by their subjectivity and variability, including due to the fact that they are based on the personal perception of the situation by individuals. They are usually used when there is a shortage of accurate historical data. In this case, downloads from the client's IS allow us to build a time series, so quantitative tools are of particular interest. However, for the already stated reason of the possibility of forming a dataset, methods lying in the field of simulation modeling seem redundant due to the lack of need to conduct experiments to collect material. Thus, we can come to the conclusion that our arsenal can be replenished with forecasting methods from applied statistics and machine learning.

Moreover, consumer demand forecasting models can be combined. Even those that belong to different types. In international practice in various spheres of life, one can encounter many successful examples of combining several forecasting models to obtain a plan that has greater accuracy than that possessed by individual forecasting methods in a particular case (Bunga Kharissa Laras Kemala et al., 2024) (Heng Wang et al., 2024) (Carlos Garcia-Aroca et al., 2024) (Jiang Cheng et al., 2024) (Heba-Allah I. El-Azab, et al., 2023).

By the way, the approach by which forecast models are combined is called a “meta-model”, which is also selected and in some cases trained separately. From my personal experience, I can say that one should always try to combine several forecasting methods, especially those of different types. In practice, it often turns out that it is the combination of several strengths of each approach that causes the emergence of the forecast model that is best adapted to the conditions of a specific task. When combining individual forecast plans obtained by different methods, one should remember only one simple condition that must be met when assigning their weights-coefficients to them:

Specifics of planning the size of a transport fleet

All this time we have been talking about forecasting customer demand, but let us not forget about our ultimate goal – to obtain a plan for the need for transport infrastructure. At first glance, it may seem that outsourcing a large number of trucks “on top” of our own fleet is the key to solving all the problems. Yes, this way you can indeed maintain 100% service level, but what about the costs? The revenue received from fulfilling all the orders provided may simply not “cover” all the costs associated with maintaining a massive volume of transport capacity. The customers are happy, but there is no benefit for us…

Here we can draw a parallel with the well-known dilemma in the field of SCM between inventory costs and service levels (Shirell James, 2021), which can be expressed by the following relationship:

Figure 3. Trade-off between inventory cost and service level

Figure 3. Trade-off between inventory cost and service level

The same relationship is true for the size of the vehicle fleet: the higher the level of service maintained at a given point in time, the greater the size of the vehicle fleet capacity required for its further expansion.

In this case, we should focus on maintaining a level of service that will provide the highest income indicators. In practice, in such cases, they often turn to quantile forecasting, with the help of which they try to predict not the exact average expected value, but a value that will not be exceeded with a certain fixed probability (with some conventions, this can be called the “service level”). This approach also has certain advantages:

1) Resilience to outliers. This is especially useful in our case, where data will be aggregated over longer periods of time, during which additional vehicles may be leased;

2) Distribution agnosticism. Some forecasting methods require normality of the data to be applied, but this is not a requirement when used to forecast quantiles.

The approach based on quantile forecasting is widely used in practice in various areas of life (Honglin Wen, 2024) (Dazhi Yang et al., 2023) (Sheng-Long Jiang et al., 2023). Its application is also found in our target area of ​​transport management (Evgenii Genov et al., 2024) (Heba-Allah I. El-Azab et al., 2024).

Exploratory analysis

*All transformations and calculations will be entered using Python 3.10. A link to the repository with Jupiter Notebook is provided at the end of the article.

Before we start building pipelines for forecasting, it's worth examining the data we'll be working with. Let's look at the first line of the transposed dataset:

Figure 4. Transposed dataset row

Figure 4. Transposed dataset row

Note that there are many “extra” columns that are not quite necessary for our purposes. For example, the same application status does not interest us, since we will consider all potentially possible transactions. We will leave only the shipment date, the mass of orders aggregated by it, as well as a newly built column with the specific price of transportation of 1 ton for each individual day, which will be needed later:

Figure 5. Dataset fragment with required columns

Figure 5. Dataset fragment with required columns

Since the cargo is mostly small-sized, the parameter of the cargo capacity of the vehicles can be skipped and only their load capacity can be considered. For this reason, aggregating the mass of cargo in orders is a convenient and acceptable strategy.

It's also worth paying attention to the data itself and checking it for outliers. In this case, you can turn to a traditional visualization tool, a boxplot, often called a “box and whiskers”:

Figure 6. Boxplot for mass column

Figure 6. Boxplot for mass column

The ratio of the number of outliers to the total number of observations shows that outliers are only about 1% of the total data – they can simply be “removed” without any particular risks. Let's proceed to the “scalpen” method – IQR (interquartile range) (Ch. Sanjeev Kumar Dash et al., 2023), that is, we simply remove from the dataset those observations whose aggregate mass is 1.5 times greater than the difference between the lower (25%) and upper (75%) quartiles for the same column.

Now we can start visualizing the time series itself. If we rely on quantiles in this work, then it is worthwhile, in addition to the data dynamics, to plot the dynamics of monthly quantiles (for example, 90%) to test the hypothesis that any patterns can really be found from them. We will calculate the quantiles based on weekly aggregations, since practice has shown that this approach brings some smoothness and naturalness to the dynamics of the new time series. We will also add a trend with its equation to the quantile graph to assess the general trend:

Figure 7. Time series dynamics graph

Figure 7. Time series dynamics graph

The picture is quite interesting: the calculated monthly quantiles really allow us to see the trend and seasonality. It is obvious to the naked eye that the trend is downward. In addition to its visual representation, this is also indicated by its analytical representation: its linear equation, in which the negative coefficient in front of X.

To make sure, let's turn to time series decompositionwhere each of the mentioned components will be visualized separately. When solving the dilemma of choosing an additive or multiplicative model for decomposition, we will rely on the lower autocorrelation in the residuals. This will help us determine Durbin-Watson criterion:

Figure 8. Time series decomposition

Figure 8. Time series decomposition

After all, the analytics is based on an additive model. You can see a sharp decline in the trend in the second half of 2021, continuing until the end of the observation history. There is a stable seasonality: September and October are the peak periods, while December and January are the simplest, which is easily explained by first preparing for the weekend, and then their immediate onset. There is indeed some pattern in the residuals, and, therefore, some degree of autocorrelation. However, this is quite typical for real data, so there is no reason to panic.

Building forecasting models

Based on the conducted exploratory analysis, we can conclude that our work will use those methods that work with non-stationary time series and take into account both the trend and seasonality. Without further ado, we will select the most suitable of them:

  1. Simple Moving Average:

    The simplest method, which refers to the 1st generation forecasting algorithms. The essence of the algorithm is to average the values ​​over a certain period, called a window. The width of the window determines how many past periods will be taken into account when forecasting. As a rule, it is suitable only for smooth linear dynamics, however, it will be interesting to observe how it copes with this time series.

  2. Holt-Winters method:

    A modification of the exponential smoothing method for seasonal series. It is distinguished by a somewhat small need for computing power compared to other methods that take into account both trend and seasonality.

  3. SARIMA:

    This modification of the autoregressive moving average (ARIMA) accounts for seasonality by adding a linear combination of past seasonal values ​​and/or past forecast errors.

  4. LSTM:

    A special type of recurrent neural network architecture capable of learning long-term dependencies. It is this property that allows them to predict time series.

#build lstm scaler = MinMaxScaler(feature_range=(0,1))  
lstm = Sequential() 
lstm.add(LSTM(128, return_sequences=True,               
              input_shape=(df_qun.shape[0]-12, 1))) 
lstm.add(LSTM(64, return_sequences=False)) 
lstm.add(Dense(32)) lstm.add(Dense(1)) 
lstm.compile(optimizer="adam", loss="mean_squared_error")

Now let's consider strategies for combining forecast plans. Since our time series is limited to only 5 years, the training of the combination model will be limited to working only with the last year preceding the forecast year. This is explained by the fact that for other approaches that require training models on historical periods, there will be insufficient observations for high-quality “acceleration”. In this regard, we can offer 2 methods:

  1. Uniform combination:

  2. Combination by MSE:

Thus, we have 6 separate forecasting methods: 4 separate models and 2 combination approaches. First of all, it is worth noting that we will also have several quantile values. The following levels were empirically identified: 0.8, 0.85, 0.9, 0.9554, 0.9973. The last 2 values ​​were taken from the six sigma rule (The council for Six Sigma Certification, 2018), which is often used in the field of quality control and service level.

The forecast year will be 2022. The effectiveness of the developed forecast plans will be assessed against actual observations. However, there is a hypothesis that for each quantile level there will be its own most suitable forecasting method from among those listed earlier. This means that the correlation of those same models with quantiles will occur in the previous 2021 based on the minimum value of the mean square error (MSE) after their training in the period from 2018 to 2020. Results:

Figure 9. Results of selection of optimal methods in the test year

Figure 9. Results of selection of optimal methods in the test year

It can be noted that neural networks, in particular LSTM, are most often the most suitable solution for forecasting. Although the “chosen ones” include both the traditional SARIMA and one of the previously proposed combination strategies.

Forecasting results

Now we have to use the previously selected methods for each quantile level for the target year. In fact, the models are trained on the full historical period preceding 2022. In addition to trying to compare the effectiveness of forecast plans built on quantiles, we will also calculate the “ideal” scenario, in which all possible resources are attracted to fulfill the full volume of all potential orders. Thus, the following parameters will be used in the modeling:

SELF_TRUCKS_COUNT = 5 #Величина собственного автопарка
TRUCK_CAPACITY_TONE_COUNT = 14 #Грузоподъемность т/с (КамАзы)
TRIP_DAYS_COUNT = 4 #Длительность кругового маршрута (рассматривается 1 экономическая зона)
RENT_TRUCK_PRICE = 250000 #Ежемесячная стоимость аутсорса услуг водителя с т/с

Results:

Figure 10. Quantile forecasting results

Figure 10. Quantile forecasting results

It can be seen that the “ideal” scenario, when we try to maintain 100% service level, turns out to be the least profitable option of the presented ones, as it assumes the smallest amount of profit. The most profitable way is forecasting using LSTM with a 0.85-month quantile. During this period, the number of additionally attracted vehicles does not exceed 15 units, while this indicator for the “ideal” option sometimes exceeds 80 units. Now, with some degree of confidence, we can assume that the presented forecasting scheme can well be used in the future for the case under consideration. However, it is worth being careful and regularly revising this approach, as models and quantile levels can be more effective for subsequent periods.

conclusions

In this paper, we tested the hypothesis that quantile forecasting can be applied to the transport sector. I think we managed to get a clear and convincing result. This topic will be developed in the future.

I hope that the work done will be useful for you. What do you think about such experimental approaches in general? Share your opinions in the comments!

*Source code on GitHub: https://github.com/yelis-alt/research/tree/prod/demand_planning_system

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *