In this article, we will talk about the latest innovative TCN solutions. To begin with, using the example of a motion detector, we will consider the architecture of Temporal Convolutional Networks and their advantages over traditional approaches such as convolutional neural networks (CNN) and recurrent neural networks (RNN). We then talk about recent examples of TCN applications, including improved traffic prediction, sound localizer and detector, and probabilistic prediction.
A brief overview of TCN
The fundamental work of Lea et al. (2016) pioneered the use of temporal convolutional networks to segment video-based actions. Usually, the process is divided into two stages: firstly, the calculation of low-level features using (most often) CNN, which encodes spatio-temporal information, and secondly, the input of low-level features into the classifier, which receives high-level temporal information using (most often ) RNN. The main disadvantage of this approach is the need for two separate models. TCN offers a unified approach to cover both levels of information in a hierarchical manner.
The figure below shows the structure of the encoder-decoder, information on the architecture can be found in the first two links at the end of the article. The most critical questions are solved as follows: TCN can take a series of any length and get the same length in the output. Casual convolution is used where there is a fully convolutional one-dimensional network architecture. The key characteristic is that the output value at time t is collapsed only with those items that occurred in time before it.
The buzz surrounding TCN has even gone as far as Nature, with a recent publication of Jan et al. (2020) on the use of TCN in weather forecasting. In their work, the authors conducted an experiment comparing TCN and LSTM. One of the results was the conclusion that TCN does a good job of forecasting time series.
The following sections present the implementation and extension of the classic TCN.
Better traffic prediction
Ride-sharing and online navigation services can improve traffic forecasting and improve road experience. Reducing traffic congestion, reducing pollution, driving safely and quickly are just a few of the goals that can be achieved by improving traffic forecasting. Since this problem is based on real-time data, the accumulated traffic data must be used. For this reason, Dai et al. (2020) recently introduced a Hybrid Spatio-Temporal Graph Convolutional Network (H-STGCN). The basic idea is to take advantage of the piecewise linear sliding flow density ratio and convert the forthcoming traffic volume to its traffic time equivalent. One of the most interesting approaches they have used in their work is graph convolution to obtain time dependence. The composite adjacency matrix captures the inherent characteristics of the traffic approximation (see Lee’s 2017 article for more information). The following architecture provides four modules to describe the entire forecasting process.
Localization and detection of sound events
The area of localization and sound event detection (SELF) continues to grow. In autonomous navigation, understanding the environment plays a large role. Girjis et al. (2020) recently proposed a new SELF-TCN audio event architecture. A group of researchers claims that their framework outperforms current solutions in this area, reducing training time. In their SELDnet (architecture is shown below), multichannel audio sampled at 44.1 kHz extracts the phase and spectrum magnitude using short-term Fourier transform and extracts them as separate input features. Then convolutional blocks and recurrent blocks (bidirectional GRUs) are connected, and then a fully connected block comes. On exit from SELDnet, you can get the detection of audio events and the direction where the audio came from.
And in order to outperform the existing solution, the authors introduced SELD-TCN:
Since extended convolutions allow the network to handle different inputs, a deeper network may be required (which will be affected by unstable gradients during back propagation of the error). The authors of the study were able to solve this problem by adapting the WaveNet architecture (Dario et al., 2017). They showed that recurrent layers are not required for SELD tasks, while still being able to determine the start and end times of active sound events.
A new framework developed by Chen et al. (2020) can be applied to estimate probability density. Time series forecasting improves many business decision scenarios (for example, resource management). Probabilistic forecasting allows you to extract information from historical data and minimize the uncertainty of future events. When the forecasting task is to predict millions of related time series (as in the retail business), it takes prohibitive labor and computing resources to estimate the parameters. To resolve these difficulties, the authors proposed a CNN-based density estimation and forecasting system. Their structure can learn the hidden correlation between the series. The scientific novelty in their work lies in their proposed deep TCN, represented in their architecture:
The implementation of encoder-decoder modules can assist in the development of large-scale applications.
In this article, we reviewed the latest work related to temporal convolutional networks, which are superior in one way or another to the classical CNN and RNN approaches in solving time series problems.
- Lea, Colin, et al. “Temporal convolutional networks: A unified approach to action segmentation.” European Conference on Computer Vision. Springer, Cham, 2016.
- Lea, Colin, et al. “Temporal convolutional networks for action segmentation and detection.” proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.
- Yan, Jining, et al. “Temporal convolutional networks for the Advance prediction of enSo.” Scientific Reports 10.1 (2020): 1-15.
- Li, Yaguang, et al. “Diffusion convolutional recurrent neural network: Data-driven traffic forecasting.” arXiv preprint arXiv: 1707.01926 (2017).
- Rethage, Dario, Jordi Pons, and Xavier Serra. “A wavenet for speech denoising.” 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2018.
- Chen, Yitian, et al. “Probabilistic forecasting with temporal convolutional neural network.” Neurocomputing (2020).
- Guirguis, Karim, et al. “SELD-TCN: Sound Event Localization & Detection via Temporal Convolutional Networks.” arXiv preprint arXiv: 2003.01609 (2020).
- Dispelling Myths About Deep Learning – How Do Neural Networks Learn?