Tuesday, January 9, 2024

The choice of machine learning model - LSTM vs RNN

 


The choice of which ML model to use for financial time series is important and should not be taken lightly. The LTSM or Long Short-term Memory model has become a workhorse for many quant financial analysts. The LTSM model is a type of recurrent neural network that can account for long-term dependencies by gates to control inputs, outputs, and memory (a forget gate).  

RNNs (recurrent neural networks) are useful for processing sequential data such as time series because they can process sequential data both backwards and forwards. The RNN can use context and dependencies between time steps which is critical for financial data which may have some form of autocorrelation.  LSTM is form or type of RNN which uses a memory cell connected to gate which can be activated when long-term information is a useful input. The simplest case of a memory gate is a GRU, or gated recurrent unit which can help improve predictions. 

All these memory gates attempt to address what is referred to as the "vanishing gradient" problem, when the gradient of the weights in the model become small and there is limited learning or improvement. By pulling in or activating long-term memory, there is an attempt to boosted model performance through steepening the gradient for weight changes. Using longer-term data and not just current input can help improve forecasting or prediction context.

No comments: