GRU (Gate Recurrent Unit), I am not sure whether there is a Chinese translate of its name. Like LSTM, it is a variant of RNN (Recurrent Neural Network), and like LSTM, it aims to solve the gradient problem in RNN.
Compared with the LSTM, proposed in 1997, the GRU proposed in 2014 is significantly newer. In actual use, I observed that (in terms of somatosensory) GRU is faster than LSTM training, and the score is not worse.
So below, I will record according to the flow direction of the numerical value in GRU neuron.
GRU Architecture
Basically, there are only two types of inputs into the neuron: the feature Xt
input by the training data, and the ht-1
input from the previous neuron (if it is h0
, it is preset to all zero vectors), and then composed of three paths at the end output path.
The view of the formula and diagram is exactly the same at that of LSTM, but you can find the calculation is much simpler.
References
- https://jhui.github.io/2017/03/15/RNN-LSTM-GRU/
- https://towardsdatascience.com/understanding-gru-networks-2ef37df6c9be
- https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21