Bi-LSTM. The output of the dropout is the input for the bi-directional LSTM. A regular LSTM layer consists of 4 gates, which control the input and output of the neurons. A LSTM-neuron is described by the following formulas (▇▇▇, ▇▇▇▇▇, and ▇▇▇▇, 2015): ft = σ (Wf xt + Uf ht−1 + b f ), (1) it = σ (Wixt + Uiht−1 + bi), (2) ot = σ (Woxt + Uoht−1 + bo), (3) ct = ft ◦ct−1 + it ◦tanh(Wcxt + Ucht−1 + bc), (4) ht = ot ◦tanh(ct), (5) with b f , bi, bo and bc being biases. ht is the hidden state of the neuron, which gets passed on to the next layer. It is the Hadamard product of the ‘output-gate’ ot and the internal memory state ct. The ‘forget-gate’ ft and the ‘input-gate’ it regulate the balance between new input and previous information in the internal memory. This internal memory is what distinguishes an LSTM from a regular RNN, being able to store important information in the internal memory over a longer period. The difference between a regular LSTM and the bi-directional LSTM used here is the order of the input from the sequences. The bi-LSTM consist of two independent LSTM layers, both with the same input sequences. However one layer goes over the sequence from start to finish and the other layer the other way around. This ensures the ability to pay equal focus on the whole sequence. The output of these layers are concatenated before fed into another dropout layer (10%).
Appears in 2 contracts
Sources: End User Agreement, End User Agreement
Bi-LSTM. The output of the dropout is the input for the bi-directional LSTM. A regular LSTM layer consists of 4 gates, which control the input and output of the neurons. A LSTM-neuron is described by the following formulas (▇▇▇, ▇▇▇▇▇, and ▇▇▇▇, 2015): ft = σ (Wf xt + Uf ht−1 + b f ), (1) it = σ (Wixt + Uiht−1 + bi), (2) ot = σ (Woxt + Uoht−1 + bo), (3) ct = ft ◦◦ ct−1 + it ◦◦ tanh(Wcxt + Ucht−1 + bc), (4) ht = ot ◦tanh(ct◦ tanh(ct ), (5) with b f , bi, bo and bc being biases. ht is the hidden state of the neuron, which gets passed on to the next layer. It is the Hadamard product of the ‘output-gate’ ot and the internal memory state ctct . The ‘forget-gate’ ft and the ‘input-gate’ it regulate the balance between new input and previous information in the internal memory. This internal memory is what distinguishes an LSTM from a regular RNN, being able to store important information in the internal memory over a longer period. The difference between a regular LSTM and the bi-directional LSTM used here is the order of the input from the sequences. The bi-LSTM consist of two independent LSTM layers, both with the same input sequences. However one layer goes over the sequence from start to finish and the other layer the other way around. This ensures the ability to pay equal focus on the whole sequence. The output of these layers are concatenated before fed into another dropout layer (10%).
Appears in 1 contract
Sources: End User Agreement