What is: Deep LSTM Reader?

The Deep LSTM Reader is a neural network for reading comprehension. We feed documents one word at a time into a Deep LSTM encoder, after a delimiter we then also feed the query into the encoder. The model therefore processes each document query pair as a single long sequence. Given the embedded document and query the network predicts which token in the document answers the query.

The model consists of a Deep LSTM cell with skip connections from each input $x\left(t\right)$ to every hidden layer, and from every hidden layer to the output $y\left(t\right)$ :

$x'\left(t, k\right) = x\left(t\right)||y'\left(t, k - 1\right) \text{, } y\left(t\right) = y'\left(t, 1\right)|| \dots ||y'\left(t, K\right)$

$i\left(t, k\right) = \left(W\_{kxi}x'\left(t, k\right) + W\_{khi}h(t - 1, k) + W\_{kci}c\left(t - 1, k\right) + b\_{ki}\right)$

$f\left(t, k\right) = \left(W\_{kxf}x\left(t\right) + W\_{khf}h\left(t - 1, k\right) + W\_{kcf}c\left(t - 1, k\right) + b\_{kf}\right)$

$c\left(t, k\right) = f\left(t, k\right)c\left(t - 1, k\right) + i\left(t, k\right)\text{tanh}\left(W\_{kxc}x'\left(t, k\right) + W\_{khc}h\left(t - 1, k\right) + b\_{kc}\right)$

$o\left(t, k\right) = \left(W\_{kxo}x'\left(t, k\right) + W\_{kho}h\left(t - 1, k\right) + W\_{kco}c\left(t, k\right) + b\_{ko}\right)$

$h\left(t, k\right) = o\left(t, k\right)\text{tanh}\left(c\left(t, k\right)\right)$

$y'\left(t, k\right) = W\_{kyh}\left(t, k\right) + b\_{ky}$

where || indicates vector concatenation, $h\left(t, k\right)$ is the hidden state for layer $k$ at time $t$ , and $i$ , $f$ , $o$ are the input, forget, and output gates respectively. Thus our Deep LSTM Reader is defined by $g^{\text{LSTM}}\left(d, q\right) = y\left(|d|+|q|\right)$ with input $x\left(t\right)$ the concatenation of $d$ and $q$ separated by the delimiter |||.

Source	Teaching Machines to Read and Comprehend
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com