Introduction

# Creating the Encoder RNN
def encoder_rnn(rnn_inputs, rnn_size, num_layers, keep_prob, sequence_length):
    lstm = tf.contrib.rnn.BasicLSTMCell(rnn_size)
    lstm_dropout = tf.contrib.rnn.DropoutWrapper(lstm, input_keep_prob = keep_prob)
    encoder_cell = tf.contrib.rnn.MultiRNNCell([lstm_dropout] * num_layers)
    encoder_output, encoder_state = tf.nn.bidirectional_dynamic_rnn(cell_fw = encoder_cell,
                                                                    cell_bw = encoder_cell,
                                                                    sequence_length = sequence_length,
                                                                    inputs = rnn_inputs,
                                                                    dtype = tf.float32)
    return encoder_state

tf.contrib.rnn.BasicLSTMCell(num_hidden_neurons)

tf.nn.bidirectional_dynamic_rnn( # tf.While loop to dynamically construct the graph when executed.
    cell_fw = encoder_cell,
    cell_bw = encoder_cell,
    sequence_length = sequence_length, # 
    inputs = rnn_inputs,
    dtype = tf.float32
)

sequence_length

Save computational time
Ensure Correctness.

A batch with 2 samples, one is length 13, and the other is length 20. Each one is a vector of 128 numbers. The length 13 is 0-padded to length 20. Then your RNN input tensor is [2, 20, 128].

The dynamic_rnn returns a tuple of (outputs, state), whereoutputsis a tensor [2, 20, ...]with the last dimension being the RNN output at each time step.stateis the last state for each example, and it’s a tensor of size[2, ...].

So, here’s the problem: Once your reach time step 13, your first example in the batch is already “done” and you don’t want to perform any additional calculation on it. The second example isn’t and must go through the RNN until step 20. By passingsequence_length=[13,20]you tell Tensorflow to stop calculations for example 1 at step 13.

Without passingsequence_length, Tensorflow will continue calculating the state until T=20 instead of simply copying the state from T=13. This means you would calculate the state using the padded elements, which is not what you want.

NextFirst Chapter

Last updated 5 years ago

Was this helpful?