问题描述
限时送ChatGPT账号..我有一个堆叠的 MultiRNNCell 定义如下:
I have a stacked MultiRNNCell defined as below :
batch_size = 256
rnn_size = 512
keep_prob = 0.5
lstm_1 = tf.nn.rnn_cell.LSTMCell(rnn_size)
lstm_dropout_1 = tf.nn.rnn_cell.DropoutWrapper(lstm_1, output_keep_prob = keep_prob)
lstm_2 = tf.nn.rnn_cell.LSTMCell(rnn_size)
lstm_dropout_2 = tf.nn.rnn_cell.DropoutWrapper(lstm_2, output_keep_prob = keep_prob)
stacked_lstm = tf.nn.rnn_cell.MultiRNNCell([lstm_dropout_1, lstm_dropout_2])
rnn_inputs = tf.nn.embedding_lookup(embedding_matrix, ques_placeholder)
init_state = stacked_lstm.zero_state(batch_size, tf.float32)
rnn_outputs, final_state = tf.nn.dynamic_rnn(stacked_lstm, rnn_inputs, initial_state=init_state)
在这段代码中,有两个 RNN 层.我只想处理这个动态RNN的最终状态.我希望状态是形状 [batch_size, rnn_size*2]
的二维张量.
In this code, there are two RNN layers. I just want to process the final state of this dynamic RNN. I expected the state to be a 2D tensor of shape [batch_size, rnn_size*2]
.
final_state 的形状是 4D - [2,2,256,512]
The shape of the final_state is 4D - [2,2,256,512]
有人能解释一下为什么我会得到这个形状吗?另外,我如何处理这个张量,以便我可以通过一个完全连接的层?
Can someone please explain why am I getting this shape ? Also, how can I process this tensor so that I can pass it through a fully_connected layer ?
推荐答案
我无法重现 [2,2,256,512]
形状.但是用这段代码:
I can't reproduce the [2,2,256,512]
shape. But with this piece of code:
rnn_size = 512
batch_size = 256
time_size = 5
input_size = 2
keep_prob = 0.5
lstm_1 = tf.nn.rnn_cell.LSTMCell(rnn_size)
lstm_dropout_1 = tf.nn.rnn_cell.DropoutWrapper(lstm_1, output_keep_prob=keep_prob)
lstm_2 = tf.nn.rnn_cell.LSTMCell(rnn_size)
stacked_lstm = tf.nn.rnn_cell.MultiRNNCell([lstm_dropout_1, lstm_2])
rnn_inputs = tf.placeholder(tf.float32, shape=[None, time_size, input_size])
# Shape of the rnn_inputs is (batch_size, time_size, input_size)
init_state = stacked_lstm.zero_state(batch_size, tf.float32)
rnn_outputs, final_state = tf.nn.dynamic_rnn(stacked_lstm, rnn_inputs, initial_state=init_state)
print(rnn_outputs)
print(final_state)
我得到了 run_outputs
的正确形状:(batch_size, time_size, rnn_size)
I get the right shape for run_outputs
: (batch_size, time_size, rnn_size)
Tensor("rnn/transpose_1:0", shape=(256, 5, 512), dtype=float32)
final_state
确实是一对 LSTMStateTuple
(对于 2 个单元格 lstm_dropout_1
和 lstm_2
):
The final_state
is indeed a pair of LSTMStateTuple
(for the 2 cells lstm_dropout_1
and lstm_2
):
(LSTMStateTuple(c=<tf.Tensor 'rnn/while/Exit_3:0' shape=(256, 512) dtype=float32>, h=<tf.Tensor 'rnn/while/Exit_4:0' shape=(256, 512) dtype=float32>),
LSTMStateTuple(c=<tf.Tensor 'rnn/while/Exit_5:0' shape=(256, 512) dtype=float32>, h=<tf.Tensor 'rnn/while/Exit_6:0' shape=(256, 512) dtype=float32>))
如tf.nn.dynamic_run
的字符串文档中所述:
as described in the string doc of tf.nn.dynamic_run
:
# 'outputs' is a tensor of shape [batch_size, max_time, 256]
# 'state' is a N-tuple where N is the number of LSTMCells containing a
# tf.contrib.rnn.LSTMStateTuple for each cell
这篇关于Tensorflow 中 MultiRNNCell 的输出和状态的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
更多推荐
[db:关键词]
发布评论