问题描述
限时送ChatGPT账号..在 MNIST 示例中,优化器设置如下
# Optimizer: set up a variable that's incremented once per batch and
# controls the learning rate decay.
batch = tf.Variable(0, dtype=data_type())
# Decay once per epoch, using an exponential schedule starting at 0.01.
learning_rate = tf.train.exponential_decay(
0.01, # Base learning rate.
batch * BATCH_SIZE, # Current index into the dataset.
train_size, # Decay step.
0.95, # Decay rate.
staircase=True)
# Use simple momentum for the optimization.
optimizer = tf.train.MomentumOptimizer(learning_rate,
0.9).minimize(loss,
global_step=batch)
在训练过程中,
for step in xrange(int(num_epochs * train_size) // BATCH_SIZE):
# skip some code here
sess.run(optimizer, feed_dict=feed_dict)
我的问题是,在定义 learning_rate
时,他们使用 batch * batch_size
来定义全局步长.然而,在训练迭代中,我们只有可变步长.代码如何将step信息连接(或传递)到tf.train.exponential_decay
中的全局step参数我不是很清楚这个python参数传递机制是如何工作的.
My question is that when defining learning_rate
, they use batch * batch_size
to define global step. However, in the training iteration, we only have variable step. How does the code connect(or pass) the step information to the global step parameter in tf.train.exponential_decay
I am not very clear how does this python parameter passing mechanism work.
推荐答案
从您链接的代码来看,batch
是全局步骤.它的值由优化器更新.学习节点将其作为输入.
From the code you have linked, batch
is the global step. Its value is updated by the optimizer. The learning node takes it as input.
命名可能是一个问题.batch
仅表示用于训练的当前批次的数量(大小为 BATCH_SIZE
).或许更好的名字应该是 step
甚至 global_step
.
The naming may be an issue. batch
merely means the number of the current batch used for training (of size BATCH_SIZE
). Perhaps a better name could have been step
or even global_step
.
大部分 global_step
代码似乎是 在单个源文件中.它很短,也许是了解各个部分如何协同工作的好方法.
Most of the global_step
code seems to be in a single source file. It is quite short and perhaps a good way to see how the pieces work together.
这篇关于关于在mini-batch优化中设置全局步长信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
更多推荐
[db:关键词]
发布评论