关于 tensorflow 中变量作用域的名称

编程入门 行业动态 更新时间:2024-10-25 22:29:51
本文介绍了关于 tensorflow 中变量作用域的名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

限时送ChatGPT账号..

最近我一直在尝试学习使用 TensorFlow,但我不明白变量作用域究竟是如何工作的.特别是,我有以下问题:

Recently I have been trying to learn to use TensorFlow, and I do not understand how variable scopes work exactly. In particular, I have the following problem:

import tensorflow as tf
from tensorflow.models.rnn import rnn_cell
from tensorflow.models.rnn import rnn

inputs = [tf.placeholder(tf.float32,shape=[10,10]) for _ in range(5)]
cell = rnn_cell.BasicLSTMCell(10)
outpts, states = rnn.rnn(cell, inputs, dtype=tf.float32)

print outpts[2].name
# ==> u'RNN/BasicLSTMCell_2/mul_2:0'

'BasicLSTMCell_2' 中的 '_2' 来自哪里?稍后使用 tf.get_variable(reuse=True) 再次获取相同的变量时它是如何工作的?

Where does the '_2' in 'BasicLSTMCell_2' come from? How does it work when later using tf.get_variable(reuse=True) to get the same variable again?

我想我发现了一个相关的问题:

edit: I think I find a related problem:

def creating(s):
    with tf.variable_scope('test'):
        with tf.variable_scope('inner'):
            a=tf.get_variable(s,[1])
    return a

def creating_mod(s):
    with tf.variable_scope('test'):
        with tf.variable_scope('inner'):
            a=tf.Variable(0.0, name=s)
    return a

tf.ops.reset_default_graph()
a=creating('a')
b=creating_mod('b')
c=creating('c')
d=creating_mod('d')

print a.name, '\n', b.name,'\n', c.name,'\n', d.name

输出是

test/inner/a:0 
test_1/inner/b:0 
test/inner/c:0 
test_3/inner/d:0

我很困惑...

推荐答案

"BasicLSTMCell_2" 中的 "_2"名称范围,其中创建了操作 outpts[2].每次创建新的名称范围时(使用 tf.name_scope()) 或变量范围(使用 tf.variable_scope()) 根据给定的字符串,将唯一的后缀添加到当前名称范围,可能还带有附加后缀以使其唯一.对 rnn.rnn(...) 的调用具有以下伪代码(为了清晰起见,已简化并使用公共 API 方法):

The "_2" in "BasicLSTMCell_2" relates to the name scope in which the op outpts[2] was created. Every time you create a new name scope (with tf.name_scope()) or variable scope (with tf.variable_scope()) a unique suffix is added to the current name scope, based on the given string, possibly with an additional suffix to make it unique. The call to rnn.rnn(...) has the following pseudocode (simplified and using public API methods for clarity):

outputs = []
with tf.variable_scope("RNN"):
  for timestep, input_t in enumerate(inputs):
    if timestep > 0:
      tf.get_variable_scope().reuse_variables()
    with tf.variable_scope("BasicLSTMCell"):
      outputs.append(...)
return outputs

如果您查看 outpts 中张量的名称,您会发现它们如下所示:

If you look at the names of the tensors in outpts, you'll see that they are the following:

>>> print [o.name for o in outpts]
[u'RNN/BasicLSTMCell/mul_2:0',
 u'RNN/BasicLSTMCell_1/mul_2:0',
 u'RNN/BasicLSTMCell_2/mul_2:0',
 u'RNN/BasicLSTMCell_3/mul_2:0',
 u'RNN/BasicLSTMCell_4/mul_2:0']

当您输入新的名称范围时(通过输入 with tf.name_scope("..."):with tf.variable_scope("..."): 块),TensorFlow 创建一个新的唯一名称为范围.第一次输入 "BasicLSTMCell" 范围时,TensorFlow 逐字使用该名称,因为它之前没有使用过(在 "RNN/" 范围内).下一次,TensorFlow 将 "_1" 附加到范围以使其唯一,依此类推直到 "RNN/BasicLSTMCell_4".

When you enter a new name scope (by entering a with tf.name_scope("..."): or with tf.variable_scope("..."): block), TensorFlow creates a new, unique name for the scope. The first time the "BasicLSTMCell" scope is entered, TensorFlow uses that name verbatim, because it hasn't been used before (in the "RNN/" scope). The next time, TensorFlow appends "_1" to the scope to make it unique, and so on up to "RNN/BasicLSTMCell_4".

变量作用域和名称作用域的主要区别在于变量作用域还有一组name-to-tf.Variable 绑定.通过调用tf.get_variable_scope().reuse_variables(),我们指示TensorFlow重用而不是为"RNN/"范围创建变量(及其子节点),在时间步长 0 之后.这确保权重在多个 RNN 单元之间正确共享.

The main difference between variable scopes and name scopes is that a variable scope also has a set of name-to-tf.Variable bindings. By calling tf.get_variable_scope().reuse_variables(), we instruct TensorFlow to reuse rather than create variables for the "RNN/" scope (and its children), after timestep 0. This ensures that the weights are correctly shared between the multiple RNN cells.

这篇关于关于 tensorflow 中变量作用域的名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

更多推荐

[db:关键词]

本文发布于:2023-05-01 04:51:01,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1403890.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:变量   作用   名称   tensorflow

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!