训练时如何冻结张量流变量中的特定节点？

编程入门行业动态更新时间:2024-10-18 18:23:48

本文介绍了训练时如何冻结张量流变量中的特定节点？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述

目前，我在使变量中的一些元素变得不可训练方面遇到麻烦。它意味着给定变量x，

Currently I am having trouble in making a few elements in a variable as non-trainable. It implies that given a variable such as x,

x= tf.Variable(tf.zeros([2,2]))

我希望只训练x [0,0]和x [1,1]，同时保持训练时固定为x [0,1]和x [1.0]。

I wish to train only x[0,0] and x[1,1] while keeping x[0,1] ans x[1.0] as fixed while training.

当前tensorflow确实提供了通过使用 trainable = False 或 tf.stop_gradient（）。但是，这些方法将使 x 中的all元素不可训练。我的问题是如何获得这种选择性？

Currently tensorflow does provide the option to make any variable non-trainable by using trainable=False or tf.stop_gradient(). However, these method will make the all element in x as non-trainable. My question is how to obtain this selectivity?

推荐答案

到目前为止，还没有选择性地缺乏更新；但是，您可以通过明确地指定应应更新的变量来间接实现此效果。 .minimize 和所有渐变函数都接受要优化的变量列表-只需创建一个列表，省略，例如其中的一些

There is no selective lack of update as for now; however you can achieve this effect indirectly by specifing explicitely variables that should be updated. Both .minimize and all the gradient functions accept the list of variables you want to optimize over - just create a list omitting some of these, for example

v1 = tf.Variable( ... ) # we want to freeze it in one op v2 = tf.Variable( ... ) # we want to freeze it in another op v3 = tf.Variable( ... ) # we always want to train this one loss = ... optimizer = tf.train.GradientDescentOptimizer(0.1) op1 = optimizer.minimize(loss, var_list=[v for v in tf.get_collection(tf.TRAINABLE_VARIABLES) if v != v1]) op2 = optimizer.minimize(loss, var_list=[v for v in tf.get_collection(tf.TRAINABLE_VARIABLES) if v != v2])

现在，只要您想训练wrt，就可以打电话给他们。变量子集。请注意，如果您使用Adam或其他方法来收集统计信息，则可能需要2个单独的优化器（最终每个优化器将获得单独的统计信息！）。但是，如果每次培训只有一组冻结变量，那么使用var_list可以很简单。

and now you can call them whenever you want to train wrt. subset of variables. Note that this might require 2 separate optimizers if you are using Adam or some other method gathering statistics (and you will end up with separate statistics per optimizer!). However if there is just one set of frozen variables per training - everything will be straightforward with var_list.

但是没有办法变量的子集。 Tensorflow始终将变量视为一个单位。您必须以不同的方式指定计算方式才能实现这一目标，一种方式是：

However there is no way to fix training of the subset of the variable. Tensorflow treats variable as a single unit, always. You have to specify your computations in a different way to achieve this, one way is to:

创建一个二进制掩码M，其中1为要停止在X上进行更新
创建单独的变量X'，该变量不可训练，并将tf.assign为其值X
输出X'* M +（1-M）* X

例如：

x = tf.Variable( ... ) xp= tf.Variable( ..., trainable=False) m = tf.Constant( ... ) # mask cp= tf.Assign(x, xp) with tf.control_dependencies([cp]): x_frozen = m*xp + (1-m)*x

，您只需使用x_frozen而不是x。请注意，我们需要控制依赖项，因为tf.assign可以异步执行，这里我们要确保它始终具有最新的x值。

and you just use x_frozen instead of x. Note that we need control dependency as tf.assign can execute asynchronously, and here we want to make sure it always has the most up to date value of x.