Tensorflow:针对高阶张量计算Hessian矩阵(仅对角线部分)

编程入门 行业动态 更新时间:2024-10-28 20:19:24
本文介绍了Tensorflow:针对高阶张量计算Hessian矩阵(仅对角线部分)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我想针对vgg16 conv4_3层内核的每个特征图(3x3x512x512尺寸矩阵)计算指定损失的一阶和二阶导数(Hessian的对角线部分).我知道如何根据如何计算Tensorflow中的所有二阶导数(仅是Hessian矩阵的对角线)? 但是,当排名上升时,我就迷失了方向.

I would like to compute the first and the second derivatives(diagonal part of Hessian) of my specified Loss with respect to each feature map of a vgg16 conv4_3 layer's kernel which is a 3x3x512x512 dimensional matrix. I know how to compute derivatives if it is respected to a low-rank one according to How to compute all second derivatives (only the diagonal of the Hessian matrix) in Tensorflow? However, when it turns to higher-rank, I got completed lost.

# Inspecting variables under Ipython notebook In : Loss Out : <tf.Tensor 'local/total_losses:0' shape=() dtype=float32> In : conv4_3_kernel.get_shape() Out : TensorShape([Dimension(3), Dimension(3), Dimension(512), Dimension(512)]) ## Compute derivatives Grad = tfpute_gradients(Loss, conv4_3_kernel) Hessian = tfpute_gradients(Grad, conv4_3_kernel) In : Grad Out : [<tf.Tensor 'gradients/vgg/conv4_3/Conv2D_grad/Conv2DBackpropFilter:0' shape=(3, 3, 512, 512) dtype=float32>] In : Hessian Out : [<tf.Tensor 'gradients_2/vgg/conv4_3/Conv2D_grad/Conv2DBackpropFilter:0' shape=(3, 3, 512, 512) dtype=float32>]

请帮助我检查我的理解.因此,对于conv4_3_kernel,每个[Kx,Ky,in_channels,out_channels]的暗淡代表,因此相对于每个特征图中的每个元素(像素),Grad应该是Loss的偏导数. Hessian是二阶导数.

Please help me to check my understandings. So, for conv4_3_kernel, each dim stand for [Kx, Ky, in_channels, out_channels], so Grad should be partial derivatives of Loss with respect to each element(pixel) in the each feature maps. And Hessian is the second derivatives.

但是,Hessian计算所有导数,如何只计算对角线部分?我应该使用tf.diag_part()吗?提前非常感谢!

But, Hessian computes all the derivatives, how can I only compute only the diagonal part? should I use tf.diag_part()? Many thanks in advance!

推荐答案

tfpute_gradients计算标量的导数.如果提供的数量不是标量,则通过汇总示例中所发生的成分将其变成标量

tfpute_gradients computes derivative of a scalar quantity. If the quantity provided isn't scalar, it turns it into scalar by summing up the components which is what's happening in your example

要计算完整的Hessian,您需要n调用tf.gradients,示例为此处.如果只需要对角线部分,则修改对tf.gradients的第i个调用的参数,以区分第i个变量而不是所有变量.

To compute full Hessian you need n calls to tf.gradients, The example is here. If you want just the diagonal part, then modify arguments to ith call to tf.gradients to differentiate with respect to ith variable, rather than all variables.

更多推荐

Tensorflow:针对高阶张量计算Hessian矩阵(仅对角线部分)

本文发布于:2023-11-30 10:23:23,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1649642.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:张量   矩阵   高阶   仅对   角线

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!