在论文语义卷积的完全卷积网络中,作者在反卷积的背景下区分了输入步幅和输出步幅。 这些术语有何不同?
In the paper 'Fully Convolutional Networks for Semantic Segmentation' the author distinguishes between input stride and output stride in the context of deconvolution. How do these terms differ from each other?
推荐答案输入步幅是步幅过滤器。您在输出中移动了多少滤波器。
Input stride is the stride of the filter . How much you shift the filter in the output .
输出步幅,这实际上是一个标称值。经过多次卷积和最大池化操作后,我们在CNN中获得了特征图。假设我们的输入图像是 224 * 224 ,而最终特征图是 7 * 7 。
Output Stride this is actually a nominal value . We get feature map in a CNN after doing several convolution , max-pooling operations . Let's say our input image is 224 * 224 and our final feature map is 7*7 .
然后我们说我们的输出跨度为:224/7 = 32(向下采样后图像发生的大致变化。)
Then we say our output stride is : 224/7 = 32 (Approximate of what happened to the image after down sampling .)
此tensorflow 脚本描述此输出步幅以及如何在FCN中使用密集预测。
This tensorflow script describe what is this output stride , and how to use in FCN which is the case of dense prediction .
one使用空间尺寸是32加1的倍数的输入,例如[321,321]。在中,这种情况下ResNet输出处的要素图将具有空间形状 [(高度-1)/ output_stride +1,(宽度-1)/ output_stride +1] 和角与输入图像角对齐,这极大地促进了特征与图像的对齐。使用[225,225] 图像作为输入,将在最后一个ResNet块的输出处生成[8,8]特征图。
one uses inputs with spatial dimensions that are multiples of 32 plus 1, e.g., [321, 321]. In this case the feature maps at the ResNet output will have spatial shape [(height - 1) / output_stride + 1, (width - 1) / output_stride + 1] and corners exactly aligned with the input image corners, which greatly facilitates alignment of the features to the image. Using as input [225, 225] images results in [8, 8] feature maps at the output of the last ResNet block.
更多推荐
CNN:输入步幅与输出步幅
发布评论