【论文笔记】An Improved Deep Learning Approach for Retrieving Outfalls Into Rivers From UAS Imagery|电子爱好者

admin管理员组
文章数量:1652190

- 1 数据集
- 2 检测方法
- - 1 Fast-RCNN
  - 2 Fast-RCNN 优化（调参）
  - 3 GDCNN-Outfalls（提出的方法）
  - - 1 DSM Enhancement
    - 2 Spatial Activation
- 3 实验结果
- - 1 调参结果
  - - 1 调整 anchor
    - 2 调整 RoI
    - 3 困难负样本挖掘
    - 4 额外的实验
  - 2 GDCNN-Outfalls 结果
  - 3 GDCNN-Outfalls 测试
- 4 讨论
- 总结

An Improved Deep Learning Approach for Retrieving Outfalls Into Rivers From UAS Imagery
这是中国科学院地理科学与自然研究所发表在TGRS2021的文章，检测排污口，这像是一篇实验论文。

概述：作者采用三种策略（调参）来提高Faster RCNN检测排水口的性能；作者利用航拍额外得到的DSM信息以及河流的GIS数据，提出DSM Ehancement和Spatial Activation模块并集成到Faster RCNN模型，显著降低了误检率。

1 数据集

背景：我国生态环境部于2019年开始对长江流域排污口排查，将于2023年底前完成长江黄河等七大流域干流及重要支流排污口排查。

数据集包含8691个沿黄河和长江的图像块，9696个排水口；图像分辨率 6000×4000 像素，空间分辨率 10 cm，提取 600×600像素（对应 60m×60m 的实际范围）.

按排水口的形态分为 3 类：

类别1：涵洞，形状像中文的“八”和“门”，带有明显的建筑轮廓。
类别2：汇流点和可疑点，无明显轮廓。
类别3：管道或通道，主要为圆形或细长管形。

train : val : test = 7:1:2 = 6790 : 1049 : 1857

2 检测方法

提出了一种基于Fast RCNN的改进的地理深度学习方法，来检索河流中的排水口，简称 GDCNN-outfalls。
首先，以经典的 Fast RCNN 作为基本模型。
然后，调整anchor size，region of interest (RoI) number ，使用 hard negative mining 进行数据扩充。
此外，引入了 DSM enhancement 和 spatial activation两种策略作为地理分类器，提高 GDCNN-outfalls 的性能。

1 Fast-RCNN

三个模块：特征提取网络、RPN 和分类网络

ResNet50作为特征提取网络，定义 600×600 像素作为输入，以避免图像过度缩放丢失细节。
RPN模块由两个卷积层组成，分类器中的核尺寸为 18×1×1，回归器中的核尺寸为 36×1×1。特征提取的公共特征图用于生成一个称为anchor 的密集参考框，通过 bounding-box 回归和非极大值抑制（NMS）操作来细化锚框，使得将建议区域确定为包含排水口的 RoI。
分类模块由包含两个分别有3、4个神经元的全连接层组成，通过 softmax 回归将 RPN 生成的 RoI 分类为特定类型的排水口，并通过 bounding-box 回归确定更精确的位置。

2 Fast-RCNN 优化（调参）

Anchor Size
调整 anchor 的尺寸为 [128, 256, 576], [96, 256, 576], [64, 256, 576], 以及 [1:1, 2:1, 1:2] 的长宽比。下图为排水口样本的标记框尺寸的统计情况。
Region of Interest
In this study, the original 600 × 600 pixel image generates 38 × 38 feature images from ResNet50, and 12996 (38 × 38 × 9) anchors are generated pixel by pixel. According to the statistics of GT in Section III-B1, the average size of GT is 141 × 136, which is approximately (141/16) × (136/16) ∼ 9 × 9 pixels in the feature map. On the feature maps, each pixel takes the three anchors with the largest IoUs (based on the anchor size settings in Section III-B1) because these three anchors are sufficient to completely cover an outfall.
为了平衡训练速度和模型精度，进行了一些实验，以确定 RoI 的数量。最选择 100 (approximately 9 × 9 × 1), 200 (approximately 9 × 9 × 2), and 300 (approximately 9 × 9 × 3)
Hard Negative Mining
IoU从0.7调整到0.5，…
In this study, the hard negative class is defined as FPs with a score >0.5 and has no overlap with any GT. By adding the hard negative class into the training dataset, the model is retrained, and the operation is iterated until the percentage is convergent.

3 GDCNN-Outfalls（提出的方法）

将特定物体的地理相关知识引入 CNN 比原始 CNN 算法更加有效。这可能是未来将深度学习方法应用于遥感图像中检索自然地理相关对象的关键

引入一个单独的地理分类器模块，

GDCNN-Outfalls的结构

1 DSM Enhancement

作者认为图 a 排水口和图 b 建筑轮廓的相似性，容易让模型引起混淆。因此利用无人机图像和DOM数据，引入DSM，建立一个名为 RGBDSM 的数据集，用于训练排水口的固有空间特征，以抑制 False Positions。

具体操作：在Faster RCNN 输出层添加一个 geo-classfier ，形成 GDCNN-outfalls。 geo-classfier是一个全连接CNN，由三个卷积层（核大小分别为 32 × 3 × 3，32 × 3 × 3, 64 × 3 × 3，激活函数为 ReLU）和两个全连接层（64个神经元，激活函数一个是ReLU，另一个是 Sigmoid）。

The geo-classifier samples the DSM data in the predicted boxes output by the Faster R-CNN to a size of 128 × 128, which is close to the average size of the GT boxes, then further classifies the DSM and finally determines the correctness of the outfalls.

2 Spatial Activation

排水口应该位于河流附近的空间位置，这符合人的直觉。引入GIS信息，找出河流的区域。

空间激活函数定义如下：

where SAF(·) is the spatial activation function, R(x, y) is the input raster, I (x, y) is the buffer overlay indicator, and (x, y) is the spatial location of the pixel

3 实验结果

使用 F1-score, FPs, precision, recall, and average precision (AP)，共5个指标。

1 调参结果

1 调整 anchor

three anchor sizes with scales of [128, 256, 576], [96, 256, 576], and [64, 256, 576] and a ratio scale of [1:1, 2:1, 1:2].

model4性能最好，AP (65.96) and recall (75.4%)。确定 [64, 256, 576]
根据检索对象的大小调整锚框，有助于提升模型性能。

2 调整 RoI

RoIs分别取100，200，200.
model7效果最好。确定RoIs=100
降低RoIs数量对改善模型性能的原因尚不清楚。

3 困难负样本挖掘

困难样本挖掘有助于提升F1和precision。
hard negative mining obviously suppressed FPs from 43.96% to 37.37% and improved the precision by 6.59%.

4 额外的实验

上述实验是由 Type1 和 Type3混淆造成的。重复实验，错误分类被忽略， the misclassification error is ignored improving the F1-score, precision, recall, and AP to 0.72%, 66.48%, 77.79%, and 71.16%, respectively

2 GDCNN-Outfalls 结果

本节评估引入 DSM enhancement 和 spatial activation 的效果。
As shown in Table VI, GDCNN-outfalls attained 0.75 F1 score and 73.86% precision, which outperformed optimized Faster R-CNN (F1 = 0.72, precision = 66.48%). Adding a geo-
classifier branch to Faster R-CNN improved precision in our validation set at a cost of a small decrement in recall from 77.79% to 76.72%.
It is common and acceptable to trade-off recall for precision in object autodetection.

3 GDCNN-Outfalls 测试

真实测试

更快、更准。

4 讨论

我们采用了三种优化策略，使深度学习模型在UAS图像的排污口检索中具有更强的鲁棒性。
锚框尺寸[64, 256, 576]；细化RoI的数量可以消除RPN训练阶段中提议区域的冗余，从而提高训练速度和模型精度；硬负挖掘也被证明是排水口检索的一种有效调整策略，它有助于模型增强训练集，并抑制低IoU值引入的FPs
作者说，在YOLO应用的一些方法，如feature pyramid network (FPN), leaky ReLU, and joint learning may also further improve the performance of GDCNN-outfalls in future research

In the original Faster R-CNN, the image features are extracted by a convolution function that relies only on the pixel value of the input RGB image. Therefore, the accuracy is limited by the information that RGB images provide, e.g., the common problems of “different objects with the same spectral characteristics,” “same spectrum with different objects,” and “different objects with similar structures and convolutional features”
在原始的快速R-CNN中，图像特征通过卷积函数提取，卷积函数仅依赖于输入RGB图像的像素值。因此，RGB图像提供的信息限制了精度，例如，“具有相同光谱特征的不同对象”、“具有不同对象的相同光谱”和“具有相似结构和卷积特征的不同对象”的常见问题。

总结

采用了三种策略，包括锚定尺寸、RoI数和困难负样本挖掘，通过抑制FPs和提高精度，改善了排水口检索的性能
Three tactics adopted, including anchor size, RoI number, and hard negative mining well improved the performance of outfall retrieval by suppressing FPs, and increasing precision

此外，还提出了一个具有DSM增强功能的地理分类器模块和一个空间激活函数，以改进更快的R-CNN结构，生成GDCNN出口。
In addition, a geo-classifier module with DSM enhancement In addition, and a spatial activation function is proposed to improve the Faster R-CNN architecture to generate GDCNN-outfalls.

本文标签：笔记论文 Deep Learning Approach

版权声明：本文标题：【论文笔记】An Improved Deep Learning Approach for Retrieving Outfalls Into Rivers From UAS Imagery 内容由热心网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：https://www.elefans.com/xitong/1729579204a1207400.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

电子爱好者 - 最新技术资讯及电子产品介绍！

【论文笔记】An Improved Deep Learning Approach for Retrieving Outfalls Into Rivers From UAS Imagery

目录

1 数据集

2 检测方法

1 Fast-RCNN

2 Fast-RCNN 优化（调参）

3 GDCNN-Outfalls（提出的方法）

1 DSM Enhancement

2 Spatial Activation

3 实验结果

1 调参结果

1 调整 anchor

2 调整 RoI

3 困难负样本挖掘

4 额外的实验

2 GDCNN-Outfalls 结果

3 GDCNN-Outfalls 测试

4 讨论

总结

更多相关文章

论文阅读 [CVPR-2022] An Efficient Training Approach for Very Large Scale Face Recognition

Bootstrap your own latent ：A new approach to self-supervised Learning（BYOL）（论文解读）

论文阅读：HybridAlpha: An Efficient Approach for Privacy-Preserving Federated Learning

《A Unified Approach to Interpreting Model Predictions》论文解读——解释模型 预测的统一方法

【论文阅读】Attributed Graph Clustering: A Deep Attentional Embedding Approach

【论文阅读】A Transformer-based Approach for Source Code Summarization

算法设计技巧与分析（五）：贪心算法(The Greedy Approach)

论文阅读”A deep variational approach to clustering survival data“(ICLR2022)

文献阅读笔记【12】：A novel hybrid approach for crack detection【一种新型的混合裂缝检测方法】

IMAGE CODING FOR MACHINES: AN END-TO-END LEARNED APPROACH 2021

【KDD19】Deep Uncertainty Quantification: A Machine Learning Approach for Weather Forecasting

A Multi-Scale Approach for Graph Link Prediction

《FL-MSRE: A Few-Shot Learning based Approach to Multimodal Social RelationExtraction》

Metamorphic Testing：A New Approach for Generating Next Test Cases

Computer Networking A Top-Down Approach 总结

论文阅读：GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators

Problem instances for “Multi-Agent Deep Reinforcement Learning based Real-time Planning Approach for

（IJCAI-17）Transfer learning in multi-armed bandits: A causal approach

车道线检测--Towards End-to-End Lane Detection: an Instance Segmentation Approach

python论文排版格式_学位论文排版教程2

发表评论

推荐文章

OEKO-TEX® 推出 RESPONSIBLE BUSINESS 工具和认证

AndroidManifest 所有uses-feature

用硬盘制作linux kernel 系统

解决百度网盘下载缓慢最有效、最简单的方法

PDown（百度网盘第三方不限速下载工具）

热门文章

keepalived.解决Failed to start SYSV: Start and stop Keepalived.

The provided password or token is incorrect or your account

Ubuntu20.04安装（双系统）

【优选】电脑加密软件有哪些？10款真实有效软件推荐（2024用心整理）

Mac文件压缩时加密

python爬虫项目实战： 爬取酷狗音乐详细操作

拼多多商品详情百亿补贴数据采集接口代码展示

动态盘转换为基本盘

把自己的文件上传到百度网盘，成为公开资源的教程

解决百度网盘限速的问题

最新文章

查看win8 是否彻底激活

完全免费又超级好用的万能视频播放器PotPlayer安装教程分享

CDR2024破解完整版下载安装永久激活最新

windows系统激活时间查询

中文linux 老旧电脑,安装Bodhi Linux让老旧电脑重新焕发活力

【Windows 11】 24H2 在线更新、全新安装

Autodesk 3DS Max v2025 激活版下载及安装教程

win7虚拟机黑苹果_苹果Mac虚拟机安装Win7系统的方法【图文教程】

MathType7永久免费无需激活版下载，数学神器轻松get！

QT历届版本下载总汇

重复照片清理软件分享，看看这5个重复文件删除工具（新）

[Hyper-v]删除系统保留分区，修复克隆win7win8虚拟磁盘后无法引导问题

Windows server 2022datacenter版本的j激活过程

mathtype2024最新破解永久激活码密钥序列号+下载安装教程

【C++软件调试技术】使用 Windbg 分析软件异常时的诸多细节与技巧总结

小米手机肿么还原时钟

15000流明是多少瓦

一般普通投影机功率多大?

苹果绿联转换器有些投影机不能用

坚果V9投影机具体参数?

《A Unified Approach to Interpreting Model Predictions》论文解读——解释模型预测的统一方法

python爬虫项目实战：爬取酷狗音乐详细操作

【鬼泣5（Devil May Cry V）v1.0十四项修改】鬼泣5（Devil May Cry V）v1.0十四项修改官方免费下载