admin管理员组文章数量:1609966
前言
SAR舰船检测数据集SSDD(SAR Ship Detection Dataset) 可以说是比较经典的数据集了,在 SAR Ship Detection Dataset (SSDD): Official Release and Comprehensive Data Analysis 里有这样一段话
The images with the last digits of the file number 1 and 9 are uniquely determined as the test set, and the rest are regarded as the training set. Such a rule can also maintain the distribution consistency of the training set and test set, which is conducive to network feature learning.
大致翻译一下,就是文件编号为1和9的最后一位的图像被严格确定为测试集,其余图像被视为训练集(本人注释:包括验证集)。这样的规则还可以保持训练集和测试集分布的一致性,有利于网络特征学习。
毕竟样本太少了,只有1160个,随机划分可能会破坏训练集和测试集之间的分布一致性,导致结果不一样。而且,对于每一个样本都是十分珍贵的。但是对于训练集和验证集的划分,论文并没有给出明确的规定。但是给出了一个建议是建立交叉验证集。这里我是给出了尾号8作为验证集,这样验证集中就包括了近岸和远海目标。
所以写了个脚本分一下训练集和检测集。
代码
suffix_1 = list(range(1,1160,10))
suffix_9 = list(range(9,1160,10))
suffix_8 = list(range(8,1160,10)) # 验证集不想用尾号8可以改
suffix_1_9 = suffix_1+suffix_9
suffix_1_9.sort()
#-----------------------test---------------------#
test = [str(i).zfill(6) for i in suffix_1_9]
with open("ImageSets/Main/test.txt", 'w') as f:
for i in test:
f.write(i+'\n')
#-------------------train&val--------------------#
suf_not_1_9 = []
for i in list(range(1,1161)):
if i not in suffix_1_9:
suf_not_1_9.append(i)
trainval = [str(i).zfill(6) for i in suf_not_1_9]
with open("ImageSets/Main/trainval.txt", 'w') as f:
for i in trainval:
f.write(i+'\n')
#-----------------val----------------------------#
val = [str(i).zfill(6) for i in suffix_8]
with open("ImageSets/Main/val.txt", 'w') as f:
for i in val:
f.write(i+'\n')
#-----------------train--------------------------#
suf_not_1_8_9 = []
for i in suf_not_1_9:
if i not in suffix_8:
suf_not_1_8_9.append(i)
train = [str(i).zfill(6) for i in suf_not_1_8_9]
with open("ImageSets/Main/train.txt", 'w') as f:
for i in train:
f.write(i+'\n')
版权声明:本文标题:SAR舰船检测数据集SSDD的训练集和检测集划分代码 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:https://www.elefans.com/xitong/1725898142a1047898.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论