CenterPoint 源码流程解读（二）

编程入门行业动态更新时间:2024-10-16 00:16:37

CenterPoint <a href=https://www.elefans.com/category/jswz/34/1770099.html style= 源码流程解读（二）"/>

CenterPoint 源码流程解读（二）

接上一篇CenterPoint 源码流程解读（一）

CenterPoint 源码流程解读（二）

主要内容：
二. Backbone – 特征提取
2.1 voxelize：体素化
2.2 点云voxel编码方式： PillarFeatureNet（PFN）
2.3 点云中间编码方式： PointPillarsScatter
2.4 backbone: SECOND
三. Neck
四. Head和loss
4.1 CenterHead
4.2 loss

二. Backbone – 特征提取

参考：激光点云3D目标检测算法之PointPillars

2.1 voxelize：体素化

主要实现类Voxelization，将点云转为voxel表征方式。

voxels : 30000205, 30000个体素，每个体素20个点，每个点5维度信息
coors：体素坐标，30000*3
num_points_per_voxel：每个体素中点的个数

    def forward(ctx,points,voxel_size,coors_range,max_points=35,max_voxels=20000,deterministic=True):"""convert kitti points(N, >=3) to voxels. """if max_points == -1 or max_voxels == -1:coors = points.new_zeros(size=(points.size(0), 3), dtype=torch.int)dynamic_voxelize(points, coors, voxel_size, coors_range, 3)return coorselse:voxels = points.new_zeros(size=(max_voxels, max_points, points.size(1))) #30000,20,5coors = points.new_zeros(size=(max_voxels, 3), dtype=torch.int) #30000,3num_points_per_voxel = points.new_zeros(size=(max_voxels, ), dtype=torch.int)voxel_num = hard_voxelize(points, voxels, coors,num_points_per_voxel, voxel_size,coors_range, max_points, max_voxels, 3,deterministic) # cuda中体素化ops，29249# select the valid voxels，去掉空的voxelvoxels_out = voxels[:voxel_num]coors_out = coors[:voxel_num]num_points_per_voxel_out = num_points_per_voxel[:voxel_num]  #每个体素中点数return voxels_out, coors_out, num_points_per_voxel_out

2.2 点云voxel编码方式： PillarFeatureNet（PFN）

主要作用是将点云voxel表征方式进行编码，并建立稠密特征张量。

将上一步中的体素化点云编码成为10维的向量D(x,y,z,r,delt_t,xc,yc,zc,xp,yp), 其中x,y,z,r,delt_t分别表示点云3个坐标、反射强度、多帧时点的时间戳差值；xc,yc,zc表示到该Pillar中所有点的算术平均值点（中心）的距离，xp,yp表示该点到该Pillar的x,y坐标中心的偏移值，得到一个(P，N，D)稠密张量。再通过多层 PFNLayer = linear线性层 + BatchNorm + ReLU + max pooling，转换得到(P，N，C)，N代表每一个pillar中的点数，C代表channel数目，最终经过对每个pillar进行最大池化max_pooling得到(P，C)的张量。

    def forward(self, features, num_points, coors):"""Forward function."""features_ls = [features]# Find distance of x, y, and z from cluster center，到每个pillar中心点的距离if self._with_cluster_center:points_mean = features[:, :, :3].sum(dim=1, keepdim=True) / num_points.type_as(features).view(-1, 1, 1)f_cluster = features[:, :, :3] - points_meanfeatures_ls.append(f_cluster)# Find distance of x, y, and z from pillar center, 到pillar中心坐标距离dtype = features.dtypeif self._with_voxel_center:if not self.legacy:f_center = torch.zeros_like(features[:, :, :2])f_center[:, :, 0] = features[:, :, 0] - (coors[:, 3].to(dtype).unsqueeze(1) * self.vx +self.x_offset)f_center[:, :, 1] = features[:, :, 1] - (coors[:, 2].to(dtype).unsqueeze(1) * self.vy +self.y_offset)else:f_center = features[:, :, :2]f_center[:, :, 0] = f_center[:, :, 0] - (coors[:, 3].type_as(features).unsqueeze(1) * self.vx +self.x_offset)f_center[:, :, 1] = f_center[:, :, 1] - (coors[:, 2].type_as(features).unsqueeze(1) * self.vy +self.y_offset)features_ls.append(f_center)#计算点到中心(0,0)距离if self._with_distance: points_dist = torch.norm(features[:, :, :3], 2, 2, keepdim=True) features_ls.append(points_dist)# Combine together feature decorations，合并features = torch.cat(features_ls, dim=-1)# The feature decorations were calculated without regard to whether# pillar was empty. Need to ensure that# empty pillars remain set to zeros.voxel_count = features.shape[1]mask = get_paddings_indicator(num_points, voxel_count, axis=0)mask = torch.unsqueeze(mask, -1).type_as(features)features *= maskfor pfn in self.pfn_layers:features = pfn(features, num_points)return features.squeeze() #[P,C] 27059, 64

2.3 点云中间编码方式： PointPillarsScatter

作用：将学习到的稠密特征[C，P] 还原成伪图像[C，W，H]

    def forward_batch(self, voxel_features, coors, batch_size):"""Scatter features of single sample."""# batch_canvas will be the final output.batch_canvas = []for batch_itt in range(batch_size):# Create the canvas for this samplecanvas = torch.zeros(self.in_channels,self.nx * self.ny,dtype=voxel_features.dtype,device=voxel_features.device)# Only include non-empty pillarsbatch_mask = coors[:, 0] == batch_ittthis_coors = coors[batch_mask, :]indices = this_coors[:, 2] * self.nx + this_coors[:, 3]indices = indices.type(torch.long)voxels = voxel_features[batch_mask, :]voxels = voxels.t()# Now scatter the blob back to the canvas.canvas[:, indices] = voxels# Append to a list for later stacking.batch_canvas.append(canvas)# Stack to 3-dim tensor (batch-size, in_channels, nrows*ncols)batch_canvas = torch.stack(batch_canvas, 0)# Undo the column stacking to final 4-dim tensorbatch_canvas = batch_canvas.view(batch_size, self.in_channels, self.ny,self.nx)return batch_canvas

2.4 backbone: SECOND

使用多层的conv+BN+Relu三件套进行特征提取，总共有[4,6,6]层三件套组成, channel维度分别对应[64, 128, 256]。

        blocks = []for i, layer_num in enumerate(layer_nums):block = [build_conv_layer(conv_cfg,in_filters[i],out_channels[i],3,stride=layer_strides[i],padding=1),build_norm_layer(norm_cfg, out_channels[i])[1],nn.ReLU(inplace=True),]for j in range(layer_num):block.append(build_conv_layer(conv_cfg,out_channels[i],out_channels[i],3,padding=1))block.append(build_norm_layer(norm_cfg, out_channels[i])[1])block.append(nn.ReLU(inplace=True))block = nn.Sequential(*block)blocks.append(block)self.blocks = nn.ModuleList(blocks)

三. Neck

SECONDFPN, 对Backbone得到特征进行加工和合理利用。主要还是由类似conv+BN+Relu三件套构成，进行上采样解码操作，将上一步channel[64, 128, 256]均变成128，然后合并，得到[B,C,W,H]的张量，此中C为128*3 = 384，结构如下：

  (pts_neck): SECONDFPN((deblocks): ModuleList((0): Sequential((0): Conv2d(64, 128, kernel_size=(2, 2), stride=(2, 2), bias=False)(1): BatchNorm2d(128, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)(2): ReLU(inplace=True))(1): Sequential((0): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(1): BatchNorm2d(128, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)(2): ReLU(inplace=True))(2): Sequential((0): ConvTranspose2d(256, 128, kernel_size=(2, 2), stride=(2, 2), bias=False)(1): BatchNorm2d(128, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)(2): ReLU(inplace=True))))

四. Head和loss

4.1 CenterHead

CenterHead, 先经过一个共享卷积，将特征由[B,384,128,128]变为[B,64,128,128]。然后分别对每个任务tasks进行推理，最后得到预测结果字典。

    def forward(self, feats):"""Forward pass."""return multi_apply(self.forward_single, feats)def forward_single(self, x):"""Forward function for CenterPoint."""ret_dicts = []x = self.shared_conv(x) # 共享卷积，三件套for task in self.task_heads:ret_dicts.append(task(x))return ret_dicts

每个大类别，含有一个task，每一个task，对应1个SeparateHead，每个SeparateHead包含6个需要回归的head。故配置中有6个task，6个head，6*6=36个需要回归的head。其中一个SeparateHead结构如下，6个head分别为reg、height、dim、rot、vel、heatmap。最终经过CenterHead处理后，得到关于6个tasks的list。

注意：因为不同类别，BEV视角下尺寸不同，如car和pedestrian，故将其分为不同的任务；而pedestrian与traffic_cone在BEV视角下，尺寸相近，故作为一个task进行回归。

(0): SeparateHead((reg): Sequential((0): ConvModule((conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(activate): ReLU(inplace=True))(1): Conv2d(64, 2, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))(height): Sequential((0): ConvModule((conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(activate): ReLU(inplace=True))(1): Conv2d(64, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))(dim): Sequential((0): ConvModule((conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(activate): ReLU(inplace=True))(1): Conv2d(64, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))(rot): Sequential((0): ConvModule((conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(activate): ReLU(inplace=True))(1): Conv2d(64, 2, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))(vel): Sequential((0): ConvModule((conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(activate): ReLU(inplace=True))(1): Conv2d(64, 2, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))(heatmap): Sequential((0): ConvModule((conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)(activate): ReLU(inplace=True))(1): Conv2d(64, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))))

4.2 loss

参考：CenterHead的loss函数

针对每一个task，利用gtbox真值和get_targets得到heatmaps, anno_boxes, inds, masks，这四个量含义如下表：

参数	heatmap	anno_box	ind	mask
说明	中心点热图分数	框的gt真值	框的中心点在热力图中的位置	有效box的掩码，1/0划分
尺寸	[class_num, 128, 128]	[500, 10]	[500]	[500]
取值举例	每个class有一张热图	10维参数的含义，第1-2维表示中心点的偏移量offset_x、 offset_y，第3维表示中心点的高度z，第4-6维表示目标框的长宽高box_dim，第7-8维表示旋转角度sin(α) cos(α)，第9-10维表示速度vx vy	ind[idx] = x*128 + y	mask[idx] = 1

主要包含两个loss，一个是针对heatmap的focal loss，另一个是针对bbox的L1 loss。

    def loss(self, gt_bboxes_3d, gt_labels_3d, preds_dicts, **kwargs):"""Loss function for CenterHead."""heatmaps, anno_boxes, inds, masks = self.get_targets(gt_bboxes_3d, gt_labels_3d)loss_dict = dict()for task_id, preds_dict in enumerate(preds_dicts):# loss1： heatmap focal loss preds_dict[0]['heatmap'] = clip_sigmoid(preds_dict[0]['heatmap'])num_pos = heatmaps[task_id].eq(1).float().sum().item()loss_heatmap = self.loss_cls(preds_dict[0]['heatmap'],heatmaps[task_id],avg_factor=max(num_pos, 1)) target_box = anno_boxes[task_id]# reconstruct the anno_box from multiple reg headspreds_dict[0]['anno_box'] = torch.cat((preds_dict[0]['reg'], preds_dict[0]['height'],preds_dict[0]['dim'], preds_dict[0]['rot'],preds_dict[0]['vel']),dim=1)# Regression loss for dimension, offset, height, rotationind = inds[task_id]num = masks[task_id].float().sum()pred = preds_dict[0]['anno_box'].permute(0, 2, 3, 1).contiguous()pred = pred.view(pred.size(0), -1, pred.size(3))pred = self._gather_feat(pred, ind)mask = masks[task_id].unsqueeze(2).expand_as(target_box).float()isnotnan = (~torch.isnan(target_box)).float()mask *= isnotnancode_weights = self.train_cfg.get('code_weights', None)bbox_weights = mask * mask.new_tensor(code_weights)# loss2： bbox lossloss_bbox = self.loss_bbox(pred, target_box, bbox_weights, avg_factor=(num + 1e-4))loss_dict[f'task{task_id}.loss_heatmap'] = loss_heatmaploss_dict[f'task{task_id}.loss_bbox'] = loss_bboxreturn loss_dict

更多推荐

CenterPoint 源码流程解读（二）

本文发布于:2024-03-10 17:22:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1728569.html