Skip to content

Latest commit

 

History

History
23 lines (19 loc) · 2.32 KB

custom_dataset.md

File metadata and controls

23 lines (19 loc) · 2.32 KB

Tips for custom dataset.

Data preparation.

  • Step 1: split your data to train/val folder
  • Step 2: for each scene construct a .pth file that contains:
    • point XYZ coordinates: shape of (N, 3)
    • colors RGB: shape of (N, 3)
    • semantic labels: shape of (N, )
    • insance labels: shape(N, )

Noted that colors should be normalized in range [-1, 1], see here.

Config

The following configs may be modified for custom dataset.

  • semantic_classes: the number of class for semantic segmentation
  • instance_classes: the number of semantic classes considered for instance segmentation. For example, in ScanNet dataset config, wall and floor is not considered for instance segmentation. So that instance_classes = semantic_classes - 2.
  • sem2ins_classes : use this when you directly use semantic segmentation results as instance segmentation results for specified classes. For example, in S3DIS dataset, class floor and ceil (index [0, 1]) are specified since most of the cases, each scene has only one floor and one ceil.
  • class_numpoint_mean: the number of points for each instance per class. shape of (semantic_classes, )
  • scale: the point coordinates are scaled up for voxelization. From scale, we can infer voxel_size = 1 / scale. Indoor datasets often use scale = 50 (voxel_size = 0.02m). In outdoor datasets, the voxelize should be larger due to higher spasity. For example, in STPLS3D dataset, scale is set to 3 (voxel_size = 0.33m). Ablation may be needed to figure out which scale is most suitable to your dataset.
  • grouping_cfg.radius: The radius for grouping. This value is related to voxel_size. When the voxelize is higher, the radius should be also higher.
  • grouping_cfg.ignore_classes: the semantic class indices that are not considered for grouping.

For further information, you can compare the configs of STPLS3D and ScanNet.