Generative Range Imaging for Learning Scene Priors of 3D LiDAR Data
Kazuto Nakashima, Yumi Iwashita, Ryo Kurazume
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023
project | paper | supplemental | arxiv | slide
We propose GAN-based LiDAR data priors for sim2real and restoration tasks, which is an extension of our previous work, DUSty [Nakashima et al. IROS'21].
The core idea is to represent LiDAR range images as a continuous-image generative model or 2D neural fields. This model generates a range value and the corresponding dropout probability from a laser radiation angle. The generative process is trained using a GAN framework. For more details on the architecture, please refer to our paper and supplementary materials.
The environment can be built using Anaconda. This command installs the CUDA 11.X runtime, however, we require PyTorch JIT compilation for the gans/
directory. Please also install the corresponding CUDA locally.
$ conda env create -f environment.yaml
$ conda activate dusty-gan-v2
The following demo generates random range images.
$ python quick_demo.py --arch dusty_v2
Pretrained weights are automatically downloaded.
The --arch
option can also be set to our baselines: vanilla
and dusty_v1
.
To train models by your own or run the other demos, please download the KITTI Raw dataset and make a symbolic link.
$ ln -sf <path to kitti raw root> ./data/kitti_raw
$ ls ./data/kitti_raw
2011_09_26 2011_09_28 2011_09_29 2011_09_30 2011_10_03
To check the KITTI data loader:
$ python -m gans.datasets.kitti
To train GANs on KITTI:
$ python train_gan.py --config configs/gans/dusty_v2.yaml # ours
$ python train_gan.py --config configs/gans/dusty_v1.yaml # baseline
$ python train_gan.py --config configs/gans/vanilla.yaml # baseline
To monitor losses and images:
$ tensorboard --logdir ./logs/gans
$ python test_gan.py --ckpt_path <path to *.pth file> --metrics swd,jsd,1nna,fpd,kpd
options | modality | metrics |
---|---|---|
swd |
2D inverse depth maps | Sliced Wasserstein distance (SWD) |
jsd |
3D point clouds | Jensen–Shannon divergence (JSD) |
1nna |
3D point clouds | Coverage (COV), minimum matching distance (MMD), and 1-nearest neighbor accuracy (1-NNA), based on the earth mover's distance (EMD) |
fpd |
PointNet features | Fréchet pointcloud distance (FPD) |
kpd |
PointNet features | Squared maximum mean discrepancy (like KID in the image domain) |
Note: --ckpt_path
can also be the following keywords: dusty_v2
, dusty_v1
, or vanilla
. In this case, the pre-trained weights are automatically downloaded.
$ python demo_interpolation.py --mode 2d --ckpt_path <path to *.pth file>
--mode 2d
demo_interpolation_2d.mp4
--mode 3d
demo_interpolation_3d.mp4
$ python demo_inversion.py --ckpt_path <path to *.pth file>
demo_inversion_0000017000.mp4
The semseg/
directory includes an implementation of Sim2Real semantic segmentation. The basic setup is to train the SqueezeSegV2 model [Wu et al. ICRA'19] on GTA-LiDAR (simulation) and test it on KITTI (real). To mitigate the domain gap, our paper proposed reproducing the ray-drop noises onto the simulation data using our learned GAN. For details, please refer to our paper (Section 4.2).
- Please setup GTA-LiDAR (simulation) and KITTI (real) datasets provided by the SqueezeSegV2 repository.
├── GTAV # GTA-LiDAR
│ ├──1
│ │ ├── 00000000.npy
│ │ └── ...
│ └── ...
├── ImageSet # KITTI
│ ├── all.txt
│ ├── train.txt
│ └── val.txt
└── lidar_2d # KITTI
├── 2011_09_26_0001_0000000000.npy
└── ...
- Compute the raydrop probability map (64x512 shape) for each GTA-LiDAR depth map (
*.npy
) using the GAN inversion, and save them with the same structure. We will also release the pre-computed data.
data/kitti_raw_frontal
├── GTAV
│ ├──1
│ │ ├── 00000000.npy
│ │ └── ...
│ └── ...
├── GTAV_noise_v1 # computed with DUSty v1
│ ├──1
│ │ ├── 00000000.npy
│ │ └── ...
│ └── ...
├── GTAV_noise_v2 # computed with DUSty v2
│ ├──1
│ │ ├── 00000000.npy
│ │ └── ...
│ └── ...
- Finally, please make a symbolic link.
$ ln -sf <a path to the root above> ./data/kitti_raw_frontal
Training configuration files can be found in configs/semseg/
. We compare five approaches (config-A-E) to reproduce the raydrop noises.
$ python train_semseg.py --config <path to *.yaml file>
config | training domain | raydrop probability | file |
---|---|---|---|
A | Simulation | configs/semseg/sim2real_wo_noise.yaml |
|
B | Simulation | Global frequency | configs/semseg/sim2real_w_uniform_noise.yaml |
C | Simulation | Pixel-wise frequency | configs/semseg/sim2real_w_spatial_noise.yaml |
D | Simulation | Computed w/ DUSty v1 | configs/semseg/sim2real_w_gan_noise_dustyv1.yaml |
E | Simulation | Computed w/ DUSty v2 | configs/semseg/sim2real_w_gan_noise_dustyv2.yaml |
F | Real | N/A | configs/semseg/real2real.yaml |
To monitor losses and images:
$ tensorboard --logdir ./logs/semseg
$ python test_semseg.py --ckpt_path <path to *.pth file>
Note: --ckpt_path
can also be the following keywords: clean
, uniform
, spatial
, dusty_v1
, dusty_v2
, or real
. In this case, the pre-trained weights are automatically downloaded.
@InProceedings{nakashima2023wacv,
author = {Nakashima, Kazuto and Iwashita, Yumi and Kurazume, Ryo},
title = {Generative Range Imaging for Learning Scene Priors of 3{D} LiDAR Data},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
year = {2023},
pages = {1256-1266}
}