Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to convert a depth map to png? #2

Open
SZUshenyan opened this issue Dec 3, 2023 · 1 comment
Open

How to convert a depth map to png? #2

SZUshenyan opened this issue Dec 3, 2023 · 1 comment

Comments

@SZUshenyan
Copy link

When I download the depth map, what I get is the depth map stored in float16, how to convert it to png format, please let me know thanks!

@gcr
Copy link
Collaborator

gcr commented Dec 12, 2023

Hello! Are you interested in visualizing the depth map for human consumption or do you want to convert to greyscale int16 format to work with other systems?

First, load your depth map:

>>> sample['metric_depth']
<tf.Tensor: shape=(720, 1280, 1), dtype=float32, numpy=
array([[[9.671875 ],
        [9.65625  ],
        [9.6640625],
        ...,
        [5.2539062],
        [5.2578125],
        [5.2695312]],

       ...,

       [[2.2324219],
        [2.2304688],
        [2.2304688],
        ...,
        [4.0546875],
        [4.0546875],
        [4.0585938]],

       [[2.2304688],
        [2.2285156],
        [2.2304688],
        ...,
        [4.0429688],
        [4.046875 ],
        [4.046875 ]],

       [[2.2304688],
        [2.2285156],
        [2.2304688],
        ...,
        [4.0351562],
        [4.0390625],
        [4.0429688]]], dtype=float32)>

During training, it's common for models to consume tensors like this directly. However, if you're working with a framework that needs 16-bit integer grayscale .PNG files, you can convert them like this:

UINT16_MAX = 65535
metric_depth_in_meters = sample['metric_depth'].numpy().squeeze()
metric_depth_in_mm = metric_depth_in_meters * 1000.0
metric_depth_in_mm_int16 = np.clip(metric_depth_in_mm, 0, UINT16_MAX).astype('uint16')
pil_image = Image.fromarray(metric_depth_in_mm_int16)
assert pil_image.mode == 'I;16' # 16-bit grayscale; see https://pillow.readthedocs.org/handbook/concepts.html#concept-modes
pil_image.save('/tmp/image_16bit_gray.png')

image_16bit_gray

Encoding .png files this way is common, but ranges longer than 65.5 meters must be clipped, which becomes an issue with the synthetic data. The conversion from float to int also adds some quantization noise: at 10m, you lose ~1cm resolution, and at 40m away, you lose around ~3cm of resolution. SANPO-Real can't promise that high resolution or range anyway, but SANPO-Synthetic can. Just something to be mindful of. I recommend double-checking your framework's desired clipping range and units.

If you wanted to just visualize the depth maps for display, good old matplotlib can help you here:

plt.imshow(sample['metric_depth'], cmap='turbo', vmin=0.0, vmax=40.0)

image

You can also use the colormap object on the image to avoid losing any resolution, which also gets rid of the axes:

MAX_DEPTH = 40.0
colormap = plt.matplotlib.colormaps.get('turbo')
colored_image = colormap(sample['metric_depth'] / MAX_DEPTH)
colored_image = (255*colored_image).astype('uint8').squeeze()
pil_image = Image.fromarray(colored_image)
pil_image.save('/tmp/image.png')

image (3)

Hope that helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants