Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docker-compose support for easier and more portable environment set-up #519

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

apockill
Copy link

@apockill apockill commented Nov 23, 2024

What this does

This PR enables users to easily build and enter a docker container which has lerobot and it's dependencies installed.

Installing lerobot is now as easy as:

docker compose build
docker compose run lerobot

and that plops you right into a container within a tested lerobot environment.

Features:

  • It mounts the current directory at runtime, so development is exactly the same as local development
  • It mounts "/dev/" allowing serial communication as though you were running it as host
  • It will help linux users who are suffering from environment issues. More on this below.
  • It opens up ports that are used within the lerobot scripts (like visualize_dataset and visualize_dataset_html)
  • Wandb credentials are mounted into the container

Why:

  • I'm using ubuntu 22.04 and have been frustrated several times debugging different dependency related issues. For example, the control_robot record functionality expects that I will want to save my video using libsvtav1 (see video_utils.encode_video_frames). However, for whatever reason my version of ffmpeg doesn't allow that, and I had to manually edit code to get off the ground.
  • There are currently incompatibilities with dependencies and their build version. For example, I ran into this issue: Torchvision import leads to OpenCV imshow freeze pytorch/vision#5940
    Which froze my development, since I wanted to run imshow and check out the cameras when debugging policies.

Anyways, I think it'd be exciting to give users another way to install dependencies.

I don't want to get too far into this without some buy-in. Please let me know if this is interesting, and would be wanted. Cheers!

How it was tested

If folks agree this is desired, I'll add a CI pipeline that validates the build.

Personally, I'm still working on testing this. I'll report back after feedback from reviewers.

Here's what I've personally tested, so far:

  • visualize_dataset: Works, but you need the rerun server locally. I forward the ports in the docker-compose.
  • visualize_dataset_html.py: Works great
  • control_robot, train, and replay
  • Run unit tests
  • Validate cv2.imshow works
  • Wandb credential passthrough works
  • Huggingface credential passthrough works
  • Anything else I'm missing? Let me know! 😁

How to checkout & try? (for the reviewer)

Just go ahead and check out my fork, and run the commands I documented in the README. I'm most interested in tests done in different hardware. I was thinking of supporting GPU and CPU cases, but is it worth supporting ARM images (a la Jetson) as well?

TODO:

  • Fix cv2.imshow crashing if torchvision has been installed
  • Record, train, test a policy as a sanity test
  • Split docker-compose into multiple 'services', to allow users to docker run lerobot-gpu or 'docker run lerobot-cpu'. It might "just work" already on cpu machines, I'll test that later. It would be easier to maintain just one Dockerfile.

@@ -0,0 +1,79 @@
FROM nvidia/cuda:12.2.2-devel-ubuntu22.04
Copy link
Author

@apockill apockill Nov 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this build is identical to docker/lerobot-gpu-dev, except:

  • added the poetry install step and POETRY_EXTRAS env variable, using buildkit caching
  • added poetry as the ENTRYPOINT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant