-
Notifications
You must be signed in to change notification settings - Fork 746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reward classifier and training #528
base: user/michel-aractingi/2024-11-27-port-hil-serl
Are you sure you want to change the base?
Reward classifier and training #528
Conversation
Co-authored-by: Remi <[email protected]>
Co-authored-by: Remi <[email protected]>
Co-authored-by: Remi <[email protected]>
…lassifier_and_training
Great work! looking forward to trying it, let me know when I can review it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of adding lerobot/common/policies/classifier/README.md
with very short explanation and example commands?
Also Classifier is not a policy (it doesnt output actions). However I dont know what to do, so current implementation is good for now. Does anyone have a better idea?
(PS: waiting for the first branch to merge before approving)
tests/test_train_classifier.py
Outdated
@@ -0,0 +1,280 @@ | |||
# test_train_classifier.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice file!
Out of curiosity, why this comment # test_train_classifier.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No good reason haha force of habit, for some pre-commit setups you need a comment at the beginning of the file, so I start by a dummy one. will take off
@@ -0,0 +1,49 @@ | |||
# @package _global_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice file!
Nice work @ChorntonYoel ! Could you move the classifier directory to Since now we will only use the reward classifier for |
What this does
This PR is meant to add a reward classifier (used to classify if an image of a robot performing a task should get a reward or not), a training file allowing the training of the classifier (with logging + resuming), a config.yaml file that can be used to start a training, and a few tests for the training loop
How it was tested
Using 10 episodes made with the reward system of this PR: #518
Also I added a test file for the training classifier file. Lots of things are mocked but it covers the basics I believe.
How to checkout & try? (for the reviewer)
With the wandb entity and the dataset name adapted.
I was able to reproduce 95%+ after a few epochs with
facebook/convnext-base-224
as backbone and a dataset of 10 epsiodes of ~15 sec.This branch was built on top of the branch from #518 so will need to wait for this one to be merged befre merging