Reward classifier and training #528

ChorntonYoel · 2024-11-26T17:58:35Z

What this does

Title	Label
Item 2 of Reward Classifier in issue #504	(Feature)

This PR is meant to add a reward classifier (used to classify if an image of a robot performing a task should get a reward or not), a training file allowing the training of the classifier (with logging + resuming), a config.yaml file that can be used to start a training, and a few tests for the training loop

How it was tested

Using 10 episodes made with the reward system of this PR: #518
Also I added a test file for the training classifier file. Lots of things are mocked but it covers the basics I believe.

How to checkout & try? (for the reviewer)

python lerobot/scripts/train_classifier.py \
    --config-name", "policy/reward_classifier.yaml",

With the wandb entity and the dataset name adapted.

I was able to reproduce 95%+ after a few epochs with facebook/convnext-base-224 as backbone and a dataset of 10 epsiodes of ~15 sec.
This branch was built on top of the branch from #518 so will need to wait for this one to be merged befre merging

…uggingface#450)

…ggingface#489)

Co-authored-by: Remi <[email protected]>

…lassifier_and_training

michel-aractingi · 2024-11-27T13:01:11Z

Great work! looking forward to trying it, let me know when I can review it.

Cadene

What do you think of adding lerobot/common/policies/classifier/README.md with very short explanation and example commands?

Also Classifier is not a policy (it doesnt output actions). However I dont know what to do, so current implementation is good for now. Does anyone have a better idea?

(PS: waiting for the first branch to merge before approving)

Cadene · 2024-11-27T20:28:49Z

tests/test_train_classifier.py

@@ -0,0 +1,280 @@
+# test_train_classifier.py


Nice file!

Out of curiosity, why this comment # test_train_classifier.py?

No good reason haha force of habit, for some pre-commit setups you need a comment at the beginning of the file, so I start by a dummy one. will take off

Cadene · 2024-11-27T20:29:46Z

lerobot/configs/policy/reward_classifier.yaml

@@ -0,0 +1,49 @@
+# @package _global_


lerobot/scripts/control_robot.py

Co-authored-by: Remi <[email protected]>

michel-aractingi · 2024-12-03T17:52:47Z

Nice work @ChorntonYoel ! Could you move the classifier directory to lerobot/common/policies/classifier/ to lerobot/common/policies/hilserl/classifier.

Since now we will only use the reward classifier for hil-serl then we will put everything in its directory. In the future when the classifiers are more established we can have a separate directory in lerobot/common/classifiers to host different kinds of reward recognition models.

ChorntonYoel and others added 30 commits November 22, 2024 23:49

add reward assignment during teleoperation

2e15499

nit

0c3faff

pre commit

e7805ed

Add support for Windows (huggingface#494)

0151ec5

bug causes error uploading to huggingface, unicode issue on windows. (h…

b8bf366

…uggingface#450)

Add distinction between two unallowed cases in name check "eval_" (hu…

2001f16

…ggingface#489)

remove populate dataset

5a60728

take off useless code

abf5798

fix find motor port

6ee99dd

Update lerobot/scripts/control_robot.py

81a926f

Co-authored-by: Remi <[email protected]>

adapt to v2

c7eeff4

nit

a4f7db9

nit

e123a1f

fix

56447f9

cleanup rebase

62d3116

nit from rebase

cdc723e

fix reward leak between episodes

57f58d8

next.reward

515487f

nit

63b23f6

Update lerobot/common/robot_devices/control_utils.py

1eb1f3b

Co-authored-by: Remi <[email protected]>

Update lerobot/common/robot_devices/control_utils.py

4b78469

Co-authored-by: Remi <[email protected]>

int in arg parser

62db861

add classifier + training logic

1e46694

have image/label keys in config

84669d7

add image/label keys + train proportion in config

de510f4

nit

673e622

update train classifier comments

9d35377

take off useless arg use_amp in validate

9e67534

add basic tests for the training loop

eecdaf4

Merge branch 'user/aliberts/2024_09_25_reshape_dataset' into reward_c…

415dd22

…lassifier_and_training

ChorntonYoel changed the base branch from user/aliberts/2024_09_25_reshape_dataset to user/michel-aractingi/2024-11-27-port-hil-serl November 27, 2024 17:36

ChorntonYoel added 3 commits November 27, 2024 19:08

nit

dbafc95

update for multiclass compatibility

c0c4006

nit

f1457ea

Cadene reviewed Nov 27, 2024

View reviewed changes

Cadene requested a review from michel-aractingi November 27, 2024 20:35

ChorntonYoel and others added 4 commits November 29, 2024 13:13

take off useless comment

6f1e456

Update lerobot/scripts/control_robot.py

7e63eaa

Co-authored-by: Remi <[email protected]>

switch to extra_features

5b89a4b

fix multi-class training

0753b35

ChorntonYoel marked this pull request as ready for review November 29, 2024 16:25

ChorntonYoel added 3 commits November 29, 2024 18:17

disable classifier

344b6d3

update config

e70cf8b

add markdown

b65aa82

michel-aractingi mentioned this pull request Dec 2, 2024

Add human intervention mechanism and eval_robot script to evaluate policy on the robot #541

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reward classifier and training #528

Reward classifier and training #528

ChorntonYoel commented Nov 26, 2024 •

edited

Loading

michel-aractingi commented Nov 27, 2024

Cadene left a comment

Cadene Nov 27, 2024

ChorntonYoel Nov 29, 2024

Cadene Nov 27, 2024

michel-aractingi commented Dec 3, 2024

Reward classifier and training #528

Are you sure you want to change the base?

Reward classifier and training #528

Conversation

ChorntonYoel commented Nov 26, 2024 • edited Loading

What this does

How it was tested

How to checkout & try? (for the reviewer)

michel-aractingi commented Nov 27, 2024

Cadene left a comment

Choose a reason for hiding this comment

Cadene Nov 27, 2024

Choose a reason for hiding this comment

ChorntonYoel Nov 29, 2024

Choose a reason for hiding this comment

Cadene Nov 27, 2024

Choose a reason for hiding this comment

michel-aractingi commented Dec 3, 2024

ChorntonYoel commented Nov 26, 2024 •

edited

Loading