-
Notifications
You must be signed in to change notification settings - Fork 79
Unable to start reward-model-and-critic-server #109
Replies: 2 comments · 17 replies
-
hello! when we run the critic we load with strict=True which forces the RM head to exist when launching the server. Will you be able to give me more details on the checkpoint you're trying to start the critic server with? if you have a |
Beta Was this translation helpful? Give feedback.
All reactions
-
Sure. Will try this. Thanks a lot. |
Beta Was this translation helpful? Give feedback.
All reactions
-
I created an issue (#115) to keep track of this feature request. |
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 2
-
I am still unable to run with the given .nemo file (NV-Llama2-13B-RLHF-RM/Llama2-13B-RLHF-RM.nemo ). |
Beta Was this translation helpful? Give feedback.
All reactions
-
Which docker image exactly? |
Beta Was this translation helpful? Give feedback.
All reactions
-
I am using nemofw-training:24.01 container pip freeze gives the below modules.
|
Beta Was this translation helpful? Give feedback.
All reactions
-
I'm unable to repro your bug. I used the and then I ran this script
can you give these exact instructions a try and see if you hit the same error? also let me know what |
Beta Was this translation helpful? Give feedback.
All reactions
-
Yes mine md5sum is 8794fb021391067217cf0179a90eb09a only |
Beta Was this translation helpful? Give feedback.
All reactions
-
Hi @gshennvm , critic server CHECKPOINT_NEMO_FILE="/workspace/Llama2-13B-RLHF-RM.nemo" Actor server PRETRAINED_ACTOR_NEMO_FILE="/workspace/Llama-2-13b-chat.nemo" I am getting Out of memory issue with this configuration . Can you suggest the least configuration which should be able to run on this hardware. |
Beta Was this translation helpful? Give feedback.
All reactions
-
If it helps this is the point i m getting OOM
|
Beta Was this translation helpful? Give feedback.
All reactions
-
|
Beta Was this translation helpful? Give feedback.
All reactions
-
You should try with |
Beta Was this translation helpful? Give feedback.
-
Hi Team,
I am trying to run PPO.
I am following this user guide. https://github.com/NVIDIA/NeMo-Aligner/blob/main/docs/user-guide/RLHF.rst
My have written similar code as here. https://github.com/NVIDIA/NeMo-Aligner/blob/main/docs/user-guide/RLHF.rst#launching-the-reward-model-and-critic-server
But I am getting the below error. Can you please help.
tensorstore version : tensorstore==0.1.45
Beta Was this translation helpful? Give feedback.
All reactions