Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training log not printed out, model based RL noisy #7

Open
xzxzxzxz opened this issue Mar 3, 2021 · 4 comments
Open

Training log not printed out, model based RL noisy #7

xzxzxzxz opened this issue Mar 3, 2021 · 4 comments

Comments

@xzxzxzxz
Copy link

xzxzxzxz commented Mar 3, 2021

Hi Changan, thanks for the great work! I finished training, but I cannot print the log figure with command
python utils/plot.py data/output/output.log
I looked into the output.log file, and found it was broken, and some texts are missing. Do you have any idea how this could happen?

Also, I looked into the VAL return with my bare eyes, and I noticed the training curve can be very noisy.
The average VAL return from 1k to 10k is something like: 0.4698, 0.5208, 0.3657, 0.2763, 0.3791, ..., 0.1896, 0.5917.
The terminal performance is indeed as good as the result reported in you paper, but this may suggest the model based RL is not stable. Is this also what you got?

Also, I trained the model with 2080ti gpu for 27hrs, is it supposed to be this slow? The networks look pretty small...
Thank you so much for your attention!

@ChanganVR
Copy link
Owner

Hi @xzxzxzxz, I'm not exactly sure why it's broken. If you can paste the log here, maybe I can help you diagnose. The code in plot.py is mostly a few lines of regular expression. If the output somehow changes, you should change the expression accordingly. Some online interactive tools help you debug, e.g., https://regexr.com/.

I remember the validation performance oscillated during training, but I don't remember whether it was this much since it's been a while.

I used to train the policy with CPU instead of GPU since most of the time was spent on the simulation & lookforward calculation. I remember GPU didn't bring much speed gain. As for the training time, it was about half day to one day. There are some optimizations you could to speed the compucation up, however, at this poin, I'm not able to spend more time on this project. If you could spend some time optimizing the code, I believe you will also gain deeper understanding of it.

@xzxzxzxz
Copy link
Author

Thank you Changan for your reply! It seems like the log file is too large and broken, but it does not matter that much. The output performance is good and stable. I am interested in working on the code base you created on multi agent planning, and I hope we can keep in touch and discuss more. I wanted to note a minor problem within the human motion prediction network: in the paper the network structure is (64, 5), but in the implementation is was (64, 5) (self.model_predictive_rl.motion_predictor_dims = [64, 5]).

@ChanganVR
Copy link
Owner

Hi @xzxzxzxz sorry, I'm not getting your question. Do you mean the structure in the paper does not match the one in the code?

@xzxzxzxz
Copy link
Author

Yes, the network structure seems to be slightly different, but this shall be fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants