Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how many hours does the qpic model need to be trained on HICO-DET and V-COCO respectively? #20

Open
truetone2022 opened this issue Jul 12, 2021 · 5 comments

Comments

@truetone2022
Copy link

No description provided.

@tamtamz
Copy link

tamtamz commented Jul 13, 2021

With the batch size of 2 and 8 V100 GPUs, it approximately takes about 38 hours for HICO-DET and 8 hours for V-COCO.

@truetone2022
Copy link
Author

truetone2022 commented Jul 14, 2021 via email

@dragen1860
Copy link

Thanks for your reply!I have trained your QPIC model on HICO-DET and VCOCO and all the setting is same with your released github code, but i encounter two problems.  First, most of time the GPU utility rate is hanging around 0% , the program seems stuck at reading data, is this situation normal ? Second, after training 10+ hours, only 10+ epochs finished, it seems to need 150+ hours to train 150 epochs on HICO-DET, is this situation normal ?  By the way, the same situation occurs on VCOCO. Thanks for your helpful reply! Best wishes!

------------------ 原始邮件 ------------------ 发件人: "hitachi-rd-cv/qpic" @.>; 发送时间: 2021年7月13日(星期二) 下午2:26 @.>; @.@.>; 主题: Re: [hitachi-rd-cv/qpic] how many hours does the qpic model need to be trained on HICO-DET and V-COCO respectively? (#20) With the batch size of 2 and 8 V100 GPUs, it approximately takes about 38 hours for HICO-DET and 8 hours for V-COCO. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

maybe you should check your disk io performance.

@truetone2022
Copy link
Author

Thanks for your reply!I have trained your QPIC model on HICO-DET and VCOCO and all the setting is same with your released github code, but i encounter two problems.  First, most of time the GPU utility rate is hanging around 0% , the program seems stuck at reading data, is this situation normal ? Second, after training 10+ hours, only 10+ epochs finished, it seems to need 150+ hours to train 150 epochs on HICO-DET, is this situation normal ?  By the way, the same situation occurs on VCOCO. Thanks for your helpful reply! Best wishes!

------------------ 原始邮件 ------------------ 发件人: "hitachi-rd-cv/qpic" @.>; 发送时间: 2021年7月13日(星期二) 下午2:26 _@**._>; _@.@.**_>; 主题: Re: [hitachi-rd-cv/qpic] how many hours does the qpic model need to be trained on HICO-DET and V-COCO respectively? (#20) With the batch size of 2 and 8 V100 GPUs, it approximately takes about 38 hours for HICO-DET and 8 hours for V-COCO. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

maybe you should check your disk io performance.

My disk io performance should be allright because i can train the similar work HoiTransformer code normally. So i'm confused where is the problem.

@DavidHuji
Copy link

How many workers are you using?
Try to add "--num_workers 4" to check if it solves both the problem of slow training and the gpu utility (usually is the same problem).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants