Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

training.py consume more resource #17

Open
Vino-git opened this issue Aug 25, 2018 · 3 comments
Open

training.py consume more resource #17

Vino-git opened this issue Aug 25, 2018 · 3 comments

Comments

@Vino-git
Copy link

Hi, After successful installation and process. When I try to execute the training.py, my system gets hang and I could not do any action unless force shutdown. Please assist for successful execution of mimic2.

@Vino-git
Copy link
Author

Below the status after completion of training.py. This training did inside the docker.

2018-08-27 15:40:09.343384: W tensorflow/core/framework/allocator.cc:101] Allocation of 18124800 exceeds 10% of system memory.
2018-08-27 15:40:09.826597: W tensorflow/core/framework/allocator.cc:101] Allocation of 77332480 exceeds 10% of system memory.
2018-08-27 15:40:10.362115: W tensorflow/core/framework/allocator.cc:101] Allocation of 16142400 exceeds 10% of system memory.
2018-08-27 15:40:10.542931: W tensorflow/core/framework/allocator.cc:101] Allocation of 16142400 exceeds 10% of system memory.
2018-08-27 15:40:10.666453: W tensorflow/core/framework/allocator.cc:101] Allocation of 17369600 exceeds 10% of system memory.
2018-08-27 15:40:10.673025: W tensorflow/core/framework/allocator.cc:101] Allocation of 17936000 exceeds 10% of system memory.
2018-08-27 15:40:10.676968: W tensorflow/core/framework/allocator.cc:101] Allocation of 18691200 exceeds 10% of system memory.
2018-08-27 15:40:10.704691: W tensorflow/core/framework/allocator.cc:101] Allocation of 17369600 exceeds 10% of system memory.
2018-08-27 15:40:10.718342: W tensorflow/core/framework/allocator.cc:101] Allocation of 17936000 exceeds 10% of system memory.
2018-08-27 15:40:10.733771: W tensorflow/core/framework/allocator.cc:101] Allocation of 19635200 exceeds 10% of system memory.
2018-08-27 15:40:10.738065: W tensorflow/core/framework/allocator.cc:101] Allocation of 18691200 exceeds 10% of system memory.
2018-08-27 15:40:10.829940: W tensorflow/core/framework/allocator.cc:101] Allocation of 19163200 exceeds 10% of system memory.
2018-08-27 15:40:14.460210: W tensorflow/core/framework/allocator.cc:101] Allocation of 17301504 exceeds 10% of system memory.
2018-08-27 15:40:14.596566: W tensorflow/core/framework/allocator.cc:101] Allocation of 34603008 exceeds 10% of system memory.
Step 1 [102.949 sec/step, loss=0.96979, avg_loss=0.96979]
2018-08-27 15:40:29.697392: W tensorflow/core/framework/allocator.cc:101] Allocation of 101680000 exceeds 10% of system memory.
2018-08-27 15:40:31.111335: W tensorflow/core/framework/allocator.cc:101] Allocation of 43778048 exceeds 10% of system memory.
2018-08-27 15:40:31.122356: W tensorflow/core/framework/allocator.cc:101] Allocation of 43778048 exceeds 10% of system memory.
Killed

@LearnedVector
Copy link

@Vino-git you may need to change the batch size to something lower. It's currently set at 32 and that may be taking up to much memory.

@JasonGhent
Copy link

sed -i 's/batch_size=32/batch_size=16/g' hparams.py

The above fixed this for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants