code #13

404akhan · 2017-04-18T19:12:52Z

Added two evolution strategy implementations

hunkim · 2017-04-18T21:41:09Z

@404akhan Do you have loss graph or some results? Plese, post here.

@kkweon Could you review the code?

kkweon

Interesting.

First, could you please change the name to 09_?_nes_env_name.py
This is a local/random search algorithm, so it should be in 09s. ('10_xx' is for actor-critic methods)

However, I'm not so sure about using the low-level API for the Pong.
It could have been done in a single line with the high-level API. If @hunkim is okay, then I guess it's okay.

I left some comments only in the bipedal example, but the relevant issues in the Pong example should be fixed as well.

kkweon · 2017-04-18T22:25:39Z

10_2_es_bipedal.py

+
+import gym
+import numpy as np
+import cPickle as pickle


Are you using Python2?

cPickle is renamed to pickle in python3
no need to import cPickle

kkweon · 2017-04-18T22:30:16Z

10_2_es_bipedal.py

+env = gym.make('BipedalWalker-v2')
+np.random.seed(10)
+
+hl_size = 100


Can you add comments for each hyper parameters?
Also please follow some standard naming conventions.
aver_reward should be renamed to something like avg_reward

kkweon · 2017-04-18T22:33:21Z

10_2_es_bipedal.py

+        model[k] = model[k] + alpha/(npop*sigma) * np.dot(N[k].transpose(1, 2, 0), A)
+
+    cur_reward = f(model)
+    aver_reward = aver_reward * 0.9 + cur_reward * 0.1 if aver_reward != None else cur_reward


Why do we need aver_reward?
Why did you use EMA?

hunkim · 2017-04-19T03:54:25Z

"high-level API" preferred.

code

7b16a2f

hunkim requested a review from kkweon April 18, 2017 21:41

kkweon requested changes Apr 18, 2017

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code #13

code #13

404akhan commented Apr 18, 2017

hunkim commented Apr 18, 2017

kkweon left a comment

kkweon Apr 18, 2017

kkweon Apr 18, 2017

kkweon Apr 18, 2017

hunkim commented Apr 19, 2017

code #13

Are you sure you want to change the base?

code #13

Conversation

404akhan commented Apr 18, 2017

hunkim commented Apr 18, 2017

kkweon left a comment

Choose a reason for hiding this comment

kkweon Apr 18, 2017

Choose a reason for hiding this comment

kkweon Apr 18, 2017

Choose a reason for hiding this comment

kkweon Apr 18, 2017

Choose a reason for hiding this comment

hunkim commented Apr 19, 2017