Ch 8 predicting? #35

nisbus · 2018-06-13T15:30:17Z

I've managed to run the code from chapter 8 successfully and the update_q seems to be creating Q values for states.

I now wanted to run the simulation a 100 times and then predict the same (or other) prices using the learnt knowledge.

I tried adding the following method to the QDecisionPolicy

    def predict(self, state):                
        action_q_vals = self.sess.run(self.q, feed_dict={self.x: state})        
        action_idx = np.argmax(action_q_vals)
        action = self.actions[action_idx]
        print('Action {}, Q {}, STATE :{}'.format(action, action_q_vals, state))
        return action

This always prints out Action 'HOLD', Q [[0. 0. 0.]] for any state given
event though I've tested printing the same in the update Q and seeing that the state I'm putting into predict is being updated to non zero values.

How can I query the policy, or is there some other mechanism that I should be using to predict using the learnt policy?

Thanks

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ch 8 predicting? #35

Ch 8 predicting? #35

nisbus commented Jun 13, 2018

Ch 8 predicting? #35

Ch 8 predicting? #35

Comments

nisbus commented Jun 13, 2018