You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This always prints out Action 'HOLD', Q [[0. 0. 0.]] for any state given
event though I've tested printing the same in the update Q and seeing that the state I'm putting into predict is being updated to non zero values.
How can I query the policy, or is there some other mechanism that I should be using to predict using the learnt policy?
Thanks
The text was updated successfully, but these errors were encountered:
I've managed to run the code from chapter 8 successfully and the update_q seems to be creating Q values for states.
I now wanted to run the simulation a 100 times and then predict the same (or other) prices using the learnt knowledge.
I tried adding the following method to the QDecisionPolicy
This always prints out Action 'HOLD', Q [[0. 0. 0.]] for any state given
event though I've tested printing the same in the update Q and seeing that the state I'm putting into predict is being updated to non zero values.
How can I query the policy, or is there some other mechanism that I should be using to predict using the learnt policy?
Thanks
The text was updated successfully, but these errors were encountered: