-
Notifications
You must be signed in to change notification settings - Fork 6
/
plan.txt
45 lines (34 loc) · 1.28 KB
/
plan.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#========= Sim set up ===========
(DONE) Car distribution: 100 HVs, 1 AV
(DONE) AV Behavior initially: Baseline HV Like
Simulation terminates when:
(DONE) AV makes 10 cycles
Implement Reinforcement Learning Regime 1:
-> optimize P(lc)
-> reward fastest time taken to finish (record sim time)
-> allow AV to change its own P(lc)
Implement Reinforcement Learning Regime 2:
-> optimize v
-> penalize if speed less than average speed
-> allow AV to change its own v_max
============= to do =============
> environment from road.py
> qlearning.py -> uses batch mode and environment file to train agents
============== PLAN =============
1) think of simulation scheme (state, actions, rewards, termination, outputs) -> Done!
2) implement changes to current software to incorporte Qlearning
> set up/think about simulation in episodic learning format
gameEngine = nagel.py
driver = startLearning.py
(DONE) task 1) make nagel.py in class format
task 2) nagel.py -> environment format
task 3) gameEngine in batch mode
> implement changes to agent in car.py (add agent_car.py)
> create a running model
> batch mode functionality
Try taxiSim set up
3) use VI to populate Qtable
4i) get results
5) write report and interpretations
6) learn CNN and Deep Qlearning
7) change VI to NN system