d0p3t/notes for dqn

## notes for dqn
DQN Setup Steps Cloney

1. init dqn in setup_play()
2. ?????? enter_train_mode() (WHAT IS DIFFERENCE BETWEEN RUN OBSERVER TRAIN)
3. next_step()?????????
4. build_frame_stack() and then / or update_frame_stack()??
5. pick_action(RANDOM or PREDICTED at % 20) how to format action mapping
6. now observe = next_step()?????
7. append_to_replay_memory() how to use reward parameter????
8. calculate_target_error() how to use observation parameter from previous step???
9. keep going until game_over context
10. now train_on_mini_batch
11. save_model_weights every ~100 runs and checkpoint too?
12. update_target_model
	DQN Setup Steps Cloney

	1. init dqn in setup_play()
	2. ?????? enter_train_mode() (WHAT IS DIFFERENCE BETWEEN RUN OBSERVER TRAIN)
	3. next_step()?????????
	4. build_frame_stack() and then / or update_frame_stack()??
	5. pick_action(RANDOM or PREDICTED at % 20) how to format action mapping
	6. now observe = next_step()?????
	7. append_to_replay_memory() how to use reward parameter????
	8. calculate_target_error() how to use observation parameter from previous step???
	9. keep going until game_over context
	10. now train_on_mini_batch
	11. save_model_weights every ~100 runs and checkpoint too?
	12. update_target_model