Produced using PGQ implementation from tensorflow-rl at commit daf75a33a20aae461a63c5b650b61216117b3f7b. Evaluation generated from checkpoint at 260M agent steps
To reproduce run:
python main.py Boxing-v0 \
--initial_lr 0.00025 \
--momentum .99 \
--clip_norm_type ignore \
--frame_skip 2 4 \