Skip to content

Instantly share code, notes, and snippets.

A3C

Deep reinforcement learning using an asynchronous advantage actor-critic (A3C) model written in TensorFlow.

This AI does not rely on hand-engineered rules or features. Instead, it masters the environment by looking at raw pixels and learning from experience, just as humans do.

The Flappy Bird Gym environment was mastered in 48 hours of training using a CPU with 8 cores. Training and evaluation code is available at github.com/andreimuntean/a3c.

Dependencies

  • OpenAI Gym 0.8
  • TensorFlow 1.0
@sloria
sloria / bobp-python.md
Last active May 12, 2024 06:54
A "Best of the Best Practices" (BOBP) guide to developing in Python.

The Best of the Best Practices (BOBP) Guide for Python

A "Best of the Best Practices" (BOBP) guide to developing in Python.

In General

Values

  • "Build tools for others that you want to be built for you." - Kenneth Reitz
  • "Simplicity is alway better than functionality." - Pieter Hintjens