Skip to content

Instantly share code, notes, and snippets.

@awjuliani
Last active November 14, 2016 20:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save awjuliani/a562f9361b94fde2f7f4a1735c8957ba to your computer and use it in GitHub Desktop.
Save awjuliani/a562f9361b94fde2f7f4a1735c8957ba to your computer and use it in GitHub Desktop.
#Add this to network to compute Boltzmann probabilities.
Temp = tf.placeholder(shape=None,dtype=tf.float32)
Q_dist = slim.softmax(Q_out/Temp)
#Use this for action selection.
t = 0.5
Q_probs = sess.run(Q_dist,feed_dict={inputs:[state],Temp:t})
action_value = np.random.choice(Q_probs[0],p=Q_probs[0])
action = np.argmax(Q_probs[0] == action_value)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment