Skip to content

Instantly share code, notes, and snippets.

@horoiwa
Created May 2, 2023 09:17
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save horoiwa/86a1cbd0d4842587e8461baffc94e6bd to your computer and use it in GitHub Desktop.
Save horoiwa/86a1cbd0d4842587e8461baffc94e6bd to your computer and use it in GitHub Desktop.
def update_value(self, states, actions):
""" Expectile Regression
"""
q1, q2 = self.target_qnet(states, actions)
target_values = tf.minimum(q1, q2)
with tf.GradientTape() as tape:
values = self.valuenet(states)
error = (target_values - values)
weights = tf.where(error > 0, self.tau, 1. - self.tau)
loss = tf.reduce_mean(weights * tf.square(error))
variables = self.valuenet.trainable_variables
grads = tape.gradient(loss, variables)
self.v_optimizer.apply_gradients(zip(grads, variables))
return loss
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment