Skip to content

Instantly share code, notes, and snippets.

@Chion82
Created April 2, 2020 15:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Chion82/f8ef2e9d9b8aa94fbf306ff8e4748481 to your computer and use it in GitHub Desktop.
Save Chion82/f8ef2e9d9b8aa94fbf306ff8e4748481 to your computer and use it in GitHub Desktop.
Implementations of discount function used to calculate discounted rewards in Reinforcement Learning
import numpy as np
from scipy.signal import lfilter
def discount_readable(r, gamma):
""" Compute the gamma-discounted rewards over an episode
"""
discounted_r, cumul_r = np.zeros_like(r), 0
for t in reversed(range(0, len(r))):
cumul_r = r[t] + cumul_r * gamma
discounted_r[t] = cumul_r
return discounted_r
discount_wtf = lambda x, gamma: lfilter([1],[1,-gamma],x[::-1])[::-1]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment