Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class ReplayMemory: | |
... | |
def update_all_q_values(self): | |
""" | |
Update all Q-values in the replay-memory. | |
When states and Q-values are added to the replay-memory, the | |
Q-values have been estimated by the Neural Network. But we now |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class NeuralNetwork: | |
def __init__(self): | |
# Initializer for the layers in the Neural Network. | |
# If you change the architecture of the network, particularly | |
# if you add or remove layers, then you may have to change | |
# the stddev-parameter here. The initial weights must result | |
# in the Neural Network outputting Q-values that are very close | |
# to zero - but the network weights must not be too low either |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class NeuralNetwork: | |
def __init__(self): | |
... | |
# TensorFlow has a built-in loss-function for doing regression: | |
# self.loss = tf.nn.l2_loss(self.q_values - self.q_values_new) | |
# But it uses tf.reduce_sum() rather than tf.reduce_mean() | |
# which is used by PrettyTensor. This means the scale of the |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def _rgb_to_black_and_white(image): | |
""" | |
Convert an RGB-image into gray-scale using a formula from Wikipedia: | |
https://en.wikipedia.org/wiki/Grayscale | |
""" | |
# Get the separate colour-channels. | |
r, g, b = image[:, :, 0], image[:, :, 1], image[:, :, 2] | |
# Convert to gray-scale using the Wikipedia formula. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class EpsilonGreedy: | |
""" | |
The epsilon-greedy policy either takes a random action with | |
probability epsilon, or it takes the action for the highest | |
Q-value. | |
If epsilon is 1.0 then the actions are always random. | |
If epsilon is 0.0 then the actions are always argmax for the Q-values. | |
Epsilon is typically decreased linearly from 1.0 to 0.1 during training |
OlderNewer