Gandalf Saxe gandalfsaxe

## gist:4ca72ed004b832f4721e5274e2f1bb48
Automatically generated by Mendeley Desktop 1.17.13
Any changes to this file will be lost if it is regenerated by Mendeley.

BibTeX export options can be customized via Preferences -> BibTeX in Mendeley Desktop

@article{Such2017,
abstract = {Deep artificial neural networks (DNNs) are typically trained via gradient-based learning algorithms, namely backpropagation. Evolution strategies (ES) can rival backprop-based algorithms such as Q-learning and policy gradients on challenging deep reinforcement learning (RL) problems. However, ES can be considered a gradient-based algorithm because it performs stochastic gradient descent via an operation similar to a finite-difference approximation of the gradient. That raises the question of whether non-gradient-based evolutionary algorithms can work at DNN scales. Here we demonstrate they can: we evolve the weights of a DNN with a simple, gradient-free, population-based genetic algorithm (GA) and it performs well on hard deep RL problems, including Atari and humanoid lo
	Automatically generated by Mendeley Desktop 1.17.13
	Any changes to this file will be lost if it is regenerated by Mendeley.

	BibTeX export options can be customized via Preferences -> BibTeX in Mendeley Desktop

	@article{Such2017,
	abstract = {Deep artificial neural networks (DNNs) are typically trained via gradient-based learning algorithms, namely backpropagation. Evolution strategies (ES) can rival backprop-based algorithms such as Q-learning and policy gradients on challenging deep reinforcement learning (RL) problems. However, ES can be considered a gradient-based algorithm because it performs stochastic gradient descent via an operation similar to a finite-difference approximation of the gradient. That raises the question of whether non-gradient-based evolutionary algorithms can work at DNN scales. Here we demonstrate they can: we evolve the weights of a DNN with a simple, gradient-free, population-based genetic algorithm (GA) and it performs well on hard deep RL problems, including Atari and humanoid lo