Bright Chang bright1993ff66

## newton_tensorflow.py
# Newton's method in Tensorflow

# 'Vanilla' N.m. intended to work when loss function to be optimized is convex.
# One-layer linear network without activation is convex.
# If activation function is monotonic, the error surface associated with a single-layer model is convex.

# In other cases, Hessian will have negative eigenvalues in saddle points and other non-convex places of the surface
# To fix that, you can try different methods. One of those approaches is to do eigendecomposition of H and invert negative eigenvalues,
# making H "pushing out" in those directions, as described in this paper: Identifying and attacking the saddle point problem in high-dimensional non-convex optimization (https://papers.nips.cc/paper/5486-identifying-and-attacking-the-saddle-point-problem-in-high-dimensional-non-convex-optimization.pdf)
	# Newton's method in Tensorflow

	# 'Vanilla' N.m. intended to work when loss function to be optimized is convex.
	# One-layer linear network without activation is convex.
	# If activation function is monotonic, the error surface associated with a single-layer model is convex.

	# In other cases, Hessian will have negative eigenvalues in saddle points and other non-convex places of the surface
	# To fix that, you can try different methods. One of those approaches is to do eigendecomposition of H and invert negative eigenvalues,
	# making H "pushing out" in those directions, as described in this paper: Identifying and attacking the saddle point problem in high-dimensional non-convex optimization (https://papers.nips.cc/paper/5486-identifying-and-attacking-the-saddle-point-problem-in-high-dimensional-non-convex-optimization.pdf)