Based on: https://sudeepraja.github.io/Neural/
- Input matrix is
$x_0$ - Layer 1 weight matrix is
$W_1$ - Layer 1 output is
$x_1 = f_1(W_1x_0)$ , where$f$ is the activation function for layer 1 - There are 4 layers (including input layer)
- Hence, network output is
$x_3 = f_3(W_3x_2)$ - Assuming MSE loss function, with
$t$ as target variable,$E = \frac{1}{2}|x_3 - t|_2^2$