Related tweets
- Today, we’re going to play a game I’m calling “IT’S JUST A LINEAR MODEL” (IJALM)...
- How about deep learning? Super non-linear, right? Well, as a function of some non-linear activations, it's IJALM...
Codes are avaiable in both Julia and R. No dependency. You simply run it, or play around with it.
A dummy, 1-dimensional architecture of deep learning network
Using mathematical notation of linear regression, β0 and β
#' Non-linear activation function e.g. Rectified Linear Unit (RELU)
activation(x) = max(0, x)
#' Layer by layer...input is transformed using both linear and non-linear function
β0, β = 1, 2 # Arbitrary number
layer_1(x) = activation( β0 + β * x ) # Linear and non-linear
layer_2(x) = β0 + β * x # Linear
network(x) = layer_2(layer_1(x)) # Multi-layer linear and non-linear
output = network(123)
#' Input = 123, output = 495
A simple architercture of two-layer deep learning network
Using conventional mathematical notation of deep learning
input = [1, 2, 3] # Vector of size 3
#' Non-linear activation function e.g. Rectified Linear Unit (RELU)
activation(x) = max(0, x)
#' Weights and biases in matrix form, just like β0 and β1, β2... of linear model
#' Here, input dimension = 3, intermediate dimension = 5, output dimension = 1
W1, b1 = rand(5, 3), rand(5)
W2, b2 = rand(1, 5), rand(1)
#' Simple matrix dot product of linear and non-linear models
layer_1(x) = activation.( W1 * x .+ b1) # Linear and non-linear
layer_2(x) = W2 * x .+ b2 # Linear
network(x) = layer_2(layer_1(x)) # Multi-layer linear and non-linear
input # 3-element Array{Int64,1}: 1 2 3
#' Layer by layer...input is transformed using both linear and non-linear function
output = network(input) # 1-element Array{Float64,1}: 4.5
#' Internally
output_1 = layer_1(input) # 5-element Array{Float64,1}: 2.3 2.6 3.0 1.4 3.4
output = layer_2(output_1) # 1-element Array{Float64,1}: 4.5
A dummy, 1-dimensional architecture of deep learning network
Using mathematical notation of linear regression, β0 and β
#' Non-linear activation function e.g. Rectified Linear Unit (RELU)
activation <- function(x) max(0, x)
#' Layer by layer...input is transformed using both linear and non-linear function
β0 = 1 ; β = 2 # Arbitrary number
layer_1 <- function(x) activation( β0 + β * x )
layer_2 <- function(x) β0 + β * x
network <- function(x) layer_2(layer_1(x))
output <- network(123) # Input = 123, output = 495
A simple architercture of two-layer deep learning network
Using conventional mathematical notation of deep learning
input <- c(1, 2, 3) # Vector of size 3
#' Non-linear activation function e.g. Rectified Linear Unit (RELU)
activation <- function(x) sapply(x, max, 0)
#' Weights and biases in matrix form, just like β0 and β1, β2... of linear model
#' Here, input dimension = 3, intermediate dimension = 5, output dimension = 1
W1 <- matrix(runif(5*3), 5) ; b1 <- runif(5)
W2 <- matrix(runif(1*5), 1) ; b2 <- runif(1)
#' Simple matrix dot product of linear and non-linear models
layer_1 <- function(x) activation( W1 %*% x + b1 ) # Linear and non-linear
layer_2 <- function(x) W2 %*% x + b2 # Linear
network <- function(x) layer_2(layer_1(x)) # Multi-layer linear and non-linear
input # [1] 1 2 3
#' Layer by layer...input is transformed using both linear and non-linear function
( output <- network(input) ) # [1] 10.311
#' Internally
( output_1 <- layer_1(input) ) # [1] 1.9659 3.0138 5.3722 3.0259 3.8878
( output <- layer_2(output_1) ) # [1] 10.311