An Artificial Neural Network (ANN) with a Multi-Layer Structure can be seen as a Cascade of Linear Combinations of Neurons Transfer Functions
In case of a Binary Classification Task the implicit goal for the ANN is to learn a Function which separates the ANN Input Space in 2 regions: one for each possible label. Let's call this function the Input Space Separation Function.
What if the Neuron Transfer Function would be linear ?
In that case the ANN would only be able to learn a Linear Input Space Separation Function hence