Data is often split into "signal" + "noise", with
|data|^2 = |signal|^2 + |noise|^2
= sum of squares, data1^2 + data2^2 + ...
Sums of squares can be rather non-intuitive. For example,
101^2 = 99^2 + 20^2
|data|^2 = |signal|^2 + |noise|^2
|signal| / |data| = 99 / 101 = 98 % -- sounds good
|noise| / |data| = 20 / 101 = 20 % -- not so good
|signal| / |noise| = 99 / 20 = 5 -- ?
Giving only one of these ratios -- 98 %, 20 %, 5 -- is misleading. Giving all 3 numbers, though, can be confusing.
What to do ? I like to print / plot "signal" not squared, and perhaps squared too, e.g.
PCA eigenvalues %: [ 2 4 6 8 9 11 12 13 15 16 ...
PCA variance %: [ 9 17 23 28 32 36 39 42 45 47 ...
Statisticians use a ratio called "R squared", which in this context is |signal|^2 / |data|^2, e.g. 99^2 / 101^2 = 96 % -- impressive ?
R^2 gives the 'percentage of variance explained' by the regression, an expression that, for most social scientists, is of doubtful meaning but great rhetorical value.
-- Wikipedia Explained variation
For a lovely picture of the squares that least-squares minimizes, see Coefficient of determination .
("Why sums of squares ?" I don't know of a brief answer for laymen, beyond "nice math", "commonly used"; comments welcome.)
cheers
-- denis