Skip to content

Instantly share code, notes, and snippets.

@nir-krakauer
Created June 15, 2018 16:24
Show Gist options
  • Save nir-krakauer/3aaf08d3f21d169635576bf70949d31d to your computer and use it in GitHub Desktop.
Save nir-krakauer/3aaf08d3f21d169635576bf70949d31d to your computer and use it in GitHub Desktop.
Reduced rank regression function
## Copyright (C) 2018 Nir Krakauer mail@nirkrakauer.net
##
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 3 of the License, or
## (at your option) any later version.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
## GNU General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this program; If not, see <http://www.gnu.org/licenses/>.
##RRR_BASIC Basic multivariate reduced-rank regression.
## Given an n-by-T matrix X and an m-by-T matrix Y, both with zero row means,
## find r-by-n B and m-by-r A that minimize the sum of squares of Y - ABX.
##
## The minimizer of the weighted sum of squares sumsq(W(Y - Aw*Bw*X)), where W is m by m with inverse Wi,
## can be found with the substitutions Yw = W*Y, [A, Bw] = rrr_basic (Yw, X, r), and Aw=Wi*A.
##
## References:
## Reinsel, G. C. & Velu, R. P. (1998)
## Multivariate Reduced-Rank Regression : Theory and Applications, Springer, Chapter 2.
## Mo, R. (2003)
## Efficient algorithms for maximum covariance analysis of datasets with many variables and fewer realizations: a revisit
## Journal of Atmospheric and Oceanic Technology, 20(12):1804-1809.
function [A, B] = rrr_basic (Y, X, r)
n = size (X, 1);
m = size (Y, 1);
T = size (X, 2); #or size (Y, 2)
few_T = (T < n) & (T < m); #case with few realizations compared to number of variables, in which case QR decomposition is used to reduce computation and memory requirements compared to directly computing Y*X' and its SVD
[U, S, ~] = svd (X, 'econ'); #or [Q, R] = qr (X, 0); [U, S, ~] = svd (R, 'econ'); U = Q * U;, but this isn't obviously faster, even in the few_T case
s = diag(S);
rX = sum (s > max(n, T)*eps*s(1)); #numeric rank of X
U = U(:, 1:rX); s = s(1:rX); #discard effectively-zero singular values for computing (pseudo)inverse
#iZ = T*U*diag(1 ./ (s.^2))*U'; # = inv((X*X') / T)
if few_T
[Qy, Ry] = qr (Y, 0);
[U, S, ~] = svd (Ry*X'*U*diag(1 ./ s)/sqrt(T), 'econ');
U = Qy * U;
else
Z = sqrt(T)*U*diag(1 ./ s)*U'; # = sqrtm(inv((X*X') / T))
Sh_yx = (Y*X') / T;
R = Sh_yx*Z;
[U, S, ~] = svd (R, 'econ');
endif
A = U(:, 1:r);
B = (U(:, 1:r)'*Y) / X; #or U(:, 1:r)'*Sh_yx*iZ, or U(:, 1:r)'*Y*X'*Ux*diag(1 ./ (s.^2))*Ux'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment