Skip to content

Instantly share code, notes, and snippets.

@SirmaXX
Last active April 27, 2022 19:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save SirmaXX/64e92f8e9837fd33ecccbc32b68b5fee to your computer and use it in GitHub Desktop.
Save SirmaXX/64e92f8e9837fd33ecccbc32b68b5fee to your computer and use it in GitHub Desktop.
CHAPTER10 Solution power of (-1/2)
---
title: "R Notebook"
output: html_notebook
---
# Chapter 10 Canonical correlation Analysis (CH10 p 539)
Chapter is related to Partitioning the covariance matrix (p73 ch2) .(ABbrevation :kısaltma )
CCA seeks (araştırma) to identify and quantity associations(ilişkili) between two sets of variables .
they are denotes data sets $X^{1},X^{2}}$
(Developed by hotelling )
CCA focuses on the correlation between a linear combination f the variables in one set ,and a linear combination of the variables in another set .
The idea in CCA ,is first to determine the pair of linear combinations having the largest correlations.
(what is the purpose of CCA (exam question))
Next, we determine the pair of linear combinations having largest correlations among( compare 3>2 thing it means compare ) all pairs uncorrelated with the initial selected pair and so on.
The pairs are linear combinations are called $\textbf{"the Canonical variables "}$,
Their correlations are called .$\textbf{" canonical correlations "}$ (are called :adlandırılmak)
## Canonical Variates and canonical correlations
For two random vectors (data array) $\textbf{X^{1}}$ and $\textbf{X^{2}}$ expectaion of mean $E(X^{1})=\mu^{1}$ ,$E(X^{2})=\mu^{2}$
(işlemler full rank olmak zorunda çünkü full rank olmazsa tersini alamayız)
$$cov(x^{1}, x^{2})=
\begin{bmatrix}
\sigma_{11} & ... & \sigma_{12} \\
\sigma_{21} & ... & \sigma_{22}\\
\end{bmatrix} \\
$$
as
different covarinces
$cov(x^1)=\sum_{11}$,$cov(x^2)=\sum_{22}$
data array denotes
$\textbf{x}= [X^{1},X^{2}]^T=[X_1^{1},X_2^{1},..X_1^{n},|X_1^{2},X_2^{2},..X_1^{n}]$
mean vectors
$\textbf{E(x)}= [\mu_1^{1},\mu_2^{2}]^T=[\mu^{1},\mu^{2}]^T $ mean vector=$\mu_{(p+g)x1}$
linear combinations prvide simply summary measures of a set variables.
Set $U=a' X^{1}$,$V=b' X^{2}$ $\textbf{a}$ and $\textbf{b}$ are coefficient vectors !
for some pair of coeffiecient vectors $\textbf{a}$ and $\textbf{b}$ .Then we obtain the covariances as
(this calculations from chapter 2 )
$Cov(U)=Cov(a' X^{1})= a' x \sum_{11} x a$
$Cov(V)=Cov(b' X^{1})= b' x \sum_{22} x b$
$Cov(U,V)=a' \sum_{12} b$
$Corr(U,V)=\frac{a' \sum_{12} b}{\sqrt{a' x \sum_{11} x a}\sqrt{b' x \sum_{22} x b}} $
Not:scaler ,matris ,vector nedir bak
linear bağlantı sayın çoksa correlasyonlar matristir
korelasyon olabildiğince büyük olmalıdır.
### we define following
the first pair of canonical variables(canonical variate ) is the pair of linear combination of $U_{1}$,$V_{1}$having unit variances which maximaize the correlation.
The second pair of canonical variables is the pair of linear combinations of $U_2$ ,$V_{1}$having unit variances which maximaize the correlation among all choices that are uncorrelated with the pair of canonical variables .
....
At the $k^{th}$ step,
The $k^{th}$ pair of canonical variables is the pair of linear combinations $U_k$ ,$V_{k}$ having unit variances ,which maximize the correlation among all choices "uncorrelated" with the previous(k-1) canonical variable pairs.
"memorize canonical covariance "vb tanımları sorabilir!!!
The correlation between the $k^{th}$ pair of canonical variables is called $k^{th}$ canonical correlation.
(In cca our purpose largest posible correlation,maximum corr(u,v))
$max_{a,b} \quad corr(U,V)=\rho_1^{*}$ is attained by the linear combinations .$U_1=a'X^{1}$ and$V_1=b'X^{2}$
(standart pca bak)
not :a'katsaylar için eigenvectors bul
spectral decomposition $A=\sum{i=1}{k} \lambda_i eiei'$ help to find cov matrix inverse
$A^{-1}=\sum{i=1}{k} \frac{1}{\lambda_i} eiei'$
$U_{1}=\textbf{e_{1}^1} \sum_{11}^{-1/2} X^{(1)}$
$V_{1}=\textbf{f_{1}^1} \sum_{11}^{-1/2} X^{(2)}$
here,$\rho_1{*} >= \rho_2{*} >= ...>=\rho_K{*} $ are the eigenvalues of $\sum_{11}^{-1/2}\sum_{12}\sum_{22}^{-1}\sum_{21}\sum_{11}^{-1/2}$
and e1,e2,..,ek are the eigenvectors of the corresponding eigenvalues.
$f_{1},f_{2},..,f_{k} $ are the eigenvectors of $\sum_{22}^{-1/2}\sum_{21}\sum_{11}^{-1}\sum_{12}\sum_{22}^{-1/2}$
fi is proportional to$\sum_{12}^{-1/2}\sum_{21}\sum_{11}^{-1/2}(matrix) .ei(vector) $
## uncorrelated canonical variances
$Var(Uk)=var(V_k)=1 $
1$Cov(U_k,U_l)=0 \quad k\neq l$
2.$Cov(V_k,V_l)=0 \quad k\neq l$
3.$Cov(U_k,V_l)=0 \quad k\neq l$ , $k,l=1,.....,p$
### Example ( page 543)
Suppose $Z^{1}=[Z_1^{1},Z_2^{1}]'$ are standardized variables and $Z^{2}=[Z_1^{2},Z_2^{2}]'$ are also standartied varables .
resource :https://stat.ethz.ch/pipermail/r-help/2008-April/160662.html
p11 <- matrix(c(1.0, 0.4,
0.4,1.0),ncol = 2, byrow = TRUE)
p12 <- matrix(c(0.5, 0.6,
0.3,0.4),ncol = 2, byrow = TRUE)
p21 <- matrix(c(0.5, 0.3,
0.6,0.4),ncol = 2, byrow = TRUE)
p22 <- matrix(c(1.0, 0.2,
0.2,1.0),ncol = 2, byrow = TRUE)
"%^%" <- function(x, n)
with(eigen(x), vectors %*% (values^n * t(vectors)))
result =(p11%^% (-0.5)) %*% p12 %*% solve(p22) %*% p21 %*% (p11%^% (-0.5))
result
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment