Skip to content

Instantly share code, notes, and snippets.

@bvancil
Created March 28, 2023 19:59
Show Gist options
  • Save bvancil/85ef5d28ce7cd23519462158219918cb to your computer and use it in GitHub Desktop.
Save bvancil/85ef5d28ce7cd23519462158219918cb to your computer and use it in GitHub Desktop.
Compositional PCA
---
title: "Compositional PCA"
---
## Problem
From [https://sciencemastodon.com/\@Alex_Koiter/109909545205270622](https://sciencemastodon.com/@Alex_Koiter/109909545205270622), "Question for the [#soil](https://fosstodon.org/tags/soil) and [#stats](https://fosstodon.org/tags/stats) peeps. Can you include the percent sand, silt, and clay in to a PCA? The reason I am asking is that the three variables always sum to 100% My lurking on stack exchange etc seems to suggest that it is ok, but some how it doesn't feel right."
Let's check this for three variables. We don't have a proof but suspect different eigenvalues to $X^TX$ . Let's try to shoot for a negative case.
## Trying for a counterexample
First, we'll create some sample data.
```{r}
library(withr)
get_sample_data <- function(n = 100L) {
withr::with_seed(20230320L, {
x1 <- runif(n, 0, 1)
x2 <- runif(n, 0, 1-x1)
x3 <- 1 - x1 - x2
X <- cbind(x1, x2, x3)
X
})
}
X <- get_sample_data()
```
Next let's do the PCA with all three variables.
```{r}
pca_X <- princomp(X, cor = TRUE)
print(summary(pca_X))
print(loadings(pca_X))
biplot(pca_X)
```
We compare this to the PCA with two variables.
```{r}
pca_X12 <- princomp(X[,1:2], cor = TRUE)
print(summary(pca_X12))
print(loadings(pca_X12))
biplot(pca_X12)
```
Are the PCA-coded loadings the same?
```{r}
X_prime <- predict(pca_X, X)
X12_prime <- predict(pca_X12, X[,1:2])
head(X_prime)
head(X12_prime)
cor(t(X_prime[,1:2])) - cor(t(X12_prime))
```
## Compositional approach
```{r}
library(compositions)
```
```{r}
X_comp <- acomp(X)
pca_X_comp <- princomp(X_comp)
summary(pca_X_comp)
plot(pca_X_comp, type="biplot")
plot(X_comp)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment