Skip to content

Instantly share code, notes, and snippets.

@PeteHaitch
Created August 11, 2017 14:31
Show Gist options
  • Save PeteHaitch/e08ab50a779ee957e26a69f4affc937d to your computer and use it in GitHub Desktop.
Save PeteHaitch/e08ab50a779ee957e26a69f4affc937d to your computer and use it in GitHub Desktop.
colCounts subject to (in)equality
library(profmem)
library(Matrix)
dgt <- readMM("~/sparse.mtx.gz")
dgc <- as(dgt, "dgCMatrix")
colCountsEqualZero <- function(x) {
if (length(x@x) / length(x) > 0.5) {
# If majority of data are non-zero
Matrix::colSums(x == 0)
} else {
# If majority of data are zero
nrow(x) - Matrix::colSums(x > 0)
}
}
colCountsLessThan <- function(x, threshold) {
if (length(x@x) / length(x) > 0.5) {
# If majority of data are non-zero
Matrix::colSums(x < threshold)
} else {
# If majority of data are zero
nrow(x) - Matrix::colSums(x >= threshold)
}
}
colCountsGreaterThan <- function(x, threshold) {
if (length(x@x) / length(x) > 0.5) {
# If majority of data are non-zero
nrow(x) - Matrix::colSums(x <= threshold)
} else {
# If majority of data are zero
Matrix::colSums(x > threshold)
}
}
total(profmem(colCountsEqualZero(dgc)))
#> [1] 295577544
total(profmem(colCountsLessThan(dgc, 5)))
#> [1] 787204208
total(profmem(colCountsGreaterThan(dgc, 4)))
#> [1] 787098032
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment