Skip to content

Instantly share code, notes, and snippets.

@njtierney
Created January 23, 2023 03:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save njtierney/676d3ca8a0d5241f6a3c4ff0b130ce5d to your computer and use it in GitHub Desktop.
Save njtierney/676d3ca8a0d5241f6a3c4ff0b130ce5d to your computer and use it in GitHub Desktop.
# ABS - what is statistically special about these numbers?
set_one <- c(5, 2, 7, 3, 5, 1)
set_two <- c(9, 6, 3, 5, 8, 6)

summary(set_one)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>   1.000   2.250   4.000   3.833   5.000   7.000
summary(set_two)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>   3.000   5.250   6.000   6.167   7.500   9.000

# nothing seems the same here so far

# variance?
var(set_one)
#> [1] 4.966667
var(set_two)
#> [1] 4.566667

# close, but no

# standard deviation?
sd(set_one)
#> [1] 2.228602
sd(set_two)
#> [1] 2.136976

# they have the same median absolute deviation?
mad(set_one)
#> [1] 2.2239
mad(set_two)
#> [1] 2.2239

mad(set_one) == mad(set_two)
#> [1] TRUE
identical(mad(set_one), mad(set_two))
#> [1] TRUE

# Answer: they have the same median absolute deviation

# other ideas
# population vs sample standard deviation
# it won't be the same, but for the sake of being exhaustive:

# SD and var in R use n-1 for their calulation - the sample calulation
# we can try using the *population* calculation, which is dividing by n

# sum_squares = sum((xi - xbar)^2)
# var = sum_squares / n
# sd = sqrt(var)

sum_squares <- function(x){
  mean_x <- mean(x)
  res <- (x - mean_x)
  sum(res^2)
}

sample_var <- function(x){
  sum_squares(x) / (length(x))
}

sample_var(set_one)
#> [1] 4.138889
sample_var(set_two)
#> [1] 3.805556

sort(set_one)
#> [1] 1 2 3 5 5 7
sort(set_two)
#> [1] 3 5 6 6 8 9

# someone said that when ordered, they have the same mean, median, and mode
# unfortunately this isn't the case (also ordering them and taking these 
# calculations doesn't change their result)
identical(summary(sort(set_one)), summary(set_one))
#> [1] TRUE
identical(summary(sort(set_two)), summary(set_two))
#> [1] TRUE

# but let's calculate the mode anyway:
modes <- function(x) {
  ux <- unique(x)
  tab <- tabulate(match(x, ux))
  ux[tab == max(tab)]
}

modes(set_one)
#> [1] 5
modes(set_two)
#> [1] 6

# and show the sorting is the same
modes(sort(set_one))
#> [1] 5
modes(sort(set_two))
#> [1] 6

# graphic relationship?
plot(set_one, set_two)

# nah

Created on 2023-01-23 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.1 (2022-06-23)
#>  os       macOS Monterey 12.3.1
#>  system   aarch64, darwin20
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Australia/Hobart
#>  date     2023-01-23
#>  pandoc   2.19.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  cli           3.4.1   2022-09-23 [1] CRAN (R 4.2.0)
#>  curl          4.3.3   2022-10-06 [1] CRAN (R 4.2.0)
#>  digest        0.6.30  2022-10-18 [1] CRAN (R 4.2.0)
#>  evaluate      0.18    2022-11-07 [1] CRAN (R 4.2.0)
#>  fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.2.0)
#>  fs            1.5.2   2021-12-08 [1] CRAN (R 4.2.0)
#>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.2.0)
#>  highr         0.9     2021-04-16 [1] CRAN (R 4.2.0)
#>  htmltools     0.5.3   2022-07-18 [1] CRAN (R 4.2.0)
#>  httr          1.4.4   2022-08-17 [1] CRAN (R 4.2.0)
#>  knitr         1.41    2022-11-18 [1] CRAN (R 4.2.0)
#>  lifecycle     1.0.3   2022-10-07 [1] CRAN (R 4.2.0)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.2.0)
#>  mime          0.12    2021-09-28 [1] CRAN (R 4.2.0)
#>  purrr         0.3.5   2022-10-06 [1] CRAN (R 4.2.0)
#>  R.cache       0.16.0  2022-07-21 [1] CRAN (R 4.2.0)
#>  R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.2.0)
#>  R.oo          1.25.0  2022-06-12 [1] CRAN (R 4.2.0)
#>  R.utils       2.12.2  2022-11-11 [1] CRAN (R 4.2.0)
#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.2.0)
#>  reprex        2.0.2   2022-08-17 [1] CRAN (R 4.2.0)
#>  rlang         1.0.6   2022-09-24 [1] CRAN (R 4.2.0)
#>  rmarkdown     2.18    2022-11-09 [1] CRAN (R 4.2.0)
#>  rstudioapi    0.14    2022-08-22 [1] CRAN (R 4.2.0)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.2.0)
#>  stringi       1.7.8   2022-07-11 [1] CRAN (R 4.2.0)
#>  stringr       1.5.0   2022-12-02 [1] CRAN (R 4.2.0)
#>  styler        1.8.1   2022-11-07 [1] CRAN (R 4.2.0)
#>  vctrs         0.5.1   2022-11-16 [1] CRAN (R 4.2.0)
#>  withr         2.5.0   2022-03-03 [1] CRAN (R 4.2.0)
#>  xfun          0.35    2022-11-16 [1] CRAN (R 4.2.0)
#>  xml2          1.3.3   2021-11-30 [1] CRAN (R 4.2.0)
#>  yaml          2.3.6   2022-10-18 [1] CRAN (R 4.2.0)
#> 
#>  [1] /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment