Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save mccarthy-m-g/313319befc092f7cfe8f8758cf3dcb4a to your computer and use it in GitHub Desktop.
Save mccarthy-m-g/313319befc092f7cfe8f8758cf3dcb4a to your computer and use it in GitHub Desktop.

Question

Dear #rstats community, I have a problem. I have a df with multiple char columns which I want to coerce to factors and a list of pre-defined factor level orders (vectors) that I want to apply in the process. Now apparently this isn't a wide-spread thing to do. I am using #tidyverse code.

My approach of

df <- df |> mutate(across(all_of(charvars), ~factor(., levels= levellist[[which(factors==cur_column())]]))

throws me an error telling me it "cant compute the first column ("who")" followed by "internal error"...backtrace attached. My problem is: I used the same approach literally 3 lines above on different subsets before and it works!

I'm stuck. Help would be very appreciated.

https://infosec.exchange/@odr_k4tana/110515347173876260

Code

Here's a simpler approach, just directly indexing into the named level list using the current column name, rather than using which():

library(dplyr)

df <- tibble(lower = c("b", "c", "a"), upper = c("C", "A", "B"))

# The desired factor levels are: a, b, c and A, B, C.
factor_levels <- list(lower = letters[1:3], upper = LETTERS[1:3])

df <- df |>
  mutate(
    across(
      where(is.character),
      \(.x) factor(.x, levels = factor_levels[[cur_column()]])
    )
  )

# Both columns are factors now.
df
#> # A tibble: 3 × 2
#>   lower upper
#>   <fct> <fct>
#> 1 b     C    
#> 2 c     A    
#> 3 a     B

# And their levels are in the desired order:
str(df)
#> tibble [3 × 2] (S3: tbl_df/tbl/data.frame)
#>  $ lower: Factor w/ 3 levels "a","b","c": 2 3 1
#>  $ upper: Factor w/ 3 levels "A","B","C": 3 1 2

Created on 2023-06-09 with reprex v2.0.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment