Skip to content

Instantly share code, notes, and snippets.

@riinuots
Last active July 23, 2017 21:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save riinuots/b362b1aad497bb7480b0dae18beeb087 to your computer and use it in GitHub Desktop.
Save riinuots/b362b1aad497bb7480b0dae18beeb087 to your computer and use it in GitHub Desktop.
library(dplyr)   # gives mutate_if
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(forcats) # gives fct_explicit_na

#example dataframe, a and c are factors,
#b is numeric, d is boolean (TRUE/FALSE)
mydata = data.frame(
  a = c( 'Yes', 'No',  NA),
  b = c(  0.5,   NA,   0.6),
  c = c( 'No',   NA,  'Yes'),
  d = c( TRUE,   NA,   FALSE)
)

# Making the missing fields in the columns which are factors explicit
#(by default, fct_explicit_na changes NAs "(Missing)"):
newdata1 = mydata %>% 
  mutate_if(is.factor, fct_explicit_na)

# changing the missing label to "Dunno"
#(note how the syntax is a little bit different 
# than when using fct_explicit_na on a single column)
newdata2 = mydata %>% 
  mutate_if(is.factor, fct_explicit_na, na_level = 'Dunno')


# on a single column it would look like:
mydata$a %>% fct_explicit_na(na_level = 'Dunno')
#> [1] Yes   No    Dunno
#> Levels: No Yes Dunno

mydata 
#>      a   b    c     d
#> 1  Yes 0.5   No  TRUE
#> 2   No  NA <NA>    NA
#> 3 <NA> 0.6  Yes FALSE
newdata1
#>           a   b         c     d
#> 1       Yes 0.5        No  TRUE
#> 2        No  NA (Missing)    NA
#> 3 (Missing) 0.6       Yes FALSE
newdata2
#>       a   b     c     d
#> 1   Yes 0.5    No  TRUE
#> 2    No  NA Dunno    NA
#> 3 Dunno 0.6   Yes FALSE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment