Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
R: Replacing NAs in all factors with 'Missing'
library(dplyr) # gives mutate_if
library(forcats) # gives fct_explicit_na
#example dataframe, a and c are factors,
#b is numeric, d is boolean (TRUE/FALSE)
mydata = data.frame(
a = c( 'Yes', 'No', NA),
b = c( 0.5, NA, 0.6),
c = c( 'No', NA, 'Yes'),
d = c( TRUE, NA, FALSE)
)
# Making the missing fields in the columns which are factors explicit
#(by default, fct_explicit_na changes NAs "(Missing)"):
newdata1 = mydata %>%
mutate_if(is.factor, fct_explicit_na)
# changing the missing label to "Dunno"
#(note how the syntax is a little bit different
# than when using fct_explicit_na on a single column)
newdata2 = mydata %>%
mutate_if(is.factor, fct_explicit_na, na_level = 'Dunno')
# on a single column it would look like:
mydata$a %>% fct_explicit_na(na_level = 'Dunno')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment