Skip to content

Instantly share code, notes, and snippets.

@bhive01
Last active February 4, 2016 15:55
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bhive01/6698e012c4388c8d3c14 to your computer and use it in GitHub Desktop.
Save bhive01/6698e012c4388c8d3c14 to your computer and use it in GitHub Desktop.
require(dplyr)
# input data frame
df1 <- data.frame(trait=rep(1:10, 4), Trait.Code = rep(LETTERS[1:4], each = 10), Alpha_Value= rep(c(NA, "1", NA, "Alphastuff"), each = 10), Number_Value = rep(c(3, NA, 6, NA), each = 10), stringsAsFactors= FALSE)
df1 %>%
mutate(n=row_number()) %>% #add index for later join
filter(!is.na(Alpha_Value), Alpha_Value != "NULL") %>% # Alpha_Value != NULL or NA
group_by(Trait.Code) %>% #group for mutate
mutate(Converted_Value = ifelse(any(is.na(as.numeric(.$Alpha_Value))), NA, as.numeric(Alpha_Value))) %>%
# if upon conversion within a group any values are not convertible to numeric, then return NA, else convert number to number
ungroup() %>% # ungroup for select to work
select(n, Converted_Value)%>% # select key columns for join
left_join(mutate(df1, n=row_number()), ., key = "n") %>% # join converted back
mutate(Number_Value = ifelse(is.na(Converted_Value), Number_Value, Converted_Value)) %>% # copy numeric values over to numeric column
select(-Converted_Value)
## Works! Thanks Ben
df1 %>%
filter(!is.na(Alpha_Value), Alpha_Value != "NULL") %>% # Alpha_Value != NULL or NA
group_by(Trait.Code) %>%
summarise(Converted_Value = ifelse(any(is.na(as.numeric(Alpha_Value))), NA, TRUE)) %>%
ungroup() %>%
left_join(df1, ., key = "Trait.Code") %>%
## needed to add this afterwards. If value is NA, ifelse returns NA not FALSE in final statement
mutate(Converted_Value = ifelse(is.na(Converted_Value), FALSE, Converted_Value)) %>%
mutate(Number_Value = ifelse(Converted_Value == TRUE, as.numeric(Alpha_Value), Number_Value))
@bpbond
Copy link

bpbond commented Feb 3, 2016

Wouldn't it be simpler to just do

  mutate(Converted_Value = suppressWarnings(as.numeric(Alpha_Value)))

This seems to work fine.

@bhive01
Copy link
Author

bhive01 commented Feb 3, 2016

@bpbond, easier but not really what I need. I'm trying to isolate poor data entry practices. In our DB, numeric data can be stored in "text" column. If all of the data under a given Trait.Code is numeric I want to convert that over to numeric. If all of the data is not numeric... I want to leave it as it is. Which is why I think I need the group_by and the conditional mutate.

@bpbond
Copy link

bpbond commented Feb 3, 2016

Behavior still seems correct to me though. For example, this correctly identifies groups by the criterion:

df1 %>% 
  group_by(Trait.Code) %>% 
  summarise(something_not_numeric = any(is.na(as.numeric(Alpha_Value))))

@bhive01
Copy link
Author

bhive01 commented Feb 3, 2016

Yup... combine that with a join to get that information back and it works. Just wasn't thinking about it in steps... trying to get everything in a single line. Thanks @bpbond!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment