-
-
Save dieulinh110/0b5b17f10b2807c648b482682449b3fc to your computer and use it in GitHub Desktop.
## LOAD PACKAGE | |
library(dplyr) | |
## INPUT | |
input <- data.frame(president = c("A1", "A1", "A1", "A1", "A1", "A1", "A1"), | |
vice_president = c("B", "B", "B", "B", "C", "C", "C"), | |
manager = c("D", "D", "E", "E", "F", "G", "H"), | |
staff = c("I", "J", "K", "L", "M", "N", "O"), | |
qty_1 = c(1:7), | |
qty_2 = c(2:8), | |
qty_3 = c(1:7), | |
stringsAsFactors = FALSE) | |
## PROCESSING | |
### Group manager | |
out_manager <- input %>% | |
group_by(president, vice_president, manager) %>% | |
summarise_at(vars(qty_1, qty_2, qty_3), funs(sum(., na.rm = TRUE))) %>% | |
mutate(manager = paste(manager, "Total", sep = " ")) %>% | |
bind_rows(input) %>% | |
arrange(president, vice_president, manager) | |
### Group vice-president | |
out_vice_president <- input %>% | |
group_by(president, vice_president) %>% | |
summarise_at(vars(qty_1, qty_2, qty_3), funs(sum(., na.rm = TRUE))) %>% | |
mutate(vice_president = paste(vice_president, "Total")) %>% | |
bind_rows(out_manager) %>% | |
arrange(president, vice_president) | |
### Group president | |
out_president <- input %>% | |
group_by(president) %>% | |
summarise_at(vars(qty_1, qty_2, qty_3), funs(sum(., na.rm = TRUE))) %>% | |
mutate(president = paste(president, "Total")) %>% | |
bind_rows(out_vice_president) %>% | |
arrange(president) %>% | |
mutate_all(funs(replace(., is.na(.), ""))) %>% | |
mutate_at(vars(president, vice_president), funs(ifelse(grepl("Total", manager), "", .))) %>% | |
mutate(president = ifelse(grepl("Total", vice_president), "", president)) | |
## OUTPUT | |
output <- out_president %>% | |
select(president, vice_president, manager, | |
staff, qty_1, qty_2, qty_3) | |
write.csv(output, "output.csv") | |
My code is still long. That is 3 groups. In the future, the number of groups is raising. So, with my code, they're impossile. Do you have any other solutions? I think I will write a function, but they have fewer common charateristics. I'm a failure in working that.
Yes, writing your own function(s) would be very necessary if you're thinking of scaling up this operation. Why don't you try it out? Here are two tips:
- Write out the logic first.
- Be ready to break the problem(s) down into smaller pieces.
I think you will have to break up pipe-chains a bit to accomplish that and if necessary mix in some base R. For instance the first 4 or 5 links in the pipe-chains are quite similar and in line with the DRY principle, should be put into one function. Because you have a progressive decrease in the number of variables by which the grouping is done, you might want to use a recursive function or a loop.
For example
## define a function
rearrange_data <- function(input) {
## -- some code to extract the names of variables --
input %>%
group...
summarise...
mutate...
bind...
arrange...
## return the modified input object
}
## Use the function
blah <- rearrange_data(input) %>%
##... other operations
Usually if you start by dealing with the repetitive point, you should come up with something more manageable. Let me know how it goes!
Oh yeah, I will try it. Many thank you. It's useful. If I have some problems again, I will question to you.
Alright then. Cheers!
Hello. Did you get around to doing it?
Hi BroVic, I've been busy with my works, so I don't see your inbox. With my problems, I've had some cases and made it clear. I tried with your solution. It's useful, but I'm having other solution and try it. Good things we can discuss with each other. Don't hesitate if you have problems. I'm ready to help you in any case.
Okay, I've studied the code and also run in locally. It all works well. What issues do you have with it?