Skip to content

Instantly share code, notes, and snippets.

@dieulinh110
Last active April 20, 2018 06:16
Show Gist options
  • Save dieulinh110/0b5b17f10b2807c648b482682449b3fc to your computer and use it in GitHub Desktop.
Save dieulinh110/0b5b17f10b2807c648b482682449b3fc to your computer and use it in GitHub Desktop.
Calculating the sum of rows in each column group
## LOAD PACKAGE
library(dplyr)
## INPUT
input <- data.frame(president = c("A1", "A1", "A1", "A1", "A1", "A1", "A1"),
vice_president = c("B", "B", "B", "B", "C", "C", "C"),
manager = c("D", "D", "E", "E", "F", "G", "H"),
staff = c("I", "J", "K", "L", "M", "N", "O"),
qty_1 = c(1:7),
qty_2 = c(2:8),
qty_3 = c(1:7),
stringsAsFactors = FALSE)
## PROCESSING
### Group manager
out_manager <- input %>%
group_by(president, vice_president, manager) %>%
summarise_at(vars(qty_1, qty_2, qty_3), funs(sum(., na.rm = TRUE))) %>%
mutate(manager = paste(manager, "Total", sep = " ")) %>%
bind_rows(input) %>%
arrange(president, vice_president, manager)
### Group vice-president
out_vice_president <- input %>%
group_by(president, vice_president) %>%
summarise_at(vars(qty_1, qty_2, qty_3), funs(sum(., na.rm = TRUE))) %>%
mutate(vice_president = paste(vice_president, "Total")) %>%
bind_rows(out_manager) %>%
arrange(president, vice_president)
### Group president
out_president <- input %>%
group_by(president) %>%
summarise_at(vars(qty_1, qty_2, qty_3), funs(sum(., na.rm = TRUE))) %>%
mutate(president = paste(president, "Total")) %>%
bind_rows(out_vice_president) %>%
arrange(president) %>%
mutate_all(funs(replace(., is.na(.), ""))) %>%
mutate_at(vars(president, vice_president), funs(ifelse(grepl("Total", manager), "", .))) %>%
mutate(president = ifelse(grepl("Total", vice_president), "", president))
## OUTPUT
output <- out_president %>%
select(president, vice_president, manager,
staff, qty_1, qty_2, qty_3)
write.csv(output, "output.csv")
@BroVic
Copy link

BroVic commented Mar 10, 2018

Yes, writing your own function(s) would be very necessary if you're thinking of scaling up this operation. Why don't you try it out? Here are two tips:

  1. Write out the logic first.
  2. Be ready to break the problem(s) down into smaller pieces.

I think you will have to break up pipe-chains a bit to accomplish that and if necessary mix in some base R. For instance the first 4 or 5 links in the pipe-chains are quite similar and in line with the DRY principle, should be put into one function. Because you have a progressive decrease in the number of variables by which the grouping is done, you might want to use a recursive function or a loop.

For example

## define a function
rearrange_data <- function(input) {
  ## -- some code to extract the names of variables --
  input %>%
    group...
    summarise...
    mutate...
    bind...
    arrange...
    
    ## return the modified input object
} 

## Use the function
blah <- rearrange_data(input) %>%
    ##... other operations

Usually if you start by dealing with the repetitive point, you should come up with something more manageable. Let me know how it goes!

@dieulinh110
Copy link
Author

Oh yeah, I will try it. Many thank you. It's useful. If I have some problems again, I will question to you.

@BroVic
Copy link

BroVic commented Mar 10, 2018

Alright then. Cheers!

@BroVic
Copy link

BroVic commented Mar 19, 2018

Hello. Did you get around to doing it?

@dieulinh110
Copy link
Author

Hi BroVic, I've been busy with my works, so I don't see your inbox. With my problems, I've had some cases and made it clear. I tried with your solution. It's useful, but I'm having other solution and try it. Good things we can discuss with each other. Don't hesitate if you have problems. I'm ready to help you in any case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment