Skip to content

Instantly share code, notes, and snippets.

@mattparker-wf
Created November 7, 2014 16:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mattparker-wf/8da546627b26684fadd0 to your computer and use it in GitHub Desktop.
Save mattparker-wf/8da546627b26684fadd0 to your computer and use it in GitHub Desktop.
# An alternative approach
# First - a quick function for counting unique values that excludes NAs
count.unique <- function(x) { length(unique(x[!is.na(x)])) }
# Compare:
length(unique(NA))
count.unique(NA)
# Then, using plyr to aggregate
collapsed_timelog <- ddply(collapsed_history,
.var = c("Account.Name", "Quarter.End", "filing.estimate"),
.fun = function(x) {
# Grab the appropriate subset of timelog
x_timelog <- subset(timelog,
subset = Account.Name %in% x$Account.Name &
Billable %in% 1 &
Date >= x$Quarter.End &
Date <= x$filing.estimate &
is.na(Date)
)
# Aggregate using summarise - basically generates a data.frame with jus
# the variables I name on the 2nd and 3rd lines
summarise(x_timelog,
billable_time = sum(Hours),
concurrent_services = count.unique(x_timelog$Services.ID)
)
})
# Then merging the aggregated results
collapsed_history_time <- merge(x = collapsed_history,
y = collapsed_timelog,
by = c("Account.Name", "filing.estimate"),
all = TRUE
)
@mbreecher
Copy link

Figured out the warning. The subset is conditioning the date with a vector of dates. So, if we say Date >= unique(x$Quarter.End) & Date <= unique(x$filing.estimate), the function has a 1:1 and finishes without any drama. Thanks so much for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment