Skip to content

Instantly share code, notes, and snippets.

View githoov's full-sized avatar

Scott Hoover githoov

View GitHub Profile
# query: https://metanew.looker.com/x/HBhS99t
# median pickup time
df <- read.csv(file = "~/Downloads/pickups.csv", header = TRUE)
names(df) <- c("id", "month", "pickup_time")
ddply(df, .(month), summarize, median_duration = median(pickup_time, na.rm = TRUE))
# load data
durations <- read.csv(file = "~/Downloads/trial_durations.csv", header = TRUE)
names(durations) <- c("id", "month", "duration", "won")
# summary of time to close by closed won
summary.df <- ddply(durations, .(month, won), summarize, median_duration = median(duration))
# time to close
ggplot(durations, aes(x = month, y = duration, colour = won)) + geom_boxplot() + ggtitle("Trials: Time to Close") + xlab("Time") + ylab("Time to Close (Days)")
# load data
df <- read.csv(file = "~/Downloads/cadence.csv", header = TRUE)
names(df) <- c("id", "month", "in_trial", "topic", "number_of_chats", "median_time_to_reply", "iqr")
# create a summary table for inspection
summary.df <- ddply(na.omit(subset(df, number_of_chats > 5)), .(month), summarize, cadence = median(median_time_to_reply), spread = median(iqr))
# plot median of medians over time
ggplot(na.omit(subset(df, number_of_chats > 5)), aes(x = month, y = median_time_to_reply)) + stat_summary(fun.data = mean_cl_normal)
# input queries
# tickets: https://metanew.looker.com/x/CQ9zcpw
# discourse views: https://metanew.looker.com/x/yYf3CPH
# load data
tickets <- read.csv(file = "~/Downloads/ticket_topics.csv", header = TRUE)
articles <- read.csv(file = "~/Downloads/discourse_topics.csv", header = TRUE)
# rename columns
names(tickets) <- c("week", "topic", "chats")
# preliminaries
# using pinger
# 2014 (control): https://metanew.looker.com/sql/cb77dv2thvbrxd
# 2015 (treat): https://metanew.looker.com/sql/xqttksqxp3qgrn
# using license
# 2014 (control): https://metanew.looker.com/sql/fm8hgny7f9yrkh
# 2015 (treat): https://metanew.looker.com/sql/rq2sxqkwqmc7z2
# read in 2014 and 2015 data sets
Y2014 <- read.csv(file = "~/Downloads/sql_runner_fm8hgny7f9yrkh_2016-04-09_05-37-37.csv", header = TRUE)
@githoov
githoov / zendesk_chat_events.yaml
Last active February 22, 2016 06:49
Zendesk Chat Events
# preliminaries #
- connection: your_connection
- scoping: true
- case_sensitive: false
- explore: zendesk_chat_events
- view: zendesk_chat_events
derived_table:
@githoov
githoov / view_definitions.yml
Created January 7, 2016 21:23
Redshift View Definitions
- explore: view_definitions
from: pg_views
hidden: true
- view: pg_views
sql_table_name: pg_views
fields:
- dimension: definition
sql: ${TABLE}.definition
@githoov
githoov / recent_data_load.yml
Created January 7, 2016 21:21
Redshift Recent Data Loads
- explore: data_loads
hidden: true
- view: data_loads
derived_table:
sql: |
select replace(regexp_substr(filename, '//[a-zA-Z0-9\-]+/'), '/', '') as root_bucket
, case
when replace(regexp_substr(filename, '//[a-zA-Z0-9\-]+/'), '/', '') = 'snowplow-looker-emr-out'
then 'snowplow'
else regexp_replace(regexp_substr(filename, '/[a-zA-Z0-9\\-\\_]+\\.[a-zA-Z]{3,4}'), '(/|[0-9]{2,})', '')
@githoov
githoov / load_errors.yml
Created January 7, 2016 21:19
Redshift Load Errors
- explore: etl_errors
label: 'ETL Errors'
hidden: true
- view: etl_errors
derived_table:
sql: |
select distinct errors.starttime as error_time
, detail.filename as file_name
, errors.err_reason
from stl_loaderror_detail as detail, stl_load_errors as errors
@githoov
githoov / table_skew.yml
Created January 7, 2016 21:18
Redshift Table Skew
- explore: table_skew
hidden: true
- view: table_skew
derived_table:
sql: |
SELECT *
FROM svv_table_info
fields:
- dimension: database