Skip to content

Instantly share code, notes, and snippets.

View githoov's full-sized avatar

Scott Hoover githoov

View GitHub Profile
@githoov
githoov / gist:02057029962363eb6167
Created September 5, 2014 19:41
Interquartile Rule for Outliers
- view: outlier
derived_table:
sql: |
SELECT (PERCENTILE_DISC(0.75) WITHIN GROUP(ORDER BY days_between_posts) OVER()
+ (PERCENTILE_DISC(0.75) WITHIN GROUP(ORDER BY days_between_posts) OVER()
- PERCENTILE_DISC(0.25) WITHIN GROUP(ORDER BY days_between_posts) OVER()) * 1.5)::INT AS cutoff
FROM ${consumer_posts.SQL_TABLE_NAME} AS consumer_posts
LIMIT 1
sql_trigger_value: SELECT CURRENT_DATE
sortkeys: [cutoff]
- base_view: cohorts
always_filter:
event_date: 'this week'
type: activity, user
- view: cohorts
derived_table:
sql: |
SELECT 'activity' AS type
, user_id
@githoov
githoov / gist:c0122dfaafaf5fdd1804
Last active August 29, 2015 14:06
production_vs_development_derived_tables

Quite often it's beneficial to place a restriction on an SQL query while proptyping/testing a transformation. To make this a bit easier, one has the ability to place an in-line comment within a derived table to place a restriction while in developer mode, and, once the derived table is pushed, have it not apply or simply differ in some way.

In the following example, I would like to do a funnel analysis, but limit this transformation to the past 30 days while in developer mode. No resitriction exists once pushed to production.

- view: funnel
  derived_table:
    sql: |
      SELECT user_id
        , COUNT(*) AS lifetime_events
@githoov
githoov / gist:942cc3addc82813790ed
Created September 16, 2014 02:28
date_series_mysl
- view: day_sequence
derived_table:
sql: |
SELECT DATE(DATE_ADD('2010-01-01', INTERVAL @i := @i + 1 day)) AS series
FROM orders, (SELECT @i := 0) AS i_table
WHERE @i < DATEDIFF(CURDATE(), '2010-01-01') + 1
sql_trigger_value: SELECT CURDATE()
indexes: [series]
fields:
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@githoov
githoov / output.json
Created February 27, 2015 19:40
us_msa
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@githoov
githoov / gist:c7fdc8904ffd24ae6b34
Created May 26, 2015 23:16
Custom JS Visualizations

#Custom Visualization Framework

##The Basics

# PRELIMINARIES #
- connection: meta
- scoping: true # for backward compatibility
- include: "dwh.mrr.view.lookml"
- include: "dwh.mrr_planned.view.lookml"
- include: "dwh.account.view.lookml"
- include: "dwh.account_facts.view.lookml"
- include: "dwh.opportunity.view.lookml"
- include: "dwh.opportunity_actualized.view.lookml"
@githoov
githoov / config.yml
Created October 20, 2015 17:53
Snowplow Config
aws:
access_key_id: my_id
secret_access_key: my_key
s3:
region: us-east-1
buckets:
assets: s3://snowplow-hosted-assets # DO NOT CHANGE unless you are hosting the jarfiles etc yourself in your own bucket
jsonpath_assets: s3://snowplow-looker/jsonpaths
log: s3n://snowplow-looker-emr-log/
raw:
@githoov
githoov / generate.r
Last active September 1, 2016 17:22
R Script to Create a Survival Plot and to Generate a Sample Data Set
# preliminaires
library("ggplot2")
library("zoo")
set.seed(111)
# generate plot of survival curve
x <- sort(dexp(seq(0, 1, 0.01)), decreasing = TRUE)
ggplot(data.frame(x = c(0, 5)), aes(x)) + stat_function(fun = dexp, args = list(rate = 1)) + scale_x_continuous(labels=c(expression(t["0"], t["1"], t["2"], t["3"], t["4"], t["5"]))) + labs(x = "Time", y = expression(y = P(T > t["i"])), title = "Survival Function")
# simulate subscription data