Skip to content

Instantly share code, notes, and snippets.

Avatar

VedAustin

View GitHub Profile
@VedAustin
VedAustin / hackathon_bp.py
Last active Mar 24, 2018
How to join disparate data sources and map the customer journey through various touch points
View hackathon_bp.py
from pyspark.sql import functions as F
from pyspark.sql import Window
# Read data
user_guid_email = spark.read.json("/mnt/public-blobs/attribution-modelling/data2/id-maps/id-map-email.json")
user_guid_paid_search = spark.read.json("/mnt/public-blobs/attribution-modelling/data2/id-maps/id-map-paid-search.json")
user_guid_social = spark.read.json("/mnt/public-blobs/attribution-modelling/data2/id-maps/id-map-social.json")
guid_event_email = spark.read.parquet("/mnt/public-blobs/attribution-modelling/data2/events-email")
guid_event_paid_search = spark.read.parquet("/mnt/public-blobs/attribution-modelling/data2/events-paid-search")
You can’t perform that action at this time.