Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

View VedAustin's full-sized avatar
🚵‍♂️

Vedraj VedAustin

🚵‍♂️
  • Redmond
View GitHub Profile
@VedAustin
VedAustin / Slots_demo.py
Created June 19, 2022 19:40
Perfomance improvement using slots
import logging
import timeit
from dataclasses import dataclass
from functools import partial
from statistics import median
logging.basicConfig(level=logging.INFO)
@dataclass(slots=True)
@VedAustin
VedAustin / hackathon_bp.py
Last active March 24, 2018 16:08
How to join disparate data sources and map the customer journey through various touch points
from pyspark.sql import functions as F
from pyspark.sql import Window
# Read data
user_guid_email = spark.read.json("/mnt/public-blobs/attribution-modelling/data2/id-maps/id-map-email.json")
user_guid_paid_search = spark.read.json("/mnt/public-blobs/attribution-modelling/data2/id-maps/id-map-paid-search.json")
user_guid_social = spark.read.json("/mnt/public-blobs/attribution-modelling/data2/id-maps/id-map-social.json")
guid_event_email = spark.read.parquet("/mnt/public-blobs/attribution-modelling/data2/events-email")
guid_event_paid_search = spark.read.parquet("/mnt/public-blobs/attribution-modelling/data2/events-paid-search")