Skip to content

Instantly share code, notes, and snippets.

View valiantone's full-sized avatar
🌪️
learning to embrace chaos

Zubin J valiantone

🌪️
learning to embrace chaos
View GitHub Profile
@valiantone
valiantone / pandas_s3_streaming.py
Created November 30, 2021 15:01 — forked from uhho/pandas_s3_streaming.py
Streaming pandas DataFrame to/from S3 with on-the-fly processing and GZIP compression
def s3_to_pandas(client, bucket, key, header=None):
# get key using boto3 client
obj = client.get_object(Bucket=bucket, Key=key)
gz = gzip.GzipFile(fileobj=obj['Body'])
# load stream directly to DF
return pd.read_csv(gz, header=header, dtype=str)
def s3_to_pandas_with_processing(client, bucket, key, header=None):
@valiantone
valiantone / window_functions.md
Last active June 19, 2021 08:30
SQL queries and examples

Weekly, Monthly Active Emails

WITH
    -- this is your original query, with the ISO week and month number added.
    members_log_aggr(login_date,  year_nbr, iso_week_nbr, month_nbr, email_count) AS
    (
        SELECT
            CAST(ml.login AS Date),
            DATEPART(YEAR, ml.login),
            DATEPART(ISO_WEEK, ml.login),
@valiantone
valiantone / bellhops-archive.md
Last active April 9, 2024 10:28
Work Hard, Play Harder

Transitioning from Tag Clouds to Tag Trees

In the packages paradigm - each packaged selection/offering is a bundle of semi-unique characteristics - lets term these as attributes. When a package bundle is selected it instantly informs our controller that certain atrributes are ground truth for this move. The current process introduces assumptions rather than ground truth. For instance let's observe Use Case 1.

Use Case 1: Studio Package Bundle

Facts:
    - Two Bellhops on the move
    - Duration of ~ 2hours
    - 16 Foot Moving truck required

Installing Cool-Retro-Term on Windows10

First of all, this document is just a recompilation of different resources that already existed on the web previously that I personally tested some ones did work and other not. I liked the idea to make a full guide from start to end so all of you could also enjoy playing with cool-retro-term on windows 10. Personally I installed it on a windows 10 pro version. Fingers crossed!

result

train = pd.DataFrame([
    {"Name": "Olyphant", "FamilySize": 1},
    {"Name": "Rodent", "FamilySize": 3},
    {"Name": "Possum", "FamilySize": 1},
])

sub = train[train["FamilySize"] == 1]
sub["isAlone"] = 1
train
@valiantone
valiantone / mongo_to_csv.py
Created February 28, 2020 15:19 — forked from mieitza/mongo_to_csv.py
python mongo to csv use pandas.
# @Author: xiewenqian <int>
# @Date: 2016-11-28T20:35:09+08:00
# @Email: wixb50@gmail.com
# @Last modified by: int
# @Last modified time: 2016-12-01T19:32:48+08:00
import pandas as pd
from pymongo import MongoClient
@valiantone
valiantone / pandas_labeled_csv_import_to_mongo.py
Created February 28, 2020 15:18 — forked from jxub/pandas_labeled_csv_import_to_mongo.py
A simple mongoimport for importing csv files with python and pymongo
import pandas as pd
from pymongo import MongoClient
import json
def mongoimport(csv_path, db_name, coll_name, db_url='localhost', db_port=27000)
""" Imports a csv file at path csv_name to a mongo colection
returns: count of the documants in the new collection
"""
client = MongoClient(db_url, db_port)
db = client[db_name]