Skip to content

Instantly share code, notes, and snippets.

@austospumanto
austospumanto / execute_concurrently_without_pool.py
Created May 8, 2019 21:33
Executing CPU-intensive workloads via multiprocessing.Process sub-processes
import multiprocessing
import pickle
import struct
from typing import Optional, Callable, List, Any
from tqdm import tqdm
import pickle
import struct
@austospumanto
austospumanto / SharedMemoryManager.py
Last active June 24, 2019 06:28
SharedMemoryManager
"""
System/Runtime Requirements:
Python3.7
Linux / Mac
Must pip install:
numpy, pandas
Must build from source:
shared_memory (https://github.com/SleepProgger/py_shared_memory)
@austospumanto
austospumanto / gbq_utils.py
Created July 23, 2019 01:57
Python Google BigQuery Utility Functions
"""
pip install \
bigquery-schema-generator \
google-api-python-client \
pandas-gbq \
pandas
"""
import logging
import os
@austospumanto
austospumanto / correlation.py
Created July 23, 2019 02:36
Use Numba to Quickly Calculate Pearson and Cramer's Correlation Coefficients
"""
pip install \
numpy \
pandas \
numba
"""
import time
from dataclasses import dataclass
from functools import wraps
@austospumanto
austospumanto / pdutils.py
Created July 23, 2019 02:54
Some Python Pandas Utility Functions (Serialization, Data Type Casting)
"""
pip install \
pandas \
sqlalchemy \
pyarrow
"""
import json
import os
import time
@austospumanto
austospumanto / diskcached.py
Created August 3, 2019 02:00
Diskcached: Like functools.lru_cache, but pickled to disk
"""
from .diskcached import clear_all, diskcached
@diskcached()
def cached_fn(x):
print(x)
return x + 2
@austospumanto
austospumanto / base.py
Last active August 28, 2019 05:49
"sqlalchemy/dialects/mssql/base.py" for `sqlacodegen` w/ Azure SQL Data Warehouse
"""
Modified file (<VIRTUALENV>/lib/python3.7/site-packages/sqlalchemy/dialects/mssql/base.py)
for sqlalchemy==1.3.7 to get it to work with sqlacodegen==2.1.0
for an Azure SQL Data Warehouse server using the following get_engine function:
```
def get_engine(username=None, password=None, hostname=None, database=None) -> Engine:
from skykick_ds.runconfig import RunConfig
username = username or RunConfig.sql_username
@austospumanto
austospumanto / profiling.py
Last active November 20, 2019 05:33
Memory & Time Profiling Decorators
"""
################
# Installation #
################
memory_profiler:
pip install memory-profiler
pympler:
pip install pympler
@austospumanto
austospumanto / processit.py
Last active August 31, 2020 02:36
processit
"""
System/Runtime Requirements:
>=Python3.7
Linux / Mac
>=2 CPU Cores
Must pip install to run `processit`:
pickle5, tqdm
Must pip install to run the tests and some functions:
@austospumanto
austospumanto / pd_processit.py
Last active August 31, 2020 02:36
Pandas groupby-apply using processit
"""
System/Runtime Requirements:
>=Python3.7
Linux / Mac
>=2 CPU Cores
Must pip install to use `pd_processit`:
pickle5, tqdm, numpy, pandas
To use this file, have processit.py in same folder as pd_processit.py (this file).