Skip to content

Instantly share code, notes, and snippets.

View karenyyng's full-sized avatar

Karen Ng karenyyng

View GitHub Profile
@ltrainpr
ltrainpr / software_to_data_engineer.md
Last active July 8, 2023 21:22
From Software to Data Engineer

Data Engineer's Responsibilities (not all encompassing):

  • Building data platforms
  • Define data architecture and data modeling
  • Handle data in various formats
  • Create ETL or ELT pipelines as well as streaming data pipelines
  • Schedule and deploy pipelines
  • Build frameworks or code for data management activities
  • Make data accessible with right governance in place
  • Enable self service access to data
@cccntu
cccntu / csv.py
Created February 8, 2021 11:58
python mmap to concatenate csv files
❯ rm out.csv
❯ cat 1.py
from glob import glob
import mmap
files = glob("data/*")
files.sort(key=lambda x: int(x.split("/")[-1].split(".")[0]))
write_f = open("out.csv", "w+b")
@wesm
wesm / parquet-benchmark-20170210.py
Created February 10, 2017 18:07
Parquet multithreaded benchmarks
import gc
import os
import time
import numpy as np
import pandas as pd
from pyarrow.compat import guid
import pyarrow as pa
import pyarrow.parquet as pq
import snappy
@nfaggian
nfaggian / pool.py
Last active July 30, 2021 17:12
Multiprocessing example
from __future__ import print_function
import multiprocessing
import ctypes
import numpy as np
def shared_array(shape):
"""
Form a shared memory numpy array.
@dfm
dfm / line.py
Created July 15, 2013 15:38
emcee: line with 2D errors.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import division, print_function
import emcee
import numpy as np
import matplotlib.pyplot as pl
np.random.seed(123)
@disnet
disnet / gist:4489250
Last active August 13, 2019 15:04
osx - force skim to always autoupdate
defaults write -app Skim SKAutoReloadFileUpdate -boolean true
@uasi
uasi / vim.rb
Created November 30, 2010 16:46
Vim formula for Homebrew (EDIT: recent versions of official Homebrew distribution includes one)
require 'formula'
class Vim < Formula
homepage 'http://www.vim.org/'
url 'ftp://ftp.vim.org/pub/vim/unix/vim-7.3.tar.bz2'
head 'https://vim.googlecode.com/hg/'
sha256 '5c5d5d6e07f1bbc49b6fe3906ff8a7e39b049928b68195b38e3e3d347100221d'
version '7.3.682'
def features; %w(tiny small normal big huge) end