Skip to content

Instantly share code, notes, and snippets.

Solid Git PR Contributor Workflow

A solid Git pull request workflow will keep you from having issues when contributing work to projects of interest. At the core, the idea is simple: keep a local master branch simply as a means of getting the latest official updates from the project's official Git repo so that you can create new branches from it to work on your desired changes. Then, always open PRs from these new branches, and once the PR is merged into the official Git repo, you can simply move back to master, pull those official changes, and then checkout a brand new branch for the next item you wish to work on.

Rsync Tips & Tricks

  • rsync -auzPhv --delete --exclude-from=rsync_exclude.txt SOURCE/ DEST/ -n
    • -a -> --archive; recursively sync, preserving symbolic links and all file metadata
    • -u -> --update; skip files that are newer on the receiver; sometimes this is inaccurate (due to Git, I think...)
    • -z -> --compress; compression
    • -P -> --progress + --partial; show progress bar and resume interupted transfers
    • -h -> --human-readable; human-readable format
    • -v -> --verbose; verbose output
  • -n -> --dry-run; dry run; use this to test, and then remove to actually execute the sync
@sgouda0412
sgouda0412 / install-jupyter.sh
Created October 23, 2025 02:03 — forked from cosmincatalin/install-jupyter.sh
AWS EMR bootstraps to install Jupyter (R, SparkR, Python 2, Python 3, PySpark)
#!/bin/bash
MINICONDA_VERSION="4.3.21"
PANDAS_VERSION="0.20.3"
SCIKIT_VERSION="0.19.0"
while [[ $# > 1 ]]; do
key="$1"
case $key in
from __future__ import print_function
import sys
import re
from operator import add
import pandas as pd
from pyspark.sql.types import StructField, StructType, StringType
from pyspark.sql import Row
from pyspark.sql.types import *
from pyspark.sql import SQLContext
import json
#!/usr/bin/env python
# encoding: utf-8
# This file lives in tests/project_test.py in the usual disutils structure
# Remember to set the SPARK_HOME evnironment variable to the path of your spark installation
import logging
import sys
import unittest
@sgouda0412
sgouda0412 / primes_with_numba.py
Created October 22, 2025 05:24 — forked from SajjadAemmi/primes_with_numba.py
Numba makes Python code fast
import math
import time
from numba import njit
@njit(fastmath=True, cache=True)
def is_prime(num):
if num == 2:
return True
elif num <= 1 or num % 2 == 0:
@sgouda0412
sgouda0412 / tweetthreader.py
Created October 22, 2025 05:21 — forked from pratikone/tweetthreader.py
This script fetches and creates threads from twitter statuses of a twitter profile. A thread is a series of tweets created by replying to your own tweet.
import os
import re
import time
from collections import namedtuple
import codecs
import tweepy
import json
from datetime import datetime
from requests.exceptions import Timeout, ConnectionError
from requests.packages.urllib3.exceptions import ReadTimeoutError, ProtocolError
@sgouda0412
sgouda0412 / functional_style.py
Created October 22, 2025 05:18 — forked from serge-sans-paille/functional_style.py
Python - functional style!
import ast
import sys
import shutil
import unparse
import unittest
import doctest
import StringIO
import os
from copy import deepcopy
@sgouda0412
sgouda0412 / useful_python_snippets.py
Created October 22, 2025 05:16 — forked from fomightez/useful_python_snippets.py
Useful Python snippets
# These are meant to work in both Python 2 and 3, except where noted.
# See my useful_pandas_snippets.py for those related to dataframes (such as pickling/`df.to_pickle(save_as)`)
# https://gist.github.com/fomightez/ef57387b5d23106fabd4e02dab6819b4
# also see https://gist.github.com/fomightez/324b7446dc08e56c83fa2d7af2b89a33 for examples of my
# frequently used Python functions and slight variations for more expanded, modular structures.
#argparse
# good snippet collection at https://mkaz.tech/code/python-argparse-cookbook/
@sgouda0412
sgouda0412 / processify.py
Created October 22, 2025 05:05 — forked from iamzjk/processify.py
processify
import os
import sys
import traceback
from functools import wraps
from multiprocessing import Process, Queue
def processify(func):
'''Decorator to run a function as a process.
Be sure that every argument and the return value