Skip to content

Instantly share code, notes, and snippets.

gyli /
Created April 13, 2019 03:48
Processing large CSV chunks unevenly with Pandas and multiprocessing
import multiprocessing
import time
class WorkerPool:
def __init__(self, worker_number):
self.worker_number = worker_number
self.pool = [multiprocessing.Process()] * worker_number
def run(self, target, args=None, sleep_time=1):
gyli /
Last active April 20, 2019 19:22
Find the best number for parameter number_of_routing_shards in Elasticsearch
# Parameter number_of_routing_shards is used for splitting index in Elasticsearch
# Since ES 7.0, it has default value, which is designed to split by factors of 2 up to a maximum of 1024 shards.
# However, depending on the original number of primary shards, the default value might not be the best choice,
# since it might not provide the most possibles the shards could be split to.
# For example, if the current primary shard number is 5, es would give number_of_routing_shards 650 as default value,
# and it allows the index to be split to 10, 20, 40, 80, 160, 320 or 640.
# However, assuming the maximum shard number is still 1024, set number_of_routing_shards to 900 would give the splitting
# more options: 5, 10, 15, 20, 25, 30, 45, 50, 60, 75, 90, 100, 150, 180, 225, 300, 450, 900
gyli / CopyObjectRecursively.scala
Created September 23, 2019 21:10
def CopyObjectRecursively(
s3client: AmazonS3Client,
sourcePath: String,
targetPath: String,
includeSourceBucketInTargetPath: Boolean = false): Unit = {
val sourceURI: AmazonS3URI = new AmazonS3URI(sourcePath)
val targetURI: AmazonS3URI = new AmazonS3URI(targetPath)
val sourceBucket = sourceURI.getBucket()
gyli /
Last active October 9, 2019 06:24
Load parameters with default values in an elegant way in Python
import csv
class CustomCSVReader:
"delimiter": ",",
"doublequote": True,
"escapechar": None,
"lineterminator": "\r\n",
"quotechar": '"',
gyli /
Created February 14, 2020 07:55
Split string but ignore separator wrapped in brackets
import re
example = """Ann,Bob,Cat(Tom,Max),Dave"""
re.split(r',(?![^\(\)]*\))', example)
# Output:
# ['Ann', 'Bob', 'Cat(Tom,Max)', 'Dave']
gyli /
Created March 31, 2020 22:43
Upgrade pipx packages when Python is upgraded
# Once brew upgraded Python, packages installed through pipx might need to be reinstalled
rm -rf ~/.local/pipx/shared/
pipx reinstall-all
gyli /
Created April 27, 2020 08:25
Fetch nested bracketed values from string
template = """head{var1}middle{var2{nested}}end"""
def parenthetic_contents(string):
"""Generate parenthesized contents in string as pairs (level, contents)."""
stack = []
for i, c in enumerate(string):
if c == '{':
elif c == '}' and stack:
start = stack.pop()
gyli /
Last active July 27, 2020 01:36
Bash Alias for Cisco AnyConnect VPN Connection
function vpn {
if [ $1 = "c" ]; then
/opt/cisco/anyconnect/bin/vpn connect "";
elif [ $1 = "d" ]; then
/opt/cisco/anyconnect/bin/vpn disconnect;
kill $(ps aux | grep "[C]isco AnyConnect Secure Mobility Client" | awk '{print $2}') 2>/dev/null;
elif [ $1 = "k" ]; then
kill $(ps aux | grep "[C]isco AnyConnect Secure Mobility Client" | awk '{print $2}') 2>/dev/null;
kill -9 $(ps aux | grep "[/]opt/cisco/anyconnect/bin/vpn connect" | awk '{print $2}') 2>/dev/null;
gyli /
Created December 14, 2021 05:35
Calculate range of Decimal with precision and scale
from typing import Decimal, Tuple
def calculate_decimal_range(precision: int, scale: int) -> Tuple[Decimal, Decimal]:
This method calculates the range of Decimal with given precision and scale.
:return: (min_value, max_value)
precision, scale = Decimal(precision), Decimal(scale)
max_value = 10**(precision-scale) - 10**-scale
return -max_value, max_value
gyli / youtube-restyle.js
Created May 14, 2022 15:55
TamperMonkey Script: Youtube Restyle
// ==UserScript==
// @name Youtube Restyle
// @namespace
// @version 0.1
// @description Update Youtube player position to avoid unexpected scrolling
// @author Guangyang Li
// @match*
// @grant GM_addStyle
// @run-at document-start
// ==/UserScript==