Skip to content

Instantly share code, notes, and snippets.

View lxneng's full-sized avatar
🎯
Focusing

Eric Luo lxneng

🎯
Focusing
View GitHub Profile
@lxneng
lxneng / latency_numbers.md
Created March 1, 2024 10:02 — forked from GLMeece/latency_numbers.md
Latency Numbers Every Programmer Should Know - MarkDown Fork

Latency Comparison Numbers

Note: "Forked" from Latency Numbers Every Programmer Should Know

Event Nanoseconds Microseconds Milliseconds Comparison
L1 cache reference 0.5 - - -
Branch mispredict 5.0 - - -
L2 cache reference 7.0 - - 14x L1 cache
Mutex lock/unlock 25.0 - - -
@lxneng
lxneng / latency.txt
Created March 1, 2024 09:58 — forked from jboner/latency.txt
Latency Numbers Every Programmer Should Know
Latency Comparison Numbers (~2012)
----------------------------------
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy 3,000 ns 3 us
Send 1K bytes over 1 Gbps network 10,000 ns 10 us
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD
import streamlit as st
import os
import sys
import importlib.util
# Parse command-line arguments.
if len(sys.argv) > 1:
folder = os.path.abspath(sys.argv[1])
else:
folder = os.path.abspath(os.getcwd())
@lxneng
lxneng / notes.md
Created November 24, 2022 15:59 — forked from ian-whitestone/notes.md
Best practices for presto sql

Presto Specific

  • Don’t SELECT *, Specify explicit column names (columnar store)
  • Avoid large JOINs (filter each table first)
    • In PRESTO tables are joined in the order they are listed!!
    • Join small tables earlier in the plan and leave larger fact tables to the end
    • Avoid cross joins or 1 to many joins as these can degrade performance
  • Order by and group by take time
    • only use order by in subqueries if it is really necessary
  • When using GROUP BY, order the columns by the highest cardinality (that is, most number of unique values) to the lowest.
@lxneng
lxneng / sidekiq_cheat_sheet.md
Created November 16, 2021 07:10 — forked from wakproductions/sidekiq_cheat_sheet.md
Sidekiq Commands Cheat Sheet
@lxneng
lxneng / mysql-big-deletes.py
Last active August 24, 2021 03:22
Delete millions of rows from MySQL/TiDB
import os
import pymysql
if __name__ == '__main__':
ret = 1
conn = pymysql.connect(
host='tidb-cluster.dm',
port=4000,
user='dm',
@lxneng
lxneng / install-azkaban.md
Created January 19, 2021 05:54 — forked from greenqy/install-azkaban.md
install-azkaban.md
@lxneng
lxneng / LearnGoIn5mins.md
Created January 15, 2021 02:57 — forked from prologic/LearnGoIn5mins.md
Learn Go in ~5mins
@lxneng
lxneng / StreamCatsToHBase.py
Created August 20, 2020 14:21 — forked from MallikarjunaG/StreamCatsToHBase.py
PySpark HBase and Spark Streaming: Save RDDs to HBase - http://cjcroix.blogspot.in/
1: import sys
2: import json
3: from pyspark import SparkContext
4: from pyspark.streaming import StreamingContext
5:
6:
7: def SaveRecord(rdd):
8: host = 'sparkmaster.example.com'
9: table = 'cats'
10: keyConv = "org.apache.spark.examples.pythonconverters.StringToImmutableBytesWritableConverter"