Skip to content

Instantly share code, notes, and snippets.

View lieuzhenghong's full-sized avatar

lieu lieuzhenghong

View GitHub Profile
# Quicksort algorithm
# Takes in a list and a pivot index and returns a partitioned list
comparisons = 0
def read_file(filename):
with open(filename) as text_file:
numbers = text_file.read().splitlines()
num_array = []
for number in numbers:
number = int(number)
@lieuzhenghong
lieuzhenghong / telegram-bot-primer.md
Last active September 12, 2017 16:04
Telegram Bot Primer

So You Want to Build a Telegram Bot

What's a Telegram Bot and how does it work?

tl;dr

A Telegram Bot is a client that makes requests to the Telegram server and makes further requests depending on the Telegram server's response.

Harh?

@lieuzhenghong
lieuzhenghong / bloom_filter_false_positives.ipynb
Created September 20, 2017 09:29
False positives of a Bloom filter
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@lieuzhenghong
lieuzhenghong / setup.md
Last active September 16, 2019 14:16
Setting up Apache Spark on the Raspberry Pi Cluster

Preliminaries

Have the spark folder in a directory of your choice.

For my master it was home/lieu/dev/spark and for my slaves it was /home/pirate/spark.

Do the following export on master because the slaves and the master have their spark folder in different directories (we'll make use of this later)

export $SLAVE_SPARK_HOME=/home/pirate/spark/spark-2.4.4-bin-without-hadoop

object SimpleApp {
def main(args: Array[String]) {
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions.explode
val spark = SparkSession.builder.appName("TripAnalysis").getOrCreate()
import spark.implicits._
val results_path = "s3a://results/"
val paths = "s3a://trips/*"
val tripDF = spark.read.option("multiline", "true").json(paths)
val linksDF = tripDF.select(explode($"data").as("data"))
@lieuzhenghong
lieuzhenghong / SimpleApp.scala
Created September 16, 2019 14:26
Scala code for doing data analysis
object SimpleApp {
def main(args: Array[String]) {
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions.explode
val spark = SparkSession.builder.appName("TripAnalysis").getOrCreate()
import spark.implicits._
val results_path = "s3a://results/"
val paths = "s3a://trips/*"
val tripDF = spark.read.option("multiline", "true").json(paths)
// "Explode" the data array into individual rows
@lieuzhenghong
lieuzhenghong / not_any_slower.py
Last active February 3, 2020 18:23
why is this code running slower and slower, but very similar code isn't?
def convert_point_distances_to_tract_distances(pttm, DM_PATH, SAVE_FILE_TO):
'''Returns a JSON file giving driving durations from all points in the Census Tract
to all other points in every other Census Tract.
Takes in three things:
1. A mapping from points to tracts (Dict[PointId : TractId]
2. The distance matrix file path
3. The path to save the output tract distance file to
Reads line by line
This file has been truncated, but you can view the full file.
[[571094.1357543762, -773577.3797877071], [571098.604729983, -773648.5435628016], [571100.8332030145, -773684.0247029174], [571099.9309872123, -773831.8708064299], [571099.2302662726, -773872.7562192013], [571095.636439994, -774095.5614189495], [571085.8323616097, -774701.7030761379], [571079.5078288359, -774763.2011126044], [571081.2460979479, -774921.0076785735], [571082.1753802244, -775007.7497131865], [571076.7410337091, -775043.7030129055], [571080.5694360966, -775104.6747568208], [571070.3473310568, -775225.0604186992], [571054.8983891369, -775274.4416969415], [570992.016955107, -775474.5892505096], [570886.716341442, -775738.598645946], [570795.4211244136, -775983.8088714498], [570783.7374297748, -775999.7901206934], [570756.3190915445, -776076.0754741585], [570692.9173797355, -776252.7654168684], [570678.4217791795, -776263.8025235158], [570675.7391908611, -776302.257968393], [570583.2109455694, -776567.9033818175], [570513.532819163, -776739.7544404415], [570435.8670755206, -776931.3610177484], [5704
We can make this file beautiful and searchable if this error is corrected: No commas found in this CSV file in line 0.
cut_edges
213
204
219
220
289
294
253
236
274
# On a new EC2 instance:
# Setup locale
sudo locale-gen en_US en_US.UTF-8
sudo update-locale LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8
export LANG=en_US.UTF-8
# Setup sources
sudo apt update && sudo apt install curl gnupg2 lsb-release
curl -s https://raw.githubusercontent.com/ros/rosdistro/master/ros.asc | sudo apt-key add -