Skip to content

Instantly share code, notes, and snippets.

View abrocod's full-sized avatar

Jinchao abrocod

  • Google
  • New York, NY
View GitHub Profile
@jobliz
jobliz / RedisPythonPubSub1.py
Created May 4, 2012 17:58
A short script exploring Redis pubsub functions in Python
import redis
import threading
class Listener(threading.Thread):
def __init__(self, r, channels):
threading.Thread.__init__(self)
self.redis = r
self.pubsub = self.redis.pubsub()
self.pubsub.subscribe(channels)
@bonzanini
bonzanini / create_index.sh
Last active May 25, 2023 23:35
Elasticsearch/Python test
curl -XPOST http://localhost:9200/test/articles/1 -d '{
"content": "The quick brown fox"
}'
curl -XPOST http://localhost:9200/test/articles/2 -d '{
"content": "What does the fox say?"
}'
curl -XPOST http://localhost:9200/test/articles/3 -d '{
"content": "The quick brown fox jumped over the lazy dog"
}'
curl -XPOST http://localhost:9200/test/articles/4 -d '{
@snowindy
snowindy / spark-create-rdd-from-s3-parallel.scala
Last active September 24, 2020 12:25
This code allows parallel loading of data from S3 to Spark RDD. Support multiple paths to load from. Based on http://tech.kinja.com/how-not-to-pull-from-s3-using-apache-spark-1704509219
val s3Paths = "s3://yourbucket/path/to/file1.txt,s3://yourbucket/path/to/directory"
val pageLength = 100
val key = "YOURKEY"
val secret = "YOUR_SECRET"
import com.amazonaws.services.s3._, model._
import com.amazonaws.auth.BasicAWSCredentials
import com.amazonaws.services.s3.model.ObjectListing
import scala.collection.JavaConverters._
import scala.io.Source
@evenv
evenv / Spark Dataframe Cheat Sheet.py
Last active June 24, 2022 23:45
Cheat sheet for Spark Dataframes (using Python)
# A simple cheat sheet of Spark Dataframe syntax
# Current for Spark 1.6.1
# import statements
from pyspark.sql import SQLContext
from pyspark.sql.types import *
from pyspark.sql.functions import *
#creating dataframes
df = sqlContext.createDataFrame([(1, 4), (2, 5), (3, 6)], ["A", "B"]) # from manual data