Skip to content

Instantly share code, notes, and snippets.

val hdfs: org.apache.hadoop.fs.FileSystem =
org.apache.hadoop.fs.FileSystem.get(
new org.apache.hadoop.conf.Configuration())
val hadoopPath= new org.apache.hadoop.fs.Path("hdfs://localhost:9000/tmp")
val recursive = false
val ri = hdfs.listFiles(hadoopPath, recursive)
val it = new Iterator[org.apache.hadoop.fs.LocatedFileStatus]() {
override def hasNext = ri.hasNext
override def next() = ri.next()
@jaceklaskowski
jaceklaskowski / spark-jobserver-docker-macos.md
Last active August 1, 2018 11:28
How to run spark-jobserver on Docker and Mac OS (using docker-machine)
@mp911de
mp911de / Client.java
Created September 24, 2015 08:18
JMX Monitoring Demo of lettuce 3.4-SNAPSHOT
package com.lambdaworks.redis.experimental.mbean;
import java.lang.management.ManagementFactory;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.TimeUnit;
import javax.management.JMException;
import javax.management.MBeanServer;
@MLnick
MLnick / HyperLogLogStoreUDAF.scala
Last active March 16, 2022 05:31
Experimenting with Spark SQL UDAF - HyperLogLog UDAF for distinct counts, that stores the actual HLL for each row to allow further aggregation
class HyperLogLogStoreUDAF extends UserDefinedAggregateFunction {
override def inputSchema = new StructType()
.add("stringInput", BinaryType)
override def update(buffer: MutableAggregationBuffer, input: Row) = {
// This input Row only has a single column storing the input value in String (or other Binary data).
// We only update the buffer when the input value is not null.
if (!input.isNullAt(0)) {
if (buffer.isNullAt(0)) {
package testGeneric
import scala.language.higherKinds
import scalaz.Functor
object TestLabelledGeneric {
case class Ahoy(name:String, y:Int, l:Int)
@mp911de
mp911de / JedisCluster.java
Last active November 27, 2015 16:50
Connecting a Redis Cluster
import redis.clients.jedis.HostAndPort;
import redis.clients.jedis.JedisCluster;
import java.util.HashSet;
import java.util.Set;
public class JedisCluster {
public static void main(String[] args) {
Set<HostAndPort> connectionPoints = new HashSet<HostAndPort>();
@ezhulenev
ezhulenev / spark-thred-safe.scala
Created August 11, 2015 22:16
Thread-safe Spark Sql Context
object ServerSparkContext {
private[this] lazy val _sqlContext = {
val conf = new SparkConf()
.setAppName("....")
val sc = new SparkContext(conf)
// TODO: Bug in Spark: http://stackoverflow.com/questions/30323212
val ctx = new HiveContext(sc)
ctx.setConf("spark.sql.hive.convertMetastoreParquet", "false")
@mp911de
mp911de / MyExtendedRedisClient.java
Created August 11, 2015 06:11
Implementing "cancel commands while disconnected" for https://github.com/mp911de/lettuce/issues/115
import javax.enterprise.inject.Alternative;
import com.lambdaworks.redis.RedisClient;
import com.lambdaworks.redis.RedisURI;
import com.lambdaworks.redis.StatefulRedisConnectionImpl;
import com.lambdaworks.redis.codec.RedisCodec;
import com.lambdaworks.redis.protocol.CommandHandler;
import com.lambdaworks.redis.pubsub.PubSubCommandHandler;
import com.lambdaworks.redis.pubsub.StatefulRedisPubSubConnectionImpl;
@mbostock
mbostock / .block
Last active September 13, 2018 09:12
Smallest Enclosing Circle
license: gpl-3.0
@harlow
harlow / golang_job_queue.md
Last active April 24, 2024 10:21
Job queues in Golang