Skip to content

Instantly share code, notes, and snippets.

View umbertogriffo's full-sized avatar

Umberto Griffo umbertogriffo

View GitHub Profile
@umbertogriffo
umbertogriffo / HBaseRestore.rb
Created February 19, 2016 15:04
This code restore the snapshots of all HBase tables saved using the script HBaseBackup.rb (https://gist.github.com/umbertogriffo/fe1bce24f8e9ee68c75f). Tested on CDH-5.4.4-1
# To execute script launch this command on shell: hbase shell HBaseRestore.rb
include Java
java_import org.apache.hadoop.hbase.HBaseConfiguration
java_import org.apache.hadoop.hbase.client.HBaseAdmin
java_import org.apache.hadoop.hbase.snapshot.ExportSnapshot
java_import org.apache.hadoop.hbase.TableExistsException
java_import org.apache.hadoop.util.ToolRunner
@umbertogriffo
umbertogriffo / ObjectPool.java
Created June 28, 2016 08:01
Generic Java object pool with minimalistic code
import java.util.Queue;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
/**
* @param <T>
*/
public abstract class ObjectPool<T> {
@umbertogriffo
umbertogriffo / Method1.java
Last active January 22, 2017 14:21
How to make the method run() of class NoThreadSafe thread-safe in Java
public class Method1 {
/*
Adding synchronized to this method will makes it thread-safe.
When synchronized is added to a static method, the Class object is the object which is locked.
*/
public static void main(String[] args) throws InterruptedException {
ProcessingThreadS pt = new ProcessingThreadS();
Thread t1 = new Thread(pt, "t1");
@umbertogriffo
umbertogriffo / TestPerformance.scala
Last active April 13, 2017 09:33
This Scala code tests the performance of Euclidean distance developed using map-reduce pattern, treeReduce and treeAggregate.
import org.apache.commons.lang.SystemUtils
import org.apache.spark.mllib.random.RandomRDDs._
import org.apache.spark.sql.SQLContext
import org.apache.spark.{SparkConf, SparkContext}
import scala.math.sqrt
/**
* Created by Umberto on 08/02/2017.
*/
import java.util.*;
import java.util.Map.Entry;
import java.util.stream.Collectors;
/**
* Created by Umberto on 16/05/2017.
*/
public class HashMapUtils {
@umbertogriffo
umbertogriffo / JavaRddAPI.java
Created February 23, 2018 11:50
This is a collections of examples about Apache Spark's JavaRDD Api. These examples aim to help me test the JavaRDD functionality.
package test.idlike.spark.datastructure;
import org.apache.commons.lang3.SystemUtils;
import org.apache.spark.api.java.JavaDoubleRDD;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.api.java.function.VoidFunction;
import scala.Tuple2;
@umbertogriffo
umbertogriffo / Transpose.scala
Created October 26, 2016 08:05
Utility Methods to Transpose a org.apache.spark.mllib.linalg.distributed.RowMatrix
def transposeRowMatrix(m: RowMatrix): RowMatrix = {
val transposedRowsRDD = m.rows.zipWithIndex.map{case (row, rowIndex) => rowToTransposedTriplet(row, rowIndex)}
.flatMap(x => x) // now we have triplets (newRowIndex, (newColIndex, value))
.groupByKey
.sortByKey().map(_._2) // sort rows and remove row indexes
.map(buildRow) // restore order of elements in each row and remove column indexes
new RowMatrix(transposedRowsRDD)
}
def rowToTransposedTriplet(row: Vector, rowIndex: Long): Array[(Long, (Long, Double))] = {
@umbertogriffo
umbertogriffo / falsehoods-programming-time-list.md
Created August 6, 2019 10:03 — forked from timvisee/falsehoods-programming-time-list.md
Falsehoods programmers believe about time, in a single list

Falsehoods programmers believe about time

This is a compiled list of falsehoods programmers tend to believe about working with time.

Don't re-invent a date time library yourself. If you think you understand everything about time, you're probably doing it wrong.

Falsehoods

  • There are always 24 hours in a day.
  • February is always 28 days long.
  • Any 24-hour period will always begin and end in the same day (or week, or month).
@umbertogriffo
umbertogriffo / Winner.java
Created February 15, 2017 09:02
Java 8 Streams Cookbook
package knowledgebase.java.stream;
import java.time.Duration;
import java.util.*;
import static java.util.stream.Collectors.*;
/**
* Created by Umberto on 15/02/2017.
* https://dzone.com/articles/a-java-8-streams-cookbook
@umbertogriffo
umbertogriffo / RddAPI.scala
Last active January 29, 2020 12:57
This is a collections of examples about Apache Spark's RDD Api. These examples aim to help me test the RDD functionality.
/*
This is a collections of examples about Apache Spark's RDD Api. These examples aim to help me test the RDD functionality.
References:
http://spark.apache.org/docs/latest/programming-guide.html
http://homepage.cs.latrobe.edu.au/zhe/ZhenHeSparkRDDAPIExamples.html
*/
object RddAPI {