Skip to content

Instantly share code, notes, and snippets.

View dportabella's full-sized avatar

David Portabella dportabella

  • Lausanne, Switzerland
View GitHub Profile
@dportabella
dportabella / ExampleExecuteScalaFuturesInSerial.scala
Created September 13, 2016 20:16
Example on how to execute scala futures in serial one after the other, without collecting the result of the futures
/*
Example on how to execute scala futures in serial one after the other, without collecting the result of the futures
Look this instead if we need to collect the result of the futures (it also explains how foldLeft works here):
https://gist.github.com/dportabella/4e7569643ad693433ec6b86968f589b8
*/
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration.Duration
@dportabella
dportabella / TestJKS.java
Last active January 26, 2024 19:34
Test your JKS file easily with java -Djavax.net.ssl.trustStore=your_trust_store.jks TestJKS <url> [<user> <password>]
/*
Test your JKS file easily.
You have created a java JKS trust store file to access a webservice with a certificate, and you want to test if it works?
Some colleagues often test this by deploying the jks to the application server (tomcat, weblogic...), restarting the server and manually running tests,
and repeating this procedure until the jks is properly created.
you can speed up this test by using this simple java program:
> javac TestJKS.java
@dportabella
dportabella / ExampleExecuteScalaFuturesInSerial.scala
Last active June 1, 2023 20:40
Explanation on how to execute scala futures in serial one after the other
/*
Execute scala futures in serial one after the other
This gist is to explain the solution given in
http://www.michaelpollmeier.com/execute-scala-futures-in-serial-one-after-the-other-non-blocking
The three examples produce the same result:
---
done: 10
done: 20
done: 30
@dportabella
dportabella / PomDependenciesToSbt
Last active May 7, 2022 16:25
Script to convert Maven dependencies (and exclusions) from a pom.xml to sbt dependencies. Or run it online on http://goo.gl/wnHCjE
#!/usr/bin/env amm
// This script converts Maven dependencies from a pom.xml to sbt dependencies.
// It is based on the answers of George Pligor and Mike Slinn on http://stackoverflow.com/questions/15430346/
// - install https://github.com/lihaoyi/Ammonite
// - make this script executable: chmod +x PomDependenciesToSbt
// - run it with from your shell (e.g bash):
// $ ./PomDependenciesToSbt /path/to/pom.xml
import scala.xml._
@dportabella
dportabella / ExampleScalaAck.scala
Created August 22, 2014 22:46
this example Scala scripts executes a regex to all files recursively. it uses apache tika UniversalEncodingDetector to filter only text files. it uses a regex to find all lines containing the word "super", except if this word is part of the larger word "superstition" or "supernatural".
import java.io.File
import org.apache.tika.detect._
import org.apache.tika.metadata._
import org.apache.tika.mime._
import org.apache.tika.io._
import org.apache.tika.parser.txt._
import resource._
def recursiveListFiles(f: File): List[File] = {
val these = f.listFiles.toList
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>david.rundeck</string>
<key>ProgramArguments</key>
<array>
<string>/Users/david/bin/rundeck/server/sbin/rundeck_launchd</string>
</array>
@dportabella
dportabella / dist.scala
Last active March 24, 2017 19:14
compute distance in km between two postal codes
// using build.sbt: libraryDependencies += "org.apache.spark" %% "spark-core" % sparkVersion % "provided"
// using Ammonite: import $ivy.`org.apache.sis.core:sis-referencing:0.7`, org.apache.sis.distance.DistanceUtils
case class Coordinates(lat: Double, lon: Double)
def readCoordinates(file: String): Map[String, Coordinates] = {
def parseLine(line: String): (String, Coordinates) = {
val c = line.split("\t")
(c(0) + "-" + c(1), Coordinates(c(9).toDouble, c(10).toDouble))
}
@dportabella
dportabella / FilterArchive.scala
Created February 8, 2017 09:39
Example to filter a WARC archive using Spark and storing the result back to a WARC archive
package application
import java.io._
import java.util
import org.apache.spark.rdd.RDD
import org.archive.format.warc.WARCConstants.WARCRecordType
import org.archive.io.warc.WARCRecordInfo
import org.warcbase.spark.archive.io.ArchiveRecord
import org.warcbase.spark.matchbox.RecordLoader
@dportabella
dportabella / DeserializeHadoopSequenceFileWithoutClassDeclaration.scala
Last active November 8, 2016 22:32
How to deserialize a hadoop result sequence file outside hadoop (or a spark saveAsObjectFile outside spark) without having the class declaration
// resolvers += "dportabella-3rd-party-mvn-repo-releases" at "https://github.com/dportabella/3rd-party-mvn-repo/raw/master/releases/"
// libraryDependencies += "org.apache.hadoop" % "hadoop-common" % "2.7.3"
// libraryDependencies += "com.github.dportabella.3rd-party-mvn-repo" % "jdeserialize" % "1.0.0",
import java.io._
import org.apache.hadoop.conf._
import org.apache.hadoop.fs._
import org.apache.hadoop.io._
import org.unsynchronized.jdeserialize
@dportabella
dportabella / deserialize_hadoop_sequence_file.scala
Last active November 8, 2016 21:42
How to deserialize a hadoop result sequence file outside hadoop (or a spark saveAsObjectFile outside spark)
// libraryDependencies += "org.apache.hadoop" % "hadoop-common" % "2.7.3"
import java.io.{ByteArrayInputStream, ObjectInputStream}
import org.apache.hadoop.conf._
import org.apache.hadoop.fs._
import org.apache.hadoop.io._
val f = "/path/to/part-00000"
val reader = new SequenceFile.Reader(new Configuration(), SequenceFile.Reader.file(new Path(f)))