Skip to content

Instantly share code, notes, and snippets.

View andrearota's full-sized avatar

Andrea Rota andrearota

View GitHub Profile
@andrearota
andrearota / main.py
Last active March 22, 2019 10:56
Apache Spark issues with optimizer when using PySpark UDF and complex types
from pyspark.sql import SparkSession, functions as f
from pyspark.sql.types import IntegerType
import sys
if __name__ == '__main__':
spark = SparkSession.builder.getOrCreate()
print('Spark version:', spark.version)
print('Python version:', sys.version)
@andrearota
andrearota / start_docker_vm.bat
Created December 15, 2017 14:37
Autostart Docker Machine and a Docker container in Windows <10
@echo on
REM Script for booting the Docker machine and to start the container at startup
REM This file should be scheduled in Windows automatic execution folder
REM Set the path of Docker tools
set PATH=%PATH%;"C:\Program Files\Docker Toolbox\"
REM Set the name of the VM configuration where Docker daemon will be hosted
REM The docker machine below should exist!
@andrearota
andrearota / example.scala
Created October 18, 2016 08:40
Creating Spark UDF with extra parameters via currying
// Problem: creating a Spark UDF that take extra parameter at invocation time.
// Solution: using currying
// http://stackoverflow.com/questions/35546576/how-can-i-pass-extra-parameters-to-udfs-in-sparksql
// We want to create hideTabooValues, a Spark UDF that set to -1 fields that contains any of given taboo values.
// E.g. forbiddenValues = [1, 2, 3]
// dataframe = [1, 2, 3, 4, 5, 6]
// dataframe.select(hideTabooValues(forbiddenValues)) :> [-1, -1, -1, 4, 5, 6]
//
// Implementing this in Spark, we find two major issues:
@andrearota
andrearota / Sqoop.md
Last active October 11, 2016 07:41
Examples for CCA175

Handling null values

For string columns:

--null-string <null-string>	The string to be written for a null value for string columns

For non-string columns:

--null-non-string <null-string>	The string to be written for a null value for non-string columns

If not declared, null is written as null string.

@andrearota
andrearota / gist:82873d2d6e5ae21a03fd79fc7ebaf170
Created September 22, 2016 22:11
How to extract srt subtitles from an mkv file
# Find all the .mkv file in the current folder and for each one extract the srt with ffmpeg
# We assume that the srt is the stream 0:2, change it according to your file
find . -name "*.mkv" -exec ffmpeg -i {} -map 0:2 {}.srt \;
# Now rename the .mkv.srt to .srt, in order to have them automatically loaded by VLC and other players
for j in *.mkv.srt; do mv -v -- "$j" "${j%.mkv.srt}.srt"; done
@andrearota
andrearota / ParsingJson.scala
Last active September 2, 2016 08:42
Using Lift Json libraries to parse Json
import net.liftweb.json._
case class Entry(name: String, job: String, scores: Array[Double], weights: Array[Double])
object ParsingJson {
def main(args: Array[String]) = {
implicit val formats = DefaultFormats
@andrearota
andrearota / keybase.md
Created August 18, 2016 22:34
My verification snippet for keybase.io

Keybase proof

I hereby claim:

  • I am andrearota on github.
  • I am arota (https://keybase.io/arota) on keybase.
  • I have a public key whose fingerprint is AE75 DD6A 531D 4FFB 4042 6E81 28E3 8D6F 7239 5237

To claim this, I am signing this object:

@andrearota
andrearota / asound.conf
Last active November 13, 2016 09:38
Shairport and alsa configuration to make a Raspberry Pi able to receive AirPlay sound streams, equalize the sound with ALSA equal plugin and play them either on HDMI audio interface or Raspberry PWM analog output
# File: /etc/asound.conf
# .alsaequal.bin is automatically created tuning the equalizer with "alsamixer -D equal"
ctl.equal {
type equal;
controls "/home/pi/.alsaequal.bin"
}
pcm.plugequal {
type equal;
@andrearota
andrearota / gist:8211153
Created January 1, 2014 20:18
Script to be run inside a Chrome JS console to delete all the messages and conversations from your Facebook inbox. Instruction: 1 - open your Facebook inbox folder https://www.facebook.com/messages/ 2 - open the Chrome JS console (ALT+CMD+J on Mac OS X) 3 - paste the script and press enter The script will start to open every conversation in your…
setInterval(function() {
document.getElementsByClassName('_k_')[0].click();
document.getElementsByClassName('uiSelectorButton uiButton uiButtonOverlay')[0].click();
document.getElementsByClassName('itemAnchor')[7].click();
document.getElementsByName('delete_conversation')[0].click()
}, 500);