Skip to content

Instantly share code, notes, and snippets.

View rberenguel's full-sized avatar

Ruben Berenguel rberenguel

View GitHub Profile
@rberenguel
rberenguel / schemaAsScala.scala
Last active March 8, 2018 14:57
Converts a DataFrame schema output (StructType) into something you can paste as a schema, as valid Scala code. Useful when working with Apache Zeppelin or any other REPL.
import org.apache.spark.sql.types._ // You'll need this to evaluate its output
object StructFmt {
def asScala(field: StructField): String = field.dataType match {
case struct: StructType => s"""StructField("${field.name}",""" + asScala(struct) + s", ${field.nullable})"
case _ => s"""StructField("${field.name}", ${field.dataType}, ${field.nullable})"""
}
def asScala(struct: StructType): String = "StructType(Seq(" + (for(field <- struct) yield asScala(field)).mkString(",") + "))"
}

Keybase proof

I hereby claim:

  • I am rberenguel on github.
  • I am rberenguel (https://keybase.io/rberenguel) on keybase.
  • I have a public key ASDoKc_uTbL4lQBaJ4bNh9G-5OkiK2gSxIFF6h0xvuMy8wo

To claim this, I am signing this object:

case class Foo(id: String, value: Int)
case class Bar(theId: String, value: Int)
val ds = List(Foo("Alice", 42), Foo("Bob", 43)).toDS
import org.apache.spark.sql.{DataFrame, Dataset}
val renamedDF: DataFrame = ds.select($"id".as("theId"), $"value")
val renamedDS: Dataset[Bar] = renamedDF.toDF("theId", "value").as[Bar]
@rberenguel
rberenguel / primes.py
Created September 24, 2018 22:04 — forked from vegard/primes.py
Prime factorisation diagram
# -*- coding: utf-8 -*-
#
# Author: Vegard Nossum <vegard.nossum@gmail.com>
import math
import os
import sys
import cairo
@rberenguel
rberenguel / wat.scala
Created March 8, 2019 09:58
Weird companion object issue in Scala 2.11 vs 2.12
case class Country(name: String) {
private val ThreeLetterValidCountries = List("FOO", "BAR")
def valid: Option[Country] =
if (ThreeLetterValidCountries.contains(name.toUpperCase)) Some(this) else None
}
object Country {
def apply(name: String): Country = new Country(name.toUpperCase)
}
@rberenguel
rberenguel / mand.ps
Created May 31, 2019 14:23
Simple Mandelbrot set generator in PostScript
%!PS-Adobe-2.0
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %
% Mandelbrot set via PostScript code. Not optimized %
% in any way. Centered in A4 paper. Escape time, B&W %
% %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
/fun {
@rberenguel
rberenguel / postcard2008.ps
Created May 31, 2019 14:25
Christmas Postcard 2008 in PostScript
%!PS-Adobe-3.0
%%BeginFeature: *PageSize A4
<< /PageSize [595 842] >> setpagedevice
%%EndFeature
%/PageSize A4
/CoordX 595 def
/CoordY 842 def
/RadiMax 95 def
@rberenguel
rberenguel / Koch.ps
Created August 24, 2019 18:14
Koch snowflake in PostScript
%!PS-Adobe-2.0
%%% Start of L-system definition
/STARTK { FK +K +K FK +K +K FK} def
/FK {
dup 0 eq
{ DK } % if the recursion order ends, draw forward
{
1 sub % recurse
@rberenguel
rberenguel / Lavaurs.lsp
Created August 24, 2019 19:30
Code I wrote around 2009 for rendering Lavaurs chords (an abstract Mandelbrot set). Blame my old self for my poor Lisp
;; Copyright 2009 Rubén Berenguel
;; ruben /at/ maia /dot/ ub /dot/ es
;; This program is free software: you can redistribute it and/or
;; modify it under the terms of the GNU General Public License as
;; published by the Free Software Foundation, either version 3 of the
;; License, or (at your option) any later version.
;; This program is distributed in the hope that it will be useful,
@rberenguel
rberenguel / pyspark-workshop-requirements.md
Last active November 14, 2019 09:22
Quick writeup of the requirements for my PySpark workshop at PyDay 2019, Barcelona (https://pybcn.org/pyday-bcn-2019/)

To take full advantage of the workshop you'll need

  • PySpark installed (anything more recent than 2.3 should be fine)
  • Jupyter installed
  • Pandas and Arrow installed
  • All able to talk to each other
  • One or more datasets

You can clone this repository to have the notebook and slides (some things may still change until Saturday, like uploading and upgating the compiled slides, but the notebook is essentially finished).