Skip to content

Instantly share code, notes, and snippets.

View rberenguel's full-sized avatar

Ruben Berenguel rberenguel

View GitHub Profile

Quick Tips for Fast Code on the JVM

I was talking to a coworker recently about general techniques that almost always form the core of any effort to write very fast, down-to-the-metal hot path code on the JVM, and they pointed out that there really isn't a particularly good place to go for this information. It occurred to me that, really, I had more or less picked up all of it by word of mouth and experience, and there just aren't any good reference sources on the topic. So… here's my word of mouth.

This is by no means a comprehensive gist. It's also important to understand that the techniques that I outline in here are not 100% absolute either. Performance on the JVM is an incredibly complicated subject, and while there are rules that almost always hold true, the "almost" remains very salient. Also, for many or even most applications, there will be other techniques that I'm not mentioning which will have a greater impact. JMH, Java Flight Recorder, and a good profiler are your very best friend! Mea

@yukoff
yukoff / GitHub Wiki Subtree Storage.markdown
Created September 3, 2017 13:01 — forked from joshuajabbour/GitHub Wiki Subtree Storage.markdown
Store and edit GitHub wikis within the main project repository.

Project documentation

The project documentation (stored in the docs directory) is a git subtree of the project wiki. This allows for the documentation to be referenced and edited from within the main project.

Initial local setup

When cloning the main project repository for the first time, the wiki repository must be added as a remote.

git remote add wiki https://github.com//.wiki.git

@johnhw
johnhw / umap_sparse.py
Last active January 6, 2024 16:09
1 million prime UMAP layout
### JHW 2018
import numpy as np
import umap
# This code from the excellent module at:
# https://stackoverflow.com/questions/4643647/fast-prime-factorization-module
import random

Introduction

I was recently asked to explain why I felt disappointed by Haskell, as a language. And, well. Crucified for crucified, I might as well criticise Haskell publicly.

First though, I need to make it explicit that I claim no particular skill with the language - I will in fact vehemently (and convincingly!) argue that I'm a terrible Haskell programmer. And what I'm about to explain is not meant as The Truth, but my current understanding, potentially flawed, incomplete, or flat out incorrect. I welcome any attempt at proving me wrong, because when I dislike something that so many clever people worship, it's usually because I missed an important detail.

Another important point is that this is not meant to convey the idea that Haskell is a bad language. I do feel, however, that the vocal, and sometimes aggressive, reverence in which it's held might lead people to have unreasonable expectations. It certainly was my case, and the reason I'm writing this.

Type classes

I love the concept of type class

@kmader
kmader / README.md
Last active October 31, 2023 14:21
Beating Serialization in Spark

Serialization

As all objects must be Serializable to be used as part of RDD operations in Spark, it can be difficult to work with libraries which do not implement these featuers.

Java Solutions

Simple Classes

For simple classes, it is easiest to make a wrapper interface that extends Serializable. This means that even though UnserializableObject cannot be serialized we can pass in the following object without any issue

public interface UnserializableWrapper extends Serializable {
 public UnserializableObject create(String parm1, String parm2);
@cb372
cb372 / io-and-tf.md
Last active June 5, 2023 16:16
IO and tagless final

TL;DR

We should use a type parameter with a context bound (e.g. F[_]: Sync) in library code so users can choose their IO monad, but we should use a concrete IO monad in application code.

Abstracting over IO

If you're writing a library that makes use of effects, it makes sense to use the cats-effect type classes so users can choose their IO monad (IO, ZIO, Monix Task, etc).

So instead of

@benjamineskola
benjamineskola / evening.applescript
Last active November 30, 2022 09:57
Automatically set repeating tasks tagged ‘Evening’ to be done this evening, in Things 3 — updated versions here: https://github.com/benjamineskola/things-scripts/blob/master/evening.applescript
-- run first thing in the morning, e.g., from cron
tell application "Things3"
set theToken to "your-auth-token"
set theTodos to to dos of list "Today"
repeat with aTodo in theTodos
set tagList to tags of aTodo
repeat with aTag in tagList
if (name of aTag as text) is "Evening"
@vegard
vegard / primes.py
Created September 21, 2018 07:51
Prime factorisation diagram
# -*- coding: utf-8 -*-
#
# Author: Vegard Nossum <vegard.nossum@gmail.com>
import math
import os
import sys
import cairo
@vil1
vil1 / 🏳️.md
Last active September 8, 2019 00:05

This is a witch hunt.

When you exclude a relentless innovator from a conference, when this exclusion results in excluding the young woman from North-Africa who was supposed to share the stage with him, it has nothing to do with promoting innovation and inclusivity. It is a witch hunt.

When you bar someone from contributing to a FLOSS project based on alleged aggressive communication without providing any concrete example of the said behavior nor explaining what you did to make this behavior stop before taking such extreme decision, it has nothing to do with making your community a better place. It is a witch hunt.

Witch hunts are bad. Not because they burn people with no fair trial and that some of the burnt people may not have been witches in the first place.

Witch hunts are bad because they burn people. Period.

@HeartSaVioR
HeartSaVioR / a-bit-tricky-spark-sql.scala
Last active August 20, 2018 22:00
A bit tricky result of Spark SQL query result (Tested with 2.3.0)
/////////////////////////////////////////////////////////////////////////////////////////////
// 1. select with swapping columns, and apply where
/////////////////////////////////////////////////////////////////////////////////////////////
import spark.implicits._
import org.apache.spark.sql.{DataFrame, Dataset}
case class Hello(id: Int, name1: String, name2: String)
val ds = List(Hello(1, "Alice", "Bob"), Hello(2, "Bob", "Alice")).toDS