Skip to content

Instantly share code, notes, and snippets.

@jboner
jboner / latency.txt
Last active June 25, 2024 12:58
Latency Numbers Every Programmer Should Know
Latency Comparison Numbers (~2012)
----------------------------------
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy 3,000 ns 3 us
Send 1K bytes over 1 Gbps network 10,000 ns 10 us
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD
@kleem
kleem / README.md
Last active February 22, 2023 09:52
WordNet noun graph

This experiment converts an SQL version of WordNet 3.0 into a graph, using the python library graph-tool. In order to create a taxonomical structure, only noun synsets, hyponym links and hypernym links are considered.

The result of the conversion is saved as GraphML, then rendered as the following hairball:

WordNet 3.0 taxonomy as a graph

Since the graph can be considered a tangled tree, i.e. a tree in which some nodes have multiple parents, two untangled versions (using longest and shortest paths) are also provided as GraphML. Only a few links are lost (about 2%), making the tree a good approximation of the noun taxonomy graph.

@rampage644
rampage644 / spark_etl_resume.md
Created September 15, 2015 18:02
Spark ETL resume

Introduction

This document describes sample process of implementing part of existing Dim_Instance ETL.

I took only Clound Block Storage source to simplify and speedup the process. I also ignnored creation of extended tables (specific for this particular ETL process). Below are code and final thoughts about possible Spark usage as primary ETL tool.

TL;DR

Implementation

Basic ETL implementation is really straightforward. The only real problem (I mean, really problem) is to find correct and comprehensive Mapping document (description what source fields go where).

@paulp
paulp / oddity.txt
Created January 11, 2016 22:22
Whitespace Oddity
WHITESPACE ODDITY
by Paul Phillips, in eternal admiration of David Bowie, RIP
Bound Ctrl to Major mode
Bound Ctrl to Major mode
Read inputrc and set extdebug on
Bound Ctrl to Major mode (Ten, Nine, Eight, Seven, Six)
Connecting readline, options on (Five, Four, Three)
Check the syntax, may terminfo be with you (Two, One, Exec)
@Alanaktion
Alanaktion / pacman.md
Last active April 21, 2020 14:49
Useful pacman commands and packages

Basic usage

pacman -S <package> # Install a package
pacman -Sy # Update package list
pacman -Su # Update installed packages
pacman -Ss <query> # Search packages
pacman -R <package> # Remove a package
pacman -Rs <package> # Remove a package and it's unneeded dependencies
@longcao
longcao / SparkCopyPostgres.scala
Last active December 26, 2023 14:47
COPY Spark DataFrame rows to PostgreSQL (via JDBC)
import java.io.InputStream
import org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils
import org.apache.spark.sql.{ DataFrame, Row }
import org.postgresql.copy.CopyManager
import org.postgresql.core.BaseConnection
val jdbcUrl = s"jdbc:postgresql://..." // db credentials elided
val connectionProperties = {
@yoyama
yoyama / Schema2CaseClass.scala
Created January 20, 2017 07:36
Generate case class from spark DataFrame/Dataset schema.
/**
* Generate Case class from DataFrame.schema
*
* val df:DataFrame = ...
*
* val s2cc = new Schema2CaseClass
* import s2cc.implicit._
*
* println(s2cc.schemaToCaseClass(df.schema, "MyClass"))
*
@allquest
allquest / leboncoin_rss.user.js
Last active August 2, 2023 11:00
Greasemonkey script for LeBonCoin - A kind of RSS for the website Le bon coin with your query -- each time you reload a page, a GET request is sent to lbc and match your query. If a new offer is available, the link is shown on the top of the page.
// ==UserScript==
// @name Leboncoin RSS
// @namespace http://gist.github.com/fb7b790fb6548bdec3ec5259bebd20c0
// @author Tegomass
// @description A kind of RSS for LeBonCoin with your personnal search
// @include *
// @require https://cdnjs.cloudflare.com/ajax/libs/jquery/3.1.1/jquery.min.js
// @version 1.1
// @grant GM_addStyle
// @grant GM_setValue
@max-mapper
max-mapper / bibtex.png
Last active March 10, 2024 21:53
How to make a scientific looking PDF from markdown (with bibliography)
bibtex.png
@CesarCapillas
CesarCapillas / add-by-id.sh
Last active April 23, 2024 21:14
SOLR bash recipes for creating, deleting or truncating collections, monitoring and searching.
#!/bin/bash
COLLECTION=${2:-zylk}
SERVER=${3:-localhost}
PORT=${4:-8983}
if [ -z "$1" ]; then
# Usage
echo 'Usage: add-by-id.sh <id> [<collection> <solr-server=localhost> <port=8383>]'
else
curl -X POST "http://${SERVER}:${PORT}/solr/${COLLECTION}/update?commit=true" -H "Content-Type: text/xml" --data-binary "<add><doc><field name='id'>$1</field><field name='url'>$1</field></doc></add>"