Skip to content

Instantly share code, notes, and snippets.

View Gaarv's full-sized avatar
:atom:

Gaarv Gaarv

:atom:
  • Oslo, Norway
View GitHub Profile
import socket
from contextlib import closing
def find_free_port():
with closing(socket.socket(socket.AF_INET, socket.SOCK_STREAM)) as s:
s.bind(("", 0))
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
return s.getsockname()[1]
from threading import Thread
worker = Worker()
worker.daemon = True
worker.start()
class Worker(Thread):
def run(self):
download()
CREATE OR REPLACE FUNCTION truncate_tables(username IN VARCHAR) RETURNS void AS $$
DECLARE
statements CURSOR FOR
SELECT tablename FROM pg_tables
WHERE tableowner = username AND schemaname = 'public';
BEGIN
FOR stmt IN statements LOOP
EXECUTE 'TRUNCATE TABLE ' || quote_ident(stmt.tablename) || ' CASCADE;';
END LOOP;
END;
class SwishActivation(layers.Activation):
def __init__(self, activation, **kwargs):
super(SwishActivation, self).__init__(activation, **kwargs)
self.__name__ = "swish_act"
def swish_act(x, beta=1):
return x * sigmoid(beta * x)
docker images --no-trunc --format '{{.ID}} {{.CreatedSince}}' \
| grep ' months' | awk '{ print $1 }' \
| xargs --no-run-if-empty docker rmi
import asyncio
import smtplib
from threading import Thread
def send_notification(email):
"""Generate and send the notification email""" # Do some work to get email body
message = ...
# Connect to the server
server = smtplib.SMTP("smtp.gmail.com:587")

Generating Flame Graphs for Apache Spark

Flame graphs are a nifty debugging tool to determine where CPU time is being spent. Using the Java Flight recorder, you can do this for Java processes without adding significant runtime overhead.

When are flame graphs useful?

Shivaram Venkataraman and I have found these flame recordings to be useful for diagnosing coarse-grained performance problems. We started using them at the suggestion of Josh Rosen, who quickly made one for the Spark scheduler when we were talking to him about why the scheduler caps out at a throughput of a few thousand tasks per second. Josh generated a graph similar to the one below, which illustrates that a significant amount of time is spent in serialization (if you click in the top right hand corner and search for "serialize", you can see that 78.6% of the sampled CPU time was spent in serialization). We used this insight to spee

@Gaarv
Gaarv / serialization.sc
Created December 28, 2018 09:10 — forked from laughedelic/serialization.sc
Shows how to serialize-deserialize an object in Scala to a String
import java.io._
import java.util.Base64
import java.nio.charset.StandardCharsets.UTF_8
def serialise(value: Any): String = {
val stream: ByteArrayOutputStream = new ByteArrayOutputStream()
val oos = new ObjectOutputStream(stream)
oos.writeObject(value)
oos.close
new String(
# 1 - change in submit.py from:
def load_input_data(file_location):
with open(file_location, 'r') as input_data_file:
input_data = ''.join(input_data_file.readlines())
return input_data
# to:
def load_input_data(file_location):
return file_location
@Gaarv
Gaarv / Graph
Created January 8, 2018 12:51 — forked from printminion/Graph
Status: not working
import java.io.IOException;
import java.nio.charset.Charset;
import java.nio.file.FileSystems;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.HashMap;
import java.util.List;