Skip to content

Instantly share code, notes, and snippets.

View devendrasr's full-sized avatar
💭
building DataOS®

Devendra Singh devendrasr

💭
building DataOS®
View GitHub Profile
@FabioBatSilva
FabioBatSilva / SymlinkManifestWriter.java
Last active December 6, 2019 18:54
Delta lake SymlinkTextInputFormat Manifest Generation
package com.a3k.dw.tracking.driver;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.spark.SparkContext;
import org.apache.spark.sql.SparkSession;
import org.apache.spark.sql.delta.DeltaLog;
import org.apache.spark.sql.delta.Snapshot;
@animeshtrivedi
animeshtrivedi / spark-default
Last active August 11, 2023 17:02
A spark configuration for some performance knobs
# RDD settings
spark.rdd.compress false
# Shuffle settings
spark.shuffle.manager sort
spark.shuffle.compress false
spark.shuffle.spill false
spark.shuffle.spill.compress false
spark.shuffle.sort.initialBufferSize 4194304
spark.shuffle.sort.bypassMergeThreshold 200
@BretFisher
BretFisher / docker-for-mac.md
Last active November 9, 2025 22:48
Getting a Shell in the Docker Desktop Mac VM

2021 Update: Easiest option is Justin's repo and image

Just run this from your Mac terminal and it'll drop you in a container with full permissions on the Docker VM. This also works for Docker for Windows for getting in Moby Linux VM (doesn't work for Windows Containers).

docker run -it --rm --privileged --pid=host justincormack/nsenter1

more info: https://github.com/justincormack/nsenter1


@devendrasr
devendrasr / binary_safe_to_binary_string.go
Last active January 19, 2017 19:16
Convert - Redis binary safe string to binary string
//Usage - toBinaryString("\xd4!")) ==> 110101000010000100100000
func toBinaryString(str string) string {
bitstring := ""
for i := 0; i < len(str); i++ {
for bit := 7; bit >= 0; bit-- {
set := (str[i]>>uint(bit))&1 == 1
if set {
bitstring += "1"
} else {
@dusenberrymw
dusenberrymw / spark_tips_and_tricks.md
Last active October 23, 2025 02:15
Tips and tricks for Apache Spark.

Spark Tips & Tricks

Misc. Tips & Tricks

  • If values are integers in [0, 255], Parquet will automatically compress to use 1 byte unsigned integers, thus decreasing the size of saved DataFrame by a factor of 8.
  • Partition DataFrames to have evenly-distributed, ~128MB partition sizes (empirical finding). Always err on the higher side w.r.t. number of partitions.
  • Pay particular attention to the number of partitions when using flatMap, especially if the following operation will result in high memory usage. The flatMap op usually results in a DataFrame with a [much] larger number of rows, yet the number of partitions will remain the same. Thus, if a subsequent op causes a large expansion of memory usage (i.e. converting a DataFrame of indices to a DataFrame of large Vectors), the memory usage per partition may become too high. In this case, it is beneficial to repartition the output of flatMap to a number of partitions that will safely allow for appropriate partition memory sizes, based upon the
@yuksbg
yuksbg / golang-fcm-xmpp-server.go
Created June 15, 2016 17:00
Quick and dirty example for FCM XMPP server which uses channels for comunications
package main
import (
"github.com/titan-x/gcm/ccs"
"log"
"time"
"encoding/json"
)
func sendMe(sendM chan string) {
@bastman
bastman / docker-cleanup-resources.md
Created March 31, 2016 05:55
docker cleanup guide: containers, images, volumes, networks

Docker - How to cleanup (unused) resources

Once in a while, you may need to cleanup resources (containers, volumes, images, networks) ...

delete volumes

// see: https://github.com/chadoe/docker-cleanup-volumes

$ docker volume rm $(docker volume ls -qf dangling=true)

$ docker volume ls -qf dangling=true | xargs -r docker volume rm

@transitive-bullshit
transitive-bullshit / logger.js
Last active July 9, 2024 04:09
winston logger with filename:linenumber
// NOTE: this adds a filename and line number to winston's output
// Example output: 'info (routes/index.js:34) GET 200 /index'
var winston = require('winston')
var path = require('path')
var PROJECT_ROOT = path.join(__dirname, '..')
var logger = new winston.logger({ ... })
// this allows winston to handle output from express' morgan middleware
@lolzballs
lolzballs / HelloWorld.java
Created March 22, 2015 00:21
Hello World Enterprise Edition
import java.io.FileDescriptor;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.io.PrintStream;
public class HelloWorld{
private static HelloWorld instance;
public static void main(String[] args){
instantiateHelloWorldMainClassAndRun();
@hassansin
hassansin / aspell-add-words.sh
Last active February 8, 2019 11:47
aspell add words to dictionary
touch ~/aspell.personal.txt
vi ~/aspell.personal.txt # add words per line
aspell --lang=en create master /tmp/en-personal.pws < ~/aspell.personal.txt
cp /tmp/en-personal.pws /usr/lib/aspell
vim /usr/lib/aspell/en_US.multi # add the line: 'add en-personal.pws'