Skip to content

Instantly share code, notes, and snippets.

View animeshtrivedi's full-sized avatar

Animesh Trivedi animeshtrivedi

View GitHub Profile
@animeshtrivedi
animeshtrivedi / CrailStreamExample.java
Last active December 22, 2017 10:28
example API for crail streaming API
/*
This is just a rough sketch how a the code should look like. While
impelmenting the actual logic you may find that new classes or abstractions
might be required.
*/
class CrailStreamProducerExample {
public static void main(String[] args) {
/* you need to implement CrailBroker, CrailProducer, CrailStreamWriter classes */
BorkerProducer broker = new BorkerProducer();
broker.connect("hostanme", port, someThingMore);
// https://webcache.googleusercontent.com/search?q=cache:YymJdLRJt40J:https://github.com/apache/arrow/blob/master/java/vector/src/test/java/org/apache/arrow/vector/file/TestArrowFile.java+&cd=1&hl=en&ct=clnk&gl=ch&client=ubuntu
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
@animeshtrivedi
animeshtrivedi / Main.java
Created November 20, 2017 18:57
Example of parquet vectorized reading
// http://www.jofre.de/?p=1459
package de.jofre.test;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.parquet.column.page.PageReadStore;
import org.apache.parquet.example.data.Group;
import org.apache.parquet.example.data.simple.convert.GroupRecordConverter;
import org.apache.parquet.format.converter.ParquetMetadataConverter;
@animeshtrivedi
animeshtrivedi / Notes.md
Created November 15, 2017 10:30 — forked from chrisvest/Notes.md
PrintCompilation on different versions of HotSpot VM

About PrintCompilation

This note tries to document the output of PrintCompilation flag in HotSpot VM. It was originally intended to be a reply to a blog post on PrintCompilation from Stephen Colebourne. It's kind of grown too big to fit as a reply, so I'm putting it here.

Written by: Kris Mok rednaxelafx@gmail.com

Most of the contents in this note are based on my reading of HotSpot source code from OpenJDK and experimenting with the VM flags; otheres come from HotSpot mailing lists and other reading materials listed in the "References" section.

This

@animeshtrivedi
animeshtrivedi / MemoryStore.scala
Created September 21, 2017 11:01
MemoryStore that shows all the inserted entries and their size in the showEntries function
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
/**
* Created by atr on 19.09.17.
*/
import java.io.{BufferedWriter, File, FileWriter}
import com.ibm.crail.CrailFS
import com.ibm.crail.conf.CrailConfiguration
import org.apache.spark.SparkContext
import scala.util.Random
@animeshtrivedi
animeshtrivedi / AtrParquetReadBenchmark.scala
Last active June 27, 2017 11:31
An example of parquet read benchmark where we can benchmark different stages of the parquet materialization in the Spark stack. It is compiled as a unit test in the spark source code.
package org.apache.spark.sql.execution.datasources.parquet
/**
* Created by atr on 23.06.17.
*/
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
@animeshtrivedi
animeshtrivedi / SparkCrailFileWriteBroadcast.scala
Last active May 18, 2017 11:16
This file has two functions to write raw 128 byte[] arrays vs. serialized array.
import java.nio.ByteBuffer
import java.util.concurrent.atomic.{AtomicBoolean, AtomicInteger, AtomicLong}
import java.util.concurrent.{ConcurrentHashMap, Future, LinkedBlockingQueue, TimeUnit}
import com.ibm.crail._
import com.ibm.crail.conf.CrailConfiguration
import com.ibm.crail.utils.CrailImmediateOperation
import org.apache.spark._
import org.apache.spark.common._
import org.apache.spark.executor.ShuffleWriteMetrics
@animeshtrivedi
animeshtrivedi / spark-default
Last active August 11, 2023 17:02
A spark configuration for some performance knobs
# RDD settings
spark.rdd.compress false
# Shuffle settings
spark.shuffle.manager sort
spark.shuffle.compress false
spark.shuffle.spill false
spark.shuffle.spill.compress false
spark.shuffle.sort.initialBufferSize 4194304
spark.shuffle.sort.bypassMergeThreshold 200
@animeshtrivedi
animeshtrivedi / UnsafeCopy.java
Created May 10, 2017 07:38
a sample code to unsafe copy from byte[] to double[] (and vice-versa)
public static void copyTest(String[] args) {
Unsafe us = null;
try {
Field f = Unsafe.class.getDeclaredField("theUnsafe");
f.setAccessible(true);
us = (Unsafe) f.get(null);
System.err.println(" oh my, I have it : " + us);
} catch (Exception e) {
e.printStackTrace();