Skip to content

Instantly share code, notes, and snippets.

View randomstatistic's full-sized avatar

Jeff Wartes randomstatistic

View GitHub Profile
@randomstatistic
randomstatistic / gist:a7a026798880e1003777
Created May 7, 2014 18:38
Microsecond timing of DNS resolution
for x in `seq 1 20`; do strace -f -tt -o /tmp/st dig
www.google.com > /dev/null && grep -P '(send|recv)msg\(20'
/tmp/st | grep -v EAGAIN | awk '/sendmsg/ {f= substr($2,7)} /recvmsg/
{e= substr($2,7)} END {print e-f}'; done | sort -n
@randomstatistic
randomstatistic / FutureBuffer
Created July 17, 2015 18:03
Block simultaneous future creation beyond a threshold
import java.util.concurrent.{TimeUnit, LinkedBlockingQueue}
import scala.annotation.tailrec
import scala.collection.JavaConverters._
import scala.concurrent.Future
/**
* Thread-safe lock on Future generation. The put() method accepts futures without blocking so long as there are less than
* $size futures that have been added via put() that are still alive. If more than $size futures are still running,
* calling put() *blocks* the calling thread until some of the current futures finish.
@randomstatistic
randomstatistic / solr_garbage_analysis.txt
Last active March 22, 2016 00:00
Solr Garbage Analysis
This was against a pretty specific index, using a very specialized query corpus, with lots of caveats. Be careful about comparisions.
Index stats: 85M docs/shard, 3 shards, 1 node.
Query stats: 142k queries. 78k use a simple facet.query, 27k do geospatial radius, 115 use CollapseQParser
CollapsingQParserPlugin:
22% of garbage by size
These were huge, perhaps a half-dozen allocations
Lines 510,512 CollapsingQParserPlugin
SolrIndexSearcher: (getDocListAndSetNC)
56% of garbage by size
@randomstatistic
randomstatistic / FixedBitSetPool.java
Created April 29, 2016 15:48
FixedBitSet Pooler
public class FixedBitSetPool {
public static final int poolSize = 10;
private static ArrayBlockingQueue<FixedBitSet> pool = new ArrayBlockingQueue<FixedBitSet>(poolSize);
// Ask for a FBS
public static FixedBitSet request(int size) {
FixedBitSet next = pool.poll();
if (next == null || next.length() < size) {
// if the size doesn't match, throw it away and return a new one of the requested size
@randomstatistic
randomstatistic / solr_or_match.txt
Last active May 24, 2016 16:39
Which OR'ed field matched a given solr doc?
The idea is to add calculated fields to the “fl” of your query that match the components of your query.
Solr 4.0+ allows calculated functions in your field list using a <field name>:<function> syntax, and one of the available functions is “query”.
This seems like a vaguely unpleasant way to determine which clauses matched, but I could see it working. I think it would go something like:
q=colA:foo OR colB:bar&fl=*,score,matchedColA:query({!standard q=“colA:foo”},0),matchedColB:query({!standard q=“colB:bar”},0)
Presumably the field matchedColA would be non-zero if colA:foo matched on that document, and matchedColB would be non-zero
if colB:bar matched.
(I’m actually not sure if “standard” works as the name of the default query parser, but whatever, the idea is that it needs to match the relevant bit of your query.)
@randomstatistic
randomstatistic / LeakyBucket.scala
Created June 15, 2016 17:38
Leaky bucket implementation in scala
import java.util.concurrent.locks.ReentrantLock
import scala.concurrent.{ExecutionContext, Future, Promise}
import scala.concurrent.duration._
class LeakyBucket(dripEvery: FiniteDuration, maxSize: Int) {
require(maxSize > 0, "A bucket must have a size > 0")
private val dripEveryNanos = dripEvery.toNanos
private val lock = new ReentrantLock()
@randomstatistic
randomstatistic / ClasspathReader.scala
Created April 24, 2017 16:57
Resource file reader
import java.io.FileNotFoundException
import java.util.zip.GZIPInputStream
import scala.io.Codec
object ClasspathReader {
def getResourceAsStream(filename: String) = {
// why doesn't the more scala-like getClass.getResourceAsStream work?
val inputStream = Thread.currentThread().getContextClassLoader().getResourceAsStream(filename)
if (inputStream == null) throw new FileNotFoundException(s"Couldn't find $filename on the classpath")
import org.apache.zookeeper.server.{NIOServerCnxn, ZooKeeperServer}
import java.net.{ServerSocket, InetSocketAddress}
import kafka.server.{KafkaServer, KafkaConfig}
import kafka.producer.{ProducerConfig, Producer}
import java.util.Properties
import kafka.serializer.{DefaultEncoder, StringEncoder}
import java.io.File
import scala.util.Random
import kafka.admin.AdminUtils
@randomstatistic
randomstatistic / cached_by_extension.sh
Last active January 23, 2018 17:13
Investigate linux filesystem cache usage
#!/bin/bash
dir=$1
if [ "$1" == "" ]; then
echo "Must provide a directory as an argument"
exit 1
fi
ftypes=$(find $1 -type f -size +10c | grep -E ".*\.[a-zA-Z0-9]*$" | sed -e 's/.*\(\.[a-zA-Z0-9]*\)$/\1/' | sort | uniq)
@randomstatistic
randomstatistic / TryWithResources.scala
Created July 13, 2018 18:21
Scala try-with-resources equivelent
import java.io.Closable
import scala.util.control.NonFatal
import scala.util.{ Success, Try }
// This should really be scala-standard-lib.
object TryWithResources {
def withClose[T <: Closeable, V](closable: T)(f: T => V): V = {
(Try(f(closable)), Try(closable.close())) match {
case (Success(v), Success(_)) => v
case (a, b) => throw preferFirstException(a, b).get