This is now an actual repo:
Latency Comparison Numbers (~2012) | |
---------------------------------- | |
L1 cache reference 0.5 ns | |
Branch mispredict 5 ns | |
L2 cache reference 7 ns 14x L1 cache | |
Mutex lock/unlock 25 ns | |
Main memory reference 100 ns 20x L2 cache, 200x L1 cache | |
Compress 1K bytes with Zippy 3,000 ns 3 us | |
Send 1K bytes over 1 Gbps network 10,000 ns 10 us | |
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD |
Every application ever written can be viewed as some sort of transformation on data. Data can come from different sources, such as a network or a file or user input or the Large Hadron Collider. It can come from many sources all at once to be merged and aggregated in interesting ways, and it can be produced into many different output sinks, such as a network or files or graphical user interfaces. You might produce your output all at once, as a big data dump at the end of the world (right before your program shuts down), or you might produce it more incrementally. Every application fits into this model.
The scalaz-stream project is an attempt to make it easy to construct, test and scale programs that fit within this model (which is to say, everything). It does this by providing an abstraction around a "stream" of data, which is really just this notion of some number of data being sequentially pulled out of some unspecified data source. On top of this abstraction, sca
/** | |
* Part Zero : 10:15 Saturday Night | |
* | |
* (In which we will see how to let the type system help you handle failure)... | |
* | |
* First let's define a domain. (All the following requires scala 2.9.x and scalaz 6.0) | |
*/ | |
import scalaz._ | |
import Scalaz._ |
/* | |
Copyright 2012-2021 Viktor Klang | |
Licensed under the Apache License, Version 2.0 (the "License"); | |
you may not use this file except in compliance with the License. | |
You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, software |
import spark.streaming.{Seconds, StreamingContext} | |
import spark.storage.StorageLevel | |
import spark.streaming.examples.twitter.TwitterInputDStream | |
import com.twitter.algebird._ | |
import spark.streaming.StreamingContext._ | |
import spark.SparkContext._ | |
/** | |
* Example of using CountMinSketch monoid from Twitter's Algebird together with Spark Streaming's | |
* TwitterInputDStream |
/** | |
* To get started: | |
* git clone https://github.com/twitter/algebird | |
* cd algebird | |
* ./sbt algebird-core/console | |
*/ | |
/** | |
* Let's get some data. Here is Alice in Wonderland, line by line | |
*/ |
Too much for teh twitterz :)
JVM + invokedynamic is in a completely different class than CLR + DLR, for the same reasons that JVM is in a different class than CLR to begin with.
CLR can only do its optimization up-front, before executing code. This is a large part of the reason why C# is designed the way it is: methods are non-virtual by default so they can be statically inlined, types can be specified as value-based so their allocation can be elided, and so on. But even with those language features CLR simply cannot optimize code to the level of a good, warmed-up JVM.
The JVM, on the other hand, optimizes and reoptimizes code while it runs. Regardless of whether methods are virtual/interface-dispatched, whether objects are transient, whether exception-handling is used heavily...the JVM sees through the surface and optimizes code appropriate for how it actually runs. This gives it optimization opportunities that CLR will never have without adding a comparable profiling JIT.
So how does this affect dynamic
trait Enum { //DIY enum type | |
import java.util.concurrent.atomic.AtomicReference //Concurrency paranoia | |
type EnumVal <: Value //This is a type that needs to be found in the implementing class | |
private val _values = new AtomicReference(Vector[EnumVal]()) //Stores our enum values | |
//Adds an EnumVal to our storage, uses CCAS to make sure it's thread safe, returns the ordinal | |
private final def addEnumVal(newVal: EnumVal): Int = { import _values.{get, compareAndSet => CAS} | |
val oldVec = get |