Skip to content

Instantly share code, notes, and snippets.

@JoshRosen
JoshRosen / scala-lambda-serialization-with-lifted-local-defs.md
Last active June 12, 2021 16:35
Serialization of Scala closures that contain local defs

Serialization of Scala closures that contain local defs

Several Apache Spark APIs rely on the ability to serialize Scala closures. Closures may reference non-Serializable objects, preventing them from being serialized. In some cases (SI-1419 and others), however, these references are unnecessary and can be nulled out, allowing otherwise-unserializable closures to be serialized (in Spark, this nulling is performed by the ClosureCleaner).

Scala 2.12's use of Java 8 lambdas for implementing closures appears to have broken our ability to serialize closures which contain local defs. If we cannot resolve this problem, Spark will be unable to support Scala 2.12 and will be stuck on 2.10 and 2.11 forever.

As an example which illustrates this problem, the following closure has a nested localDef and is defined inside of a non-serializable class:

``

@apangin
apangin / HotSpot JVM intrinsics
Last active May 11, 2023 18:32
HotSpot JVM intrinsics
_hashCode java/lang/Object.hashCode()I
_getClass java/lang/Object.getClass()Ljava/lang/Class;
_clone java/lang/Object.clone()Ljava/lang/Object;
_dabs java/lang/Math.abs(D)D
_dsin java/lang/Math.sin(D)D
_dcos java/lang/Math.cos(D)D
_dtan java/lang/Math.tan(D)D
_datan2 java/lang/Math.atan2(DD)D
_dsqrt java/lang/Math.sqrt(D)D
_dlog java/lang/Math.log(D)D
@JoshRosen
JoshRosen / SnappyBenchmark.scala
Created April 9, 2015 02:10
Exploring whether changes to snappy-java have resulted in worse compression in newer versions
import java.io._
import org.xerial.snappy.SnappyOutputStream
object Main {
def main(args: Array[String]) {
val blockSize = sys.env.getOrElse("BLOCK_SIZE", "32768").toInt
val resetFrequency = sys.env.getOrElse("RESET_FREQUENCY", "100").toInt
val byteArrayOutputstream = new ByteArrayOutputStream()

Make it real

Ideas are cheap. Make a prototype, sketch a CLI session, draw a wireframe. Discuss around concrete examples, not hand-waving abstractions. Don't say you did something, provide a URL that proves it.

Ship it

Nothing is real until it's being used by a real user. This doesn't mean you make a prototype in the morning and blog about it in the evening. It means you find one person you believe your product will help and try to get them to use it.

Do it with style

@doryokujin
doryokujin / 0.basics.sql
Created July 22, 2012 09:04
退会に関する分析
-- 0.1 ログインインターバルの分布 --
-- 当日入会->辞めた人もインターバルは1となるので1回以上プレイしている人に限定 --
td query -w -d your_app -f csv -o dist_of_login_interval.csv "
SELECT ROUND((datediff(latest_login, registered_day)+1)/login_times) AS login_interval, COUNT(*) AS cnt
FROM
(
SELECT v['uid'] AS uid, from_unixtime(MAX(time),'yyyy-MM-dd' ) AS latest_login
FROM login
GROUP BY v['uid']
) t1
@kmizu
kmizu / GetEnumerationClassInfo.scala.repl
Created January 11, 2012 16:00
Get runtime type information of a Enumeration's subclass from a Value's instance.
Welcome to Scala version 2.9.1.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_26).
Type in expressions to have them evaluated.
Type :help for more information.
scala> object W extends Enumeration { val A = Value }
defined module W
scala> val V: Enumeration#Value = W.A
V: Enumeration#Value = A