NDJSON is a convenient format for storing or streaming structured data that may be processed one record at a time.
- Each line is a valid JSON value
- Line separator is ‘\n’
cat test.json | jq -c '.[]' > testNDJSON.json
NDJSON is a convenient format for storing or streaming structured data that may be processed one record at a time.
cat test.json | jq -c '.[]' > testNDJSON.json
Using jq
is great for examining JSON objects. You can extend its functionality with custom methods. The following is useful to understand at a high level the structure of arbitrary JSONs which is useful when trying to understand new data sources.
Requires jq
verison 1.5.
Add the following method to your ~/.jq
:
Flame graphs are a nifty debugging tool to determine where CPU time is being spent. Using the Java Flight recorder, you can do this for Java processes without adding significant runtime overhead.
Shivaram Venkataraman and I have found these flame recordings to be useful for diagnosing coarse-grained performance problems. We started using them at the suggestion of Josh Rosen, who quickly made one for the Spark scheduler when we were talking to him about why the scheduler caps out at a throughput of a few thousand tasks per second. Josh generated a graph similar to the one below, which illustrates that a significant amount of time is spent in serialization (if you click in the top right hand corner and search for "serialize", you can see that 78.6% of the sampled CPU time was spent in serialization). We used this insight to spee
System process daemons that are system-wide provided by mac os x are described by launchd preference files that can be showed with the command: | |
$ sudo ls -all /System/Library/LaunchDaemons/ | |
Third party process daemons that are system-wide provided by the administrator are described by preference files that can be showed with the command: | |
$ sudo ls -all /Library/LaunchDaemons/ | |
Launch Agents that are per-user provided by mac os x usually loaded when the user logs in. Those provided by the system can be found with: | |
$ sudo ls -all /System/Library/LaunchAgents/ | |
Launch Agents that are per-user provided by the administrator and usually loaded when the user logs in. Those provided by the system can be found with: |
import scala.xml._ | |
// To convert a Maven pom.xml to build.sbt: | |
// 1) Place this code into a file called PomToSbt.scala next to pom.xml | |
// 2) Type scala PomtoSbt.scala > build.sbt | |
// The dependencies from pom.xml will be extracted and place into a complete build.sbt file | |
// Because most pom.xml files only refernence non-Scala dependencies, I did not use %% | |
val lines = (XML.load("pom.xml") \\ "dependencies") \ "dependency" map { dependency => | |
val groupId = (dependency \ "groupId").text | |
val artifactId = (dependency \ "artifactId").text |
Latency Comparison Numbers | |
-------------------------- | |
L1 cache reference 0.5 ns | |
Branch mispredict 5 ns | |
L2 cache reference 7 ns 14x L1 cache | |
Mutex lock/unlock 25 ns | |
Main memory reference 100 ns 20x L2 cache, 200x L1 cache | |
Compress 1K bytes with Zippy 3,000 ns 3 us | |
Send 1K bytes over 1 Gbps network 10,000 ns 10 us | |
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD |