Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.
import time | |
# Make sure you have duckdb==0.7.0. Earlier versions might fail with GIL problems ( https://twitter.com/mr_le_fox/status/1620535141675433986 ) | |
import duckdb | |
import s3fs | |
from fsspec.implementations.cached import SimpleCacheFileSystem | |
# Create the s3 file system. This one does not have caching |
app = "mastiff" | |
primary_region = "iad" | |
kill_signal = "SIGINT" | |
kill_timeout = 5 | |
processes = [] | |
[build] | |
image = "tootsuite/mastodon:v3.5.5" |
Mute these words in your settings here: https://twitter.com/settings/muted_keywords | |
ActivityTweet | |
generic_activity_highlights | |
generic_activity_momentsbreaking | |
RankedOrganicTweet | |
suggest_activity | |
suggest_activity_feed | |
suggest_activity_highlights | |
suggest_activity_tweet |
I was talking to a coworker recently about general techniques that almost always form the core of any effort to write very fast, down-to-the-metal hot path code on the JVM, and they pointed out that there really isn't a particularly good place to go for this information. It occurred to me that, really, I had more or less picked up all of it by word of mouth and experience, and there just aren't any good reference sources on the topic. So… here's my word of mouth.
This is by no means a comprehensive gist. It's also important to understand that the techniques that I outline in here are not 100% absolute either. Performance on the JVM is an incredibly complicated subject, and while there are rules that almost always hold true, the "almost" remains very salient. Also, for many or even most applications, there will be other techniques that I'm not mentioning which will have a greater impact. JMH, Java Flight Recorder, and a good profiler are your very best friend! Mea
/** | |
* Generate Case class from DataFrame.schema | |
* | |
* val df:DataFrame = ... | |
* | |
* val s2cc = new Schema2CaseClass | |
* import s2cc.implicit._ | |
* | |
* println(s2cc.schemaToCaseClass(df.schema, "MyClass")) | |
* |
import io.circe._ | |
import io.circe.generic.auto._ | |
import io.circe.parser._ | |
val jsonPoint = "{ \"type\": \"Point\", \"coordinates\": [100.0, 0.0] }" | |
val jsonLineString = "{ \"type\": \"LineString\",\n \"coordinates\": [ [100.0, 0.0], [101.0, 1.0] ]\n }" | |
val jsonPolygon = "{ \"type\": \"Polygon\",\n \"coordinates\": [\n [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ]\n ]\n }" | |
val jsonPolygonWithHoles = "{ \"type\": \"Polygon\",\n \"coordinates\": [\n [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ],\n [ [100.2, 0.2], [100.8, 0.2], [100.8, 0.8], [100.2, 0.8], [100.2, 0.2] ]\n ]\n }" | |
val jsonGC = "{ \"type\": \"GeometryCollection\",\n \"geometries\": [\n { \"type\": \"Point\",\n \"coordinates\": [100.0, 0.0]\n },\n { \"type\": \"LineString\",\n \"coordinates\": [ [101.0, 0.0], [102.0, 1.0] ]\n }\n ]\n }" |
- Web Server: Play (framework) or http4s (library)
- Actors: akka
- Asynchronous Programming: monix (for tasks, reactors, observables, scheduler etc)
- Authentication: Silhouette
- Authorization: Deadbolt
- Command-line option parsing: case-app
- CSV Parsing: kantan.csv
- DB: doobie (for PostgreSQL)
Result: 1 | |
Items { | |
TemplateId: "BADGE_BATTLE_ATTACK_WON" | |
Badge { | |
BadgeType: BADGE_BATTLE_ATTACK_WON | |
BadgeRanks: 4 | |
Targets: "\nd\350\007" | |
} | |
} | |
Items { |