Skip to content

Instantly share code, notes, and snippets.

@frosforever
frosforever / spark_alb.scala
Last active February 21, 2024 14:35
Reading ALB logs using spark
// See https://docs.aws.amazon.com/athena/latest/ug/application-load-balancer-logs.html for regex pattern and column names
val raw = spark.read.text("s3://some_s3_path/albname/AWSLogs/accountId/elasticloadbalancing/region/year/month/day/")
val regex = """([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*):([0-9]*) ([^ ]*)[:-]([0-9]*) ([-.0-9]*) ([-.0-9]*) ([-.0-9]*) (|[-0-9]*) (-|[-0-9]*) ([-0-9]*) ([-0-9]*) \"([^ ]*) (.*) (- |[^ ]*)\" \"([^\"]*)\" ([A-Z0-9-_]+) ([A-Za-z0-9.-]*) ([^ ]*) \"([^\"]*)\" \"([^\"]*)\" \"([^\"]*)\" ([-.0-9]*) ([^ ]*) \"([^\"]*)\" \"([^\"]*)\" \"([^ ]*)\" \"([^\s]+?)\" \"([^\s]+)\" \"([^ ]*)\" \"([^ ]*)\""""
val albLogs = raw.select(
regexp_extract($"value", regex, 1).as("type"),
regexp_extract($"value", regex, 2).as("time"),
regexp_extract($"value", regex, 3).as("elb"),
@frosforever
frosforever / IncrementingServer.scala
Created September 18, 2019 18:24
Incrementing server using a different route for each request based on updating a `Ref`
import org.http4s.circe._
import org.http4s.dsl.io._
import org.http4s.implicits._
import org.http4s.server.blaze.BlazeServerBuilder
import org.http4s.{ HttpApp, HttpRoutes, Request }
import cats.data.Kleisli
import cats.effect._
import cats.effect.concurrent.Ref
import cats.implicits._
@frosforever
frosforever / BreakingPipe.scala
Last active August 23, 2019 15:38
Scope breaking fs2.Pipe
import fs2.{ Pipe, Pull } // Version 1.0.5
import cats.effect.{ ExitCode, IO, IOApp }
object BreakingPipe extends IOApp {
def onlyOnNonEmpty[F[_], A, B](innerPipe: fs2.Pipe[F, A, B]): fs2.Pipe[F, A, B] = { s =>
s.pull.peek.flatMap {
case None => Pull.pure(None)
case Some((_, ss)) => ss.through(innerPipe).pull.echo
import shapeless._
import shapeless.ops.hlist
object mapper extends Poly1 {
implicit def caseOpt[T] = at[Option[T]](_ => Option.empty[T])
}
def foo[S, SRep <: HList, O <: HList](s: S)(implicit
labelledGeneric: Generic.Aux[S, SRep],
map: hlist.Mapper.Aux[mapper.type, SRep, O],
@frosforever
frosforever / manifest.sh
Created October 18, 2017 13:14
create Redshift manifest from prefix ignoring `SUCCESS` files
aws s3api list-objects --bucket BUCKET_NAME --prefix WTVR/PREFIX/ --query 'Contents[?!contains(Key,`SUCCESS`)].{Key:Key}' --output json | jq '[.[] | .["url"] = "s3://BUCKET_NAME/" + .Key | .["mandatory"] = true | del(.Key)] | { entries: .}'
@frosforever
frosforever / prefix_size.sh
Created October 18, 2017 13:11
recursive size of contents in bucket prefix
aws s3api list-objects --bucket BUCKET_NAME --prefix "whateverprefix/" --output json --query "[sum(Contents[].Size), length(Contents[])]"
@frosforever
frosforever / gist:db39283d8beecfb101cd84ca35fef2ca
Created February 6, 2017 15:05
docker for mac access host (not docker xhve host) from docker container
Alias for loopback that container can see:
`sudo ifconfig lo0 alias 10.0.2.2`
container can now access stuff running on host at `10.0.2.2`!
@frosforever
frosforever / gist:8a7dbd7a8c952d8a46b9
Created May 4, 2015 20:19
SBT scalaTest testOnly regex
testOnly *MySuite -- -z foo
to run only the tests whose name includes the substring "foo". For exact match rather than substring, use -t instead of -z.
@frosforever
frosforever / GitPath.txt
Created April 20, 2015 15:09
Git Patching
Ref: https://ariejan.net/2009/10/26/how-to-create-and-apply-a-patch-with-git/
git format-patch master --stdout > whatever.patch
git apply --check whatever.patch
git am --signoff < whatever.patch
@frosforever
frosforever / gitGrep.sh
Created April 3, 2015 16:16
git grep on directory with multiple repose
for i in *; do ( cd $i; git grep foo HEAD ); done