Skip to content

Instantly share code, notes, and snippets.

View Daenyth's full-sized avatar

Alanna Stone Daenyth

View GitHub Profile
@Daenyth
Daenyth / debug_requests.py
Created August 27, 2015 14:35
Enable debug logging for python requests
import requests
import logging
import httplib
# Debug logging
httplib.HTTPConnection.debuglevel = 1
logging.basicConfig()
logging.getLogger().setLevel(logging.DEBUG)
req_log = logging.getLogger('requests.packages.urllib3')
req_log.setLevel(logging.DEBUG)
@Daenyth
Daenyth / Pull.md
Last active December 8, 2024 00:27
Designing an fs2 `Pull` from scratch

The problem

I have some data which has adjacent entries that I want to group together and perform actions on. I know roughly that fs2.Pull can be used to "step" through a stream and do more complicated logic than the built in combinators allow. I don't know how to write one though!

In the end we should have something like

def combineAdjacent[F[_], A](
 shouldCombine: (A, A) => Boolean,
@Daenyth
Daenyth / nonunit.sbt
Created April 30, 2024 15:37
Allow scalatest/scalamock to not ruin -Wnonunit-statement
Test / scalacOptions ++= Seq(
// Allow using -Wnonunit-statement to find bugs in tests without exploding from scalatest assertions
"-Wconf:msg=unused value of type org.scalatest.Assertion:s",
"-Wconf:msg=unused value of type org.scalamock:s"
)
@Daenyth
Daenyth / parEvalMap.md
Created November 28, 2023 16:44
fs2 parEvalMap vs parEvalMapUnordered

Unordered means that we don't care about the order of results from the evalMap action.

It allows for higher throughput because it never waits to start a new job. It has N job permits (with a semaphore), and the moment one job finishes, it requests the next element from the stream and begins operation.

When you use parEvalMap with ordered results, it means that it only begins the next job if the oldest input's job is ready to emit.

This matters when individual elements can take a variable amount of time to complete - and that's the case here, because backfill can take more or less time depending on how many transactions are present within the time window.

Suppose we have 4 jobs we want to run with up to 2 at a time. Job 1 takes 60 seconds to complete, and all the rest take 10 seconds. Using parEvalMap would mean the entire set of inputs would take ~70 seconds to complete.

@Daenyth
Daenyth / OutputStreamSteam.scala
Last active July 11, 2024 20:05
OutputStreamSteam [fs2 0.10] - write into a java.io.OutputStream, read from fs2.Stream.
import java.io.OutputStream
import java.util.concurrent.Executors
import cats.effect.{Async, Effect, IO, Timer}
import cats.implicits._
import fs2.async.mutable.Queue
import fs2.{Chunk, Stream}
import scala.annotation.tailrec
import scala.concurrent.{ExecutionContext, SyncVar}
@Daenyth
Daenyth / 1-MapTraverse.md
Last active June 25, 2024 13:05
Scala (cats) map/traverse parallels

Parallels between map and similar functions

map          :: F[A] => (A =>     B)   => F[B]
flatMap      :: F[A] => (A =>   F[B])  => F[B]
traverse     :: G[A] => (A =>   F[B])  => F[G[B]]
flatTraverse :: G[A] => (A => F[G[B]]) => F[G[B]]
traverse_    :: F[A] => (A =>   F[B])  => F[Unit]
@Daenyth
Daenyth / MonadAndFs2Ops.md
Last active June 25, 2024 13:04
Cheat sheet for common cats monad and fs2 operation shapes
Operation Input Result Notes
map F[A] , A => B F[B] Functor
apply F[A] , F[A => B] F[B] Applicative
(fa, fb, ...).mapN (F[A], F[B], ...) , (A, B, ...) => C F[C] Applicative
(fa, fb, ...).tupled (F[A], F[B], ...) F[(A, B, ...)] Applicative
flatMap F[A] , A => F[B] F[B] Monad
traverse F[A] , A => G[B] G[F[A]] Traversable; fa.traverse(f) == fa.map(f).sequence; "foreach with effects"
sequence F[G[A]] G[F[A]] Same as fga.traverse(identity)
attempt F[A] F[Either[E, A]] Given ApplicativeError[F, E]
@Daenyth
Daenyth / prep_gcp_local.bash
Created June 12, 2024 17:31
Create a topic and subscription for gcp pubsub local emulator, idempotently
#!/usr/bin/env bash
# Exit immediately if a command exits with a non-zero status.
set -e
# Treat unset variables as an error and exit immediately.
set -u
# Return the exit status of the last command in the pipeline that failed.
set -o pipefail
PROJECT_ID="MYPROJECT"
@Daenyth
Daenyth / torrpen
Created February 21, 2010 18:55
torrpen - automated torrent management by size
#!/bin/bash
rtorrent_watch_dir=~/torrent
torrent_holding_pen=~/torrent/pending
buffer=$((600 * 1024)) # Size in KB to keep unused (600MB)
space_left=$(($(df $rtorrent_watch_dir | awk '/dev/ {print $4}') - $buffer))
tmpdir=$(mktemp -d /tmp/torrpen.XXXX)
get_torrent_size () {
@Daenyth
Daenyth / SlickUpsert.scala
Created February 26, 2018 20:59
A slick profile extension to allow native postgres batch upsert
import com.github.tminglei.slickpg.ExPostgresProfile
import slick.SlickException
import slick.ast.ColumnOption.PrimaryKey
import slick.ast.{ColumnOption, FieldSymbol, Insert, Node, Select}
import slick.compiler.{InsertCompiler, Phase, QueryCompiler}
import slick.dbio.{Effect, NoStream}
import slick.jdbc.InsertBuilderResult
import slick.lifted.Query
// format: off