Skip to content

Instantly share code, notes, and snippets.

View koenvo's full-sized avatar
💭
Owner teamtv

koenvo

💭
Owner teamtv
View GitHub Profile
@koenvo
koenvo / read.py
Last active January 1, 2024 21:11
"The One Billion Row Challenge" python submission
import os
from multiprocessing import Pool
# Notes:
# a) Let every process handle a single chunk.
# b) Use as many processes as cores
CHUNK_COUNT = 10
CONCURRENCY = 10
import time
# Make sure you have duckdb==0.7.0. Earlier versions might fail with GIL problems ( https://twitter.com/mr_le_fox/status/1620535141675433986 )
import duckdb
import s3fs
from fsspec.implementations.cached import SimpleCacheFileSystem
# Create the s3 file system. This one does not have caching
@koenvo
koenvo / README.md
Last active November 3, 2022 08:15

Cronjob and docker?

Some applications consist of multiple services. In those cases it can be useful to use a tool like docker-compose. With docker-compose you can define all your services within a single docker-compose yaml file, and add the file to your repository. This way you keep all services definition at a single place, with version control.

But what if you want to run a cronjob?

Cron on the host

One solution is to add a call to docker-compose to the crontab in the host system. In this example we would like to run a script called cleanup-old-files.py.

@koenvo
koenvo / codevember-11-bike-gsap-svg-stroke-animation.markdown
Created November 8, 2021 19:45
Codevember 11 #bike gsap svg stroke animation

Codevember 11 #bike gsap svg stroke animation

Starting from an svg representing a cyclist with multiple stroke i'm able to animate the dash offset of all of thoses at the same time using GSAP DrawSVG plugin and TimelineLite

A Pen by Alaric Baraou on CodePen.

License.

@koenvo
koenvo / pass_chain.py
Created November 13, 2020 20:29
Pass chain
from collections import defaultdict
import streamlit as st
from kloppy import datasets, event_pattern_matching as pm
from mplsoccer.pitch import Pitch
@st.cache(allow_output_mutation=True)
def load_dataset(match_id):
return datasets.load(
@koenvo
koenvo / requirements.txt
Last active February 4, 2022 09:08
Soccer analytics open-source demo
kloppy
mplsoccer
streamlit
natsort
@koenvo
koenvo / csvprocessor.py
Created December 2, 2015 14:07 — forked from miku/csvprocessor.py
CSV processor examples for luigi. Can serialize *args to CSV. Can deserialize CSV rows into namedtuples if requested. -- "works on my machine".
from luigi.format import Format
import csvkit
class CSVOutputProcessor(object):
"""
A simple CSV output processor to be hooked into Format's
`pipe_writer`.
If `cols` are given, the names are used as CSV header, otherwise no
explicit header is written.
@koenvo
koenvo / goal_dict.py
Last active January 4, 2016 16:59 — forked from dtheodor/goal_dict.py
GaTransactionConf = namedtuple("GaTransactionConf", ["sku", "name", "category"])
class GaTransactionsDict(dict):
def __getitem__(self, key):
if isinstance(key, GaTransactionConf):
updated = False
for mapping in self.keys():
if cmp_ga_conf(key, mapping):
key = mapping #update the key