Skip to content

Instantly share code, notes, and snippets.

View dkapitan's full-sized avatar

Daniel Kapitan dkapitan

View GitHub Profile
@dkapitan
dkapitan / make_polynomial.py
Created November 21, 2021 19:10
Polynomial
def make_polynomial(dataframe, degree=MAX_DEGREE):
"""Function for creating higher-order polynomial features from dataframe.
Dataframe df should be like [Y, X1, X2, .. Xi].
Returns dataframe polynomial features of X1 ... Xi up to degree polynomials."""
df = dataframe.copy()
cols = df.columns[1:]
for i in range(2, degree + 1):
for col in cols:
@dkapitan
dkapitan / stelselcatalogus.ttl
Created September 30, 2021 06:59
Stelselcatalogus
This file has been truncated, but you can view the full file.
# baseURI: http://opendata.stelselcatalogus.nl/id/dataset/sc
# imports: http://purl.org/dc/elements/1.1/
# imports: http://rdfs.org/ns/void
# imports: http://www.w3.org/2004/02/skos/core
@prefix adms: <http://www.w3.org/ns/adms#> .
@prefix begrip_banken: <http://opendata.stelselcatalogus.nl/banken/id/begrip/> .
@prefix begrip_bgt: <http://opendata.stelselcatalogus.nl/bgt/id/begrip/> .
@prefix begrip_bri: <http://opendata.stelselcatalogus.nl/bri/id/begrip/> .
@prefix begrip_brk: <http://opendata.stelselcatalogus.nl/brk/id/begrip/> .
@dkapitan
dkapitan / ISSUE_TEMPLATE.md
Created September 24, 2021 08:46
GitHub templates

Title

When released, this story will ENTER TEXT HERE.

Description

As a ENTER ROLE I want ENTER GOAL, so that ENTER REASON(S).

Requirements

@dkapitan
dkapitan / kwb-datasets.py
Created July 29, 2021 09:44
cbs statline
KWB = {
2016: "83487NED",
2017: "83765NED",
2018: "84286NED",
2019: "84583NED",
2020: "84799NED"
}
@dkapitan
dkapitan / python-versions.md
Last active May 11, 2021 10:52
Python 3 main versions - highlights
  • 3.6: f-strings
  • 3.7: async and await; dataclasses
  • 3.8: assignment expression (:=)
  • 3.9: type hinting generics in standard library
@dkapitan
dkapitan / blog-post.md
Created January 30, 2021 10:27
comet-chart-flight-delays-post

Zan Armstrong's comet chart has been on my list of hobby projects for a while now. I think it is an elegant solution to visualize statistical mix effects and address Simpson's paradox, and particularly useful when working with longitudinal data involving different sub-populations. Recently I found a good excuse to spend some time to actually use it as part of a exploratory data analysis on a project.

Since I mostly work in Python and have recently fallen in love with Altair - for the same reasons as Fernando explains here - I wondered how the comet chart could be implemented using the grammar of interactive graphics. It took me a while to figure out how to actually plot the c

@dkapitan
dkapitan / comet-chart-flight-delays.py
Last active January 30, 2021 10:26
Comet charts in Python: visualizing statistical mix effects and Simpson's paradox in Altair
import altair as alt
import pandas as pd
import vega_datasets
# Use airline data to assess statistical mix effects of delays
flights = vega_datasets.data.flights_20k()
aggregation = dict(
number_of_flights=("destination", "count"),
mean_delay=("delay", "mean"),
@dkapitan
dkapitan / tree-of-machine-learning-algorithms.md
Created January 12, 2021 08:19
Medium article: Tree of Machine Learning Algorithms

The Tree of Machine Learning Algorithms

Inevitably, when I teach introductory courses on machine learning at the Jheronimus Academy of Data Science, I get questions like:

  • Which algorithms are most suitable for my task/project/problem?
  • Which open source libraries should I use?
  • What is the intuition behind algorithm X, and how does that compare with algorithm Y?

Notwithstanding the wealth of information that is available in the creative commons, it is hard to see the forest for the trees. I found myself doing the same websearches over and over again, and I thought: 'Surely there must be a better way?'.

@dkapitan
dkapitan / install-pop_os.md
Last active January 5, 2021 22:12
Migrating from OSX to Pop!_OS

Installing POP!_OS 20.10 with dual boot

Dell XPS 15 (2020)

  • disable Secure Boot
  • make bootable USB with POP!_OS using Rufus
  • Shrink Windows 10 partion to make room for POP!_OS
  • Turn RAID off and switch to AHCI. This breaks windows if you do it directly in BIOS!. You can follow the steps from this Medium article
@dkapitan
dkapitan / Makefile
Last active August 25, 2020 16:52
dephell and poetry
poetry: ## generate setup.py, environment.yml and requirements.txt from poetry
dephell deps convert
dephell deps convert --env pip
dephell deps convert --env conda
kernel:
poetry run ipython kernel install --user --name=your-project-name