Skip to content

Instantly share code, notes, and snippets.

View internaut's full-sized avatar

Markus Konrad internaut

View GitHub Profile
@internaut
internaut / cooc.py
Last active November 8, 2019 14:16
Function to calculate word co-occurrence from document-term matrix and a test using the hypothesis package
import numpy as np
def word_cooccurrence(dtm):
"""
Calculate the co-document frequency (aka word co-occurrence) matrix for a document-term matrix `dtm`, i.e. how often
each pair of tokens occurs together at least once in the same document.
:param dtm: (sparse) document-term-matrix of size NxM (N docs, M is vocab size) with raw term counts.
:return: co-document frequency (aka word co-occurrence) matrix with shape MxM
@internaut
internaut / sponscraper_v1.py
Created December 1, 2020 13:36
Sample scripts for blog post "Robust data collection via web scraping and web APIs".
"""
Sample scripts for blog post "Robust data collection via web scraping and web APIs"
(https://datascience.blog.wzb.eu/2020/12/01/robust-data-collection-via-web-scraping-and-web-apis/).
Script 1. Starting point – baseline (unreliable) web scraping script.
December 2020, Markus Konrad <markus.konrad@wzb.eu>
"""
from datetime import datetime, timedelta
@internaut
internaut / voronoize.py
Created February 10, 2021 19:25
Voronoi regions of schools in East Germany. An example using the geovoronoi package (https://pypi.org/project/geovoronoi/).
"""
Voronoi regions of schools in East Germany.
An example using the geovoronoi package (https://pypi.org/project/geovoronoi/).
Feb. 2021
Markus Konrad <markus.konrad@wzb.eu>
"""
import os
@internaut
internaut / transfer.py
Created February 22, 2022 09:33
Transfer a user's GitLab projects to a new group.
"""
Transfer all GitLab projects from the user authenticated with a supplied private access token (PAT) to a new
namespace (i.e. a group with a group ID).
To generate a PAT, log in to your GitLab account and go to "User settings > Access tokens".
To find out the ID of a group to which you want to transfer the projects, go to the group's page. The group ID is shown
under the title of the group.
Requirements: Python 3 with requests package installed (tested with Python 3.8 and requests 2.27.1).
@internaut
internaut / xsfpcopy.py
Created April 28, 2022 16:15
Copy contents of a XSFP music playlist to a target folder
#!/bin/python3
# Copy contents of a XSFP music playlist to a target folder
#
# required two arguments: path to xspf file, target path
# requires Python >= 3.8
#
# author: Markus Konrad <post@mkonrad.net>
import os.path
import sys