Quick'n'dirty Jekyll plugin for sorted cycle.
This fork fixes two issues:
- problems when specifiying sort fields like
sort_by:'weight'
(with ' or " characters) - problems when a collection entry does not have the specified sort field
### generate questionnaire data | |
library(triangle) | |
set.seed(0) | |
q1_d1 <- round(rtriangle(1000, 1, 7, 5)) | |
q1_d2 <- round(rtriangle(1000, 1, 7, 6)) | |
q1_d3 <- round(rtriangle(1000, 1, 7, 2)) |
""" | |
Runtime optimization through vectorization and parallelization. | |
Script 3: Parallel and vectorized calculation of haversine distance. | |
Please note that this might be slower than the single-core vectorized version because of the overhead that is caused | |
by multiprocessing. | |
January 2018 | |
Markus Konrad <markus.konrad@wzb.eu> | |
""" |
def str_multisplit(s, sep): | |
""" | |
Split string `s` by all characters/strings in `sep`. | |
:param s: a string to split | |
:param sep: sequence or set of characters to use for splitting | |
:return: list of split string parts | |
""" | |
if not isinstance(s, (str, bytes)): | |
raise ValueError('`s` must be of type `str` or `bytes`') |
import numpy as np | |
def word_cooccurrence(dtm): | |
""" | |
Calculate the co-document frequency (aka word co-occurrence) matrix for a document-term matrix `dtm`, i.e. how often | |
each pair of tokens occurs together at least once in the same document. | |
:param dtm: (sparse) document-term-matrix of size NxM (N docs, M is vocab size) with raw term counts. | |
:return: co-document frequency (aka word co-occurrence) matrix with shape MxM |
""" | |
Shows how to do a cross join (i.e. cartesian product) between two pandas DataFrames using an example on | |
calculating the distances between origin and destination cities. | |
Tested with pandas 0.17.1 and 0.18 on Python 3.4 and Python 3.5 | |
Best run this with Spyder (see https://github.com/spyder-ide/spyder) | |
Author: Markus Konrad <post@mkonrad.net> | |
April 2016 |
""" | |
Sample scripts for blog post "Robust data collection via web scraping and web APIs" | |
(https://datascience.blog.wzb.eu/2020/12/01/robust-data-collection-via-web-scraping-and-web-apis/). | |
Script 1. Starting point – baseline (unreliable) web scraping script. | |
December 2020, Markus Konrad <markus.konrad@wzb.eu> | |
""" | |
from datetime import datetime, timedelta |
This is a simple liquid tag that helps to easily embed images, videos or slides from OEmbed enabled providers. It uses Magnus Holm's great oembed gem which connects to the OEmbed endpoint of the link's provider and retrieves the HTML code to embed the content properly (i.e. an in-place YouTube player, Image tag for Flickr, in-place slideshare viewer etc.). By default it supports the following OEmbed providers (but can fallback to Embed.ly or OoEmbed for other providers):
# Create a "balloon plot" as alternative to a heatmap with ggplot2 | |
# | |
# January 2017 | |
# Author: Markus Konrad <markus.konrad@wzb.eu>, WZB Berlin Social Science Center | |
library(dplyr) | |
library(tidyr) | |
library(ggplot2) | |
# define the variables that will be displayed in the columns |