Skip to content

Instantly share code, notes, and snippets.

Avatar
🎲
RAND()

Mikhail Popov bearloga

🎲
RAND()
View GitHub Profile
@bearloga
bearloga / wdqs-cocktails.R
Created Apr 21, 2017
Fetches cocktails and their ingredients from Wikidata using SPARQL and Wikidata Query Service
View wdqs-cocktails.R
cocktails <- WikidataQueryServiceR::query_wikidata('
SELECT DISTINCT ?cocktailLabel ?ingredientLabel ?instanceOfLabel ?subclassLabel
WHERE
{
?cocktail wdt:P31/wdt:P279* wd:Q134768 .
?cocktail wdt:P186 ?ingredient .
OPTIONAL {
?ingredient wdt:P279 ?subclass .
}
OPTIONAL {
@bearloga
bearloga / mkrproj.sh
Last active Jan 3, 2018
A bash shell script that can be used to turn the current directory into an RStudio project, opening the project in RStudio after creating it.
View mkrproj.sh
#!/bin/bash
# Usage: mkproj [projectname]
# projectname defaults to name of current directory
template="Version: 1.0\nRestoreWorkspace: Default\nSaveWorkspace: Default\nAlwaysSaveHistory: Default\n\nEnableCodeIndexing: Yes\nUseSpacesForTab: Yes\nNumSpacesForTab: 4\nEncoding: UTF-8\n\nRnwWeave: knitr\nLaTeX: pdfLaTeX"
wd=$(basename `pwd`)
if [ -z $1 ]; then
@bearloga
bearloga / dl2csv.R
Created Nov 9, 2017
Some code for converting an HTML description list into an R data.frame that can then be exported as a CSV.
View dl2csv.R
library(rvest)
x <- "<dl>
<dt>Coffee</dt>
<dd>Black hot drink</dd>
<dt>Milk</dt>
<dd>White cold drink</dd>
</dl>"
y <- read_html(x)
@bearloga
bearloga / druid-csv-spec_country-all.json
Last active Jun 25, 2019
Druid ingestion spec for gzipped CSV data
View druid-csv-spec_country-all.json
{
"type": "index_hadoop",
"spec": {
"ioConfig": {
"type": "hadoop",
"inputSpec": {
"paths": "hdfs://analytics-hadoop/tmp/gsc-all.csv.gz",
"type": "static"
}
},
View pavement.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View pavement-r.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View logarithmic-time.R
# daily_stats has 5 columns used by this code: date, time_spent_10/25/50/75/90
ggplot(daily_stats) +
geom_segment(aes(x = date, xend = date, y = time_spent_10, yend = time_spent_90),
size = 1, color = "#00af89") +
geom_segment(aes(x = date, xend = date, y = time_spent_25, yend = time_spent_75),
size = 2, color = "#14866d") +
# geom_ribbon(aes(x = date, ymin = time_spent_lower, ymax = time_spent_upper), alpha = 0.3) +
# geom_line(aes(x = date, y = time_spent_middle)) +
geom_label(
@bearloga
bearloga / sql-murder-mystery-solution.md
Created Oct 13, 2019
A walkthrough of the solution to SQL Murder Mystery by Northwestern University Knight Lab. Solution by Mikhail Popov (@bearloga)
View sql-murder-mystery-solution.md

Solution to SQL Murdery Mystery

A walkthrough of the solution to SQL Murder Mystery by Northwestern University Knight Lab. Solution by Mikhail Popov

Prompt

A crime has taken place and the detective needs your help. The detective gave you the crime scene report, but you somehow lost it. You vaguely remember that the crime was a ​murder​ that occurred sometime on ​Jan.15, 2018​ and that it took place in ​SQL City​. Start by retrieving the corresponding crime scene report from the police department’s database.

Witness reports

@bearloga
bearloga / engines.Rmd
Last active Mar 11, 2020
Automatically printing chunk engine in R Markdown
View engines.Rmd
---
title: "Printing chunk engine via hook"
output: github_document
---
```{r setup, include=FALSE}
library(knitr)
opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE)
print_engine_hook <- function(before, options, envir) {
@bearloga
bearloga / waxer-demo.ipynb
Created Jul 23, 2020
Demo of using {waxer} R package in a Jupyter Notebook to fetch different Wikipedia languages' pageviews with different access methods
View waxer-demo.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.