Skip to content

Instantly share code, notes, and snippets.

View jnv's full-sized avatar

Jan Vlnas jnv

View GitHub Profile
@jnv
jnv / utmstrip.user.js
Last active September 5, 2015 20:55 — forked from paulirish/utmstrip.user.js
UTM parameters stripper – modified for Scriptish regex matching to not run on every page. https://greasyfork.org/cs/scripts/4056-utm-param-stripper
// ==UserScript==
// @name UTM param stripper
// @author Paul Irish
// @namespace http://github.com/paulirish
// @version 1.1.2
// @description Drop the UTM params from a URL when the page loads.
// @extra Cuz you know they're all ugly n shit.
// @include /^https?:\/\/.*[\?#&]utm_.*/
// @grant none
// ==/UserScript==
@jnv
jnv / 02-building_stones.ipynb
Created June 4, 2014 13:27
Example IPython notebook with Ruby
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@jnv
jnv / data.tsv
Last active August 29, 2015 13:57
(WIP) Srovnání kalorické hodnoty a sacharidů
type name price kcal sacharides sugar qty
d Cappy Jablko 39.9 47 11.3 11.3 1000
d Hello Jablko 37.9 45 11 9.6 1000
d Hello Čerstvě vylisovaná jablečná šťáva 39.9 40 9.8 8.4 1000
d Relax Jablko 39.9 42 10.2 10.2 1000
o Kubík Multivitamín 16.9 52.10333333 12.2 11.9 300
o Jupík Jahoda 16.5 24 6 6 330
m Rubín Jablko 40.9 40 9.83 0 1000
d Pfanner Jablko 47.9 44 10.3 9.9 1000
d Relax Pomeranč 42.9 44 10 10 1000
@jnv
jnv / shotdetect2csv.rb
Created March 14, 2014 22:35
Converts Shotdetect's results.xml file to CSV
#!/usr/bin/env ruby
# Extracts shotdetect's results.xml file to csv
# with relative position of shot in movie.
# Writes converted file to results.csv file.
#
# Usage: ./shotdetect2csv.rb results.xml
require "nokogiri"
require "csv"
@jnv
jnv / effective_tld_names.dat
Created December 23, 2013 22:01
Public Suffix List minus the private part
// This Source Code Form is subject to the terms of the Mozilla Public
// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at http://mozilla.org/MPL/2.0/.
// ===BEGIN ICANN DOMAINS===
// ac : http://en.wikipedia.org/wiki/.ac
ac
com.ac
edu.ac
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@jnv
jnv / 11a-dsl.ipynb
Last active December 29, 2015 23:59
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
library(tm)
dirname <- "episodes"
rawCorpus <- Corpus(DirSource(dirname, recursive=TRUE), readerControl=list(language="en"))
my.corpus <- rawCorpus
my.stopwords <- c(stopwords("english"),"ain't","just","can","get","got","will")
my.stopwords <- rev(my.stopwords) # Hack to apply i'll etc. before i
my.stopwords <- my.stopwords[my.stopwords != "who"] # Not a stopword. Not here.
@jnv
jnv / srt2txt.rb
Last active November 20, 2022 16:11
Converts SRT subtitles to plain text, strips irrelevant parts. Requires gems: srt, sanitize. Created for Doctor Who text analysis project.
#!/usr/bin/env ruby
require "srt"
require "sanitize"
REJECT_LINES = [/Best watched using Open Subtitles MKV Player/,
/Subtitles downloaded from www.OpenSubtitles.org/, /^Subtitles by/,
/www.tvsubtitles.net/, /subtitling@bbc.co.uk/, /addic7ed/, /allsubs.org/,
/www.seriessub.com/, /www.transcripts.subtitle.me.uk/, /~ Bad Wolf Team/,
/^Transcript by/, /^Update by /, /UKsubtitles.ru/
]
@jnv
jnv / dh-ngram-cloud.R
Created November 25, 2013 17:02
N-gramy do wordcloudu
# Podle https://gist.github.com/josefslerka/4148592
## Nacteni knihoven
library(textcat)
library(tau)
library(wordcloud)
## Vytvoreni korpusu
# Pro texty ve Windows kodovani pouzijte encoding="cp1250"
mujKorpus <- Corpus(DirSource("klaus", encoding="UTF-8"), readerControl = list(language = "cz"))