Skip to content

Instantly share code, notes, and snippets.

Avatar

Jonas Lekevicius lekevicius

View GitHub Profile
View keybase.md

Keybase proof

I hereby claim:

  • I am lekevicius on github.
  • I am lekevicius (https://keybase.io/lekevicius) on keybase.
  • I have a public key whose fingerprint is 8CD7 B7E7 19FD 0DB3 0604 785E 2F1F B4F3 6100 6889

To claim this, I am signing this object:

@lekevicius
lekevicius / data_to_csv.rb
Created Mar 14, 2013
Converts the text feed to CSV.
View data_to_csv.rb
require 'json'
require 'csv'
data_txt = File.read('data_itf.txt')
words = ['sir', 'thank', 'love', 'god', 'life', 'night', 'shit', 'boy', 'girl', 'fuck', 'car', 'money', 'father', 'mother', 'hell', 'son', 'kill', 'dead', 'call', 'friend', 'stay', 'leave', 'baby', 'home', 'world']
results = {}
words.each do |word|
years = {}
@lekevicius
lekevicius / corpus_analysis.py
Created Mar 14, 2013
Output word frequencies.
View corpus_analysis.py
from pattern.vector import *
import glob
from string import *
import operator
import json
words = ['sir', 'thank', 'love', 'god', 'life', 'night', 'shit', 'boy', 'girl', 'fuck', 'car', 'money', 'father', 'mother', 'hell', 'son', 'kill', 'dead', 'call', 'friend', 'stay', 'leave', 'baby', 'home', 'world']
documents = []
corpus = Corpus.load('years.corpus')
@lekevicius
lekevicius / create_corpus.py
Created Mar 14, 2013
Creates corpus for all movie years.
View create_corpus.py
from pattern.vector import *
import glob
from string import *
import operator
documents = []
def create_document(script, name):
document = Document(script,
filter = lambda w: w.isalpha() and len(w) > 1,
@lekevicius
lekevicius / clean_merge.rb
Created Mar 14, 2013
Clean subtitles and merge by year.
View clean_merge.rb
drop_words = ['subtitle', 'cd1', 'cd2', 'kbps', 'transc', 'subed', 'distrib', 'synched', '.com']
(1962..2012).each do |year|
year_string = ""
Dir.glob("Scripts/#{ year }-*") do |file|
contents = File.read(file)
year_string += contents
year_string += "\n\n"
end
@lekevicius
lekevicius / download_subtitles.rb
Created Mar 14, 2013
Download subtitles from OpenSubtitles by IDs, and clean them up a little.
View download_subtitles.rb
# encoding: UTF-8
require 'iconv'
require 'json'
require 'nokogiri'
# getsub -s i -l eng -t srt tt0055928
def download_movie movie_data
@lekevicius
lekevicius / top_imdb_movies.coffee
Created Mar 14, 2013
Crawls IMDb to find the most popular movies in the last 50 years with their IDs and genres. Run with casper.
View top_imdb_movies.coffee
listTopMovies = (year) ->
url = "http://www.imdb.com/year/#{ year }/"
casper.then ->
casper.open(url).then ->
# console.log '# ' + url
console.log ''
console.log "========= #{ year } ========="
# console.log url
pageMovies = @evaluate ->
movies = []
@lekevicius
lekevicius / colors.coffee
Created Feb 23, 2013
HSL and HSV conversion to RGB in CoffeeScript
View colors.coffee
hue2rgb = (p, q, t) ->
t += 1 if t < 0
t -= 1 if t > 1
return p + (q - p) * 6 * t if t < 1/6
return q if t < 1/2
return p + (q - p) * (2/3 - t) * 6 if t < 2/3
return p
hslToRgb = (h, s, l) ->
if s is 0
@lekevicius
lekevicius / iconMonster.coffee
Created Jan 6, 2013
Download all IconMonstr icons (previous website version; might need modification to work on new design).
View iconMonster.coffee
loadPage = (url) ->
casper.open(url).then ->
pageIcons = @evaluate ->
links = []
$('a.thumbnail_link').each -> links.push $(@).attr('href')
links
# console.log pageIcons
allIcons.push icon for icon in pageIcons
hasNextPage = @evaluate -> $('.navigation a.next').length
# console.log hasNextPage