Skip to content

Instantly share code, notes, and snippets.

View dchaplinsky's full-sized avatar

Dmitry Chaplinsky dchaplinsky

View GitHub Profile
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
class Solution1(object):
def straight(self, rest, pile1, pile2):
self.call_cnt += 1
if not rest:
return ((abs(sum(pile1) - sum(pile2)), pile1, pile2))
if sum(pile1) > self.upper_bound or sum(pile2) > self.upper_bound:
return [1E100]
@dchaplinsky
dchaplinsky / exception
Last active August 29, 2015 14:07 — forked from z4y4ts/exception
Traceback (most recent call last):
File "metrics.py", line 32, in <module>
test()
File "metrics.py", line 27, in test
distance_type='jaccard')
File "/Users/ai/.virtualenvs/unshred/lib/python2.7/site-packages/mongoengine/queryset/base.py", line 201, in create
return self._document(**kwargs).save()
File "/Users/ai/.virtualenvs/unshred/lib/python2.7/site-packages/mongoengine/document.py", line 241, in save
object_id = collection.save(doc, **write_concern)
File "/Users/ai/.virtualenvs/unshred/lib/python2.7/site-packages/pymongo/collection.py", line 266, in save
import MySQLdb
import json
import html2text
con = MySQLdb.connect('localhost', 'root', '', 'nashig', use_unicode=True)
with con:
h = html2text.HTML2Text()
h.ignore_links = True
<span class="ner-popup" data-slots="{{ data.slots|tojson }}" data-refs="{{ data.refs|tojson }}" title="{{ data.type }}">
@dchaplinsky
dchaplinsky / unshred_gtd_stats.js
Created November 11, 2014 22:37
Some interesting stats that we've obtained during beta test on GTD dataset.
>> db.tagging_speed.aggregate(
{
$match:
{
msec: {$gte: 2 * 1000, $lte: 180 * 1000}
}
},
{
$group:
{
@dchaplinsky
dchaplinsky / titleua.py
Last active August 29, 2015 14:10
Simple implementaiton of python's title that works well with ukrainian surnames and names (including compound ones and names with apostrophes)
# -*- coding: utf-8 -*-
from string import capwords
def title(s):
chunks = s.split()
chunks = map(lambda x: capwords(x, u"-"), chunks)
return u" ".join(chunks)
if __name__ == '__main__':
INFO 2014-12-21 17:20:27 simplifying tags: looking for tag spellings
INFO 2014-12-21 17:20:41 simplifying tags: looking for spelling duplicates (skip_space_ambiguity: True)
DEBUG 2014-12-21 17:20:41 313 duplicate tags will be removed
INFO 2014-12-21 17:20:41 simplifying tags: fixing
INFO 2014-12-21 17:20:46 inlining lexeme derivational rules...
INFO 2014-12-21 17:20:47 building paradigms...
DEBUG 2014-12-21 17:20:47 word len(gramtab) len(words) len(paradigms)
DEBUG 2014-12-21 17:20:47 пообклеювати 15 15 1
DEBUG 2014-12-21 17:20:50 кричавши 1186 133287 667
DEBUG 2014-12-21 17:20:51 димувати 1417 269033 985
INFO 2014-12-22 12:55:53 simplifying tags: looking for tag spellings
INFO 2014-12-22 12:56:06 simplifying tags: looking for spelling duplicates (skip_space_ambiguity: True)
DEBUG 2014-12-22 12:56:06 290 duplicate tags will be removed
INFO 2014-12-22 12:56:06 simplifying tags: fixing
INFO 2014-12-22 12:56:11 inlining lexeme derivational rules...
INFO 2014-12-22 12:56:12 building paradigms...
DEBUG 2014-12-22 12:56:12 word len(gramtab) len(words) len(paradigms)
DEBUG 2014-12-22 12:56:12 пообклеювати 15 15 1
DEBUG 2014-12-22 12:56:15 кричавши 1042 133287 667
DEBUG 2014-12-22 12:56:16 димувати 1249 269033 985
import re
import os.path
import requests
from random import sample, random
from collections import Counter
from pymongo import MongoClient
from glob2 import glob
client = MongoClient()
db = client.decl