Skip to content

Instantly share code, notes, and snippets.

View sandello's full-sized avatar

Ivan Puzyrevskiy sandello

View GitHub Profile
@sandello
sandello / Makefile
Created September 17, 2012 16:44
Snippets for EMA-course in Yandex Data Analysis School
STXXL_ROOT ?= ../stxxl
STXXL_CONFIG ?= stxxl_boost.mk
include $(STXXL_ROOT)/$(STXXL_CONFIG)
CXX = $(STXXL_CXX)
CPPFLAGS += $(STXXL_CPPFLAGS)
LDLIBS += $(STXXL_LDLIBS)
CPPFLAGS += -O3 -Wall -g
@sandello
sandello / gist:2351138
Created April 10, 2012 12:49
Earley algorithm for NLP-course in Yandex Data Analysis School
#!/usr/bin/python
################################################################################
# * 10.04 - Fixed QTree printing. Thanks to Igor Shalyminov.
# * 10.04 - Implemented proper backtracking and forest restoration. Thanks to Pavel Sergeev.
# * 20.03 - Initial version.
################################################################################
# GLOSSARY
################################################################################
# * Term
# Either terminal or non-terminal symbol.
@sandello
sandello / sifp.cpp
Created February 29, 2012 23:47
Seemingly Impossible Functional Program in C++
// Ivan Pouzyrevsky, EWSCS'12.
#include <iostream>
#include <functional>
#include <memory>
#include <utility>
// This is a C++ implementation for Seemingly Impossible Functional Program, which in finite time tests whether
// a computable predicate on binary strings yields True for some binary string. It does so by exhaustive search
// on whole space. Some nasty-nasty tricks like call-by-need can make this algorithm run fast.
@sandello
sandello / mediawiki_extract_pages_to_files.py
Created September 10, 2010 23:15
Snippets for IR-course in Yandex Data Analysis School
#!/usr/bin/python
# For Yandex Data Analysis School
"""Takes MediaWiki XML dump and extracts pages to separate files."""
SUBDIRECTORY_SPREAD = 512
import sys
import os
import os.path