Skip to content

Instantly share code, notes, and snippets.

Sam Zhang samzhang111

Block or report user

Report or block samzhang111

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
View suffix_tree.py, and suffix_tree_test.py
rom collections import defaultdict
class SuffixTree(object):
def __init__(self, key=None):
self.key = key
self.dict = {
'count': 0,
'children': {}
}
View gist:c79767172759b5e3b918
Please tag commit messages with one of the following:
API: an (incompatible) API change
BLD: change related to building numpy
BUG: bug fix
DEP: deprecate something, or remove a deprecated object
DEV: development tool or utility
DOC: documentation
ENH: enhancement
MAINT: maintenance commit (refactoring, typos, etc.)
@samzhang111
samzhang111 / nutch-site.xml
Created Jan 10, 2015
Deduplication with Nutch
View nutch-site.xml
<property>
<name>db.signature.class</name>
<value>org.apache.nutch.crawl.TextProfileSignature</value>
<description>The default implementation of a page signature. Signatures
created with this implementation will be used for duplicate detection
and removal.</description>
</property>
<property>
<name>db.signature.text_profile.min_token_len</name>
@samzhang111
samzhang111 / dbus.sh
Last active Aug 29, 2015
Dbus Buildpack for Heroku
View dbus.sh
#!/bin/sh
#forked from https://gist.github.com/ddollar/07d579a6621b3ddd7b6b/
# capture root dir
root=$(pwd)
# change into subdir of archive
cd $root/dbus-*
View _.md
You can’t perform that action at this time.