Skip to content

Instantly share code, notes, and snippets.

@adayone
adayone / ranksvm
Created April 3, 2014 06:22
ranksvm
<blockquote>
<p>Written by haoyuan hu</p>
</blockquote>
<h1 id="ranksvm">ranksvm</h1>
<ul>
<li>@author:本华</li>
<li>@mail:haoyuan.huhy@tmall.com</li>
</ul>
@langmore
langmore / gist:6820351
Created October 4, 2013 03:04
Using gensim, pandas, and some helper classes to analyze data
{
"metadata": {
"name": "filter_with_meta_ian"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
@bigsnarfdude
bigsnarfdude / gist:b8999f1ac7da3dbd7d18
Created November 22, 2014 21:14
Algebird BloomFilter Add example
val NUM_HASHES = 6
val WIDTH = 32
val SEED = 1
val bfMonoid1 = new BloomFilterMonoid(NUM_HASHES, WIDTH, SEED)
val bfMonoid2 = new BloomFilterMonoid(NUM_HASHES, WIDTH, SEED)
val bf1 = bfMonoid1.create("1", "2", "3", "4", "100")
val bf2 = bfMonoid2.create("11", "22", "33", "44", "1001")
val approxBool1 = bf1.contains("1")
val approxBool2 = bf2.contains("22")
@mblondel
mblondel / logsum.py
Created April 13, 2010 06:39
Logarithm of a sum without underflow
import numpy as np
def _logsum(logx, logy):
"""
Return log(x+y), avoiding arithmetic underflow/overflow.
logx: log(x)
logy: log(y)
Rationale:
@kevindavenport
kevindavenport / Regularized_Logistic_Regression_Intuition.ipynb
Last active February 18, 2019 01:11
Regularized_Logistic_Regression_Intuition.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@waleking
waleking / SparkGibbsLDA.scala
Last active January 31, 2020 11:15
We implement gibbs sampling for LDA by Spark. This version performs much better than alpha version, and now can handle 3196204 words, 100 topics, 1000 sample iterations on server in 161.7 minutes. To solve the long time consuming in collect() process in alpha version, we utilize the cache() method as line 261 and line 262. We also solve a pile o…
package topic
import spark.broadcast._
import spark.SparkContext
import spark.SparkContext._
import spark.RDD
import spark.storage.StorageLevel
import scala.util.Random
import scala.math.{ sqrt, log, pow, abs, exp, min, max }
import scala.collection.mutable.HashMap
@neubig
neubig / crf.py
Created November 7, 2013 10:59
This is a script to train conditional random fields. It is written to minimize the number of lines of code, with no regard for efficiency.
#!/usr/bin/python
# crf.py (by Graham Neubig)
# This script trains conditional random fields (CRFs)
# stdin: A corpus of WORD_POS WORD_POS WORD_POS sentences
# stdout: Feature vectors for emission and transition properties
from collections import defaultdict
from math import log, exp
import sys
@devdazed
devdazed / lp_counters.py
Created October 11, 2012 16:14
Simple Linear Probabilistic Counters
"""
Simple Linear Probabilistic Counters
Credit for idea goes to:
http://highscalability.com/blog/2012/4/5/big-data-counting-how-to-count-a-billion-distinct-objects-us.html
http://highlyscalable.wordpress.com/2012/05/01/probabilistic-structures-web-analytics-data-mining/
Installation:
pip install smhasher
pip install bitarray
@danijar
danijar / blog_tensorflow_sequence_classification.py
Last active December 24, 2021 03:53
TensorFlow Sequence Classification
# Example for my blog post at:
# https://danijar.com/introduction-to-recurrent-networks-in-tensorflow/
import functools
import sets
import tensorflow as tf
def lazy_property(function):
attribute = '_' + function.__name__
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.