Skip to content

Instantly share code, notes, and snippets.

@dound
dound / postgresql_upsert.py
Created January 10, 2011 00:19
Python implementation of UPSERT for use with postgresql.
def upsert(db_cur, table, pk_fields, schema=None, **kwargs):
"""Updates the specified relation with the key-value pairs in kwargs if a
row matching the primary key value(s) already exists. Otherwise, a new row
is inserted. Returns True if a new row was inserted.
schema the schema to use, if any (not sanitized)
table the table to use (not sanitized)
pk_fields tuple of field names which are part of the primary key
kwargs all key-value pairs which should be set in the row
"""
@fabianp
fabianp / ranking.py
Last active February 1, 2024 10:02
Pairwise ranking using scikit-learn LinearSVC
"""
Implementation of pairwise ranking using scikit-learn LinearSVC
Reference:
"Large Margin Rank Boundaries for Ordinal Regression", R. Herbrich,
T. Graepel, K. Obermayer 1999
"Learning to rank from medical imaging data." Pedregosa, Fabian, et al.,
Machine Learning in Medical Imaging 2012.
@agramfort
agramfort / ranking.py
Created March 18, 2012 13:10 — forked from fabianp/ranking.py
Pairwise ranking using scikit-learn LinearSVC
"""
Implementation of pairwise ranking using scikit-learn LinearSVC
Reference: "Large Margin Rank Boundaries for Ordinal Regression", R. Herbrich,
T. Graepel, K. Obermayer.
Authors: Fabian Pedregosa <fabian@fseoane.net>
Alexandre Gramfort <alexandre.gramfort@inria.fr>
"""
@jeetsukumaran
jeetsukumaran / pymc_multinomial_propoptions.py
Created May 31, 2012 01:09
Using PyMC to Estimate the Proportions of a Multinomial Distribution
#! /usr/bin/env python
import sys
import random
import pymc
import numpy
from dendropy.mathlib import probability as prob
from dendropy.mathlib import statistics as stats
rng = random.Random()
#!/usr/bin/env python
import json
import urllib
def estimated_count_for(search_term):
url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s' % urllib.urlencode({'q': search_term})
results = json.loads(urllib.urlopen(url).read())
try:
return results['responseData']['cursor']['estimatedResultCount']
except KeyError:
@kevin-smets
kevin-smets / iterm2-solarized.md
Last active July 31, 2024 06:33
iTerm2 + Oh My Zsh + Solarized color scheme + Source Code Pro Powerline + Font Awesome + [Powerlevel10k] - (macOS)

Default

Default

Powerlevel10k

Powerlevel10k

@sshopov
sshopov / run_in_new_thread_decorator.py
Last active February 24, 2018 21:34
A handy decorator to run a function in a new thread. It came from https://arcpy.wordpress.com/2013/10/25/using-os-startfile-and-webbrowser-open-in-arcgis-for-desktop/ Tags: arcpy python arcgis
import functools
import threading
# A decorator that will run its wrapped function in a new thread
def run_in_new_thread(function):
# functool.wraps will copy over the docstring and some other metadata
# from the original function
@functools.wraps(function)
def fn_(*args, **kwargs):
thread = threading.Thread(target=function, args=args, kwargs=kwargs)
@erikerlandson
erikerlandson / xval_ALS.scala
Created June 26, 2014 20:42
Demonstrate a function that abstracts cross validation for an MLLib model - in this case org.apache.spark.mllib.recommendation.MatrixFactorizationModel
import java.lang.Math
import org.apache.spark.rdd.RDD
import org.apache.spark.mllib.recommendation.Rating
import org.apache.spark.mllib.recommendation.ALS
import org.apache.spark.mllib.recommendation.MatrixFactorizationModel
import org.apache.spark.mllib.util.MLUtils.kFold
// Preload some Rating data for my own convenience
val txt = sc.textFile("/home/eje/git/ratorade/data/bgr.dat")
val ratings = txt.map(_.split('\t') match { case Array(user, item, rating, _, _) => Rating(user.toInt, item.toInt, rating.toDouble / 100.0)})
@cigrainger
cigrainger / gist:62910e58db46b7397de2
Created July 11, 2014 18:28
Arun et al measure with NPR data
from urllib2 import urlopen
from json import load
import re, nltk
from nltk.stem.wordnet import WordNetLemmatizer
from nltk.corpus import wordnet, stopwords
import logging
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s',
level=logging.INFO)
from gensim import corpora, models, similarities, matutils
import numpy as np