Skip to content

Instantly share code, notes, and snippets.

View cheekybastard's full-sized avatar

cheekybastard

View GitHub Profile
@dbrgn
dbrgn / queryset_generators.py
Created April 1, 2011 08:41
queryset_generator and queryset_list_generator
def queryset_generator(queryset, chunksize=1000):
"""
Iterate over a Django Queryset ordered by the primary key
This method loads a maximum of chunksize (default: 1000) rows in its
memory at the same time while django normally would load all rows in its
memory. Using the iterator() method only causes it to not preload all the
classes.
Note that the implementation of the generator does not support ordered query sets.
@jonathanmorgan
jonathanmorgan / queryset_iterators.py
Created May 6, 2011 05:40 — forked from dbrgn/queryset_generators.py
queryset_generator and queryset_list_generator
'''
queryset_generator and queryset_list_generator based on:
https://gist.github.com/897894
'''
#===============================================================================
# imports (in alphabetical order by package, then by name)
#===============================================================================
# python standard libraries
@karmi
karmi / elastic_search_ngram_analyzer_for_urls.sh
Created May 24, 2011 15:32
NGram Analyzer in ElasticSearch
# ========================================
# Testing n-gram analysis in ElasticSearch
# ========================================
curl -X DELETE localhost:9200/ngram_test
curl -X PUT localhost:9200/ngram_test -d '
{
"settings" : {
"index" : {
"analysis" : {
@madebyjazz
madebyjazz / gist:1090663
Created July 18, 2011 21:10 — forked from saidimu/gist:1024207
Generating URLs to crawl from outside a Scrapy spider
from scrapy import log
from scrapy.item import Item
from scrapy.http import Request
from scrapy.contrib.spiders import XMLFeedSpider
def NextURL():
"""
Generate a list of URLs to crawl. You can query a database or come up with some other means
Note that if you generate URLs to crawl from a scraped URL then you're better of using a
@mikeyk
mikeyk / gist:1329319
Created October 31, 2011 22:56
Testing storage of millions of keys in Redis
#! /usr/bin/env python
import redis
import random
import pylibmc
import sys
r = redis.Redis(host = 'localhost', port = 6389)
mc = pylibmc.Client(['localhost:11222'])
#-*- coding:utf-8 - *-
def load_dataset():
"Load the sample dataset."
return [[1, 3, 4], [2, 3, 5], [1, 2, 3, 5], [2, 5]]
def createC1(dataset):
"Create a list of candidate item sets of size one."
@mattweber
mattweber / README.txt
Created March 1, 2012 04:09
ElasticSearch Multi-Select Faceting Example
This is an example how to perform multi-select faceting in ElasticSearch.
Selecting multiple values from the same facet will result in an OR filter between each of the values:
(facet1.value1 OR facet1.value2)
Faceting on more than one facet will result in an AND filter between each facet:
(facet1.value1 OR facet1.value2) AND (facet2.value1)
I have chosen to update the counts for each facet the selected value DOES NOT belong to since we are performing an AND between each facet. I have included an example that shows how to keep the counts if you don't want to do this (filter0.sh).
@mlissner
mlissner / queryset_generators.py
Created March 10, 2012 04:13 — forked from dbrgn/queryset_generators.py
Adds a date-based queryset generator
from datetime import datetime
from datetime import timedelta
def queryset_generator(queryset, chunksize=1000):
"""
Iterate over a Django Queryset ordered by the primary key
This method loads a maximum of chunksize (default: 1000) rows in its
memory at the same time while django normally would load all rows in its
memory. Using the iterator() method only causes it to not preload all the
@thanos
thanos / kombu_example.py
Created June 14, 2012 23:15
A simple example of a kombu fanout exchange using python generators and coroutines
from kombu import Exchange
from kombu import Queue
from kombu import BrokerConnection
class ProduceConsume(object):
def __init__(self, exchange_name, **options):
exchange = Exchange(exchange_name, type='fanout', durable=False)
queue_name = options.get('queue', exchange_name+'_queue')
self.queue = Queue(queue_name ,exchange)
@catawbasam
catawbasam / pandas_dbms.py
Last active May 26, 2024 05:32
Python PANDAS : load and save Dataframes to sqlite, MySQL, Oracle, Postgres
# -*- coding: utf-8 -*-
"""
LICENSE: BSD (same as pandas)
example use of pandas with oracle mysql postgresql sqlite
- updated 9/18/2012 with better column name handling; couple of bug fixes.
- used ~20 times for various ETL jobs. Mostly MySQL, but some Oracle.
to do:
save/restore index (how to check table existence? just do select count(*)?),
finish odbc,