Skip to content

Instantly share code, notes, and snippets.

@erogol
erogol / crawle_google.py
Created September 22, 2013 10:57
scrap google images given the target query
#!/usr/bin/env python
'''
Query on GoogleImageSearch and install resulted images by scraping.
To use this script install mechanize and BeautifulSoup packages as
easy_install mechanize
easy_install Beautiful
@erogol
erogol / scrap_bing.py
Last active December 23, 2015 15:59
Crawle and scrap Bing Image seach images
#!/usr/bin/env python
'''
Query on GoogleImageSearch and install resulted images by scraping.
To use this script install mechanize and BeautifulSoup packages as
easy_install mechanize
easy_install Beautiful
@erogol
erogol / neg_samples.m
Created September 24, 2013 13:32
sample negative instances from real folder structure given the root folder the the interest paths
function [] = sample_neg_examples(ROOT_PATH, num_sample, OUTPUT_PATH)
SEARCH_PATH = fullfile(ROOT_PATH,'**','*');
SAVE_PATH = 'neg_examples'
paths = rdir(SEARCH_PATH);
if exist('OUTPUT_PATH','var')
SAVE_PATH = OUTPUT_PATH;
end
@erogol
erogol / 0_reuse_code.js
Created September 25, 2013 19:31
Here are some things you can do with Gists in GistBox.
// Use Gists to store code you would like to remember later on
console.log(window); // log the "window" object to the console
if ~exist('vlfeat', 'dir')
from = 'http://www.vlfeat.org/download/vlfeat-0.9.13-bin.tar.gz' ;
fprintf('Downloading vlfeat from %s\n', from) ;
untar(from, 'data') ;
movefile('data/vlfeat-0.9.13', 'vlfeat') ;
end
function [ res ] = ie505_hw1( n,a,b )
%İE505_HW1 Summary of this function goes here
% Detailed explanation goes here
[~,bins]= hist([a,b],1000);
r =unique( round( n*bins));
res = arrayfun(@(x)nchoosek(n,x)*double(1/(2^n)),r);
'''
split a file into two randomly, line by line.
Usage: split.py <input file> <output file 1> <output file 2> [<probability of writing to the first file>] [<random seed>]'
'''
import csv
import sys
import random
input_file = sys.argv[1]
@erogol
erogol / logistic_ensemble.py
Last active February 8, 2018 20:28
logistic regression ensembles with feature selection. It requires sklearn python lib
def linear_model_ensemble(X, y, X_test, fold_num, fold_num_sec, grid_search_range, oobe=True, x_val=True ):
'''
X - Train set
y - Train set labels with. Labels are 1 for pos instances and -1 for neg instances
fold_num1 - Fold size for the first step X-validation to set the hyper-params
and feature selectors
@erogol
erogol / HtmlStripper.py
Created November 4, 2013 00:53
HTML stripper in Python
from HTMLParser import HTMLParser
class MLStripper(HTMLParser):
def __init__(self):
self.reset()
self.fed = []
def handle_data(self, d):
self.fed.append(d)
def get_data(self):
return ''.join(self.fed)
@erogol
erogol / cluster.py
Created December 13, 2013 15:46
clustering with theano
import numpy as np
import numpy
import theano
import theano.tensor as T
from theano import function, config, shared, sandbox
from theano import ProfileMode
from sklearn import cluster, datasets
import matplotlib.pyplot as plt
def rsom(data, cluster_num, alpha, epochs = -1, batch = 1, verbose = False):