Skip to content

Instantly share code, notes, and snippets.

@seaslee
seaslee / gist:5274998
Created March 30, 2013 02:18
select images for codebook generation
# -*- coding: utf8 -*-
import csv
import os
def listimgs(metafilepath, imgfilepath, n):
#metafilepath: the path of the meta info of result
#n: the number of a query to generate the codebook
@seaslee
seaslee / gist:5276281
Created March 30, 2013 10:44
walk the two level files like following: ====dir ||subdir ||files ||subdir ||files ...
def imgs(imgsdir):
for dirname, subdirnames, subfiles in os.walk(imgsdir):
#subdir for each query of images
for subdir in subdirnames:
for dirname1, subdirnames1, subfiles1 in os.walk(os.path.join(dirname, subdir)):
#each image in the directory of one query
for img in subfiles1:
imgpath = os.path.join(dirname1, img)
yield imgpath
@seaslee
seaslee / gist:6435194
Created September 4, 2013 10:17
linear model example using scikit-learn
#! /usr/bin/env python
# -*- coding:utf-8 -*-
from sklearn import datasets
from sklearn import linear_model
from sklearn.metrics import mean_squared_error
##### load data and split into train and test ####
data_boston = datasets.load_boston()
data = data_boston.data
target = data_boston.target
train_ratio = 0.8
@seaslee
seaslee / gist:6436522
Last active August 13, 2018 04:09
logistic regression examples using scikit-learn
# -*- coding:utf-8 -*-
from sklearn import datasets
from sklearn import linear_model
from sklearn.metrics import f1_score
##### load data and split into train and test ####
data_digits = datasets.load_digits()
data = data_digits.data
target = data_digits.target
train_ratio = 0.8
data_num = data.shape[0]
@seaslee
seaslee / r_resources
Last active December 24, 2015 08:49
R resources
####入门
1. [John Cook写的不错的关于R语言的一个基本介绍](http://www.johndcook.com/R_language_for_programmers.html)
2. [R 官方的入门手册](http://www.r-project.org/)
3. [电子书“The R Inferno”]
####ggplot
1. [Edwen Chen写的一个不错的入门,很简单的qplot的用法]http://blog.echen.me/2012/01/17/quick-introduction-to-ggplot2/)
2. [ggplot2的文档](http://docs.ggplot2.org/current/)
####规范
1. [Google's R Style Guide](http://google-styleguide.googlecode.com/svn/trunk/Rguide.xml#filenames)
2. [另一个简洁的R编码规范,ggplot2的作者](http://stat405.had.co.nz/r-style.html)
library(tm)
## read from txt file
path = 'd:/sigir_full.txt'
f <- file(path,open='rt')
con <- readLines(f)
close(f)
## get the tile from the content
paperTitles <- con[grepl("^Title: ",con)]
@seaslee
seaslee / zh.r
Last active August 29, 2015 14:21
simple stats
d = read.csv('a.csv', head=T, sep=',')
p <-unlist(d)
pt <- ts(p, frequency=12, start=c(2015))
plot(pt)
library(forecast)
fit <- auto.arima(pt)
forecast(fit, h=2)