rongzhe Azure-rong

## Hellow latex.tex
%hello_world.tex
\documentclass{book}
\usepackage{ctex}
\begin{document}
你好\LaTeX
\end{document}

## helpfulness prediction.py
#! /usr/bin/env python2.7
#coding=utf-8

"""
Use scikit-learn to test different classifier's review helpfulness prediction performance, and test different feature subset's prediction performance
This module is the last part of review helpfulness prediction research.

"""


## pos neg(machine learning) feature.py
#! /usr/bin/env python2.7
#coding=utf-8

"""
Use a stored sentiment classifier to identifiy review positive and negative probability.
This module aim to extract review sentiment probability as review helpfulness features.

"""


## store sentiment classifier.py
#! /usr/bin/env python2.7
#coding=utf-8

"""
Use positive and negative review set as corpus to train a sentiment classifier.
This module use labeled positive and negative reviews as training set, then use nltk scikit-learn api to do classification task.
Aim to train a classifier automatically identifiy review's positive or negative sentiment, and use the probability as review helpfulness feature.

"""

## pos neg(senti dict) feature.py
#! /usr/bin/env python2.7
#coding=utf-8

"""
Compute a review's positive and negative score, their average score and standard deviation.
This module aim to extract review positive/negative score, average score and standard deviation features (all 6 features).
Sentiment analysis based on sentiment dictionary.

"""

## adj adv v feature.py
#! /usr/bin/env python2.7
#coding=utf-8

"""
Counting adjective words, adverbs and verbs number in the review.
This module aim to extract adjective words, adverbs and verbs number features.

"""


## entropy perplexity feature.py
#! /usr/bin/env python2.7
#coding=utf-8

"""
Compute review's entropy and perplexity.
This module aim to bulid review ngram language model then compute review entropy and perplexity as features

"""


## name brand attribute feature.py
#! /usr/bin/env python2.7
#coding=utf-8

"""
Counting the product name, product brand and product attribute appear times in the review.
This module aim to extract product name, brand and attribute features.

"""

import textprocessing as tp

## word sentence length feature.py
#! /usr/bin/env python2.7
#coding=utf-8

"""
Counting review's word number, sentence number and review length
This module aim to extract review's word number and sentence number and review length features.

"""

import textprocessing as tp

## centroid feature.py
#! /usr/bin/env python2.7
#coding=utf-8

"""
Compute review centroid score by combinating every word's tfidf score.
This module use filtered review data in a txt file and gensim tf-idf model to extract this review feature.

"""

import textprocessing as tp
	%hello_world.tex
	\documentclass{book}
	\usepackage{ctex}
	\begin{document}
	你好\LaTeX
	\end{document}
	#! /usr/bin/env python2.7
	#coding=utf-8

	"""
	Use scikit-learn to test different classifier's review helpfulness prediction performance, and test different feature subset's prediction performance
	This module is the last part of review helpfulness prediction research.

	"""
	#! /usr/bin/env python2.7
	#coding=utf-8

	"""
	Use a stored sentiment classifier to identifiy review positive and negative probability.
	This module aim to extract review sentiment probability as review helpfulness features.

	"""
	#! /usr/bin/env python2.7
	#coding=utf-8

	"""
	Use positive and negative review set as corpus to train a sentiment classifier.
	This module use labeled positive and negative reviews as training set, then use nltk scikit-learn api to do classification task.
	Aim to train a classifier automatically identifiy review's positive or negative sentiment, and use the probability as review helpfulness feature.

	"""
	#! /usr/bin/env python2.7
	#coding=utf-8

	"""
	Compute a review's positive and negative score, their average score and standard deviation.
	This module aim to extract review positive/negative score, average score and standard deviation features (all 6 features).
	Sentiment analysis based on sentiment dictionary.

	"""
	#! /usr/bin/env python2.7
	#coding=utf-8

	"""
	Counting adjective words, adverbs and verbs number in the review.
	This module aim to extract adjective words, adverbs and verbs number features.

	"""
	#! /usr/bin/env python2.7
	#coding=utf-8

	"""
	Compute review's entropy and perplexity.
	This module aim to bulid review ngram language model then compute review entropy and perplexity as features

	"""
	#! /usr/bin/env python2.7
	#coding=utf-8

	"""
	Counting the product name, product brand and product attribute appear times in the review.
	This module aim to extract product name, brand and attribute features.

	"""

	import textprocessing as tp
	#! /usr/bin/env python2.7
	#coding=utf-8

	"""
	Counting review's word number, sentence number and review length
	This module aim to extract review's word number and sentence number and review length features.

	"""

	import textprocessing as tp
	#! /usr/bin/env python2.7
	#coding=utf-8

	"""
	Compute review centroid score by combinating every word's tfidf score.
	This module use filtered review data in a txt file and gensim tf-idf model to extract this review feature.

	"""

	import textprocessing as tp