EmergentOrder/gist:21132e0b9fddb7272a62

## gistfile1.txt
K=3 for all SPPMI. Training data is 16k tweets sampled from the Twitter140 corpus.

SPPMI, Vector size: 500, additive
Train set accuracy: 0.58786

SPPMI, Vector size: 2500, additive
Train set accuracy: 0.67468

SPPMI, Vector size: 7500, additive
Train set accuracy: 0.76

SPPMI, Vector size: 7500, additive [exception, k = 10]
Train set accuracy: 0.79109

SPPMI, Vector size: 500, appending
Train set accuracy: 0.81155

SPPMI, Vector size: 2000, appending  [exception, k = 10]
Train set accuracy: 0.91094

SPPMI, Vector size: 5000, appending  [exception, k = 10]
Train set accuracy: 0.92674

SPPMI, Vector size: 7500, appending  [exception, k = 10]
Train set accuracy: 0.93356


[Higher vector sizes not possible for appending due to memory limitations]

TF-IDF, Vector size: 7500
Train set accuracy: 0.87812

TF-IDF, Vector size: 12500
Train set accuracy: 0.91451

TF-IDF, Vector size: default
Train set accuracy: 0.93623

-For comparison:

VW, raw text
Train set accuracy: 0.9351875

VW, SPPMI, Vector size: 2500, additive k=3
Train set accuracy: 0.7213125

VW, SPPMI, Vector size: 500, appending, k=10, plus original text
Train set accuracy: 0.9434375
	K=3 for all SPPMI. Training data is 16k tweets sampled from the Twitter140 corpus.

	SPPMI, Vector size: 500, additive
	Train set accuracy: 0.58786

	SPPMI, Vector size: 2500, additive
	Train set accuracy: 0.67468

	SPPMI, Vector size: 7500, additive
	Train set accuracy: 0.76

	SPPMI, Vector size: 7500, additive [exception, k = 10]
	Train set accuracy: 0.79109

	SPPMI, Vector size: 500, appending
	Train set accuracy: 0.81155

	SPPMI, Vector size: 2000, appending [exception, k = 10]
	Train set accuracy: 0.91094

	SPPMI, Vector size: 5000, appending [exception, k = 10]
	Train set accuracy: 0.92674

	SPPMI, Vector size: 7500, appending [exception, k = 10]
	Train set accuracy: 0.93356


	[Higher vector sizes not possible for appending due to memory limitations]

	TF-IDF, Vector size: 7500
	Train set accuracy: 0.87812

	TF-IDF, Vector size: 12500
	Train set accuracy: 0.91451

	TF-IDF, Vector size: default
	Train set accuracy: 0.93623

	-For comparison:

	VW, raw text
	Train set accuracy: 0.9351875

	VW, SPPMI, Vector size: 2500, additive k=3
	Train set accuracy: 0.7213125

	VW, SPPMI, Vector size: 500, appending, k=10, plus original text
	Train set accuracy: 0.9434375