Last active
September 16, 2015 19:17
-
-
Save EmergentOrder/21132e0b9fddb7272a62 to your computer and use it in GitHub Desktop.
BIDMach LR ( SPPMI vs TF-IDF)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
K=3 for all SPPMI. Training data is 16k tweets sampled from the Twitter140 corpus. | |
SPPMI, Vector size: 500, additive | |
Train set accuracy: 0.58786 | |
SPPMI, Vector size: 2500, additive | |
Train set accuracy: 0.67468 | |
SPPMI, Vector size: 7500, additive | |
Train set accuracy: 0.76 | |
SPPMI, Vector size: 7500, additive [exception, k = 10] | |
Train set accuracy: 0.79109 | |
SPPMI, Vector size: 500, appending | |
Train set accuracy: 0.81155 | |
SPPMI, Vector size: 2000, appending [exception, k = 10] | |
Train set accuracy: 0.91094 | |
SPPMI, Vector size: 5000, appending [exception, k = 10] | |
Train set accuracy: 0.92674 | |
SPPMI, Vector size: 7500, appending [exception, k = 10] | |
Train set accuracy: 0.93356 | |
[Higher vector sizes not possible for appending due to memory limitations] | |
TF-IDF, Vector size: 7500 | |
Train set accuracy: 0.87812 | |
TF-IDF, Vector size: 12500 | |
Train set accuracy: 0.91451 | |
TF-IDF, Vector size: default | |
Train set accuracy: 0.93623 | |
-For comparison: | |
VW, raw text | |
Train set accuracy: 0.9351875 | |
VW, SPPMI, Vector size: 2500, additive k=3 | |
Train set accuracy: 0.7213125 | |
VW, SPPMI, Vector size: 500, appending, k=10, plus original text | |
Train set accuracy: 0.9434375 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment