Last active
December 2, 2015 03:11
-
-
Save machuz/a0cc218c6d6570512546 to your computer and use it in GitHub Desktop.
Mahoutインストール〜Model作成まで ref: http://qiita.com/ma2k8/items/10d44097607525db9893
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
おはようございます!プログラマーの神様!(T_T) | |
おはよう プログラマー 代表 神様 t t |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
hadoop dfs -ls /monitoring/kerberos/ | |
Found 2 items | |
drwxr-xr-x - hdfs hadoop 0 2014-03-13 15:39 /data/ng-text | |
drwxr-xr-x - hdfs hadoop 0 2014-03-13 15:40 /data/ok-text |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ mahout seqdumper -i /monitoring/labelindex | |
Running on hadoop, using /usr/lib/hadoop/bin/hadoop and HADOOP_CONF_DIR=/etc/hadoop/conf | |
MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.7-cdh4.5.0-job.jar | |
14/03/13 19:57:48 INFO common.AbstractJob: Command line arguments: {--endPhase=[2147483647], --input=[/monitoring/labelindex], --startPhase=[0], --tempDir=[temp]} | |
Input Path: /monitoring/labelindex | |
Key class: class org.apache.hadoop.io.Text Value Class: class org.apache.hadoop.io.IntWritable | |
Key: ng: Value: 0 | |
Key: ok: Value: 1 | |
Count: 2 | |
14/03/13 19:57:49 INFO driver.MahoutDriver: Program took 1735 ms (Minutes: 0.028916666666666667) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ mahout testnb -i /monitoring/test-vectors/tfidf-vectors -o /monitoring/test1 -m /monitoring/test-model -l /monitoring/labelindex | |
Summary | |
------------------------------------------------------- | |
Correctly Classified Instances : 3083 87.0167% | |
Incorrectly Classified Instances : 460 12.9833% | |
Total Classified Instances : 3543 | |
======================================================= | |
Confusion Matrix | |
------------------------------------------------------- | |
a b <--Classified as | |
1215 442 | 1657 a = ng | |
18 1868 | 1886 b = ok | |
14/03/13 20:21:09 INFO driver.MahoutDriver: Program took 19632 ms (Minutes: 0.32721666666666666) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ hadoop dfs -ls /data/ok-data/ |more | |
Found 1897 items | |
-rw-r--r-- 3 matsukawa_tsubasa hadoop 24 2014-03-13 18:36 /data/ok-text/1311.txt | |
-rw-r--r-- 3 matsukawa_tsubasa hadoop 136 2014-03-13 18:36 /data/ok-text/1312.txt | |
-rw-r--r-- 3 matsukawa_tsubasa hadoop 115 2014-03-13 18:36 /data/ok-text/1313.txt | |
-rw-r--r-- 3 matsukawa_tsubasa hadoop 24 2014 | |
・ | |
・ | |
・ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ mahout seqdirectory -i /data/ -o /data-seq |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ mahout seqdumper -i /data-seq/chunk-0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ mahout seq2sparse -i /data-seq -o /data-vectors -a org.apache.lucene.analysis.core.WhitespaceAnalyzer |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
org.apache.lucene.analysis.core.WhitespaceAnalyzer | |
org.apache.lucene.analysis.WhitespaceAnalyzer |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ mahout vectordump -i /monitoring/kerberos-vectors/tfidf-vectors |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ mahout seqdumper -i /monitoring/kerberos-vectors/wordcount | sort -nrk4 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ mahout trainnb -i /monitoring/test-vectors/tfidf-vectors -o /monitoring/test-model -el -li /monitoring/labelindex |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment