Skip to content

Instantly share code, notes, and snippets.

@jwf-zz
jwf-zz / imdb-sentiment-vw.sh
Last active March 5, 2019 00:20
Sentiment analysis on an IMDB dataset using Vowpal Wabbit
#!/bin/bash
# Requires vw (https://github.com/JohnLangford/vowpal_wabbit/wiki/),
# the IMDB dataset (http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz),
# and the perf utility from http://osmot.cs.cornell.edu/kddcup/software.html.
cat aclImdb/train/labeledBow.feat | \
sed -n 's/^\([7-9]\|10\)\s/&/p' | \
sed -e "s/^\([7-9]\|10\)\s//" | \
awk '{ print "1 '"'"'pos_" (NR-1) " |features " $0}' > train.vw