Create a gist now

Instantly share code, notes, and snippets.

Accumulo Installation and Configuration Steps on a Ubuntu VirtualBox Instance
Oct 17, 2012 - See https://github.com/medined/accumulo_stackscript for an even better script. Really ignore the stuff below. Go. Scoot.
Aug 28, 2012 - See http://affy.blogspot.com/2012/08/stackscript-for-accumulo-on-linode.html for a more concise method is configuring accumulo. I'll leave this gist unchanged for fans of history.
My goal was to get Accumulo running on a VirtualBox Ubuntu instance. I was successful using the following steps. If a line starts with $ then it is a command-line to execute. Note that you'll need to have sudo privilege. My username was 'ubuntu'. If you are using a different username, you'll need to change the process a little bit. I'll try to point out where.
https://issues.apache.org/jira/browse/ACCUMULO
##########
# Start a new VirtualBox instance using the Ubuntu 11.10
# Desktop ISO with at least 4G RAM and at least 10G of
# disk space.
##########
##########
# For verification, you can display the OS release.
##########
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=11.10
DISTRIB_CODENAME=oneiric
DISTRIB_DESCRIPTION="Ubuntu 11.10"
##########
# Download all of the packages you'll need. Hopefully,
# you have a fast download connection.
##########
$ sudo apt-get update
$ sudo apt-get upgrade
$ sudo apt-get install curl
$ sudo apt-get install git
$ sudo apt-get install maven2
$ sudo apt-get install openssh-server openssh-client
$ sudo apt-get install openjdk-7-jdk
##########
# Switch to the new Java. On my system, it was
# the third option (marked '2' naturally)
##########
$ sudo update-alternatives --config java
##########
# Set the JAVA_HOME variable. I took the
# time to update my .bashrc script.
##########
$ export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-i386
##########
# I stored the Accumulo source code into
# ~/workspace/accumulo. After compilation, you'll
# be working with a second Accumulo directory. By
# placing this 'original source' version into
# workspace it is nicely segregated.
##########
$ mkdir -p ~/workspace
$ cd ~/workspace
$ git clone https://github.com/apache/accumulo.git
$ cd accumulo
##########
# Now we can compile Accumulo which creates the
# accumulo-assemble-1.5.0-incubating-SNAPSHOT-dist.tar.gz
# file in the src/assemble/target directory.
#
# This step confused me because the Accumulo README
# mentions mvn assembly:single and I tried to use
# that Maven command. It is not needed, at least not
# in this situation.
##########
$ mvn package
##########
# Now we can download Cloudera's version of Hadoop. The
# first step is adding the repository. Note that oneiric
# is not explicitly supported as of 2011-Dec-20. So I am
# using the 'maverick' repository.
##########
# Create a repository list file. Add the two indented lines
# to the new file.
$ sudo vi /etc/apt/sources.list.d/cloudera.list
deb http://archive.cloudera.com/debian maverick-cdh3 contrib
deb-src http://archive.cloudera.com/debian maverick-cdh3 contrib
# Add public key
$ curl -s http://archive.cloudera.com/debian/archive.key | sudo apt-key add -
$ sudo apt-get update
# Install all of the Hadoop components.
$ sudo apt-get install hadoop-0.20
$ sudo apt-get install hadoop-0.20-namenode
$ sudo apt-get install hadoop-0.20-datanode
$ sudo apt-get install hadoop-0.20-secondarynamenode
$ sudo apt-get install hadoop-0.20-jobtracker
$ sudo apt-get install hadoop-0.20-tasktracker
# Install zookeeper. It will automatically
# start.
sudo apt-get install hadoop-zookeeper-server
##########
# As an aside, you can use Ubuntu's service
# command to control zookeeper like this:
# sudo service hadoop-zookeeper-server start
##########
##########
# Now we can configure Pseudo-Distributed hadoop
# These steps were borrowed from
# http://hadoop.apache.org/common/docs/r0.20.2/quickstart.html
##########
# Set some environment variables. I added these to my
# .bashrc file.
$ export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-i386
$ export HADOOP_HOME=/usr/lib/hadoop-0.20
$ export ZOOKEEPER_HOME=/usr/lib/zookeeper
$ cd $HADOOP_HOME/conf
# Create the hadoop temp directory. It should not
# be in the /tmp directory because that directory
# disappears after each system restart. Something
# that is done a lot with virtual machines.
sudo mkdir /hadoop_tmp_dir
sudo chmod 777 hadoop_tmp_dir
# Replace the existing file with the indented lines.
$ sudo vi core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop_tmp_dir</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
##########
# Notice that the dfs secondary http address is not
# the default in the XML below. I don't know what
# process was using the default, but I needed to
# change it to avoid the 'port already in use' message.
##########
# Replace the existing file with the indented lines.
$ sudo vi hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.secondary.http.address</name>
<value>0.0.0.0:8002</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
# Replace the existing file with the indented lines.
$ sudo vi mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
# format the hadoop filesystem
$ hadoop namenode -format
##########
# Time to setup password-less ssh to localhost
##########
$ cd ~
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
# If you want to test that the ssh works, do this. Then exit.
$ ssh localhost
# Do some zookeeper configuration.
$ echo "maxClientCnxns=100" | sudo tee -a $ZOOKEEPER_HOME/conf/zoo.cfg
$ cd ~
$ export TAR_DIR=~/workspace/accumulo/src/assemble/target
$ tar xvzf $TAR_DIR/accumulo-1.5.0-incubating-SNAPSHOT-dist.tar.gz
# Add the following to your .bashrc file.
$ export ACCUMULO_HOME=~/accumulo-1.5.0-incubating-SNAPSHOT
$ cd $ACCUMULO_HOME/conf
###########
# I didn't see the metrics file mentioned in the README file but
# there was a complaint in a log file about its being missing.
###########
cp slaves.example slaves
cp masters.example masters
cp accumulo-env.sh.example accumulo-env.sh
cp accumulo-site.xml.example accumulo-site.xml
cp accumulo-metrics.xml.example accumulo-metrics.xml
# create the write-ahead directory.
cd ..
mkdir walogs
###########
# Configure for 4Gb RAM. I definitely recommend using more RAM
# if you have it. Since I am using a VirtualBox instance, I don't
# have much memory to play with.
###########
# Change these two parameters to reduce memory usage.
$ vi conf/accumulo-site.xml
tserver.memory.maps.max=256M
tserver.cache.index.size=128M
# Change (or add) the trace.password entry if the root password is
# not the default of "secret"
<property>
<name>trace.password</name>
<value>mypassword_for_root_user</value>
</property>
# Reduce the JVM memory. I have no real idea what these should be but these
# settings work. I consider them a magic formula. :)
vi conf/accumulo-env.sh
test -z "$ACCUMULO_TSERVER_OPTS" && export ACCUMULO_TSERVER_OPTS="${POLICY} -Xmx512m -Xms512m -Xss128k"
test -z "$ACCUMULO_MASTER_OPTS" && export ACCUMULO_MASTER_OPTS="${POLICY} -Xmx512m -Xms128m"
test -z "$ACCUMULO_MONITOR_OPTS" && export ACCUMULO_MONITOR_OPTS="${POLICY} -Xmx256m -Xms128m"
test -z "$ACCUMULO_GC_OPTS" && export ACCUMULO_GC_OPTS="-Xmx256m -Xms128m"
test -z "$ACCUMULO_LOGGER_OPTS" && export ACCUMULO_LOGGER_OPTS="-Xmx128m -Xms64m"
test -z "$ACCUMULO_GENERAL_OPTS" && export ACCUMULO_GENERAL_OPTS="-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75"
test -z "$ACCUMULO_OTHER_OPTS" && export ACCUMULO_OTHER_OPTS="-Xmx256m -Xms128m"
#######
#######
#######
#######
# REPEAT FOR EACH RESTART
#
# Since we are working inside a virtual machine, I found that
# some settings did not survive a shutdown or reboot. From this
# point on, repeat these command for each instance startup.
#######
# hadoop was installed as root. Therefore we need to
# change the ownership so that your username can
# write. IF YOU ARE NOT USING 'ubuntu', CHANGE THE
# COMMAND ACCORDINGLY.
$ sudo chown -R ubuntu:ubuntu /usr/lib/hadoop-0.20
$ sudo chown -R ubuntu:ubuntu /var/run/hadoop-0.20
$ sudo chown -R ubuntu:ubuntu /var/log/hadoop-0.20
# Start hadoop. I remove the logs so that I can find errors
# faster when I iterate through configuration settings.
$ cd $HADOOP_HOME
$ rm -rf logs/*
$ bin/start-dfs.sh
$ bin/start-mapred.sh
# If desired, look at the hadoop jobs. Your output should look something
# like the intended lines.
$ jps
4017 JobTracker
4254 TaskTracker
30279 Main
9808 Jps
3517 NameNode
3737 DataNode
##########
# This is an optional step to prove that the NameNode is running.
# Use a web browser like Firefix if you can.
##########
$ wget http://localhost:50070/
$ cat index.html
$ rm index.html
##########
# This is an optional step to prove that the JobTracker is running.
# Use a web browser like Firefix if you can.
##########
$ wget http://localhost:50030/
$ cat index.html
$ rm index.html
##########
# This is an optional step to prove that a map-reduce job
# can be run. In other words, that hadoop is working.
##########
$ hadoop dfs -rmr input
$ hadoop fs -put $HADOOP_HOME/conf input
$ hadoop jar $HADOOP_HOME/hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
$ hadoop fs -cat output/*
###########
# And now, the payoff. Let's get Accumulo to run.
###########
# Provide an instance (development) name and password (password) when asked.
$ cd $ACCUMULO_HOME
$ bin/accumulo init
# I remove the logs to make debugging easier.
rm -rf logs/*
bin/start-all.sh
##########
# This is an optional step to prove that Accumulo is running.
# Use a web browser like Firefox if you can.
##########
$ wget http://localhost:50095/
$ cat index.html
$ rm index.html
# Check the logs directory.
$ cd logs
# Look for content in .err or .out files. The file sizes should all be zero.
$ ls -l *.err *.out
# Look for error messages. Ignore messages about the missing libNativeMap file.
$ grep ERROR *
# Start the Accumulo shell. If this works, see the README file for an example
# how to use the shell.
$ bin/accumulo shell -u root -p password
###########
# Do a little victory dance. You're now an Accumulo user!
###########
##########
# Building Accumulo Documentation
##########
$ sudo apt-get install texlive-latex-base
$ sudo apt-get install texlive-latex-extra
$ rm ./docs/accumulo_user_manual.pdf
$ mvn -Dmaven.test.skip=true prepare-package
cd docs/src/developer_manual
pdflatex developer_manual && pdflatex developer_manual && pdflatex developer_manual && pdflatex developer_manual
##########
# Reading Documentation
##########
http://incubator.apache.org/accumulo/user_manual_1.4-incubating
docs/src/developer_manual/developer_manual.pdf
$ ls -l docs/examples
##########
# Things to Try
##########
bin/accumulo org.apache.accumulo.server.util.ListInstances
WHY DOES THIS NPE?
bin/accumulo org.apache.accumulo.server.util.DumpTable batchtest1
##########
# Running Accumulo Examples
##########
export EXAMPLE_JAR=lib/examples-simple-1.5.0-incubating-SNAPSHOT.jar
export EXAMPLE_PACKAGE=org.apache.accumulo.examples.simple
cd $ACCUMULO_HOME
export AINSTANCE=development
export AZOOKEEPERS=localhost
export AUSER=root
export APASSWORD=password
export AOPTIONS="$AINSTANCE $AZOOKEEPERS $AUSER $APASSWORD"
# ---------------------------
# Examples from README.batch
# ---------------------------
# start the command-line shell.
bin/accumulo shell -u root -p password
> setauths -u root -s exampleVis
> createtable batchtest1
> exit
export TABLE=batchtest1
export START=0
export NUM=10000
export VALUE_SIZE=50
export MAX_MEMORY=20000000
export MAX_LATENCY=500
export NUM_THREADS=20
export COLUMN_VISIBILITY=exampleVis
bin/accumulo $EXAMPLE_PACKAGE.client.SequentialBatchWriter $AOPTIONS $TABLE $START $NUM $VALUE_SIZE $MAX_MEMORY $MAX_LATENCY $NUM_THREADS $COLUMN_VISIBILITY
export NUM=1000
export MIN=0
export MAX=10000
export EXPECTED_VALUE_SIZE=50
bin/accumulo $EXAMPLE_PACKAGE.client.RandomBatchScanner $AOPTIONS $TABLE $NUM $MIN $MAX $EXPECTED_VALUE_SIZE $NUM_THREADS $COLUMN_VISIBILITY
# ----------------------------
# Examples from README.bloom
# ----------------------------
### create table without bloom filter.
bin/accumulo shell -u $AUSER -p $APASSWORD
> setauths -u root -s exampleVis
> createtable bloom_test1
bloom_test1> config -t bloom_test1 -s table.compaction.major.ratio=7
bloom_test1> exit
export TABLE=bloom_test1
export NUM=1000000
export MIN=0
export MAX=1000000000
export VALUE_SIZE=50
export MAX_MEMORY=2000000
export MAX_LATENCY=60000
export NUM_THREADS=20
export COLUMN_VISIBILITY=exampleVis
# create a million records
bin/accumulo $EXAMPLE_PACKAGE.client.RandomBatchWriter -s 7 $AOPTIONS $TABLE $NUM $MIN $MAX $VALUE_SIZE $MAX_MEMORY $MAX_LATENCY $NUM_THREADS $COLUMN_VISIBILITY
bin/accumulo shell -u $AUSER -p $APASSWORD -e 'flush -t bloom_test1 -w'
bin/accumulo $EXAMPLE_PACKAGE.client.RandomBatchWriter -s 8 $AOPTIONS $TABLE $NUM $MIN $MAX $VALUE_SIZE $MAX_MEMORY $MAX_LATENCY $NUM_THREADS $COLUMN_VISIBILITY
bin/accumulo shell -u $AUSER -p $APASSWORD -e 'flush -t bloom_test1 -w'
bin/accumulo $EXAMPLE_PACKAGE.client.RandomBatchWriter -s 9 $AOPTIONS $TABLE $NUM $MIN $MAX $VALUE_SIZE $MAX_MEMORY $MAX_LATENCY $NUM_THREADS $COLUMN_VISIBILITY
### create table with bloom filter.
bin/accumulo shell -u $AUSER -p $APASSWORD
> createtable bloom_test2
bloom_test2> config -t bloom_test2 -s table.compaction.major.ratio=7
bloom_test2> config -t bloom_test2 -s table.bloom.enabled=true
bloom_test2> exit
export TABLE=bloom_test2
bin/accumulo $EXAMPLE_PACKAGE.client.RandomBatchWriter -s 7 $AOPTIONS $TABLE $NUM $MIN $MAX $VALUE_SIZE $MAX_MEMORY $MAX_LATENCY $NUM_THREADS $COLUMN_VISIBILITY
bin/accumulo shell -u $AUSER -p $APASSWORD -e 'flush -t bloom_test2 -w'
bin/accumulo $EXAMPLE_PACKAGE.client.RandomBatchWriter -s 8 $AOPTIONS $TABLE $NUM $MIN $MAX $VALUE_SIZE $MAX_MEMORY $MAX_LATENCY $NUM_THREADS $COLUMN_VISIBILITY
bin/accumulo shell -u $AUSER -p $APASSWORD -e 'flush -t bloom_test2 -w'
bin/accumulo $EXAMPLE_PACKAGE.client.RandomBatchWriter -s 9 $AOPTIONS $TABLE $NUM $MIN $MAX $VALUE_SIZE $MAX_MEMORY $MAX_LATENCY $NUM_THREADS $COLUMN_VISIBILITY
bin/accumulo shell -u $AUSER -p $APASSWORD -e 'flush -t bloom_test2 -w'
### read table without bloom filter.
export TABLE=bloom_test1
export NUM=500
# same seed, records are found.
bin/accumulo $EXAMPLE_PACKAGE.client.RandomBatchScanner -s 7 $AOPTIONS $TABLE $NUM $MIN $MAX $EXPECTED_VALUE_SIZE $NUM_THREADS $COLUMN_VISIBILITY
# different seed, no results
bin/accumulo $EXAMPLE_PACKAGE.client.RandomBatchScanner -s 8 $AOPTIONS $TABLE $NUM $MIN $MAX $EXPECTED_VALUE_SIZE $NUM_THREADS $COLUMN_VISIBILITY
### read table with bloom filter.
export TABLE=bloom_test2
bin/accumulo $EXAMPLE_PACKAGE.client.RandomBatchScanner -s 7 $AOPTIONS $TABLE $NUM $MIN $MAX $EXPECTED_VALUE_SIZE $NUM_THREADS $COLUMN_VISIBILITY
### verify the map tables
# display the table ids.
bin/accumulo shell -u $AUSER -p $APASSWORD -e 'tables -l'
# display the hdfs files associated with the table id.
hadoop fs -lsr /accumulo/tables/3
# use PrintInfo to show the fies has a bloom filter.
bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo /accumulo/tables/4/default_tablet/F000000e.rf
# ----------------------------
# Examples from README.bulkIngest
# ----------------------------
export TABLE=test_bulk
export FIRST_SPLIT=row_00000333
export SECOND_SPLIT=row_00000666
bin/accumulo $EXAMPLE_PACKAGE.mapreduce.bulk.SetupTable $AOPTIONS $TABLE $FIRST_SPLIT $SECOND_SPLIT
export START=0
export END=1000
export BULK_FILE=bulk/test_1.txt
bin/accumulo $EXAMPLE_PACKAGE.mapreduce.bulk.GenerateTestData $START $END $BULK_FILE
#
# see the file that was just created
#
hadoop fs -cat $BULK_FILE
export INPUT=bulk
export OUTPUT=tmp/bulkWork
bin/tool.sh lib/accumulo-examples-*[^c].jar $EXAMPLE_PACKAGE.mapreduce.bulk.BulkIngestExample $AOPTIONS $TABLE $INPUT $OUTPUT
bin/accumulo $EXAMPLE_PACKAGE.mapreduce.bulk.VerifyIngest $AOPTIONS $TABLE $START $END
# -------------------------------
# Examples from README.combiner
# -------------------------------
bin/accumulo shell -u $AUSER -p $APASSWORD
>createtable runners
# enter 'stat' and '10' when asked
runners> setiter -t runners -p 10 -scan -minc -majc -n decStats -class org.apache.accumulo.examples.combiner.StatsCombiner
runners> setiter -t runners -p 11 -scan -minc -majc -n hexStats -class org.apache.accumulo.examples.combiner.StatsCombiner
runners> insert 123456 name first Joe
runners> insert 123456 stat marathon 240
runners> scan
runners> insert 123456 stat marathon 230
runners> insert 123456 stat marathon 220
#
# The next scan will show the min, max, sum, and count for the 123456:stat:marathon row.
#
runners> scan
runners> insert 123456 hstat virtualMarathon 6a
runners> insert 123456 hstat virtualMarathon 6b
#
# The next scan will show the min, max, sum, and count (in hexadecimal) for the 123456:hstat:marathon row.
#
runners> scan
runners> exit
# -------------------------------
# Examples from README.constraints
# -------------------------------
bin/accumulo shell -u $AUSER -p $APASSWORD
> createtable testConstraints
testConstraints> config -t testConstraints -s table.constraint.1=org.apache.accumulo.examples.constraints.NumericValueConstraint
testConstraints> config -t testConstraints -s table.constraint.2=org.apache.accumulo.examples.constraints.AlphaNumKeyConstraint
testConstraints> insert r1 cf1 cq1 1111
testConstraints> insert r1 cf1 cq1 ABC
Constraint Failures:
ConstraintViolationSummary(...NumericValueConstraint, ..., violationDescription:Value is not numeric...)
testConstraints> insert r1! cf1 cq1 ABC
Constraint Failures:
ConstraintViolationSummary(...NumericValueConstraint, ..., violationDescription:Value is not numeric...)
ConstraintViolationSummary(...AlphaNumKeyConstraint, ..., violationDescription:Row was not alpha numeric...)
testConstraints> scan
r1 cf1:cq1 [] 1111
testConstraints> exit
# -------------------------------
# Examples from README.dirlist
# -------------------------------
export DIR_TABLE=dirTable
export INDEX_TABLE=indexTable
export DATA_TABLE=dataTable
export AUTHORIZATION=exampleVis
export COLUMN_VISIBILITY=exampleVis
export DATA_CHUNK_SIZE=100000
export DIR_TO_INDEX=/home/$USER/workspace
# index the directory on local disk
bin/accumulo $EXAMPLE_PACKAGE.dirlist.Ingest $AOPTIONS $DIR_TABLE $INDEX_TABLE $DATA_TABLE $COLUMN_VISIBILITY $DATA_CHUNK_SIZE $DIR_TO_INDEX
export DIR_TO_VIEW=/home/$USER/workspace/accumulo/conf
bin/accumulo $EXAMPLE_PACKAGE.dirlist.Viewer $AOPTIONS $DIR_TABLE $DATA_TABLE $AUTHORIZATION $DIR_TO_VIEW
# display information about a directory.
export DIR_TO_VIEW=/home/$USER/workspace/accumulo/conf
bin/accumulo org.apache.accumulo.examples.dirlist.QueryUtil $AOPTIONS $DIR_TABLE $COLUMN_VISIBILITY $DIR_TO_VIEW
# find files
export FILE_TO_FIND=masters.example
bin/accumulo $EXAMPLE_PACKAGE.dirlist.QueryUtil $AOPTIONS $INDEX_TABLE $COLUMN_VISIBILITY $FILE_TO_FIND -search
export TRAILING_WILDCARD="masters*"
bin/accumulo $EXAMPLE_PACKAGE.dirlist.QueryUtil $AOPTIONS $INDEX_TABLE $COLUMN_VISIBILITY $TRAILING_WILDCARD -search
export LEADING_WILDCARD="*.jar"
bin/accumulo $EXAMPLE_PACKAGE.dirlist.QueryUtil $AOPTIONS $INDEX_TABLE $COLUMN_VISIBILITY $LEADING_WILDCARD -search
export WILDCARD="commons*.jar"
bin/accumulo $EXAMPLE_PACKAGE.dirlist.QueryUtil $AOPTIONS $INDEX_TABLE $COLUMN_VISIBILITY $WILDCARD -search
# count files
export AUTHORIZATION=exampleVis
export COLUMN_VISIBILITY=exampleVis
bin/accumulo $EXAMPLE_PACKAGE.dirlist.FileCount $AOPTIONS $DIR_TABLE $AUTHORIZATION $COLUMN_VISIBILITY
# -------------------------------
# Examples from README.filedata
# -------------------------------
How is FileDataIngest used?
* FileDataIngest - Takes a list of files and archives them into Accumulo keyed on the SHA1 hashes of the files.
# -------------------------------
# Examples from README.filter
# -------------------------------
bin/accumulo shell -u $AUSER -p $APASSWORD
> createtable filtertest
filtertest> setiter -t filtertest -scan -p 10 -n myfilter -filter
WAITING FOR JIRA TICKET RESOLUTION.
# -------------------------------
# Examples from README.helloworld
# -------------------------------
bin/accumulo shell -u $AUSER -p $APASSWORD
> createtable hellotable
hellotable> exit
export TABLE=hellotable
bin/accumulo $EXAMPLE_PACKAGE.helloworld.InsertWithBatchWriter $AINSTANCE $AZOOKEEPERS $TABLE $AUSER $APASSWORD
# insert via map-reduce
bin/accumulo $EXAMPLE_PACKAGE.helloworld.InsertWithOutputFormat $AINSTANCE $AZOOKEEPERS $TABLE $AUSER $APASSWORD
# display the records using the shell
bin/accumulo shell -u $AUSER -p $APASSWORD
> table hellotable
> scan
> exit
# display the records
bin/accumulo $EXAMPLE_PACKAGE.helloworld.ReadData $AINSTANCE $AZOOKEEPERS $TABLE $AUSER $APASSWORD
# -------------------------------
# Examples from README.mapred
# -------------------------------
hadoop fs -copyFromLocal $ACCUMULO_HOME/README wc/Accumulo.README
hadoop fs -ls wc
bin/accumulo shell -u $AUSER -p $APASSWORD
> createtable wordCount -a count=org.apache.accumulo.core.iterators.aggregation.StringSummation
> exit
export INPUT=wc
export OUTPUT=wordCount
bin/tool.sh lib/accumulo-examples-*[^c].jar $EXAMPLE_PACKAGE.mapreduce.WordCount $AINSTANCE $AZOOKEEPERS $INPUT $OUTPUT -u $AUSER -p $APASSWORD
# read the count from the accumulo table.
bin/accumulo shell -u $AUSER -p $APASSWORD
> table wordCount
wordCount> scan -b the
wordCount> exit
# -------------------------------
# Examples from README.shard
# -------------------------------
# create accumulo tables
bin/accumulo shell -u $AUSER -p $APASSWORD
> createtable shard
shard> createtable doc2term
doc2term> exit
# index some files
cd $ACCUMULO_HOME
export SHARD_TABLE=shard
export NUM_PARTITIONS=30
find src -name "*.java" | xargs bin/accumulo $EXAMPLE_PACKAGE.shard.Index $AINSTANCE $AZOOKEEPERS $SHARD_TABLE $AUSER $APASSWORD $NUM_PARTITIONS
export TERMS_TO_FIND="foo bar"
bin/accumulo $EXAMPLE_PACKAGE.shard.Query $AINSTANCE $AZOOKEEPERS $SHARD_TABLE $AUSER $APASSWORD $TERMS_TO_FIND
# populate doc2term
export DOC2TERM_TABLE=doc2term
bin/accumulo $EXAMPLE_PACKAGE.shard.Reverse $AINSTANCE $AZOOKEEPERS $SHARD_TABLE $DOC2TERM_TABLE $AUSER $APASSWORD
export NUM_TERMS=5
export ITERATION_COUNT=5
bin/accumulo org.apache.accumulo.examples.shard.ContinuousQuery $AINSTANCE $AZOOKEEPERS $SHARD_TABLE $DOC2TERM_TABLE $AUSER $APASSWORD $NUM_TERMS $ITERATION_COUNT
#####################################################################################
#####################################################################################
#####################################################################################
# ---------------------------------------------
# Other programs in client package
# ---------------------------------------------
bin/accumulo $EXAMPLE_PACKAGE.client.Flush $A_OPTIONS $TABLE
# To see all options.
bin/accumulo $EXAMPLE_PACKAGE.client.ReadWriteExample
bin/accumulo $EXAMPLE_PACKAGE.client.ReadWriteExample -i $AINSTANCE -z $AZOOKEEPERS -u $AUSER -p $APASSWORD -t $TABLE -s $COLUMN_VISIBILITY --read
bin/accumulo $EXAMPLE_PACKAGE.client.RowOperations $AOPTIONS
./src/main/java/org/apache/accumulo/examples/constraints/MaxMutationSize.java
./src/main/java/org/apache/accumulo/examples/isolation/InterferenceTest.java
@tariqmislam

This was a huge help and saved me a ton of time. Thank you.

edit - glad to see that the update shows that it's not necessary to reformat the HDFS each time also.

edit - one small typo... the command to chmod 777 on hadoop_tmp_dir, you forgot the '/' in front of it.

@crigano
crigano commented Apr 13, 2012

Is it possible to use sun Java JDK 6 and not 7 with accumulo?

This is what I am using with Hadoop under Umbuntu 11.10

thanks Mate!

Chris

@tariqmislam

I believe Accumulo requires Java JDK 6 at a minimum, so you should be fine.

@crigano
crigano commented Apr 14, 2012

Thanks Mate! Chris

@crigano
crigano commented Apr 25, 2012

How would I set up eclipse for interactive debuggging?

thanks Mate! Chris

@tariqmislam
@crigano
crigano commented Apr 26, 2012

Thanks! I have a new problem: cant run tar xvzf $TAR_DIR/accumulo-1.5.0-incubating-SNAPSHOT-dist.tar.gz because ~/workspace/accumulo/src/assemble/target is never created.

I execute mvn package in my home directory workspace
I find that =
/workspace/accumulo/src/assemble/target DOES NOT EXIST ONLY
I only have ~/workspace/accumulo/src/site
therefore I cant run tar xvzf $TAR_DIR/accumulo-1.5.0-incubating-SNAPSHOT-dist.tar.gz

Only indication I get is "[INFO] skip non existing resource Directory /home/crigano/workspace/accumulo/server/src/test/resources"

I run git clone https://github.com/apache/accumulo.git
Cloning into accumulo...
remote: Counting objects: 25215, done.
remote: Compressing objects: 100% (7308/7308), done.
remote: Total 25215 (delta 14315), reused 24513 (delta 13621)
Receiving objects: 100% (25215/25215), 8.51 MiB | 332 KiB/s, done.
Resolving deltas: 100% (14315/14315), done.

Then cd accumulo

Then mvn package:
12/04/20 20:13:54 INFO compress.CodecPool: Got brand-new decompressor
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries2 for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries3 for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries4 for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries5 for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 5 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 5 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries2
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 5 for tid 2
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries3
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 5 for tid 3
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries4
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 5 for tid 4
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries5
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries2 for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries3 for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries4 for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 12 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 12 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries2
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 12 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries3
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 12 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries4
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries2 for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 2 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 2 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries2
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/testlog for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number -1 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/testlog
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/testlog for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/testlog for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number -1 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/testlog
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/testlog for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 3 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/testlog
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries2 for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 3 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 3 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries2
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries2 for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 3 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 3 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries2
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 4 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries2 for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries3 for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 3 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 3 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries2
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 3 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries3
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/testlog for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 30 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/testlog
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries2 for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 2 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 2 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries2
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries2 for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries3 for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number -1 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number -1 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries2
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number -1 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries3
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries2 for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 2 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 2 for tid 1
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries2
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 3 for tid 2
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries for table<<
12/04/20 20:13:54 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 3 for tid 2
12/04/20 20:13:54 INFO log.SortedLogRecovery: Recovery complete for workdir/entries
12/04/20 20:13:54 INFO log.SortedLogRecovery: Looking at mutations from workdir/entries for table<<
12/04/20 20:13:55 INFO log.SortedLogRecovery: Scanning for mutations starting at sequence number 3 for tid 2
12/04/20 20:13:55 INFO log.SortedLogRecovery: Recovery complete for workdir/entries
Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.982 sec
Running org.apache.accumulo.server.tabletserver.log.MultiReaderTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.124 sec
Running org.apache.accumulo.server.tabletserver.InMemoryMapTest
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.247 sec
Running org.apache.accumulo.server.constraints.MetadataConstraintsTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.013 sec
Running org.apache.accumulo.server.client.BulkImporterTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.051 sec

Results :

Tests run: 83, Failures: 0, Errors: 0, Skipped: 0

[INFO] [jar:jar {execution: default-jar}]
[INFO] Building jar: /home/crigano/workspace/accumulo/lib/accumulo-server-1.5.0-SNAPSHOT.jar
[INFO] ------------------------------------------------------------------------
[INFO] Building accumulo-examples
[INFO] task-segment: [package]
[INFO] ------------------------------------------------------------------------
[INFO] [enforcer:enforce {execution: enforce-mvn}]
[INFO] [remote-resources:process {execution: default}]
[INFO] [dependency:copy-dependencies {execution: copy-dependencies}]
[INFO] [site:attach-descriptor {execution: default-attach-descriptor}]
[INFO] [site:attach-descriptor {execution: attach-descriptor}]
[INFO] ------------------------------------------------------------------------
[INFO] Building examples-simple
[INFO] task-segment: [package]
[INFO] ------------------------------------------------------------------------
[INFO] [enforcer:enforce {execution: enforce-mvn}]
[INFO] [remote-resources:process {execution: default}]
[debug] execute contextualize
[INFO] [resources:resources {execution: default-resources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/crigano/workspace/accumulo/examples/simple/src/main/resources
[INFO] Copying 3 resources
[INFO] [dependency:copy-dependencies {execution: copy-dependencies}]
[INFO] [compiler:compile {execution: default-compile}]
[INFO] Compiling 40 source files to /home/crigano/workspace/accumulo/examples/simple/target/classes
[debug] execute contextualize
[INFO] [resources:testResources {execution: default-testResources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/crigano/workspace/accumulo/examples/simple/src/test/resources
[INFO] Copying 3 resources
[INFO] [compiler:testCompile {execution: default-testCompile}]
[INFO] Compiling 5 source files to /home/crigano/workspace/accumulo/examples/simple/target/test-classes
[INFO] [surefire:test {execution: default-test}]
[INFO] Surefire report directory: /home/crigano/workspace/accumulo/examples/simple/target/surefire-reports


T E S T S

Running org.apache.accumulo.examples.simple.filedata.ChunkInputFormatTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.169 sec
Running org.apache.accumulo.examples.simple.filedata.ChunkCombinerTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.034 sec
Running org.apache.accumulo.examples.simple.filedata.KeyUtilTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.009 sec
Running org.apache.accumulo.examples.simple.filedata.ChunkInputStreamTest
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.096 sec
Running org.apache.accumulo.examples.simple.dirlist.CountTest
Max depth : 3
Time to find max depth : 2 ms
Time to compute counts : 4 ms
Entries scanned : 30
Counts inserted : 4
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.053 sec

Results :

Tests run: 15, Failures: 0, Errors: 0, Skipped: 0

[INFO] [jar:jar {execution: default-jar}]
[INFO] Building jar: /home/crigano/workspace/accumulo/lib/examples-simple-1.5.0-SNAPSHOT.jar
[INFO] ------------------------------------------------------------------------
[INFO] Building accumulo-wikisearch
[INFO] task-segment: [package]
[INFO] ------------------------------------------------------------------------
[INFO] [enforcer:enforce {execution: enforce-mvn}]
[INFO] [remote-resources:process {execution: default}]
[INFO] [dependency:copy-dependencies {execution: copy-dependencies}]
[INFO] [site:attach-descriptor {execution: default-attach-descriptor}]
[INFO] [site:attach-descriptor {execution: attach-descriptor}]
[INFO] ------------------------------------------------------------------------
[INFO] Building wikisearch-ingest
[INFO] task-segment: [package]
[INFO] ------------------------------------------------------------------------
[INFO] [enforcer:enforce {execution: enforce-mvn}]
[INFO] [remote-resources:process {execution: default}]
[debug] execute contextualize
[INFO] [resources:resources {execution: default-resources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/crigano/workspace/accumulo/examples/wikisearch/ingest/src/main/resources
[INFO] Copying 3 resources
[INFO] [dependency:copy-dependencies {execution: copy-dependencies}]
[INFO] Copying hadoop-core-0.20.2.jar to /home/crigano/workspace/accumulo/examples/wikisearch/ingest/lib/hadoop-core-0.20.2.jar
[INFO] Copying commons-codec-1.5.jar to /home/crigano/workspace/accumulo/examples/wikisearch/ingest/lib/commons-codec-1.5.jar
[INFO] Copying commons-lang-2.4.jar to /home/crigano/workspace/accumulo/examples/wikisearch/ingest/lib/commons-lang-2.4.jar
[INFO] Copying lucene-wikipedia-3.0.2.jar to /home/crigano/workspace/accumulo/examples/wikisearch/ingest/lib/lucene-wikipedia-3.0.2.jar
[INFO] Copying google-collections-1.0.jar to /home/crigano/workspace/accumulo/examples/wikisearch/ingest/lib/google-collections-1.0.jar
[INFO] Copying lucene-analyzers-3.0.2.jar to /home/crigano/workspace/accumulo/examples/wikisearch/ingest/lib/lucene-analyzers-3.0.2.jar
[INFO] Copying zookeeper-3.3.1.jar to /home/crigano/workspace/accumulo/examples/wikisearch/ingest/lib/zookeeper-3.3.1.jar
[INFO] Copying cloudtrace-1.5.0-SNAPSHOT.jar to /home/crigano/workspace/accumulo/examples/wikisearch/ingest/lib/cloudtrace-1.5.0-SNAPSHOT.jar
[INFO] Copying lucene-core-3.0.2.jar to /home/crigano/workspace/accumulo/examples/wikisearch/ingest/lib/lucene-core-3.0.2.jar
[INFO] Copying protobuf-java-2.3.0.jar to /home/crigano/workspace/accumulo/examples/wikisearch/ingest/lib/protobuf-java-2.3.0.jar
[INFO] Copying accumulo-core-1.5.0-SNAPSHOT.jar to /home/crigano/workspace/accumulo/examples/wikisearch/ingest/lib/accumulo-core-1.5.0-SNAPSHOT.jar
[INFO] Copying libthrift-0.6.1.jar to /home/crigano/workspace/accumulo/examples/wikisearch/ingest/lib/libthrift-0.6.1.jar
[INFO] [compiler:compile {execution: default-compile}]
[INFO] Compiling 23 source files to /home/crigano/workspace/accumulo/examples/wikisearch/ingest/target/classes
[debug] execute contextualize
[INFO] [resources:testResources {execution: default-testResources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] [compiler:testCompile {execution: default-testCompile}]
[INFO] Compiling 7 source files to /home/crigano/workspace/accumulo/examples/wikisearch/ingest/target/test-classes
[INFO] [surefire:test {execution: default-test}]
[INFO] Surefire report directory: /home/crigano/workspace/accumulo/examples/wikisearch/ingest/target/surefire-reports


T E S T S

Running org.apache.accumulo.examples.wikisearch.iterator.TextIndexTest
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.982 sec
Running org.apache.accumulo.examples.wikisearch.iterator.GlobalIndexUidTest
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.104 sec
Running org.apache.accumulo.examples.wikisearch.ingest.WikipediaInputSplitTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.037 sec
Running org.apache.accumulo.examples.wikisearch.reader.AggregatingRecordReaderTest
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.251 sec

Results :

Tests run: 18, Failures: 0, Errors: 0, Skipped: 0

[INFO] [jar:jar {execution: default-jar}]
[INFO] Building jar: /home/crigano/workspace/accumulo/examples/wikisearch/ingest/lib/wikisearch-ingest-1.5.0-SNAPSHOT.jar
[INFO] ------------------------------------------------------------------------
[INFO] Building wikisearch-query
[INFO] task-segment: [package]
[INFO] ------------------------------------------------------------------------
[INFO] [enforcer:enforce {execution: enforce-mvn}]
[INFO] [remote-resources:process {execution: default}]
[debug] execute contextualize
[INFO] [resources:resources {execution: default-resources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 2 resources
[INFO] Copying 3 resources
[INFO] [dependency:copy-dependencies {execution: copy-dependencies}]
[INFO] Copying commons-codec-1.5.jar to /home/crigano/workspace/accumulo/examples/wikisearch/query/lib/commons-codec-1.5.jar
[INFO] Copying hadoop-core-0.20.2.jar to /home/crigano/workspace/accumulo/examples/wikisearch/query/lib/hadoop-core-0.20.2.jar
[INFO] Copying accumulo-core-1.5.0-SNAPSHOT.jar to /home/crigano/workspace/accumulo/examples/wikisearch/query/lib/accumulo-core-1.5.0-SNAPSHOT.jar
[INFO] Copying libthrift-0.6.1.jar to /home/crigano/workspace/accumulo/examples/wikisearch/query/lib/libthrift-0.6.1.jar
[INFO] Copying commons-lang-2.4.jar to /home/crigano/workspace/accumulo/examples/wikisearch/query/lib/commons-lang-2.4.jar
[INFO] Copying wikisearch-ingest-1.5.0-SNAPSHOT.jar to /home/crigano/workspace/accumulo/examples/wikisearch/query/lib/wikisearch-ingest-1.5.0-SNAPSHOT.jar
[INFO] Copying google-collections-1.0.jar to /home/crigano/workspace/accumulo/examples/wikisearch/query/lib/google-collections-1.0.jar
[INFO] Copying zookeeper-3.3.1.jar to /home/crigano/workspace/accumulo/examples/wikisearch/query/lib/zookeeper-3.3.1.jar
[INFO] Copying kryo-1.04.jar to /home/crigano/workspace/accumulo/examples/wikisearch/query/lib/kryo-1.04.jar
[INFO] Copying commons-jexl-2.0.1.jar to /home/crigano/workspace/accumulo/examples/wikisearch/query/lib/commons-jexl-2.0.1.jar
[INFO] Copying cloudtrace-1.5.0-SNAPSHOT.jar to /home/crigano/workspace/accumulo/examples/wikisearch/query/lib/cloudtrace-1.5.0-SNAPSHOT.jar
[INFO] Copying minlog-1.2.jar to /home/crigano/workspace/accumulo/examples/wikisearch/query/lib/minlog-1.2.jar
[INFO] Copying protobuf-java-2.3.0.jar to /home/crigano/workspace/accumulo/examples/wikisearch/query/lib/protobuf-java-2.3.0.jar
[INFO] [compiler:compile {execution: default-compile}]
[INFO] Compiling 32 source files to /home/crigano/workspace/accumulo/examples/wikisearch/query/target/classes
[debug] execute contextualize
[INFO] [resources:testResources {execution: default-testResources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] [compiler:testCompile {execution: default-testCompile}]
[INFO] Compiling 2 source files to /home/crigano/workspace/accumulo/examples/wikisearch/query/target/test-classes
[INFO] [surefire:test {execution: default-test}]
[INFO] Surefire report directory: /home/crigano/workspace/accumulo/examples/wikisearch/query/target/surefire-reports


T E S T S

Running org.apache.accumulo.examples.wikisearch.logic.TestQueryLogic
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.034 sec

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

[INFO] [ejb:ejb {execution: default-ejb}]
[INFO] Building ejb wikisearch-query-1.5.0-SNAPSHOT with ejbVersion 3.1
[INFO] Building jar: /home/crigano/workspace/accumulo/examples/wikisearch/query/target/wikisearch-query-1.5.0-SNAPSHOT.jar
[INFO] ------------------------------------------------------------------------
[INFO] Building wikisearch-query-war
[INFO] task-segment: [package]
[INFO] ------------------------------------------------------------------------
[INFO] [enforcer:enforce {execution: enforce-mvn}]
[INFO] [remote-resources:process {execution: default}]
[debug] execute contextualize
[INFO] [resources:resources {execution: default-resources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /home/crigano/workspace/accumulo/examples/wikisearch/query-war/src/main/resources
[INFO] Copying 3 resources
[INFO] [compiler:compile {execution: default-compile}]
[INFO] No sources to compile
[debug] execute contextualize
[INFO] [resources:testResources {execution: default-testResources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] [compiler:testCompile {execution: default-testCompile}]
[INFO] No sources to compile
[INFO] [surefire:test {execution: default-test}]
[INFO] Surefire report directory: /home/crigano/workspace/accumulo/examples/wikisearch/query-war/target/surefire-reports


T E S T S

Results :

Tests run: 0, Failures: 0, Errors: 0, Skipped: 0

[INFO] [war:war {execution: default-war}]
[INFO] Packaging webapp
[INFO] Assembling webapp [wikisearch-query-war] in [/home/crigano/workspace/accumulo/examples/wikisearch/query-war/target/wikisearch-query-war-1.5.0-SNAPSHOT]
[INFO] Processing war project
[INFO] Copying webapp resources [/home/crigano/workspace/accumulo/examples/wikisearch/query-war/src/main/webapp]
[INFO] Webapp assembled in [75 msecs]
[INFO] Building war: /home/crigano/workspace/accumulo/examples/wikisearch/query-war/target/wikisearch-query-war-1.5.0-SNAPSHOT.war
[INFO] WEB-INF/web.xml already added, skipping
[INFO] ------------------------------------------------------------------------
[INFO] Building accumulo-assemble
[INFO] task-segment: [package]
[INFO] ------------------------------------------------------------------------
[INFO] [enforcer:enforce {execution: enforce-mvn}]
[INFO] [remote-resources:process {execution: default}]
[INFO] [dependency:copy-dependencies {execution: copy-dependencies}]
[INFO] [exec:exec {execution: user-manual}]
Missing pdflatex command. Please install.
[INFO] [site:attach-descriptor {execution: default-attach-descriptor}]
[INFO] [exec:exec {execution: config webpage}]
[INFO] [site:attach-descriptor {execution: attach-descriptor}]
[INFO]
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] ------------------------------------------------------------------------
[INFO] accumulo .............................................. SUCCESS [6.799s]
[INFO] cloudtrace ............................................ SUCCESS [7.879s]
[INFO] accumulo-start ........................................ SUCCESS [21.564s]
[INFO] accumulo-core ......................................... SUCCESS [1:23.868s]
[INFO] accumulo-server ....................................... SUCCESS [1:21.495s]
[INFO] accumulo-examples ..................................... SUCCESS [0.079s]
[INFO] examples-simple ....................................... SUCCESS [36.891s]
[INFO] accumulo-wikisearch ................................... SUCCESS [0.060s]
[INFO] wikisearch-ingest ..................................... SUCCESS [18.040s]
[INFO] wikisearch-query ...................................... SUCCESS [19.979s]
[INFO] wikisearch-query-war .................................. SUCCESS [1.350s]
[INFO] accumulo-assemble ..................................... SUCCESS [0.677s]
[INFO] ------------------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESSFUL
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 4 minutes 40 seconds
[INFO] Finished at: Fri Apr 20 20:15:14 EDT 2012
[INFO] Final Memory: 115M/241M
[INFO] ------------------------------------------------------------------------
crigano@crigano-VirtualBox:~/workspace/accumulo$

Thanks Mate! Chris

@tariqmislam

I'm a little confused by your question, the above set of commands is for building Accumulo, not installing it through apt-get, which is what you're looking for I believe.

@javawebdeveloper

My team had some issues when it came down to utilizing the SNAPSHOT on line 201. The file was not there. I found the accumulo dev guide on apache says to build it using mvn package -P assemble. This worked to create the artifact, but it was in workspace/accumulo/assemble/target and not workspace/accumulo/src/assemble/target. Other than that, awesome guide. Thanks!

@tjsears
tjsears commented May 11, 2012

Having trouble with lines 200 & 201. In line 200, the directory /workspace/accumulo/src/assemble/target does not exist. In line 201, accumulo-1.5.0-incubating-SNAPSHOT-dist.tar.gz does not exist. Rather, accumulo-1.5.0-SNAPSHOT-dist.tar.gz.

@tjsears
tjsears commented May 14, 2012

Getting a weird error:

Starting tablet servers and loggers .... done
localhost : tablet server already running (2961)
Starting logger on localhost
14 13:53:52,408 [server.Accumulo] INFO : Attempting to talk to zookeeper
14 13:53:52,664 [server.Accumulo] INFO : Zookeeper connected and initialized, attemping to talk to HDFS
14 13:53:52,666 [server.Accumulo] INFO : Connected to HDFS
Thread "org.apache.accumulo.server.master.state.SetGoalState" died null
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.accumulo.start.Main$1.run(Main.java:89)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /accumulo/34d09d85-40ee-433d-a705-404cb5e24bb9/masters/goal_state
at org.apache.zookeeper.KeeperException.create(KeeperException.java:104)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:643)
at org.apache.accumulo.core.zookeeper.ZooUtil.putData(ZooUtil.java:146)
at org.apache.accumulo.core.zookeeper.ZooUtil.putPersistentData(ZooUtil.java:126)
at org.apache.accumulo.server.zookeeper.ZooReaderWriter.putPersistentData(ZooReaderWriter.java:82)
at org.apache.accumulo.server.master.state.SetGoalState.main(SetGoalState.java:46)
... 6 more
localhost : master already running (3139)
localhost : garbage collector already running (3225)
localhost : monitor already running (3313)
localhost : tracer already running (3404)

Any ideas?

@tjsears
tjsears commented May 14, 2012

Here's is the tserver_ubuntu_debug:

14 10:53:13,199 [server.Accumulo] INFO : tserver.logger.timeout = 30s
14 10:53:13,199 [server.Accumulo] INFO : tserver.memory.lock = false
14 10:53:13,200 [server.Accumulo] INFO : tserver.memory.manager = org.apache.accumulo.server.tabletserver.LargestFirstMemoryManager
14 10:53:13,200 [server.Accumulo] INFO : tserver.memory.maps.max = 512M
14 10:53:13,200 [server.Accumulo] INFO : tserver.memory.maps.native.enabled = true
14 10:53:13,200 [server.Accumulo] INFO : tserver.metadata.readahead.concurrent.max = 8
14 10:53:13,200 [server.Accumulo] INFO : tserver.migrations.concurrent.max = 1
14 10:53:13,200 [server.Accumulo] INFO : tserver.monitor.fs = true
14 10:53:13,200 [server.Accumulo] INFO : tserver.mutation.queue.max = 256K
14 10:53:13,201 [server.Accumulo] INFO : tserver.port.client = 9997
14 10:53:13,201 [server.Accumulo] INFO : tserver.port.search = false
14 10:53:13,201 [server.Accumulo] INFO : tserver.readahead.concurrent.max = 16
14 10:53:13,203 [server.Accumulo] INFO : tserver.scan.files.open.max = 100
14 10:53:13,204 [server.Accumulo] INFO : tserver.server.threadcheck.time = 1s
14 10:53:13,204 [server.Accumulo] INFO : tserver.server.threads.minimum = 2
14 10:53:13,204 [server.Accumulo] INFO : tserver.session.idle.max = 1m
14 10:53:13,205 [server.Accumulo] INFO : tserver.tablet.split.midpoint.files.max = 30
14 10:53:13,205 [server.Accumulo] INFO : tserver.walog.max.size = 256M
14 10:53:13,861 [tabletserver.TabletServer] INFO : Tablet server starting on localhost
14 10:53:13,998 [util.FileSystemMonitor] INFO : Filesystem monitor started
14 10:53:14,121 [tabletserver.NativeMap] ERROR: Failed to load native map library /home/ubuntu/accumulo-1.5.0-SNAPSHOT/lib/native/map
/libNativeMap-Linux-i386-32.so
java.lang.UnsatisfiedLinkError: Can't load library: /home/ubuntu/accumulo-1.5.0-SNAPSHOT/lib/native/map/libNativeMap-Linux-i386-32.so
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
at java.lang.Runtime.load0(Runtime.java:792)
at java.lang.System.load(System.java:1059)
at org.apache.accumulo.server.tabletserver.NativeMap.loadNativeLib(NativeMap.java:144)
at org.apache.accumulo.server.tabletserver.NativeMap.(NativeMap.java:156)
at org.apache.accumulo.server.tabletserver.TabletServerResourceManager.(TabletServerResourceManager.java:148)
at org.apache.accumulo.server.tabletserver.TabletServer.config(TabletServer.java:2968)
at org.apache.accumulo.server.tabletserver.TabletServer.main(TabletServer.java:3103)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.accumulo.start.Main$1.run(Main.java:89)
at java.lang.Thread.run(Thread.java:722)
14 10:53:14,216 [server.Accumulo] WARN : System swappiness setting is greater than zero (60) which can cause time-sensitive operation
s to be delayed. Accumulo is time sensitive because it needs to maintain distributed lock agreement.
14 10:53:14,251 [tabletserver.TabletServer] ERROR: Uncaught exception in TabletServer.main, exiting
java.lang.IllegalArgumentException: Maximum tablet server map memory 536,870,912 and block cache sizes 149,946,368 is too large for t
his JVM configuration 530,186,240
at org.apache.accumulo.server.tabletserver.TabletServerResourceManager.(TabletServerResourceManager.java:159)
at org.apache.accumulo.server.tabletserver.TabletServer.config(TabletServer.java:2968)
at org.apache.accumulo.server.tabletserver.TabletServer.main(TabletServer.java:3103)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.accumulo.start.Main$1.run(Main.java:89)
at java.lang.Thread.run(Thread.java:722)
14 10:53:24,215 [server.Accumulo] WARN : System swappiness setting is greater than zero (60) which can cause time-sensitive operations to be delayed. Accumulo is time sensitive because it needs to maintain distributed lock agreement.
14 10:53:34,216 [server.Accumulo] WARN : System swappiness setting is greater than zero (60) which can cause time-sensitive operations to be delayed. Accumulo is time sensitive because it needs to maintain distributed lock agreement.
14 10:53:44,220 [server.Accumulo] WARN : System swappiness setting is greater than zero (60) which can cause time-sensitive operations to be delayed. Accumulo is time sensitive because it needs to maintain distributed lock agreement.
14 10:53:54,226 [server.Accumulo] WARN : System swappiness setting is greater than zero (60) which can cause time-sensitive operations to be delayed. Accumulo is time sensitive because it needs to maintain distributed lock agreement.
14 10:54:04,226 [server.Accumulo] WARN : System swappiness setting is greater than zero (60) which can cause time-sensitive operations to be delayed. Accumulo is time sensitive because it needs to maintain distributed lock agreement.
14 10:54:14,226 [server.Accumulo] WARN : System swappiness setting is greater than zero (60) which can cause time-sensitive operations to be delayed. Accumulo is time sensitive because it needs to maintain distributed lock agreement.
14 10:54:24,228 [server.Accumulo] WARN : System swappiness setting is greater than zero (60) which can cause time-sensitive operations to be delayed. Accumulo is time sensitive because it needs to maintain distributed lock agreement.

@javawebdeveloper

Tim,
How much memory do you have allocated to your virtual machine? From your error:

Maximum tablet server map memory 536,870,912 and block cache sizes 149,946,368 is too large for t
his JVM configuration 530,186,240

It looks like you are using the 512MB example configuration with not enough memory allocated from your JVM. Can you increase your memory?

Hope that helps.

@tjsears
tjsears commented May 15, 2012

Memory is 2GB for the virtual environment.

@telvis07
telvis07 commented Jun 1, 2012

two thumbs up!

@cloudnewbie

I downloaded Accumulo VM from this URL: http://blog.sqrrl.com/post/40578606670/quick-accumulo-install

I bring it up in Virtualbox it asks me for ubuntu username and password.

Anybody know what the username and password is?

Thanks

@beedaan
beedaan commented Jul 30, 2013

a quick note, I had to use Maven 3 instead of Maven 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment