Skip to content

Instantly share code, notes, and snippets.

View keith-turner's full-sized avatar
👍
17.9 % chance that I am coding

Keith Turner keith-turner

👍
17.9 % chance that I am coding
View GitHub Profile
/*
* Copyright 2013 Morphism LLC (www.morphism.com)
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
@keith-turner
keith-turner / SplitTimeTest.java
Last active August 29, 2015 14:03
A utility that times creating splits and balancing
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
@keith-turner
keith-turner / pom.xml
Created March 10, 2015 19:04
Example Checkstyle rule for Accumulo Public API
<pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-checkstyle-plugin</artifactId>
<version>2.14</version>
</plugin>
</plugins>
</pluginManagement>
@keith-turner
keith-turner / hadoop apilyzer.txt
Created September 1, 2015 03:13
Output from using APILyzer to analyze Hadoop and HBase. This is output from an experimental version of APILyzer with Annotation support. Looked for Public+Stable APIs using non Public+Stable types. Did not analyze entire Hadoop API.
Includes: []
IncludeAnnotations: [@org.apache.hadoop.classification.InterfaceAudience$Public]
ExcludesAnnotations: [@org.apache.hadoop.classification.InterfaceStability$Evolving, @org.apache.hadoop.classification.InterfaceStability$Unstable]
Excludes: []
Allowed: []
Public API:
org.apache.hadoop.HadoopIllegalArgumentException
org.apache.hadoop.conf.Configurable
org.apache.hadoop.conf.Configuration

Basic Sampling Example

Accumulo supports building a set of sample data that can be efficiently accessed by scanners. What data is included in the sample set is configurable. Below, some data representing documents are inserted.

root@instance sampex> createtable sampex
root@instance sampex> insert 9255 doc content 'abcde'

root@instance sampex> insert 9255 doc url file://foo.txt

if [[ -z $HADOOP_HOME ]] ; then
test -z "$HADOOP_PREFIX" && export HADOOP_PREFIX=/home/fluo/git/fluo-dev/install/hadoop-2.7.2
else
HADOOP_PREFIX="$HADOOP_HOME"
unset HADOOP_HOME
fi
# hadoop-2.0:
test -z "$HADOOP_CONF_DIR" && export HADOOP_CONF_DIR="$HADOOP_PREFIX/etc/hadoop"
@keith-turner
keith-turner / ContentObserver.java
Last active June 27, 2017 20:47
A solution to excercise 1 of the Fluo Tour http://fluo.apache.org/tour/exercise-1/
package ft;
import java.util.Arrays;
import java.util.Collection;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
import com.google.common.collect.Collections2;
@keith-turner
keith-turner / ExternalIndex.java
Last active November 30, 2016 20:35
Modify http://fluo.apache.org/tour/exercise-1/ part 3 to create inverted index in external table.
package ft;
import java.util.Optional;
import java.util.function.Consumer;
import org.apache.accumulo.core.client.Connector;
import org.apache.accumulo.core.client.ZooKeeperInstance;
import org.apache.accumulo.core.client.security.tokens.PasswordToken;
import org.apache.accumulo.core.data.Mutation;
import org.apache.fluo.api.client.TransactionBase;
@keith-turner
keith-turner / LocGroupPerfTest.java
Last active July 21, 2017 18:36
Performance experiment for apache/accumulo#275 and ACCUMULO-4667
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.util.LinkedHashSet;
import java.util.Map.Entry;
import java.util.Set;
import org.apache.accumulo.core.client.Scanner;
import org.apache.accumulo.core.client.rfile.RFile;
import org.apache.accumulo.core.client.rfile.RFileWriter;
package test.rfile;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.Map.Entry;
import java.util.Random;
import java.util.function.Function;