Skip to content

Instantly share code, notes, and snippets.

View butlermh's full-sized avatar

Mark H. Butler butlermh

  • Santa Clara, United States
View GitHub Profile
@butlermh
butlermh / CorpusGenerator.java
Created June 1, 2011 19:11
Version of CorpusGenerator that supports FileFilter, File recursion and NIO for reading files from local filesystem.
/**
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
@butlermh
butlermh / CorpusGeneratorQueue.java
Created June 2, 2011 21:10
Threaded Implementation of Corpus Generator
/**
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
@butlermh
butlermh / ivyexample2
Created June 8, 2011 13:02
How to get Ivy to resolve from a local Maven repository
<property name="local-maven2-pattern"
value="${user.home}/.m2/repository/[organisation]/[module]/[revision]/[module]-[revision]"
override="false" />
...
<resolvers>
<chain ...
<filesystem
name="local-maven2"
m2compatible="true" >
<ivy pattern="${local-maven2-pattern}.pom"/>
@butlermh
butlermh / ivyexample3
Created June 8, 2011 13:31
Using configurations in Ivy
<ivy-module version="1.0">
<info .... />
<configurations>
<conf name="job" description=".job file dependency"/>
<conf name="test" description="Test dependency"/>
<conf name="default" extends="job" description="Dependency required by users of .jar"/>
</configurations>
<publications .... />
<dependencies defaultconf="*->*,!sources,!javadoc">
<dependency org="uk.ac.gate" name="gate-core" rev="6.1" conf="job->default(*),optional(*)"/>
@butlermh
butlermh / antexample1
Created June 8, 2011 14:01
Using configurations in Ant
<!-- resolve the job and test dependencies to different directories -->
<target name="resolve" depends="clean-lib" description="--> resolve and retrieve dependencies with ivy">
<mkdir dir="${job.lib.dir}"/> <!-- not usually necessary, ivy creates the directory IF there are dependencies -->
<mkdir dir="${test.lib.dir}"/>
<ivy:resolve file="${ivy.file}" transitive="true"/>
<ivy:retrieve pattern="${job.lib.dir}/[artifact]-[revision].[ext]" conf="job" />
<ivy:retrieve pattern="${test.lib.dir}/[artifact]-[revision].[ext]" conf="test" />
</target>
@butlermh
butlermh / antexample2
Created June 8, 2011 14:50
Using uptodate to speed up builds
<target name="check.retrieve.necessary" description="Only retrieve jars with Ivy if necessary">
<ivy:resolve file="${ivy.file}" transitive="true"/>
<uptodate property="libs.uptodate">
<srcfiles dir="." includes="${ivy.file}"/>
<mapper type="merge" to="{lib.dir}/.done"/>
</uptodate>
</target>
<target name="resolve" depends="check.retrieve.necessary" description="--> resolve and retrieve dependencies with ivy"
unless="libs.uptodate">
@butlermh
butlermh / antexample3
Created June 8, 2011 14:55
Using uptodate to speed up builds 2
<target name="compile" depends="resolve" description="--> compile the project" unless="module.uptodate">
<mkdir dir="${classes.dir}" />
<javac srcdir="${src.dir}" destdir="${classes.dir}" classpathref="lib.path.id" debug="true" includeantruntime="false"/>
</target>
<target name="jar" depends="version, compile, copyclasses" description="--> make a jar file for this project" unless="module.uptodate">
<jar destfile="${jar.file}">
<fileset dir="${classes.dir}" />
<manifest>
<attribute name="Built-By" value="${user.name}"/>
@butlermh
butlermh / mavenexample1
Created June 8, 2011 23:00
Jena Schemagen in Maven
<build>
<plugins>
<plugin>
<artifactId>maven-antrun-plugin</artifactId>
<executions>
<execution>
<phase>compile</phase>
<configuration>
<tasks>
<property name="runtime_classpath" refid="maven.runtime.classpath" />
@butlermh
butlermh / hadoopexample1
Created June 8, 2011 23:26
Changing imports for new Hadoop API
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.Mapper;
BECOME
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.SequenceFileInputFormat;
BECOME
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
@butlermh
butlermh / hadoopexample2
Created June 8, 2011 23:29
Changing Mapper for new Hadoop API 1
public class MyMapper extends MapReduceBase implements
Mapper<InputKey, InputValue, OutputKey, OutputValue> {
BECOMES
public class MyMapper extends
Mapper<InputKey, InputValue, OutputKey, OutputValue> {