Skip to content

Instantly share code, notes, and snippets.

View soren's full-sized avatar

Søren Lund soren

View GitHub Profile
@soren
soren / wc_mapper.pl
Created November 22, 2013 07:39
A Perl Word Count mapper script. Can be used as a mapper in Hadoop using the Streaming interface. Tested with Java 1.6 and Hadoop 1.0.4.
#!/usr/bin/env perl
use warnings;
use strict;
while (<>) {
chomp;
print lc $_,"\t1\n" foreach split /[\s.,:;!?]+/;
}
@soren
soren / MyWordCount.java
Last active December 26, 2015 21:08
A Hadoop Word Count example using my own map and reduce classes. Tested with Java 1.6 and Hadoop 1.0.4.
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
@soren
soren / WordCount.java
Created October 29, 2013 11:51
A Hadoop Word Count example using built-in map and reduce classes. Tested with Java 1.6 and Hadoop 1.0.4.
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.map.TokenCounterMapper;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
@soren
soren / JavaCodeSample.java
Created October 7, 2013 10:47
Example of my Java code formatting rules.
package net.twonky.demo;
import java.util.HashMap; // Always import fully qualified packages
import java.util.Map; // The imports should be sorted alphabetically
// import third party classes here, e.g. import oracle.jdbc.pool.OracleConnectionCacheImpl;
// import you own classes here, e.g. import net.twonky.demo.StringUtil;
/**
@soren
soren / JdbcCheck.java
Created January 16, 2013 08:31
A simple Java utility to test database connections using JDBC.
import java.io.Console;
import java.io.File;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;
import java.util.HashMap;
import java.util.Map;
public class JdbcCheck {
import java.util.HashMap;
import java.util.Map;
import com.google.common.collect.ImmutableMap;
public class PrintOS {
private static final Map<String, String> MSG = ImmutableMap
.of("unix", "This is a UNIX box and therefore good.",
"windows", "This is a Windows box and therefore bad.",
@soren
soren / scancalc.html
Created July 6, 2012 06:51
ScanCalc - a small JavaScript utility to convert between centimeters (cm), pixels and dots/inch (dpi).
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>ScanCalc</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style type="text/css">
table { border-collapse:collapse; }
th { background-color: #ccc; }
th,td { padding:3px; }
@soren
soren / java_class_version.pl
Created March 17, 2012 11:19
Perl script that can determine the version of a java class file
#!/usr/bin/env perl
use warnings;
use strict;
use Pod::Usage;
=head1 NAME
java_class_version.pl - determines the version of a java class file
@soren
soren / jspwiki-add.sh
Created March 5, 2012 09:57
Create a new JSPWiki instance
#!/bin/bash
SRCDIR=/root/build/JSPWiki
WIKIDIR=/jspwiki
CONFDIR=/etc/jspwiki
DEPLOYDIR=/opt/java/tomcat/conf/Catalina/localhost
test -z $1 && echo "USAGE: $(basename $0) wikiname" && exit 1
test -d $WIKIDIR/$1 && echo "The name $1 is already used" && exit 1
@soren
soren / dev_setup.bat
Created December 6, 2011 09:38
An init script for the default "shell" on windows: cmd.
@echo off
rem This is an init script for the default "shell" on windows: cmd.
rem
rem You can try it out by calling cmd /k c:\path\to\file\.cmdrc.bat
rem
rem For a more permanent solution do the following:
rem
rem 1) copy this file to %USERPROFILE%
rem 2) Set the AutoRun key like this: