Skip to content

Instantly share code, notes, and snippets.

@blever
blever / gist:1361224
Created November 12, 2011 22:28
Loading legacy Writable data
object SequenceFileInput {
/** Reading in from a sequence file:
* - specify path to sequence file
* - need to specify the Writable classes that have been serialised in the sequence file
* - provide functions that take can get the value out of Writables, plus the WireFormat definitions of K and V; this
* is all implicit so that for a lot of the common cases you don't have to fill it in (it's possible that the WireFormat
* args could be dropped and instead derived from the Writables themselves given they implement write and readFields). */
def fromSequenceFile[K, V, WtK <: WritableComparable, WtV <: Writable]
(keyClass: Class[WtK], valueClass: Class[WtV], path: String)
@blever
blever / gist:1361178
Created November 12, 2011 21:58
Augmenting DataSource/DataSink with Converter idea
/* Something along the lines of Crunch's "Converter" but split out into separate input/output traits */
trait InputConverter[K, V, S] {
def fromKeyValue(K key, V value): S
}
trait OutputConverter[K, V, S] {
toKeyValue(s: S): (K, V)
}