Skip to content

Instantly share code, notes, and snippets.

@rjeli
Created June 14, 2016 19:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rjeli/4c9f9c70016d2cb5528bc229a1464a0f to your computer and use it in GitHub Desktop.
Save rjeli/4c9f9c70016d2cb5528bc229a1464a0f to your computer and use it in GitHub Desktop.
// Set<Document> added
// Featurizer f
// sc::parallelize : List<T> -> JavaRDD<T>
Stream<Tuple2<Integer,Vector>> featurizedStream = added.stream()
.map(d -> new Tuple2(d.getId(), f.vector(d)));
JavaRDD<Tuple2<Integer,Vector>> featurized =
sc.parallelize(featurizedStream.collect(Collectors.toList()));
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment