Skip to content

Instantly share code, notes, and snippets.

Jun Li lasclocker

Block or report user

Report or block lasclocker

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@gangliao
gangliao / tf_data_hdfs.py
Last active Jan 17, 2019
Using Tensorflow's tf.data to load data from HDFS
View tf_data_hdfs.py
import tensorflow as tf
filenames = ["hdfs://10.152.104.73:8020/sogou/train_data/1_final.feature_transform"]
dataset = tf.data.TextLineDataset(filenames)
iterator = dataset.make_one_shot_iterator()
next_batch = iterator.get_next()
@mdespriee
mdespriee / LDAIncrementalExample.scala
Created Jun 29, 2017
Example of how to build LDA incrementally in Spark, with comparison to one-shot learning.
View LDAIncrementalExample.scala
// This code is related to PR https://github.com/apache/spark/pull/17461
// I show how to use the setInitialModel() param of LDA to build a model incrementally,
// and I compare the performance (perplexity) with a model built in one-shot
import scala.collection.mutable
import org.apache.spark.ml.{Pipeline, PipelineModel}
import org.apache.spark.ml.clustering.{LDA, LDAModel}
You can’t perform that action at this time.