Skip to content

Instantly share code, notes, and snippets.

View pramodbiligiri's full-sized avatar

Pramod Biligiri pramodbiligiri

View GitHub Profile
@pramodbiligiri
pramodbiligiri / rust-closure-invoke.rs
Created September 1, 2020 07:56
Invokes a closure in Rust
use std::collections::HashMap;
struct MapHolder {
map: HashMap<String, String>,
}
fn main() {
let mut map = HashMap::new();
map.insert("key1".to_string(), "value1".to_string());
let val = MapHolder { map: map };
@pramodbiligiri
pramodbiligiri / UseBertEmbeddings.scala
Last active March 4, 2020 12:14
Full program to use Bert Embeddings
package com.pramodb.bert
import com.johnsnowlabs.nlp.annotators._
import com.johnsnowlabs.nlp.base.DocumentAssembler
import com.johnsnowlabs.nlp.embeddings._
import org.apache.spark.ml.Pipeline
import org.apache.spark.sql.{Row, SparkSession}
import org.apache.spark.sql.types.{IntegerType, StringType, StructField, StructType}
// SparkNLP version: "com.johnsnowlabs.nlp" % "spark-nlp_2.11" % "2.4.0",
@pramodbiligiri
pramodbiligiri / BertEmbeddingCode.scala
Created March 4, 2020 07:31
Test code to load BertEmbedding in Scala
// Based on: https://gist.github.com/nlittlepoole/476242b479eb3bd70f6f3d44b2349322#file-custom_pipeline-py
val embeddings = BertEmbeddings.pretrained("bert_base_cased", "en",
"public/models").
setInputCols("document", "token").
setOutputCol("embeddings")
package com.megh.nlp
import org.apache.spark.ml.Pipeline
import org.apache.spark.sql.SparkSession
import org.slf4j.LoggerFactory
object NLPTokenizer {
private val log = LoggerFactory.getLogger(NLPTokenizer.getClass)
@pramodbiligiri
pramodbiligiri / serialization-exception.txt
Created December 3, 2019 09:47
Exception when trying to use SparkNLP Tokenizer
User class threw exception: org.apache.spark.SparkException: Job aborted.
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:198)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:156)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
@pramodbiligiri
pramodbiligiri / docker-install-sudo.txt
Created December 7, 2017 07:43
Ubuntu install docker and sudo
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
apt-cache policy docker-ce
sudo apt-get install -y docker-ce
sudo systemctl status docker
sudo usermod -aG docker ${USER}
su - ${USER}