Skip to content

Instantly share code, notes, and snippets.

Avatar
😎
Playing around with Big Data!

Maziyar Panahi maziyarpanahi

😎
Playing around with Big Data!
View GitHub Profile
View sparknlp-vit-pipeline-base.py
from sparknlp.annotator import *
from sparknlp.base import *
from pyspark.ml import Pipeline
imageAssembler = ImageAssembler() \
.setInputCol("image") \
.setOutputCol("image_assembler")
imageClassifier = ViTForImageClassification \
View hf-vit-pipeline-gpu-base.py
from transformers import ViTFeatureExtractor, ViTForImageClassification
from transformers import pipeline
import torch
device = "cuda:0" if torch.cuda.is_available() else "cpu"
print(device)
feature_extractor = ViTFeatureExtractor.from_pretrained('google/vit-base-patch16-224')
model = ViTForImageClassification.from_pretrained('google/vit-base-patch16-224')
model = model.to(device)
View hf-vit-pipeline-bench.py
from transformers import pipeline
pipe = pipeline("image-classification", model=model, feature_extractor=feature_extractor, device=-1)
for batch_size in [1, 8, 32, 64, 128]:
print("-" * 30)
print(f"Streaming batch_size={batch_size}")
for out in tqdm(pipe(dataset, batch_size=batch_size), total=len(dataset)):
pass
View hf-vit-pipeline-base.py
from transformers import ViTFeatureExtractor, ViTForImageClassification
from transformers import pipeline
feature_extractor = ViTFeatureExtractor.from_pretrained('google/vit-base-patch16-224')
model = ViTForImageClassification.from_pretrained('google/vit-base-patch16-224')
pipe = pipeline("image-classification", model=model, feature_extractor=feature_extractor, device=-1)
View hf-vit-pytorch.py
from transformers import ViTFeatureExtractor, ViTForImageClassification
from PIL import Image
import requests
url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)
feature_extractor = ViTFeatureExtractor.from_pretrained('google/vit-base-patch16-224')
model = ViTForImageClassification.from_pretrained('google/vit-base-patch16-224')
View gist:0d67a7ee858da20ce94317358d0f5a2a
Vivek Gupta Sep 2nd, 2020 at 10:02 AM
I am new to sparknlp. I am writing a custom transformer which will remove tokens from text whose length is <=2. Transformer is working and doing its job. But it is not giving proper structure as an output. Instead it is returning only Array of String. I am struggling to get output in following structure -
ArrayType(
StructType([
StructField("annotatorType", StringType(), False),
StructField("begin", IntegerType(), False),
StructField("end", IntegerType(), False),
StructField("result", StringType(), False),
StructField("metadata", MapType(StringType(), StringType()), True)
])
View wikipedia-iso-country-codes.csv
English short name lower case Alpha-2 code Alpha-3 code Numeric code ISO 3166-2
Afghanistan AF AFG 004 ISO 3166-2:AF
Åland Islands AX ALA 248 ISO 3166-2:AX
Albania AL ALB 008 ISO 3166-2:AL
Algeria DZ DZA 012 ISO 3166-2:DZ
American Samoa AS ASM 016 ISO 3166-2:AS
Andorra AD AND 020 ISO 3166-2:AD
Angola AO AGO 024 ISO 3166-2:AO
Anguilla AI AIA 660 ISO 3166-2:AI
Antarctica AQ ATA 010 ISO 3166-2:AQ
@maziyarpanahi
maziyarpanahi / readme.md
Created Sep 4, 2019 — forked from baraldilorenzo/readme.md
VGG-16 pre-trained model for Keras
View readme.md

##VGG16 model for Keras

This is the Keras model of the 16-layer network used by the VGG team in the ILSVRC-2014 competition.

It has been obtained by directly converting the Caffe model provived by the authors.

Details about the network architecture can be found in the following arXiv paper:

Very Deep Convolutional Networks for Large-Scale Image Recognition

K. Simonyan, A. Zisserman

View zeppelin-pyspark-yarn.txt
DEBUG [2019-02-18 11:27:25,397] ({YARN application state monitor} ProtobufRpcEngine.java[invoke]:249) - Call: getApplicationReport took 2ms
DEBUG [2019-02-18 11:27:25,878] ({FIFOScheduler-Worker-1} InterpreterOutputStream.java[processLine]:81) - Interpreter output:import org.apache.spark.sql.functions._
INFO [2019-02-18 11:27:25,931] ({pool-6-thread-2} RemoteInterpreterServer.java[getStatus]:818) - job:null
DEBUG [2019-02-18 11:27:25,931] ({pool-6-thread-2} Interpreter.java[getProperty]:204) - key: zeppelin.spark.concurrentSQL, value: false
INFO [2019-02-18 11:27:25,931] ({pool-6-thread-2} RemoteInterpreterServer.java[getStatus]:818) - job:null
INFO [2019-02-18 11:27:25,931] ({pool-6-thread-2} RemoteInterpreterServer.java[getStatus]:818) - job:null
INFO [2019-02-18 11:27:25,931] ({pool-6-thread-2} RemoteInterpreterServer.java[getStatus]:818) - job:org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob@f7c36f41
INFO [2019-02-18 11:27:25,931] ({pool-6-thread-2} RemoteInterpreterServer.
View zeppelin-pyspark-yarn-client.txt
INFO [2019-02-06 22:23:16,364] ({main} RemoteInterpreterServer.java[<init>]:148) - Starting remote interpreter server on port 0, intpEventServerAddress: IP_ADDRESS:36131
INFO [2019-02-06 22:23:16,384] ({main} RemoteInterpreterServer.java[<init>]:175) - Launching ThriftServer at IP_ADDRESS:46727
INFO [2019-02-06 22:23:16,549] ({pool-6-thread-1} RemoteInterpreterServer.java[createInterpreter]:333) - Instantiate interpreter org.apache.zeppelin.spark.SparkInterpreter
INFO [2019-02-06 22:23:16,553] ({pool-6-thread-1} RemoteInterpreterServer.java[createInterpreter]:333) - Instantiate interpreter org.apache.zeppelin.spark.SparkSqlInterpreter
INFO [2019-02-06 22:23:16,556] ({pool-6-thread-1} RemoteInterpreterServer.java[createInterpreter]:333) - Instantiate interpreter org.apache.zeppelin.spark.DepInterpreter
INFO [2019-02-06 22:23:16,560] ({pool-6-thread-1} RemoteInterpreterServer.java[createInterpreter]:333) - Instantiate interpreter org.apache.zeppelin.spark.PySparkInterpreter
INFO [2019-02-06 22:23:16,563] ({pool