Skip to content

Instantly share code, notes, and snippets.

@andreinechaev
Created May 15, 2019 21:19
Show Gist options
  • Save andreinechaev/aa186c3aeaf3678e86951e439919dbcf to your computer and use it in GitHub Desktop.
Save andreinechaev/aa186c3aeaf3678e86951e439919dbcf to your computer and use it in GitHub Desktop.

GCP Document Understanding

Google vision has orientation on it's own Cloud Storage when we talk about BigData. API is extensive. Although SDK doesn't have all features available.

Advantages of GCP is a wide variety of pre-defined models and ability to upload your own. Existance of human labeled model can give better results, although the price is higher and there is no free tier.

What's interesing

I think the killer feature is AutoML (might take some fun from Pedro) it can simplify custom learning AutoML

Landmark Detection Logos Detection can be useful for business aggregators? Syntax Analysis Content Classification Automative description? Labeling

Examples

As a test scenario used Python SDK for Google Vision and Google NLP

Google Vision

Running PDF - failed with bad data. There are ways, but they look more like CURL requests and write data back Google storage

https://cloud.google.com/vision/docs/pdf

Images work fine - Text detection

super intuitive. not that safisticated but simplicity has it's own value The API returns a set of bounding boxes to split lines as well as a set for each character


with io.open(text_img, 'rb') as image_file:
    content = image_file.read()

image = types.Image(content=content)
# Performs text detection on the image file
resp = client.text_detection(image=image)
print('\n'.join([d.description for d in resp.text_annotations]))
# Response is line by line a char by char bounding boxes in format like:
# text_annotations {
#   description: "curious"
#   bounding_poly {
#     vertices {
#       x: 179
#       y: 13
#     }
#    .........
#     vertices {
#       x: 179
#       y: 54
#     }
#   }
# }
# I am curious about
# area-filling text
# rendering options

Image labeling - I'd say imppressive. Haven't seen labeling from AWS but the model returns a wider set than I can :)

just_img = os.path.join(os.path.dirname(__file__), 'jaguar.jpg')

with io.open(just_img, 'rb') as image_file:
    content = image_file.read()

image = types.Image(content=content)
# Performs text detection on the image file
resp = client.label_detection(image=image)
print(resp)
print('Labels')
print('\n'.join([d.description for d in resp.label_annotations]))
# Response is a set of objects like:
# label_annotations {
#   mid: "/m/01280g"
#   description: "Wildlife"
#   score: 0.98497146368
#   topicality: 0.98497146368
# }
# ...........
# Haven't seen AWS response but this one is quite impressive.
# Wildlife
# Terrestrial animal
# Whiskers
# Mammal
# Felidae
# Jaguar
# Facial expression
# Leopard
# Snout
# Roar

Google NLP

Has a set of pre-defined models including sentiment analysis

nlp_client = language.LanguageServiceClient()

document = nlp_types.Document(
    content=text,
    type=enums.Document.Type.PLAIN_TEXT)

# Detects the sentiment of the text
sentiment = nlp_client.analyze_sentiment(document=document).document_sentiment

print('Sentiment: {}'.format(sentiment))
# Sentiment: magnitude: 0.300000011921
# score: 0.300000011921
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment