Skip to content

Instantly share code, notes, and snippets.

@gabidavila
Created September 30, 2018 00:36
Show Gist options
  • Save gabidavila/b37e802709dfea3b0d87f9dada7647f6 to your computer and use it in GitHub Desktop.
Save gabidavila/b37e802709dfea3b0d87f9dada7647f6 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Integration"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Dependencies"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"Composer\\Autoload\\ClassLoader {#144}"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"require 'vendor/autoload.php';"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"\"https://storage.googleapis.com/demo-cloud-vision/article.jpeg\""
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"use Google\\Cloud\\Language\\LanguageClient;\n",
"use Google\\Cloud\\Translate\\TranslateClient;\n",
"use Google\\Cloud\\Vision\\V1\\ImageAnnotatorClient;\n",
"\n",
"$article = \"https://storage.googleapis.com/demo-cloud-vision/article.jpeg\";"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Article](https://storage.googleapis.com/demo-cloud-vision/article.jpeg)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## OCR"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"239 texts found.\n",
"\n"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/plain": [
"Inscrições para o Enem começam\n",
"nesta segunda e vão até 18 de maio\n",
"SÃO PAULo Começa nesta se-\n",
"gunda (7) o período de ins-\n",
"crições para o Enem (Exame\n",
"Nacional do Ensino Médio)\n",
"Os candidatos têm até o dia\n",
"18 de maio para se inscrever\n",
"pela Página do Participante\n",
"no site do Inep\n",
"O custo da inscrição neste\n",
"ano é de R$ 82, mesmo valor\n",
"de 2017. O pagamento pode\n",
"ser feito até 23 de maio em\n",
"agências bancárias, nos Cor-\n",
"reios ou em lotéricas.\n",
"O exame acontece em 4 e\n",
"11 de novembro, dois domin\n",
"gos. No primeiro dia, as pro-\n",
"vas serão de redação, ciências\n",
"humanas e linguagens, com\n",
"tempo de cinco horas e meia.\n",
"No segundo dia, os candi\n",
"datos terão cinco horas pa-\n",
"ra completar as questões de\n",
"matemática e ciências natu\n",
"rais, meia hora a mais do que\n",
"no ano passado\n",
"Mesmo quem teve o pedi\n",
"do de isenção da taxa aceito\n",
"deve fazer a inscrição online\n",
"visto que a aprovaçãonão ga-\n",
"rante a participação no Enem.\n",
"Em 2018, mais de 3 milhões\n",
"de pessoas solicitaram o di-\n",
"reito de não pagar para fazer\n",
"as provas.\n",
"Para acompanhar as infor\n",
"mações sobre o exame, os\n",
"candidatos podem baixar o\n",
"aplicativo para celular Enem\n",
"2018, que está disponível gra\n",
"tuitamente no Google Play e\n",
"na App Store.\n",
"Anota obtida nas provas é o\n",
"principal meio de acesso dos\n",
"estudantes às universidades\n",
"públicas do país.\n",
"\n"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"text/plain": [
"null"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"$imageAnnotator = new ImageAnnotatorClient();\n",
"$response = $imageAnnotator->textDetection($article);\n",
"$texts = $response->getTextAnnotations();\n",
"\n",
"printf('%d texts found.' . PHP_EOL, count($texts));\n",
"\n",
"$ocrText = $texts[0]->getDescription();\n",
"\n",
"echo $ocrText;\n",
"\n",
"$imageAnnotator->close();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Translating the OCR text"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"Registration for Enem begins on this Monday and runs until May 18 SÃO PAULO The Enem (National High School Examination) exam begins in the second (7). Candidates have until May 18 th sign up for the Participant's page at Inep's website The cost of enrollment for this year is R $ 82, the same amount as in 2017. Payment can be made until May 23 at bank branches, offices or in lottery. The exam takes place on November 4 and 11, two majors. On the first day, the essays will be in writing, humanities and languages, with a time of five and a half hours. On the second day, candidates will have five hours to complete math and natural sciences questions, half an hour more than last year. Even those who have asked for an exemption from the accepted fee must register online since the approval of the participation in Enem. In 2018, more than 3 million people applied for the right not to pay to take the tests. To track information about the exam, applicants can download the Enem 2018 mobile application, which is available in Google Play and the App Store. Anota obtained in the tests is the main means of access of the students to the public universities of the country.\n"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"$translate = new TranslateClient();\n",
"\n",
"$result = $translate->translate($ocrText, [\n",
" 'target' => 'en'\n",
"]);\n",
"\n",
"$translatedText = $result['text'];\n",
"\n",
"echo $translatedText;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Classify the content"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"Name: /Jobs & Education/Education/Standardized & Admissions Tests\n",
"Confidence: 0.5\n",
"\n"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"$language = new LanguageClient();\n",
"\n",
"$categories = $language->classifyText($translatedText)->categories();\n",
"\n",
"foreach($categories as $category) {\n",
" echo \"Name: {$category['name']}\" . PHP_EOL . \"Confidence: {$category['confidence']}\" . PHP_EOL;\n",
"}"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "PHP",
"language": "php",
"name": "jupyter-php"
},
"language_info": {
"file_extension": ".php",
"mimetype": "text/x-php",
"name": "PHP",
"pygments_lexer": "PHP",
"version": "7.2.10-0ubuntu0.18.04.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment