Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save norisk-marketing/c01b6afb3f17dfc807dd05a859d000b2 to your computer and use it in GitHub Desktop.
Save norisk-marketing/c01b6afb3f17dfc807dd05a859d000b2 to your computer and use it in GitHub Desktop.
norisk_FriendsOfSearch_PythonScripts.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "norisk_FriendsOfSearch_PythonScripts.ipynb",
"version": "0.3.2",
"provenance": [],
"collapsed_sections": [
"QdAg7oHKq6Bz",
"U0KkBAtRddP1",
"a60zDUMiOEXD",
"mgdj0ug-hp7l",
"eXeKf2JTExIu",
"omSN3SxBEr8o"
],
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/norisk-marketing/c01b6afb3f17dfc807dd05a859d000b2/norisk_friendsofsearch_pythonscripts.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"metadata": {
"id": "Ftl_U-chfVC2",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"![FOS](https://user-images.githubusercontent.com/11938030/51072997-fc7e7080-166a-11e9-9274-2a1c2bb7e9da.png)\n",
"\n",
"\n",
"# INTRO\n",
"\n",
"Welcome to the norisk **colab notebook** to play around with python code and machine learning libraries. \n",
"\n",
"The notebook was presented at the **Friends of Search Conferece 2019** in Amsterdam by Christopher Gutnecht. We hope you enjoy it!\n",
"\n",
"The slide deck can be found at: [https://slideshare.net/norisk](https://slideshare.net/norisk).\n",
"\n",
"Feel free to post any content with hashtag **#FOS19**.\n",
"\n",
"\n",
"\n",
"---\n",
"---\n",
"---\n",
"\n",
"\n",
"![norisk Group](https://user-images.githubusercontent.com/11938030/51073029-4ebf9180-166b-11e9-9d9e-5bfc6b2e0bf5.png)  ![Chris](https://user-images.githubusercontent.com/6991865/51346860-ed4e5700-1a9e-11e9-9132-01beff164a55.png)\n",
"\n",
"\n",
"In case of any questions or if you wish to start a conversation, reach out to me: cgutknecht@noriskshop.de (Twitter: @chrisgutknecht).\n",
"\n",
"Thanks and happy optimizing!\n",
"\n",
"Chris\n",
"\n",
"\n",
"\n",
"\n",
"---\n",
"---\n",
"---\n",
"## INDEX \n",
"\n",
"\n",
"\n",
"### <a href=\"#scrollTo=e7XHLlpbcibc\">1. Part I: Query Understanding</a>\n",
"\n",
"> #### 1.1 Typo Correction\n",
"> #### 1.2 Knowledge Graph API Search\n",
"> #### 1.3 Target Campaign Classification\n",
"\n",
"\n",
"\n",
"###<a href=\"#scrollTo=U40nw2hrdSM9\">2. Part II: Ngram Analysis with Big Query</a>\n",
"\n",
"> #### 2.1 Setting up the Google Ads Transfer\n",
"> #### 2.2 Loading Data via Google Cloud Package\n",
"> #### 2.3 Loading Data with Pandas\n",
"> #### 2.4 Using BigQuery ML for Prediction\n",
"\n",
"\n",
"###<a href=\"#scrollTo=utgKaJZte1KV\">3. Part III: Text Summarization</a>\n",
"\n",
"> #### 3.1 Text Summarization with Sumy\n",
"> #### 3.2 Keyphrase Extraction (Microsoft Azure)\n",
"\n"
]
},
{
"metadata": {
"id": "e7XHLlpbcibc",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"---\n",
"---\n",
"---\n",
"\n",
"#PART 1. QUERY UNDERSTANDING\n",
"\n",
"\n",
"\n"
]
},
{
"metadata": {
"id": "pk_lNxK0bVYc",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"\n",
"##1.1 Typo Check with Google Custom Search Engine\n",
"\n",
"\n"
]
},
{
"metadata": {
"id": "ukkG3CvFaJOQ",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"- Create a Google Cloud Console account: https://console.cloud.google.com\n",
"- Create a project and a custom search engine (CSE): https://cse.google.com/cse/all \n",
"- Refer to the JSON API Documentation for more details: https://developers.google.com/custom-search/v1/\n",
"- Example API Call for \"[Topshpo](https://www.googleapis.com/customsearch/v1?cx=011253106589242236363:6w-avymda-w&key=AIzaSyB8do1KTfJ4jYTdZfKSU8OvHZTqf5AQSTw&gl=de&q=topshpo&alt=json&num=5)\"\n",
"---\n",
"- Pros: Excellent for **search queries**\n",
"- Cons: Can get expensive (1€ / 1000 requests), use as last . For complete sentences try [Bing Spell Check API](https://docs.microsoft.com/en-us/azure/cognitive-services/luis/luis-tutorial-bing-spellcheck)\n",
"\n",
"\n",
"---\n",
"> --> **USE CASE**: Clean misstyped queries before adding them \n",
"\n",
"> --> Authorization needed? **YES**\n",
"\n",
"> --> Can this be rewritten in *Ads Scripts*? **YES ** (See FULL Ads script BELOW)\n",
"---\n"
]
},
{
"metadata": {
"id": "UNkMJLEbbYJC",
"colab_type": "code",
"outputId": "816d5692-0e62-4452-83d1-a7a6de159018",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 6895
}
},
"cell_type": "code",
"source": [
"#################### START CONFIG ##################\n",
"\n",
"# Create a Google Cloud console account\n",
"# Retrieve the CSE id from your project console: https://cse.google.com/cse/all\n",
"cx = '011253106589242236363:g6cntpa1c7s'\n",
"\n",
"# Create an API key here after selecting your project: \n",
"# https://developers.google.com/custom-search/v1/introduction\n",
"api_key = 'AIzaSyA5dKxNMaFoHDbEeukFQkiMUTJTuvzoteg'\n",
"\n",
"# Specify your Google domain to your country\n",
"# Reference: https://developers.google.com/custom-search/docs/xml_results_appendices#countryCodes\n",
"googlehost = 'de'\n",
"\n",
"# Specifcy your language if differing from your country domain\n",
"# Reference: https://developers.google.com/custom-search/docs/ref_languages\n",
"lang = 'de' \n",
"\n",
"# Test your misstyped queries\n",
"queries = ['woolrich herren','topshpo', 'gina loura', 'lodenfrei', 'loden frey', 'sport shuster', 'sportshuster', 'sportschuster', 'sport schuster', 'ludwigbeck', 'luwigbeck', 'windsoor']\n",
"# queries = ['kriudvat', 'limagno', 'amstrdam', 'bestlst', 'wekmp']\n",
"\n",
"#################### END CONFIG ####################\n",
"\n",
"import json, requests\n",
"\n",
"print('Results of Google Custom Search Engine:')\n",
"for query in queries:\n",
" url ='https://www.googleapis.com/customsearch/v1/siterestrict?cx=' + cx + '&key=' + api_key + '&gl=' + googlehost + '&q=' + query + '&alt=json&num=1'\n",
" json_response = json.loads(requests.get(url).content)\n",
" \n",
" print('\\n' + json.dumps(json_response, indent=2, sort_keys=False))\n",
" \n",
" try:\n",
" print('\\n' + query + ' ( > ' + json_response['spelling']['correctedQuery'].replace(',',' ').replace('.',' ') + ' )') # json_response['spelling']['correctedQuery']\n",
" except:\n",
" print('\\n' + query)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"Results of Google Custom Search Engine:\n",
"\n",
"{\n",
" \"kind\": \"customsearch#search\",\n",
" \"url\": {\n",
" \"type\": \"application/json\",\n",
" \"template\": \"https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&relatedSite={relatedSite?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json\"\n",
" },\n",
" \"queries\": {\n",
" \"request\": [\n",
" {\n",
" \"title\": \"Google Custom Search - woolrich herren\",\n",
" \"totalResults\": \"0\",\n",
" \"searchTerms\": \"woolrich herren\",\n",
" \"count\": 1,\n",
" \"inputEncoding\": \"utf8\",\n",
" \"outputEncoding\": \"utf8\",\n",
" \"safe\": \"off\",\n",
" \"cx\": \"011253106589242236363:g6cntpa1c7s\",\n",
" \"gl\": \"de\"\n",
" }\n",
" ]\n",
" },\n",
" \"searchInformation\": {\n",
" \"searchTime\": 0.531554,\n",
" \"formattedSearchTime\": \"0.53\",\n",
" \"totalResults\": \"0\",\n",
" \"formattedTotalResults\": \"0\"\n",
" }\n",
"}\n",
"\n",
"woolrich herren\n",
"\n",
"{\n",
" \"kind\": \"customsearch#search\",\n",
" \"url\": {\n",
" \"type\": \"application/json\",\n",
" \"template\": \"https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&relatedSite={relatedSite?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json\"\n",
" },\n",
" \"queries\": {\n",
" \"request\": [\n",
" {\n",
" \"title\": \"Google Custom Search - topshpo\",\n",
" \"totalResults\": \"0\",\n",
" \"searchTerms\": \"topshpo\",\n",
" \"count\": 1,\n",
" \"inputEncoding\": \"utf8\",\n",
" \"outputEncoding\": \"utf8\",\n",
" \"safe\": \"off\",\n",
" \"cx\": \"011253106589242236363:g6cntpa1c7s\",\n",
" \"gl\": \"de\"\n",
" }\n",
" ]\n",
" },\n",
" \"searchInformation\": {\n",
" \"searchTime\": 0.341345,\n",
" \"formattedSearchTime\": \"0.34\",\n",
" \"totalResults\": \"0\",\n",
" \"formattedTotalResults\": \"0\"\n",
" },\n",
" \"spelling\": {\n",
" \"correctedQuery\": \"topshop\",\n",
" \"htmlCorrectedQuery\": \"<b><i>topshop</i></b>\"\n",
" }\n",
"}\n",
"\n",
"topshpo ( > topshop )\n",
"\n",
"{\n",
" \"kind\": \"customsearch#search\",\n",
" \"url\": {\n",
" \"type\": \"application/json\",\n",
" \"template\": \"https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&relatedSite={relatedSite?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json\"\n",
" },\n",
" \"queries\": {\n",
" \"request\": [\n",
" {\n",
" \"title\": \"Google Custom Search - gina loura\",\n",
" \"totalResults\": \"0\",\n",
" \"searchTerms\": \"gina loura\",\n",
" \"count\": 1,\n",
" \"inputEncoding\": \"utf8\",\n",
" \"outputEncoding\": \"utf8\",\n",
" \"safe\": \"off\",\n",
" \"cx\": \"011253106589242236363:g6cntpa1c7s\",\n",
" \"gl\": \"de\"\n",
" }\n",
" ]\n",
" },\n",
" \"searchInformation\": {\n",
" \"searchTime\": 0.293498,\n",
" \"formattedSearchTime\": \"0.29\",\n",
" \"totalResults\": \"0\",\n",
" \"formattedTotalResults\": \"0\"\n",
" },\n",
" \"spelling\": {\n",
" \"correctedQuery\": \"gina laura\",\n",
" \"htmlCorrectedQuery\": \"gina <b><i>laura</i></b>\"\n",
" }\n",
"}\n",
"\n",
"gina loura ( > gina laura )\n",
"\n",
"{\n",
" \"kind\": \"customsearch#search\",\n",
" \"url\": {\n",
" \"type\": \"application/json\",\n",
" \"template\": \"https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&relatedSite={relatedSite?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json\"\n",
" },\n",
" \"queries\": {\n",
" \"request\": [\n",
" {\n",
" \"title\": \"Google Custom Search - lodenfrei\",\n",
" \"totalResults\": \"0\",\n",
" \"searchTerms\": \"lodenfrei\",\n",
" \"count\": 1,\n",
" \"inputEncoding\": \"utf8\",\n",
" \"outputEncoding\": \"utf8\",\n",
" \"safe\": \"off\",\n",
" \"cx\": \"011253106589242236363:g6cntpa1c7s\",\n",
" \"gl\": \"de\"\n",
" }\n",
" ]\n",
" },\n",
" \"searchInformation\": {\n",
" \"searchTime\": 0.258428,\n",
" \"formattedSearchTime\": \"0.26\",\n",
" \"totalResults\": \"0\",\n",
" \"formattedTotalResults\": \"0\"\n",
" },\n",
" \"spelling\": {\n",
" \"correctedQuery\": \"lodenfrey\",\n",
" \"htmlCorrectedQuery\": \"<b><i>lodenfrey</i></b>\"\n",
" }\n",
"}\n",
"\n",
"lodenfrei ( > lodenfrey )\n",
"\n",
"{\n",
" \"kind\": \"customsearch#search\",\n",
" \"url\": {\n",
" \"type\": \"application/json\",\n",
" \"template\": \"https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&relatedSite={relatedSite?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json\"\n",
" },\n",
" \"queries\": {\n",
" \"request\": [\n",
" {\n",
" \"title\": \"Google Custom Search - loden frey\",\n",
" \"totalResults\": \"0\",\n",
" \"searchTerms\": \"loden frey\",\n",
" \"count\": 1,\n",
" \"inputEncoding\": \"utf8\",\n",
" \"outputEncoding\": \"utf8\",\n",
" \"safe\": \"off\",\n",
" \"cx\": \"011253106589242236363:g6cntpa1c7s\",\n",
" \"gl\": \"de\"\n",
" }\n",
" ]\n",
" },\n",
" \"searchInformation\": {\n",
" \"searchTime\": 0.330833,\n",
" \"formattedSearchTime\": \"0.33\",\n",
" \"totalResults\": \"0\",\n",
" \"formattedTotalResults\": \"0\"\n",
" }\n",
"}\n",
"\n",
"loden frey\n",
"\n",
"{\n",
" \"kind\": \"customsearch#search\",\n",
" \"url\": {\n",
" \"type\": \"application/json\",\n",
" \"template\": \"https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&relatedSite={relatedSite?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json\"\n",
" },\n",
" \"queries\": {\n",
" \"request\": [\n",
" {\n",
" \"title\": \"Google Custom Search - sport shuster\",\n",
" \"totalResults\": \"0\",\n",
" \"searchTerms\": \"sport shuster\",\n",
" \"count\": 1,\n",
" \"inputEncoding\": \"utf8\",\n",
" \"outputEncoding\": \"utf8\",\n",
" \"safe\": \"off\",\n",
" \"cx\": \"011253106589242236363:g6cntpa1c7s\",\n",
" \"gl\": \"de\"\n",
" }\n",
" ]\n",
" },\n",
" \"searchInformation\": {\n",
" \"searchTime\": 0.39934,\n",
" \"formattedSearchTime\": \"0.40\",\n",
" \"totalResults\": \"0\",\n",
" \"formattedTotalResults\": \"0\"\n",
" },\n",
" \"spelling\": {\n",
" \"correctedQuery\": \"sport schuster\",\n",
" \"htmlCorrectedQuery\": \"sport <b><i>schuster</i></b>\"\n",
" }\n",
"}\n",
"\n",
"sport shuster ( > sport schuster )\n",
"\n",
"{\n",
" \"kind\": \"customsearch#search\",\n",
" \"url\": {\n",
" \"type\": \"application/json\",\n",
" \"template\": \"https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&relatedSite={relatedSite?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json\"\n",
" },\n",
" \"queries\": {\n",
" \"request\": [\n",
" {\n",
" \"title\": \"Google Custom Search - sportshuster\",\n",
" \"totalResults\": \"0\",\n",
" \"searchTerms\": \"sportshuster\",\n",
" \"count\": 1,\n",
" \"inputEncoding\": \"utf8\",\n",
" \"outputEncoding\": \"utf8\",\n",
" \"safe\": \"off\",\n",
" \"cx\": \"011253106589242236363:g6cntpa1c7s\",\n",
" \"gl\": \"de\"\n",
" }\n",
" ]\n",
" },\n",
" \"searchInformation\": {\n",
" \"searchTime\": 0.253091,\n",
" \"formattedSearchTime\": \"0.25\",\n",
" \"totalResults\": \"0\",\n",
" \"formattedTotalResults\": \"0\"\n",
" },\n",
" \"spelling\": {\n",
" \"correctedQuery\": \"sportsmaster\",\n",
" \"htmlCorrectedQuery\": \"<b><i>sportsmaster</i></b>\"\n",
" }\n",
"}\n",
"\n",
"sportshuster ( > sportsmaster )\n",
"\n",
"{\n",
" \"kind\": \"customsearch#search\",\n",
" \"url\": {\n",
" \"type\": \"application/json\",\n",
" \"template\": \"https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&relatedSite={relatedSite?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json\"\n",
" },\n",
" \"queries\": {\n",
" \"request\": [\n",
" {\n",
" \"title\": \"Google Custom Search - sportschuster\",\n",
" \"totalResults\": \"0\",\n",
" \"searchTerms\": \"sportschuster\",\n",
" \"count\": 1,\n",
" \"inputEncoding\": \"utf8\",\n",
" \"outputEncoding\": \"utf8\",\n",
" \"safe\": \"off\",\n",
" \"cx\": \"011253106589242236363:g6cntpa1c7s\",\n",
" \"gl\": \"de\"\n",
" }\n",
" ]\n",
" },\n",
" \"searchInformation\": {\n",
" \"searchTime\": 0.324232,\n",
" \"formattedSearchTime\": \"0.32\",\n",
" \"totalResults\": \"0\",\n",
" \"formattedTotalResults\": \"0\"\n",
" }\n",
"}\n",
"\n",
"sportschuster\n",
"\n",
"{\n",
" \"kind\": \"customsearch#search\",\n",
" \"url\": {\n",
" \"type\": \"application/json\",\n",
" \"template\": \"https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&relatedSite={relatedSite?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json\"\n",
" },\n",
" \"queries\": {\n",
" \"request\": [\n",
" {\n",
" \"title\": \"Google Custom Search - sport schuster\",\n",
" \"totalResults\": \"0\",\n",
" \"searchTerms\": \"sport schuster\",\n",
" \"count\": 1,\n",
" \"inputEncoding\": \"utf8\",\n",
" \"outputEncoding\": \"utf8\",\n",
" \"safe\": \"off\",\n",
" \"cx\": \"011253106589242236363:g6cntpa1c7s\",\n",
" \"gl\": \"de\"\n",
" }\n",
" ]\n",
" },\n",
" \"searchInformation\": {\n",
" \"searchTime\": 0.322846,\n",
" \"formattedSearchTime\": \"0.32\",\n",
" \"totalResults\": \"0\",\n",
" \"formattedTotalResults\": \"0\"\n",
" }\n",
"}\n",
"\n",
"sport schuster\n",
"\n",
"{\n",
" \"kind\": \"customsearch#search\",\n",
" \"url\": {\n",
" \"type\": \"application/json\",\n",
" \"template\": \"https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&relatedSite={relatedSite?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json\"\n",
" },\n",
" \"queries\": {\n",
" \"request\": [\n",
" {\n",
" \"title\": \"Google Custom Search - ludwigbeck\",\n",
" \"totalResults\": \"0\",\n",
" \"searchTerms\": \"ludwigbeck\",\n",
" \"count\": 1,\n",
" \"inputEncoding\": \"utf8\",\n",
" \"outputEncoding\": \"utf8\",\n",
" \"safe\": \"off\",\n",
" \"cx\": \"011253106589242236363:g6cntpa1c7s\",\n",
" \"gl\": \"de\"\n",
" }\n",
" ]\n",
" },\n",
" \"searchInformation\": {\n",
" \"searchTime\": 0.305889,\n",
" \"formattedSearchTime\": \"0.31\",\n",
" \"totalResults\": \"0\",\n",
" \"formattedTotalResults\": \"0\"\n",
" }\n",
"}\n",
"\n",
"ludwigbeck\n",
"\n",
"{\n",
" \"kind\": \"customsearch#search\",\n",
" \"url\": {\n",
" \"type\": \"application/json\",\n",
" \"template\": \"https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&relatedSite={relatedSite?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json\"\n",
" },\n",
" \"queries\": {\n",
" \"request\": [\n",
" {\n",
" \"title\": \"Google Custom Search - luwigbeck\",\n",
" \"totalResults\": \"0\",\n",
" \"searchTerms\": \"luwigbeck\",\n",
" \"count\": 1,\n",
" \"inputEncoding\": \"utf8\",\n",
" \"outputEncoding\": \"utf8\",\n",
" \"safe\": \"off\",\n",
" \"cx\": \"011253106589242236363:g6cntpa1c7s\",\n",
" \"gl\": \"de\"\n",
" }\n",
" ]\n",
" },\n",
" \"searchInformation\": {\n",
" \"searchTime\": 0.21647,\n",
" \"formattedSearchTime\": \"0.22\",\n",
" \"totalResults\": \"0\",\n",
" \"formattedTotalResults\": \"0\"\n",
" }\n",
"}\n",
"\n",
"luwigbeck\n",
"\n",
"{\n",
" \"kind\": \"customsearch#search\",\n",
" \"url\": {\n",
" \"type\": \"application/json\",\n",
" \"template\": \"https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&relatedSite={relatedSite?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json\"\n",
" },\n",
" \"queries\": {\n",
" \"request\": [\n",
" {\n",
" \"title\": \"Google Custom Search - windsoor\",\n",
" \"totalResults\": \"0\",\n",
" \"searchTerms\": \"windsoor\",\n",
" \"count\": 1,\n",
" \"inputEncoding\": \"utf8\",\n",
" \"outputEncoding\": \"utf8\",\n",
" \"safe\": \"off\",\n",
" \"cx\": \"011253106589242236363:g6cntpa1c7s\",\n",
" \"gl\": \"de\"\n",
" }\n",
" ]\n",
" },\n",
" \"searchInformation\": {\n",
" \"searchTime\": 0.264335,\n",
" \"formattedSearchTime\": \"0.26\",\n",
" \"totalResults\": \"0\",\n",
" \"formattedTotalResults\": \"0\"\n",
" },\n",
" \"spelling\": {\n",
" \"correctedQuery\": \"windsor\",\n",
" \"htmlCorrectedQuery\": \"<b><i>windsor</i></b>\"\n",
" }\n",
"}\n",
"\n",
"windsoor ( > windsor )\n"
],
"name": "stdout"
}
]
},
{
"metadata": {
"id": "aNizhiAfbiUm",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"## 1.1 Ads Script: Near Exact Splitter by Word-Type"
]
},
{
"metadata": {
"id": "QdAg7oHKq6Bz",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"**> PLEASE (!) RUN IN PREVIEW FIRST !**\n",
"---\n",
"\n",
"Description of script behaviour: \n",
"\n",
"- Looks through all adgroups of specified campaigns and analyzes near-exact queries\n",
"\n",
"- Classifies near-exact into typos and variants (order, word variant etc) using \n",
"> Google Suggest ([Example](http://suggestqueries.google.com/complete/search?output=toolbar&hl=en&q=aveda foo relief)) and\n",
"> Google Custom Search:\n",
"\n",
"- Case 1 **\"New adGroup\"**: If a near-exact query \n",
"\n",
"\n",
"> 1. exceeds an absolute click threshold AND \n",
"> 2. relative to its keyword click volume (> 1%) AND \n",
"> 3. is more expensive\n",
"\n",
"> --> creates a new single keyword adgroup and handles reciprocal negative fencing\n",
"\n",
"\n",
"- Case 2 **\"Keyword added to adgroup\"**: If a NE query 1. exceeds the absolute threshold 2. but* not* relative\n",
"\n",
"> --> adds the non-typo keyword to the same adgroup\n",
"\n",
"---\n",
"\n",
"\n",
"**> PLEASE (!) RUN IN PREVIEW FIRST !**\n"
]
},
{
"metadata": {
"id": "dtjzR3OCrCLp",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"'''\n",
"// 1.1 ADS SCRIPT TO AUTO-ADD NEAR-EXACT: WORKING VERSION\n",
"\n",
"/***************** CONFIG_BLOCK START *****************/\n",
"\n",
"// Thresholds for near-exact lookup\n",
"var MIN_QUERY_CLICKS = 5;\n",
"var SKAG_CLICK_THRESHOLD = 5;\n",
"var DATE_RANGE = \"LAST_30_DAYS\";\n",
"\n",
"// Settings for adding new keywords and ads\n",
"var NEAR_EXACT_BID_MODIFIER = 1.1;\n",
"var SET_NEGATIVE_IN_QUERY_SOURCE = 0;\n",
"var ADD_CLOSE_VARIANTS_TO_SAME_ADGROUP = 1;\n",
"\n",
"// Include and exclude certain campaigns\n",
"var CAMPAIGN_INCLUDE_STRING = \"DE\"; \n",
"var CAMPAIGN_EXCLUDE_STRING = \"Display\"; \n",
"var SHO_CAMPAIGN_STRING = \"Shopping\";\n",
"var DSA_CAMPAIGN_STRING = \"Dynamic\"; \n",
"\n",
"// Identify missspelling adgroups (skipped)\n",
"var ADGROUP_MISSPELLING_EXCLUDE_STRING = [\"Misspelling\", \"Verschreiber\", \"Typo\"];\n",
"\n",
"// Service validate typos via Google\n",
"// Set up via https://cse.google.com/cse/all\n",
"var GOOGLE_CUSTOM_SEARCH_CONFIG = {\n",
" \n",
" // Specify your Google domain to your country\n",
" // https://developers.google.com/custom-search/docs/xml_results_appendices#countryCodes\n",
" \"googlehost\" : \"de\",\n",
" \n",
" // Specify your language if differing from your country domain\n",
" // https://developers.google.com/custom-search/docs/ref_languages\n",
" \"lang\" : \"de\"\n",
"};\n",
"\n",
" \n",
"// In case execution errors\n",
"var EMAIL_RECIPIENTS = [\"cgutknecht@noriskshop.de\"]; \n",
"\n",
"/***************** CONFIG_BLOCK END *****************/\n",
"\n",
"\n",
"function main(){\n",
" \n",
" try{\n",
" var scriptfile_name = \"https://scripts.adserver.cc/getScript.php?package=nrKeywords&version=\"+LICENSED_VERSION+\"&script=RelatedQuerySplitter/relatedQuerySplitter.js&aid=987-654-1230&key=\"+API_KEY;\n",
" var scriptFile_raw = UrlFetchApp.fetch(scriptfile_name).getContentText();\n",
"\n",
" try{\n",
" eval(scriptFile_raw);\n",
" relatedQuerySplitter();\n",
" } catch (e) {try {if(AdWordsApp.getExecutionInfo().isPreview() === false) MailApp.sendEmail(EMAIL_RECIPIENTS[0], \"Error in Script: \" + SCRIPT_NAME + \":\" + AdsApp.currentAccount().getName(), \"Exception: \"+e.message+\"\\r\\nStacktrace:\\r\\n\"+e.stack);} catch (e2) {Logger.log(e2.stack);} throw e;}\n",
" } catch(e3){ Logger.log(e3.stack);throw e3; }\n",
"}\n",
"\n",
"\n",
"// Constants\n",
"var LOG_LEVEL_VERBOSE = 1;\n",
"var API_KEY = \"N6Xm1bf4Xk3ZC4fNKuml1xukQFYpnfS2\"; \n",
"var LICENSED_VERSION = \"unstable\";\n",
"var MATCHTYPE_VARIANT = \"NEAR_EXACT\"; \n",
"var ADGROUP_LABEL = \"nrNearExact_AG-Split\";\n",
"'''"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "EA_tAVLEcOV_",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"\n",
"##1.2 Knowledge Graph Search API (Google)\n",
"\n"
]
},
{
"metadata": {
"id": "gH9YD-5sZ4sH",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"- Example API Call for [\"Haarlem\"](https://kgsearch.googleapis.com/v1/entities:search?query=haarlem&languages=nl&limit=5&indent=true&key=AIzaSyBkQJPoUYb25K63wgQtUgLUGUmUcwR7fgk) and [\"CDG\"](https://kgsearch.googleapis.com/v1/entities:search?query=cdg&languages=nl&limit=5&indent=true&key=AIzaSyBkQJPoUYb25K63wgQtUgLUGUmUcwR7fgk)\n",
"- Documentation: https://developers.google.com/knowledge-graph/\n",
"\n",
"\n",
"---\n",
"\n",
"> --> **USE CASE**: Validate geographic entities in search queries\n",
" \n",
"> --> Authorization needed? **YES**\n",
"\n",
"> --> Can this be rewritten in *Ads Scripts*? **YES **\n",
"\n",
"---"
]
},
{
"metadata": {
"id": "am47BX7PbI-t",
"colab_type": "code",
"outputId": "feeea9ed-aad6-4432-e819-166cfe9e3beb",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1074
}
},
"cell_type": "code",
"source": [
"#################### START CONFIG ##################\n",
"\n",
"query = \"woolrich\" # Haarlem, Grachten\n",
"language = \"de\"\n",
"apiKey = \"AIzaSyBkQJPoUYb25K63wgQtUgLUGUmUcwR7fgk\"\n",
"\n",
"#################### END CONFIG ####################\n",
"\n",
"import requests\n",
"import json\n",
"\n",
"kgUrl = \"https://kgsearch.googleapis.com/v1/entities:search?query= \" + query + \"&languages=\" + language + \"&limit=2&indent=true&key=\" + apiKey\n",
"json_response = json.loads(requests.get(kgUrl).content)\n",
"\n",
"print('All place results and their scores:\\n')\n",
"[print(str(list['result'])) for list in json_response['itemListElement'] if 'Place' in list['result']['@type']]\n",
"\n",
"print('\\n\\n*****\\n\\n\\nEntire JSON response:\\n')\n",
"print('\\n' + json.dumps(json_response, indent=2, sort_keys=False)) \n",
"\n",
"# Or view directly in Browser: \n",
"# https://kgsearch.googleapis.com/v1/entities:search?query=haarlem&languages=de&limit=10&indent=true&key=AIzaSyBkQJPoUYb25K63wgQtUgLUGUmUcwR7fgk\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"All place results and their scores:\n",
"\n",
"\n",
"\n",
"*****\n",
"\n",
"\n",
"Entire JSON response:\n",
"\n",
"\n",
"{\n",
" \"@context\": {\n",
" \"@vocab\": \"http://schema.org/\",\n",
" \"goog\": \"http://schema.googleapis.com/\",\n",
" \"EntitySearchResult\": \"goog:EntitySearchResult\",\n",
" \"detailedDescription\": \"goog:detailedDescription\",\n",
" \"resultScore\": \"goog:resultScore\",\n",
" \"kg\": \"http://g.co/kg\"\n",
" },\n",
" \"@type\": \"ItemList\",\n",
" \"itemListElement\": [\n",
" {\n",
" \"@type\": \"EntitySearchResult\",\n",
" \"result\": {\n",
" \"@id\": \"kg:/m/04f5h7x\",\n",
" \"name\": \"Woolrich, Inc.\",\n",
" \"@type\": [\n",
" \"Organization\",\n",
" \"Corporation\",\n",
" \"Thing\"\n",
" ],\n",
" \"description\": \"Unternehmen\",\n",
" \"detailedDescription\": {\n",
" \"articleBody\": \"Woolrich, Inc. ist ein 1830 von John Rich in Pennsylvania gegr\\u00fcndetes, bis heute existentes amerikanisches Bekleidungsunternehmen, welches zun\\u00e4chst durch rustikale Outdoor-Bekleidung f\\u00fcr Herren und sein rot-schwarzes Wollkaromuster auf den selbst hergestellten Stoffen bekannt wurde.\",\n",
" \"url\": \"https://de.wikipedia.org/wiki/Woolrich,_Inc.\",\n",
" \"license\": \"https://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License\"\n",
" }\n",
" },\n",
" \"resultScore\": 31.89395\n",
" },\n",
" {\n",
" \"@type\": \"EntitySearchResult\",\n",
" \"result\": {\n",
" \"@id\": \"kg:/m/01dgvj\",\n",
" \"name\": \"Cornell Woolrich\",\n",
" \"@type\": [\n",
" \"Thing\",\n",
" \"Person\"\n",
" ],\n",
" \"description\": \"Schriftsteller\",\n",
" \"detailedDescription\": {\n",
" \"articleBody\": \"Cornell George Hopley-Woolrich war ein US-amerikanischer Schriftsteller, der in der Hauptsache durch seine d\\u00fcsteren Kriminalromane und Detektivgeschichten bekannt ist. \",\n",
" \"url\": \"https://de.wikipedia.org/wiki/Cornell_Woolrich\",\n",
" \"license\": \"https://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License\"\n",
" }\n",
" },\n",
" \"resultScore\": 25.324356\n",
" }\n",
" ]\n",
"}\n"
],
"name": "stdout"
}
]
},
{
"metadata": {
"id": "uME4NJ80vBUn",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"\n",
"\n",
"## 1.3 Target Campaign Classification with Auto-ML (Google)\n"
]
},
{
"metadata": {
"id": "b6T6WZ9HZ-GP",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"- [Quickstart Reference](https://cloud.google.com/natural-language/automl/docs/quickstart?hl=en) & [API Documentation](https://cloud.google.com/natural-language/automl/docs/?hl=en)\n",
"\n",
"\n",
"---\n",
"\n",
"> --> **USE CASE**: Find the right search campaign for a good shopping or DSA query\n",
"\n",
"> --> Authorization needed? **YES**\n",
"\n",
"> --> Can this be rewritten in *Ads Scripts*? **NO**\n",
"---\n"
]
},
{
"metadata": {
"id": "AGP8s_FQBekl",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"Description for **CampaignIdentifier**:\n",
"\n",
"- Load in your csv with the two columns query and campaign into an AutoML dataset: https://cloud.google.com/automl/\n",
"- Make sure all labels have at least **100 instances** or more, then train the model. \n",
"- Training will take about 5 - 25 hrs and cost up to 25€\n",
"- Call the CampaignIdentifier via the demo script & reference your service account credentials JSON\n",
"- To call from scripts, use a cloud function"
]
},
{
"metadata": {
"id": "4rrJmh7yvKYD",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"#################### START CONFIG ##################\n",
"\n",
"# Add your test query to be classified here\n",
"keyword1 = 'plus size swim shorts'\n",
"keyword2 = 'plus size panty'\n",
"\n",
"# Paste your project and model id info here\n",
"project_id = 'trafficestimator-191922' \n",
"model_id = 'TCN405734753472087990'\n",
"\n",
"# Create a service account in your cloud console project under IAM > service accounts\n",
"# Usually the JSON file would be stored in a folder and referenced. \n",
"# In this case the content is put into a file in the working directory\n",
"!touch credentials.json\n",
"!echo '{\"type\": \"service_account\",\"project_id\": \"trafficestimator-191922\",\"private_key_id\": \"317749b6cd332cf8a520504d6c31464548066d07\",\"private_key\": \"-----BEGIN PRIVATE KEY-----\\nMIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQCs0KH6YnuX3ADA\\nyMJxHzbM84bHENhvDTRpZMBdPvIi/CE/RMxBalDRzQUf+1/LsM+iAzT11sYBa6jb\\nYmVNE9FI4DB85OH3dvjQuG62EYdIaPmIhqCfdBSQb2tE0FgnvD7FJnOa4sket+aE\\nto3F5nZnrTUAEm0SI0rEXNxDRKZZnp259HF5fe4PySnLEYZtFqxcE1gC+xarUAce\\nA4kSbCzn7V3MBZjquUTQ2/Np/jf2dCyO2jOqlaAh+s3B1AFH/3XCDxmKzzr+tN8U\\n675TaRpfTZXnZmgUHxvTNIkM1SEiVQcC+q/d1FzQLVG/bcXKHdtaHmV9XZ235SjD\\n4tofhnlhAgMBAAECggEAAeC/LzGHG/aD5ZRo6ddyfitT9sbRQJ0D9EnoKJCTYLDy\\nYDXMWGuiLDLOUEjS2NZM3fsAm36gJooidXEuHkvNqufmGagCZ/DZhv8cvjVrjcG3\\nmyBcuiQwna8wRbj+iZQAmfOEhGrcJPgk9Pfni35+dpBJx/yV8IFaAc9BL36dUNJP\\n/oKnmQ+2Z1CoxkheeT79wi0IjhXxwCq1PaxIfo3kTNPtJyRWGbUZCKemOlsEqF4n\\nZXucRN+Gc/xSdUX6mid3TIpuC9LxKMSmgMfaIt1YeQDymk0yFgE9PJSyUf9CvhBo\\ntzPAUvmOnVY+OAMtwzHyzIUbuBD53keFPQgMYwbqzwKBgQDtll1AQSGMmO03cqkK\\nES6tebbriv05zOl9/mVY1O3q4idoxtyjBlfXD7Izil+qC3H5q3u2UHoTlPn8sbp/\\nqE70TP7ATetkxARj4u4mvUU8nWAiP8GSyNQ/XYnd/7Ct8xC70cWD9raGlbnIxtMv\\njwQblp/Ze71v+5Ho0G8j9YWk2wKBgQC6NTX6Q5KNjIY92qnBsdTSjMNT7Bl7Dc4Y\\nWo/PPiIba7hsk/3qyveKAGxaUo9kYXXpcqyF7NgIJLgLgHSdrJsTCeum2evPn5aP\\n463SCqx2SQ4+qe7dIVuMqbGIl79akHKvJNFo/aA7KBwa0pje1Bj47h/chiOG+KK+\\n0u/gbyaxcwKBgQCFdaegRnXgEq9+LCyEeS5tSCfWnFaut0qZ+eNYSRP0k2Fr2/lo\\nolGJf0ms5hfi31cA1rFBsmdAxLjAl92ZhCQF25Jm195Y1jtZGMqKfFt+NzEfTA0i\\ntSl1BshBKjZHIHjP6IA5JsdVRpn8DKPo3T+Dd9MHShkZ5IkL7LnhyzNtbQKBgCyd\\nyVQw0rG15Qh3F0vA52SXs8ms/aH9ec9eC601jwiQ8/yYLTQP8+YOGeg+dfswWwdX\\nobzpNg4ldi/TVJ3fGL0kGGG/P4BdFb9eZ2y3gAFLme0r3H8BmiUHF/HJomD50XDj\\n2oVUmAr0nl9S7Kwfl+g1bk7xmRujAaPFeRKGr3e/AoGAOZDShK29fA3CacN6mxW+\\njYmudrfxOx8z8lLhIbV8DGLtpHysISUe+LYIlcD+BcMiWZZaPiQl2zfgDC3AeaWI\\nx++VhvXYPMY+EnbHzgUHdvGV0srHrMLi/SvwH4AYJmC4Lv0LbvymCcNT7RuwMeks\\noF5XJjSelTeW/haaYMIF7Jo=\\n-----END PRIVATE KEY-----\\n\",\"client_email\": \"automl-campidentifier@trafficestimator-191922.iam.gserviceaccount.com\",\"client_id\": \"112136592414071134148\",\"auth_uri\": \"https://accounts.google.com/o/oauth2/auth\",\"token_uri\": \"https://oauth2.googleapis.com/token\",\"auth_provider_x509_cert_url\": \"https://www.googleapis.com/oauth2/v1/certs\",\"client_x509_cert_url\": \"https://www.googleapis.com/robot/v1/metadata/x509/automl-campidentifier%40trafficestimator-191922.iam.gserviceaccount.com\"}' > credentials.json\n",
"\n",
"#####\n",
"\n",
"!pip install google-cloud-automl\n",
"import sys, json, requests\n",
"import os, google\n",
"from google.cloud import automl_v1beta1\n",
"from google.cloud.automl_v1beta1.proto import service_pb2\n",
"\n",
"os.environ[\"GOOGLE_APPLICATION_CREDENTIALS\"] = \"/content/credentials.json\"\n",
"\n",
"def get_prediction(content, project_id, model_id):\n",
" prediction_client = automl_v1beta1.PredictionServiceClient()\n",
"\n",
" name = 'projects/{}/locations/us-central1/models/{}'.format(project_id, model_id)\n",
" payload = {'text_snippet': {'content': content, 'mime_type': 'text/plain'}}\n",
" params = {}\n",
" request = prediction_client.predict(name, payload, params)\n",
" \n",
" return request # waits till request is returned\n",
"\n",
"'''\n",
"if __name__ == '__main__':\n",
" content = sys.argv[1]\n",
" project_id = sys.argv[2]\n",
" model_id = sys.argv[3]\n",
"'''\n",
"print('\\n\\n\\n1. Classification results for \"' + keyword1 + '\"\\n')\n",
"print(get_prediction(keyword1, project_id, model_id))\n",
"print('\\n\\n\\n2. Classification results for \"' + keyword2 + '\"\\n')\n",
"print(get_prediction(keyword2, project_id, model_id))\n"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "SK1JD-d9WVFb",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"---\n",
"Predict view in UI:\n",
"\n",
"![Example Demo](https://user-images.githubusercontent.com/6991865/52154828-e66b3b00-267f-11e9-8bbe-d41b3d3cde8a.png)"
]
},
{
"metadata": {
"id": "U0KkBAtRddP1",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"## 1.4 Addon: Spacy for Word Stemming > Not Impressive :-/\n"
]
},
{
"metadata": {
"id": "NB9cyp9SaDsc",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"- Spacy is an Natural Language Processing Library that provides a Stemmer or Lemmatizer, which returns the root word of an inflection\n",
"\n",
"- Documentation: https://spacy.io/usage/\n",
"\n",
"---\n",
"\n",
"\n",
"- Spacy also has an **Entity Recognizer** module that can be traing > **used by Sixt's PPC Team**\n",
"\n",
"- Documentation: https://github.com/explosion/spacy/blob/master/examples/training/train_ner.py\n",
"\n",
"---"
]
},
{
"metadata": {
"id": "tSF395CuH4YB",
"colab_type": "code",
"outputId": "489834a5-ca9d-4633-f206-ac9f38619372",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 772
}
},
"cell_type": "code",
"source": [
"#################### START CONFIG ##################\n",
"\n",
"queries = ['bürostühle', 'besucher stühle', 'stühle kauf ich hier']\n",
"\n",
"#################### END CONFIG ##################\n",
"\n",
"!pip install -U textblob-de\n",
"!python3 -m spacy download de\n",
"import spacy\n",
"nlp = spacy.load('de')\n",
"from textblob_de.lemmatizers import PatternParserLemmatizer\n",
"\n",
"_lemmatizer = PatternParserLemmatizer()\n",
"\n",
"for query in queries: \n",
" sentence = query\n",
" \n",
" # https://github.com/explosion/spaCy/issues/701 > Only Uppercase entity detection\n",
" capSentence = []\n",
" [capSentence.append(word.capitalize()) for word in sentence.split(' ')]\n",
" capSentence = ' '.join(capSentence)\n",
"\n",
" print('\\n')\n",
" print(_lemmatizer.lemmatize(query))\n",
" \n",
" doc = nlp(capSentence)\n",
" \n",
" for token in doc: \n",
" print(token,' > stem: > ', token.lemma_)\n",
" \n",
" for chunk in doc.noun_chunks:\n",
" print(chunk.text + ' | ' + chunk.root.text + ' | ' + chunk.root.dep_ + ' | ' + chunk.root.head.text)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"Collecting textblob-de\n",
"\u001b[?25l Downloading https://files.pythonhosted.org/packages/47/61/7a5759c3ac60bf9330a50ce81ebe7f0aac1bc6c674d45e00f7b3e190f5af/textblob_de-0.4.3-py2.py3-none-any.whl (468kB)\n",
"\u001b[K 100% |████████████████████████████████| 471kB 13.7MB/s \n",
"\u001b[?25hRequirement already satisfied, skipping upgrade: textblob>=0.9.0 in /usr/local/lib/python3.6/dist-packages (from textblob-de) (0.15.2)\n",
"Requirement already satisfied, skipping upgrade: nltk>=3.1 in /usr/local/lib/python3.6/dist-packages (from textblob>=0.9.0->textblob-de) (3.2.5)\n",
"Requirement already satisfied, skipping upgrade: six in /usr/local/lib/python3.6/dist-packages (from nltk>=3.1->textblob>=0.9.0->textblob-de) (1.11.0)\n",
"Installing collected packages: textblob-de\n",
"Successfully installed textblob-de-0.4.3\n",
"Collecting de_core_news_sm==2.0.0 from https://github.com/explosion/spacy-models/releases/download/de_core_news_sm-2.0.0/de_core_news_sm-2.0.0.tar.gz#egg=de_core_news_sm==2.0.0\n",
"\u001b[?25l Downloading https://github.com/explosion/spacy-models/releases/download/de_core_news_sm-2.0.0/de_core_news_sm-2.0.0.tar.gz (38.2MB)\n",
"\u001b[K 100% |████████████████████████████████| 38.2MB 69.6MB/s \n",
"\u001b[?25hInstalling collected packages: de-core-news-sm\n",
" Running setup.py install for de-core-news-sm ... \u001b[?25ldone\n",
"\u001b[?25hSuccessfully installed de-core-news-sm-2.0.0\n",
"\n",
"\u001b[93m Linking successful\u001b[0m\n",
" /usr/local/lib/python3.6/dist-packages/de_core_news_sm -->\n",
" /usr/local/lib/python3.6/dist-packages/spacy/data/de\n",
"\n",
" You can now load the model via spacy.load('de')\n",
"\n",
"\n",
"\n",
"[('Bürostühle', 'NN')]\n",
"Bürostühle > stem: > Bürostühle\n",
"Bürostühle | Bürostühle | ROOT | Bürostühle\n",
"\n",
"\n",
"[('besuch', 'JJ'), ('stühle', 'NN')]\n",
"Besucher > stem: > Besucher\n",
"Stühle > stem: > Stuhl\n",
"Besucher Stühle | Stühle | ROOT | Stühle\n",
"\n",
"\n",
"[('stühl', 'JJ'), ('kauf', 'NN'), ('ich', 'PRP'), ('hier', 'RB')]\n",
"Stühle > stem: > Stuhl\n",
"Kauf > stem: > kaufen\n",
"Ich > stem: > ich\n",
"Hier > stem: > Hier\n",
"Stühle | Stühle | ROOT | Stühle\n",
"Kauf | Kauf | ROOT | Kauf\n",
"Ich | Ich | ROOT | Ich\n"
],
"name": "stdout"
}
]
},
{
"metadata": {
"id": "a60zDUMiOEXD",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
" ## 1.5 Addon: Call a Cloud Function from Scripts: Tutorial\n",
" \n"
]
},
{
"metadata": {
"id": "veI3Kn3HZypI",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"- Example Code for [Partial String Comparison](https://gist.github.com/norisk-marketing/afb26844fa3b68f2de9440bf7e006f4b#python-get-partialratio-via-cloud-functions-slide-25)\n",
"- Cloud Functions Docs: https://cloud.google.com/functions/docs/concepts/python-runtime\n",
"- Flask Tutorial: https://itnext.io/writing-google-cloud-functions-with-python-3-49ac2e5c8cb3\n",
"\n",
"---\n",
"\n",
"Description: \n",
"- A **POST request** is fired from an Ads script to call a cloud function, carrying a** payload** with the relevant JSON object or data, here the two strings (line 16-28)\n",
"- The cloud function receives the payload as a parameter, extracts the JSON attributes and then uses a package called **fuzzywuzzy** to compare (line 50-58)\n",
"- The cloud function returns the similarity value back to the script\n",
"\n",
"- **NOTE (!)**: The code shown below is only an example for Scripts and the cloud function, it will not (!) work here\n",
" "
]
},
{
"metadata": {
"id": "620lPfH1OLvB",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"# NOTE: ONLY EXAMPLE CODE for Scripts, WILL NOT (!) WORK HERE \n",
"\n",
"\n",
"#############################################################\n",
"############ 1. ADS SCRIPTS to call Cloud Function ##########\n",
"#############################################################\n",
"\n",
"\n",
"var KEYWORDS = [\"salomon mens shoes\", \"salomon shoes for men\"]; \n",
"\n",
"function main() {\n",
" getPartialRatio(KEYWORDS); \n",
"}\n",
"\n",
"function getPartialRatio(keywords){\n",
" var formData = {\n",
" \"term1\" : keywords[0],\n",
" \"term2\" : keywords[1]\n",
" };\n",
"\n",
" var options = {\n",
" \"method\" : 'post',\n",
" \"contentType\" : \"application/json\",\n",
" \"payload\" : JSON.stringify(formData),\n",
" \"muteHttpExceptions\" : true\n",
" };\n",
"\n",
" var response = UrlFetchApp.fetch('https://europe-west1-adwords-scripts-big-query.cloudfunctions.net/partialStringRatio', options);\n",
" Logger.log(response);\n",
" return response; \n",
"}\n",
"\n",
"\n",
"#############################################################\n",
"############ 2. PYTHON code in cloud function ###############\n",
"#############################################################\n",
"\n",
"\n",
"import fuzzywuzzy\n",
"from flask import Flask\n",
"from flask import request\n",
"from fuzzywuzzy import fuzz\n",
"from fuzzywuzzy import process\n",
"\n",
"#app = Flask(__name__)\n",
"\n",
"#@app.route('/partialStringRatio', methods=['POST'])\n",
"def partialStringRatio(request):\n",
" print(request.is_json)\n",
" content = request.get_json()\n",
" print(content)\n",
"\n",
" term1 = content.get('term1')\n",
" term2 = content.get('term2')\n",
" print (term1)\n",
" print (term2)\n",
"\n",
" return str(fuzz.partial_ratio(term1, term2))"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "U40nw2hrdSM9",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"\n",
"\n",
"\n",
"---\n",
"---\n",
"---\n",
"\n",
"\n",
"# PART 2. NGRAM ANALYSIS WITH BIG QUERY\n"
]
},
{
"metadata": {
"id": "mgdj0ug-hp7l",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"## 2.1 Setting Up the Google Ads Transfer (5-10 min)\n",
"\n"
]
},
{
"metadata": {
"id": "P-XS6T18hyqd",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"- Create an Google Cloud Console account and setup a project: https://console.cloud.google.com\n",
"- Navigate to the old BigQuery UI: https://bigquery.cloud.google.com/welcome\n",
"\n",
"- Follow the steps of this tutorial: https://www.excelinppc.com/big-query-automation-powerhouse-for-google-ads/"
]
},
{
"metadata": {
"id": "eboP7vWA4uU5",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"## 2.2 Loading Data via Google Cloud Package\n",
"\n",
"\n"
]
},
{
"metadata": {
"id": "zMhEpxpPhl4_",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"\n",
"- [Documentation](https://googleapis.github.io/google-cloud-python/latest/bigquery/usage/queries.html) & [BigQuery SQL Reference](https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax?hl=en)\n",
"\n",
"---\n",
"\n",
"> --> **USE CASE**: Execute Big Query select statements and use the output in Python. \n",
"\n",
"> --> Authorization needed? **YES**\n",
"\n",
"> --> Can this be rewritten in *Ads Scripts*? **YES**\n",
"---\n",
"\n"
]
},
{
"metadata": {
"id": "r-Zi88aoKMLa",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### 2.2.1 Example Query: Analyzing Ad Performance by HeadlinePart2 - Fred Vallaeys\n",
"\n",
"- Use Case: Fred Vallaeys [Search Engine Land article](https://searchengineland.com/get-more-conversions-by-deconstructing-your-ppc-ads-310642): '*Get more conversions by deconstructing your PPC ads*'\n",
"\n",
"![Fred Val Script](https://user-images.githubusercontent.com/6991865/52275819-e2c20780-2950-11e9-9eca-a86a4a424234.png)\n",
"- Step 1: Intialize the Big Query client, then run a query"
]
},
{
"metadata": {
"id": "moq-eVsQ43-4",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"####################\n",
"# Create a Google cloud project\n",
"# Reference: https://cloud.google.com/resource-manager/docs/creating-managing-projects\n",
"project_id = 'trafficestimator-191922'\n",
"####################\n",
"\n",
"from google.colab import auth\n",
"auth.authenticate_user()\n",
"import os\n",
"from google.cloud import bigquery\n",
"\n",
"os.environ[\"GOOGLE_CLOUD_PROJECT\"] = project_id\n",
"client = bigquery.Client(project=project_id)"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "-QhSplqQuAgX",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"- Step 2: Run the sql query **directly in the code cell **with magic command '*%%bigquery*'"
]
},
{
"metadata": {
"id": "pPJMkD2LKdqU",
"colab_type": "code",
"outputId": "d0eced15-a4db-4bb4-8b44-a10c1070af5a",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 266
}
},
"cell_type": "code",
"source": [
"%%bigquery hl2_performance\n",
"\n",
"WITH Ad_Elements AS (\n",
"SELECT CreativeId, HeadlinePart1,HeadlinePart2,Description\n",
"FROM `trafficestimator-191922.SPOR_Transfer.p_Ad_2131444427`\n",
"WHERE AdType = 'EXPANDED_TEXT_AD' AND HeadlinePart2 NOT LIKE '%chuster%'\n",
"LIMIT 100000),\n",
" \n",
"Ad_Stats AS (\n",
"SELECT CreativeId, SUM(Impressions) AS Impressions, SUM(Clicks) AS Clicks\n",
"FROM `trafficestimator-191922.SPOR_Transfer.p_AdStats_2131444427`\n",
"WHERE Impressions > 0\n",
"GROUP BY CreativeId),\n",
" \n",
"Ad_Elements_And_Stats AS(\n",
"SELECT *\n",
"FROM Ad_Elements\n",
"LEFT JOIN Ad_Stats USING (CreativeId))\n",
" \n",
"SELECT HeadlinePart2, SUM(Impressions) AS Impressions, SUM(Clicks) AS Clicks, ROUND(AVG(Clicks/Impressions),2) AS CTR\n",
"FROM Ad_Elements_And_Stats\n",
"GROUP BY HeadlinePart2\n",
"ORDER BY Clicks DESC LIMIT 7"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>HeadlinePart2</th>\n",
" <th>Impressions</th>\n",
" <th>Clicks</th>\n",
" <th>CTR</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Für die {=SaisonAds.Season_max13}-Saison {=Sai...</td>\n",
" <td>227761</td>\n",
" <td>3170</td>\n",
" <td>0.03</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Jetzt neue Angebote entdecken</td>\n",
" <td>10654</td>\n",
" <td>905</td>\n",
" <td>0.08</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Versandkostenfrei bestellen</td>\n",
" <td>4522</td>\n",
" <td>260</td>\n",
" <td>0.10</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Mind. {=nrAdBuild_Brand.nr_models} Modell(e) a...</td>\n",
" <td>6991</td>\n",
" <td>109</td>\n",
" <td>0.02</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Für Damen &amp; Herren</td>\n",
" <td>3902</td>\n",
" <td>87</td>\n",
" <td>0.02</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>Für Sie: 0€ Versand &amp; Retoure.</td>\n",
" <td>6054</td>\n",
" <td>83</td>\n",
" <td>0.02</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>Mind. {=nrAdBuild_Brand.nr_models} Modelle ab ...</td>\n",
" <td>3892</td>\n",
" <td>58</td>\n",
" <td>0.02</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" HeadlinePart2 Impressions Clicks \\\n",
"0 Für die {=SaisonAds.Season_max13}-Saison {=Sai... 227761 3170 \n",
"1 Jetzt neue Angebote entdecken 10654 905 \n",
"2 Versandkostenfrei bestellen 4522 260 \n",
"3 Mind. {=nrAdBuild_Brand.nr_models} Modell(e) a... 6991 109 \n",
"4 Für Damen & Herren 3902 87 \n",
"5 Für Sie: 0€ Versand & Retoure. 6054 83 \n",
"6 Mind. {=nrAdBuild_Brand.nr_models} Modelle ab ... 3892 58 \n",
"\n",
" CTR \n",
"0 0.03 \n",
"1 0.08 \n",
"2 0.10 \n",
"3 0.02 \n",
"4 0.02 \n",
"5 0.02 \n",
"6 0.02 "
]
},
"metadata": {
"tags": []
},
"execution_count": 20
}
]
},
{
"metadata": {
"id": "OBW8nFi0GX21",
"colab_type": "code",
"outputId": "d0b25bad-f68c-46e4-e984-af88a891524c",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 282
}
},
"cell_type": "code",
"source": [
"%matplotlib inline\n",
"hl2_performance.plot(kind='barh', x='HeadlinePart2', y='Clicks')"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<matplotlib.axes._subplots.AxesSubplot at 0x7f5660ab8908>"
]
},
"metadata": {
"tags": []
},
"execution_count": 8
},
{
"output_type": "display_data",
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAuEAAAD4CAYAAABCBjjoAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzs3Xm8XPP9x/HXFZGgKSW3toqg+q5d\niX1JrMXPVqIIJahqidJSa0lQFPWrEKoIoa01/Ki1qCSUIEJsrY+iKpaQWGONJPf3x/c7yWQyc2du\ncu/k5ub9fDzu486c5bucOTPzOd/zOWcampqaMDMzMzOz+lloXjfAzMzMzGxB4yDczMzMzKzOHISb\nmZmZmdWZg3AzMzMzszpzEG5mZmZmVmcLz+sGmJnZ/GHq1GlNH3zw2bxuxlz7xjcWw/1oXzpKX9yP\n9qc99KWxsVtDuekeCTczs5osvHCned2EVuF+tD8dpS/uR/vTnvviINzMzMzMrM4chJuZmZmZ1ZmD\ncDMzMzOzOnMQbmZmZmZWZw7CzczMzMzqzEG4mZmZmVmd+T7hZmZWk12Pvb3s9KtO3KbOLTEzm/95\nJNzMzMzMrM48Em5mZmZmbeaQ3z7YquXVcvZt/PjXueiiC/jkk4/58suvWHvtddhrrx9y+umnMnTo\nnxg48CROPnkgXbp0nW3ds84aRJ8+27L55lu2artLeSTczMzMzDqMadOm8etfH0+/fgcyfPhwhg79\nEwBXX33ljGVOP/2csgF4PXkk3MzMzMw6jDFjHqdHj55873sbANDQ0MARR/ycd96ZwOmnnwpA3767\ncu21N/Lxxx/xm98MZPr06Sy77HKccsqgGeVMnTqVY4/9OQcddAhTp07liisupUuXrnzjG0sxcOBv\nWHjhuQuj6x6ES+oJ/AfYNCIeK5o+BnghIvpLugE4OCI+r6G8J4G+EfFahfmbAX8GzoiIYXPZ9hWA\n14G9IuK2MvPXAoZERB9JI4HFgU+BxYC7I2JQM2VfCAwGDgImRcSQkvmTIqJ7M+u/BowHpgJfA4ZG\nxGUt6V9zqm3nepM0AOheaZtKGgYMj4g7y8y7DZgeEXuWTO9JlX2zhnZ9DXg+InpWmN8HGBARfau9\npjXU9RqwVkR80swyNwHfjoj157SeorL+CGwcEetVmD8cKOy3NwMv5MedgMMi4sUK660H/CAiBpbb\nJjW81v2BM4FXgAbgS+BHEfFOC7pXkaRdSPt+/xaudyIwKiJGt0Y7qtRV8TNT0slAP2CLiPiwqG3/\nQ3ptjomIJ9q6jWZm9fL666+x2mrfmWValy5d6dx5kdmWvfzyS9l33/3ZYoveXHrpYF588V8z5l10\n0QVss812rL9+L0444RcMGPAL1l33e4wa9SAfffQhSy89x1/hwLxLR3kV2K/wRNK3gW8UnkfEvrUE\n4DXaFrh4bgPwbF/g3/l/LQ6OiD7ApkA/SctVWjAijomI/8xl+3aKiK2BPsDpkjrNZXkdUkTsAaxS\nYXaz++Z8aBtgg7ktRFJnYFdgCUnfrWGVURHRJ+//VwC/qLRgRIyLiIFz2cQbc329gX8Ah8xleXMt\nIn5bjwA811XxMzMizgZGAGsDSPoe0BvYirSdzqlHG83M6qeB6dOn17TkSy+9yNprrwvAEUcczZpr\nrgXAPffcyTvvTGD33dN43dZbb8f555/DtddexWqraa4DcJh36SiPAdtL6hQR00hB7X2kEeMZI3yk\nUbW3SEFED2D/iHhK0kWkwDaA2Q9rZtUJ+CCXezPQWDRvSkTsIOll4I+kIKMLsB2wF7ATsDywb0S8\nSRpNGgDcIGnxiPhU0rdIo35fAs9UaEM30gj1J8UjoblNkyKiex45H1BYQdLCwHXAisCYKn0stRRp\nNH1aHiWc0Q/gWGAjoCtwWURcmUeNW7ydW7DdStcbBHQHvk0Khn9NCgZ6AjtHxKuSzgM2J+2jQyLi\nT5K2BS4EJgBvkwJmJJ0FbEl6rYdExPVFdfUgnQmZlss6ICL+C3xYYdtV2zf7AGcDXwFv5HZ3AW7J\n2/QfRXVvWbTseOCwchVKWoO0rzcBk4H+hRHLPP/rpH1h8dyOo4pGLk/O9UwljSaX9uuTiGiS9D/A\nr0rmXU56XbcgvS8EnB8RQyX9G7gbeDcizgJ2BJ4GxpEOUgbmth2fn/8X+Hr5TcoywJt5+ZGk/f/5\nwgg3UJjWt6jPZV/rGi0DPJ7LmdEP4E7gEtLrMR3YO7f5GtIo+rrA0xHxY0lrA9cC7+d5s8jvqx3z\n+t8Cfh8RV5fUtxowHPhbrmMl4AvgwNyvy0n7f2fgtIgoe+VSPkPzp9yOzYA/AOsAGwOXRMQl1T4z\ngfdI7w9I789bI6IJeJE0UGFm1mGstFJPbrnlplmmTZkyhc8//2y2ZRdaaCGmT2+abXpT03TeeutN\nxo9/nRVX7MGOO/4PG2+8KQ89NJITTvgFv/nNeay0Us+5aue8CsK/In1Jbg08AOwOnA70LbNsl4j4\nvqSfAgdK+oL0RbQRsALwcpW6VgaeBYiIvSssszDwYkScn0/rFr6UegCb5SBGwBIR8UAOJHYDrgd+\nDtwQEYMlnUD6Ii+4WtKnwOqk4GZyKqYmOwCdI2JTSRsDR9Wwzj2SmnJ9xcv3IG2zLsBrEfFLSYuS\nvtQLVynMyXauut2aaetSEbFjDqAPyo/PBHaT9BQpzWJzSYsDz+YUknNIQfQzku4GXs0B6EoRsZWk\nLsBTedmCvsD9EXGmpPWB5UgB46eSekTE6yXtqrZvXgZsHxHjJQ0hHZgtSkpB+YWkfZg5kn4RsG1E\nvJ8PKvYmB6MlLgYOj4h/SzoCOBI4q2j+ssCVEXGbpG2AE0gHOwDPRsTJkn4H/CiXBYCklYGPASLi\nLuCu0opzMLk26bVeDbgBGEoKDO+JiHvzov3yvKdJBxwDJS0JHAF8Ny9fHKz2zu+TbqSDhz5l+t2c\n2V7rKsvvI6kXKaifzMwDjhn9kLQ96QDmaUlnAPsDd5AC1n1IgfMbuV+nAoMi4nZJf6hQ55rA94Al\ngWckXVNS37C83EHAhIjoJ2lf0mfHJ8DbEXGopO7Ag6TAupL1gD1IB9gvkD7XupJei0tKlp3lvQw8\nRUqL65HfL0cB70raH/ht0Ws8xxobu81tEXU3P7a5nI7SD+g4fXE/2l61tu2883b88Y8X89xzY9hm\nm21YeunFOeeci/jwww9ZeOGFaGzsRqdOC9G9+9dYb711efnl59l5550ZPHgwG264IV27dmb33feh\na9euXHDB2fz5z3/m0ksv5YADDuDHPz6IKVM+5f3336ZXr7Xnqh/z8sLMm4H9JE0gBSaV8lofzv/f\nII38rAE8HhHTgfGSKn4556B4HeDQGtpTXM8S+fGYokCyEIRAGpXsTwrC18h9gTSit1NRmQfnEb8u\nwK2SxpFGLGuxBvAoQEQ8LqmW9JydIuKTPHL6QK6vuB9fSFpK0qPAFGY9KzCn27nadqukMJL7NmkE\nGOAdYGmgFzAKIJ9t+CcpQOwZEYWzDaNIwe9mwCY54IOUYlWc9nMf8H85sBpelB4whBSw9yyTU112\n35S0FNAUEePzciNIp/UXKrSXtA8gaZnc5lvzgdfiwCTKB+EbAVfk5bow+5mPd4BTJR2X539aNG9E\n/v8EKb2AXP+iuZwjytRXanQ+a1L8GhbKJB8IbQ/8JB9IfpkPaBYi5cp/Qdq3xhatO6robM9WwE3F\n7atBude6OTdGxHG5vh+RztD8qLgfpO14rqTFSGdq/pKnvxwRE/K6b5G2wYz3H7O/rwtGRcRUYJKk\nD0gHAMX1FawP/B0gIm7I9fwB2FLSFnmZRSUtEhFTKvTvlYh4T9KXpLMTbypdf7BEmWVL38uQRtLH\nMXvA3iomTpzcFsW2mcbGbvNdm8vpKP2AjtMX96O81v5Br1radt55F3HeeWcxZMgQGho6seGGG7P/\n/odw2mknMXHiZKZNm86kSZ+w//6HcPbZZ3DNNX9imWWWYZ99DuKLL27ho48+Z621erHCCj249NIr\n6NZtKQ444EC6dfs63bp1Y9dd9655G1U6aJiXQfgDpEDobdIp20qKg9aG/Fec6FMxrz0izpW0ESkv\n9r5K6SgV6oEUqBbsB0xXukirE7BKDuyK21O2LRHxpaS7SCkTpaecO1dofs39LFPfxzko3ZQU4E4B\nkNSbtC16R8RXkoqDzzndztW2WyVTKzxuyG1uKJq2SG5LufZMIV2EOktea+GMQz4IWpd0ZuEcSVdF\nxLWkNJKtygTgUHnfrNSuTsy+D0wB3sw50cXtmuV59hmwdTMHLsfksn6UR3t/V9Km2R5HxOd51Pwy\n4KZm0lGg/GtY6AOkEdiFgYfzdu1OStO5mRr2kYh4SNJ3lK5RKG5vpX2fWsptxi3Ab4qeF/oxGDg3\nj1IfR7qAGWY/MC7d/yvVXzy9sN8W11cwrUwZU4CzilOnqmju/dLcsoX5e5EuTr9U0lRgYkRcOfuq\nZmYdQ/fu3TnvvN/PdkBRuF3h8OF3ALDYYosxePCls6xbfIeUX/3q5BmPd9ppl1Zt4zy7T3ge8XmI\nNEp9R0tWBTaQ1CBpJdJp2eY8Rxr1IiL2Llwslv92qLIuAJI2BCZHxHcjYr2IWBu4kfTFFqSRW0gp\nDJVsnJf9mDxSK2kd0un6cmaUq3SHly61tDUv3wBsmMso1h0YnwPw3YBOkirl1Ld0O7emMeT0hTza\ntyrpgtg3lTQwM73hcWBXSQtJ6irp4uKC8un/tSLdzebXzHytGiPin+Uqr7RvRsQHQJNSnjmkUfAn\nKbMP5GUL+d5IOiq/3uU8Q8ovRtK+SvnQxbozM9XjB8yan1/4JYFNgH8VrxQRzzJz37+rZN/vExHX\nVWhPqX6ku42sF+nOKJuRUmteBVaXtEg++1L2AlBJqwIf5hz7Gfs/Kee/knKvda0K77VS3YFX8pmp\nnWn+epJa3tebSuqU00m6kfKuyxlDOvhF0i5Kdyt5nJTqhKRvSjq7mba0hkagsL/fQzrTs5CkVST9\nrY3rNjOzMub1j/XcDDwVER/VukIOLJ4DRpNuSzYOQNKOkn7WJq1MQcjVJdOuJo0GDgYOyV9kpXfR\nuFrSyJz+8RkpneUZUj7yo6TT5a9VqPMe0inqUbmecmkMs62TR8AfA0ZGxKMl8x8AVstlrkq6UK1s\nvus82s6Fuv8BjJX0EHA/cGJEfAqcQhqZvoN0oSO5jyNyOx8CxpYU9xIwRNKDpIsJK+X3lqq0bx4G\nXJe3c2fSa3otKSXm76SLGwsjooeS9oGHSRc/lgsMAY4mXWA5ipTm9HTJ/GuBX0q6jxS8LSvp4Dxv\nTUkPkNKu/lxj32omaelc9j2FaZFuU/kqKRf8GtK2H8qsaTS9874/Mre/kBJ2OXBJPjP0VjNVz/Za\nV7FPUX2nka7VKHUxcBvptb2YlKtdLp0D0kj6eUr56DNGtiXdkFN9IL13byad3Tolp26VcwOweH59\njyFts5tIF2o/mvv4cIV1W11EjAPuzXVeRz5rIOn2erXBzMygoampWuru/E3SScCXEfG/87ot1n5I\nejIielVfcv6W89qXbyZAtDmgdEHrWoU89PYunyG6Lubylom7Hnt72S+M1s73bGvO221/Okpf3I/2\npz30pbGxW7nUwQXiFzPvBG6T9EFElI5mz1ck/YQ0Kl/qpLn9cm0Lkm4l3c2h2EcRsfu8aE+BpDvI\nd8xZAPyVNLK+brUF27OcNnVfmVkREYfXuz1toa3e3zn9ZTPg+Dktw8zMWl+HHwk3M7PW4ZHw9qWj\n9AM6Tl/cj/anPfRlQR4JNzOzVnDHBbvP8y8zM7OOYl5fmGlmZmZmtsBxEG5mZmZmVmcOws3MzMzM\n6sxBuJmZmZlZnTkINzMzMzOrMwfhZmZmZmZ15iDczMzMzKzOHISbmZmZmdWZg3AzMzMzszpzEG5m\nZmZmVmcOws3MzMzM6sxBuJmZmZlZnTkINzMzMzOrMwfhZmZmZmZ1tvC8boCZmc0fDrv7qWbnn73h\nanVqiZnZ/M8j4WZmZmZmdeYg3MzMzMyszhyEm5mZmZnVmYNwMzMzM7M684WZZgsAST2B54CxRZPH\nRcQxkvoD44AdgWERMaGZcr4ODAWWAToBk4CDgK7A6RFxeAvbtSLwp1zW28CPIuJLSV2By4EewBfA\nARExKa8j4OaIWKeonAbgNWDDiHi3JW2YW5ImRUT3MtNPAnYhtX+viPiwZH5/4EzglTxpcWBoRFzW\nTF3rAF9ExEut1HwzM5tHPBJutuCIiOhT9HdMnjgsIsZFxG+bC8CzXwBPRMRWEbE5MAbYPyImtDQA\nz84ALomILYGXgUPy9J8C/4qIPsBtwBHFnQCmSFq9qJwtgBfrHYBXsXveRncD21RY5sbC60Hqw8/z\nAVMlewLfadVWmpnZPOGRcLMFmKQ+wICI6JufT4qI7pJGAs8DRMSAolWWBDoXnkTEb/J6PYHhEdFL\n0pbA2cBXwHjgsIiYIun2iNi9pAl9SAE3wB3AccAfgJ2BI3Md5UaGrwf2AQbl5z8ErsttORLoB0wH\nbouICyQNAlYBViaNTt8EdMl/RwLPAtcA3yKNSA+KiDvzdrifFER3B3YF3sp1rUg6CKlkuqRGYCdm\nHlxUlM8APAesImk86UzAKqTtfRowMW+riZLeBRZj5nZ+I9exH7BWRBwn6WvA8xHRU9K/SQcD7wJX\nk85mLAJMA34cEa9Xa5+ZmbUuB+FmVsnzZQLgS4D7JO0M/A24ISKeKVnmImDbiHhf0nnA3sBfygTg\nAItHxJf58bvAcpKOArYErpTUBOxbZoT+hlz/IEkLkYL2kyWtDPQljSoDPCLp5vx4kYjYUtKewBsR\ncaikVUgjy0sB90XENXnazcCdeb2PI2JbSb8ljUS/BHSOiE0lbQwcVdopSYuTgtxbgKNz35olaRlg\no1xeP+Dt3MbuwIMRsY6ke0kHO09IehHYPiLGSxqS12mqUHxn4J6IuFfSUOCCiHggv46nAodVa18t\nGhu7tUYxdTE/tbU5HaUf0HH64n60P+21Lw7CzRYcyiO7BfcDjzSz/BOlEyLi5ZyTvTXwfeDvko4H\nHswVLAOsBtyaFmNxUt54LRpqXI6IeFPSRElrkwLopyJicg4qVwNG5EW7AT1L+jMa+I2ky4Bbc2Da\nGdhQ0k9II+hLF1X3cP7/Rp6+BvBobsfjkj4vblvOT78DuIuUo/60pPMk3RIRj5d0ZR9JvUg59csC\nR0XEu5I2A7aUVDiYWFTSIkV1LAU0RcT4PGkE0Bto7td0Cv3fLBWhX5Ny8Sc2s06LTJw4ubWKalON\njd3mm7Y2p6P0AzpOX9yP9qc99KXSQYCDcLMFR+Tc4xkk9S5ZpnPR4ymlBUhaNCI+B+4jjYj/lZQS\n8mDROm+W1tOMT4rKXAF4KyIulrQLcGREvNzMuteRRtm/kR8X6r+rND9d0jaF/kTE25LWJR1I/EzS\nJsB/ScH8lvn/k0WrTy163JD/phdNK722Zl3S6PlASRflCzA3BU4s04cbc+rIYqSLZp8u6sdZEXF9\nST8KD5uY9aBlkdym4pHw4teyUGbh/94R8XaZ9piZWZ34wkyzBdvHwHIw484b1c7Z3S9pu6Ln3wJe\nLTyJiA9yWWvk/0flcit5ANgrP94LuDc/vhs4IJdxoKRTy6w7HNieFDjfnaeNBbaWtJikBkmDJS1a\nvFJu/3YRcR8p9aMXKd/7PxExnZRysgiVRV6HPGLdpWT+J6S0moWAk4DzSKku06kgIj4jXaT6+zzp\ncWD3XMc3JZ2dp08HFs7buUlSjzy9N+nAYcbrycyUnFKPA3vksreR1K+ZvpqZWRtxEG62YHsG+FTS\nI8D+wJtVlj8YOF7SQ5IeJAXOJ5cscyhwtaSHSYFgAEi6vUx5A4GD8rJLkS6OBLgMWFnSKKA/cEXp\nijkQfQcYW8grzxcYXgg8BDwGTMij7MVeBk7JqTnXAueTcrd3lfR34FPgDUmnVdgG95DSQ0YB+1Ky\nzfLo/T2ktJcRuU/flzS8QnmF9a4Hlpe0A+nC0U8kPUpKbSmkxDwMXCRpW1Ie93W5H51JefJ/Z2ba\n0XeZdcS+YBCwh6SHSNt/NGmlC3NOvZmZ1UFDU1Ol63jMzMxmOuzup5r9wjh7w9Xq1ZS50h5yRFtD\nR+kHdJy+uB/tT3voS2Njt7LXPHkk3MzMzMyszhyEm5mZmZnVme+OYmZmNbli5/Xn+WldM7OOwiPh\nZmZmZmZ15iDczMzMzKzOHISbmZmZmdWZg3AzMzMzszpzEG5mZmZmVmcOws3MzMzM6sxBuJmZmZlZ\nnTkINzMzMzOrMwfhZmZmZmZ15iDczMzMzKzOHISbmZmZmdWZg3AzMzMzszpzEG5mZmZmVmcOws3M\nzMzM6mzhed0AMzObP4y971dVl+nxvdPq0BIzs/mfR8LNzMzMzOrMQbiZmZmZWZ05CDczMzMzqzMH\n4WZmZmZmdeYLM82s3ZLUE3gOGFs0eVxEHCOpPzAO2BEYFhETmilnELA/8Cbpc+9V4JcRMaltWt48\nSQcCRwJfAT+KiP+UzO8DDIiIvkXThgHDI+LOOjbVzMzaiINwM2vvIiL6lJk4LD8cV2M5gyNiCEAO\n4P8KbNYK7ZsThwFbALsBPwD+dx61w8zM5pGqQbikHsDSEfF0yfSdIuKeNmuZmVkFpSPFkiZFRHdJ\nI4HnASJiQKX1I2KYpP0lbQqMB/6UZ3UGDoqIVyS9QgrUtwPuIaXvbQ/cExEnSloDGAI0AZOB/sCS\nwDXAK8C6wNMR8eMyTZgKfJ00Ov/bOej/WcCWQCdgSERcn0fKpwBLA3cAOwHLA/sCewD9gOnAbRFx\nQT47sAQgYFXgGH+mm5nVT7NBuKTjgJ8Dn0r6FPhBRIzPs39F+mIyM2tPno+Iy2pY7klgDVJAfEZE\njJB0CHAEcCywMvBH4BTgfaA3cCrwX+BE4GLg8Ij4t6QjSOklfwE2APYB3gXekLRkRHxYqFRSZ+AT\n4O+5nnGSGiKiqaR9vfNBRcF3geGStgRWioitJHUBnpJ0W17m/Yj4SR7p70Ea6e8J9CWNvAM8Iunm\n/HjFiNhZ0o7AT2mFz/TGxm5zW0RdzC/trKaj9AM6Tl/cj/anvfal2kj4vsB3IuILSXsC90nqHRHv\nAg1t3zwzM1QSjN4PPNLM8k/UWG43YBowAbhI0unAN5iZf/5xRLyYG/AJMDYipkoqXNC+EXCFJIAu\nwJg8/eVCfrqkt0ijzTOCcGAYabR+aWAU8BPgPeDGkvaNKpMTDimw3qRomywELFem72MioknSRsBq\nwIiifvfMj/+R/7+R2znXJk6c3BrFtKnGxm7zRTur6Sj9gI7TF/ej/WkPfal0EFAtCP8iIr4AiIhb\n85fPXZK2I52CNTNra7PlhEvqXbJM56LHU2ostxdwBXAG8LeIuExSX2CXPH9qSSOmlqz/GbB18Qh2\nvpC0dLmGovlLAmtGxP6SfkkaUd+SNHJeqynA0Ig4p3hiPhiYUrJc4f9dEXF4yfLblLTVAytmZnVU\n7RaF/5J0TT7lSUQMJ11A9DAzR1LMzOrtY/Lor6R1SKO7NZP0E+C9iHgG6A68IqkB2B1YpMZiniHd\nmQVJ+0ratoZ1pgBfz5+pF5LONr5RnK5Sg8eBXSUtJKmrpIurLD8W2FrSYpIaJA2WtGgL6jMzszZQ\nLQj/GekDf1phQkRcT8ovvLsN22Vm1pxnSNeqPMLMWw9Wc7SkkZKeJl1s2T9P/yMpv/se4AZSLvYO\ntZQHnCxpVC7r6eYXh4j4DDiHlAbyKPAQsKKkETlXvKqIeJSUWjI6rz+2yvKvkwL+h4DHgAkR8Xkt\ndZmZWdtpaGqqnlUi6bcRcWLJtCsrXPVvZmYd0Nj7flX1C6PH906rR1PmSnvIEW0NHaUf0HH64n60\nP+2hL42N3cqm+1W7O8oPgD2B7SQtXzRrEVIeo5mZmZmZtVC1CzPvJd1mqxfpdloF04GBbdUoMzMz\nM7OOrNkgPOcNPiLp4oi4pk5tMjOzdmiDHc6f56d1zcw6imoXZhbsJKlV7iFrZmZmZragq/qz9dmi\nwGuSgqL70EbEVm3SKjMzMzOzDqzWIPzMMtP8Yz1mZmZmZnOgpnSUiBhFuhftf/LfW8Dv2rBdZmZm\nZmYdVk1BuKTjgTeAIAXjT1PDD1OYmZmZmdnsar0wsy/wTeCxiGgE+gHPt1mrzMzMzMw6sFqD8MkR\nMYX0Iz1ExF+B3dusVWZmZmZmHVitF2Z+IGl/4HlJVwP/BJavso6ZmZmZmZVR60j4gcAjwC+AfwPf\nAvZrq0aZmZmZmXVkVUfCJX0DWBn4V/4FzbPbvFVmZmZmZh1YsyPhkn4A/Au4HHhRUq+6tMrMzMzM\nrAOrlo7yK2C9iOgF7AwMbPsmmZmZmZl1bNWC8CkRMQEgIl4AurV9k8zMzMzMOrZqQfj0Ks/NzMzM\nzKyFql2YubykQ4qeL1f8PCKuaptmmZlZe3PGsXfUvOzPTuzTdg0xM+sAqgXho4Eti54/VvS8CXAQ\nbmZmZmbWQs0G4RFxcL0aYmZmZma2oKjpx3okrSvpSUkv5uenStq4bZtmZmZmZtYx1fqLmUOAQ4C3\n8/Mbgf9tkxaZmZmZmXVwVX8xM/sqIp6VBEBEvCRpaqWFJfUE/gNsGhGPFU0fA7wQEf0l3QAcnH+F\ns1mSngT6RsRrFeZvBvwZOCMihtXYp0p1rQC8DuwVEbeVmb8WMCQi+kgaCSwOfAosBtwdEYOaKftC\nYDBwEDApIoaUzJ8UEd2bWf81YDwwFfgaMDQiLmtJ/5pTbTvXm6QBQPdK21TSMGB4RNxZZt5twPSI\n2LNkek+q7Js1tOtrwPMR0bPC/D7AgIjoW/yaStoR2CUiBpRZpxvwV2CPiPiohjb0JPW94g9oSVoS\neAQYUa7Olspnwu6NiGMqzJ8UEd0lDQL2B94kfcZMAA6MiM8qrNcf+Aj4gLzdSuYPJ73nRlZYfxiw\nAfAe0BUYBxwREa1yNydJvyO93sNauF7Nn3FzQ9KywOkRcXiF+bO8FyR1Jf0AWw/gC+CAiJjUlm00\nM7PZ1ToSPlXSyqSLMZG0E9BQZZ1Xgf0KTyR9G/hG4XlE7NuKX07bAhfPbQCe7Qv8O/+vxcER0QfY\nFOgnablKC0bEMRHxn7ls3042aMnMAAAgAElEQVQRsTXQBzhdUqe5LK9Diog9gFUqzG5232wLkroA\n5wEnlpsfEZOBi4CzWrHatYGXWykA34D0nu8rqZbPjcER0ScitgA+BnavtGBEDIuI/5vLJp6U69sE\nWA2Y5+lyrfwZ11w9EyoF4Hl+6Xvhp8C/8ufWbcARbdtCMzMrp9aR8GOB2wFJ+gh4jTSa25zHgO0l\ndYqIaaSg9j7SiHFhVHctUqrLW6SRrB7A/hHxlKSLSIFtAItUqasTaRQNSTcDjUXzpkTEDpJeBv4I\n7Ap0AbYD9gJ2ApYH9o2IN4F+wADgBkmLR8Snkr4F3Ax8CTxToQ3dSCPUnxSPhOY2FUYIR+ayydMX\nBq4DVgTGVOljqaVIo+nT8kjijH6QXq+NSKOCl0XElXm0sMXbuQXbrXS9QUB34NukAODXpJSmnsDO\nEfGqpPOAzUn74ZCI+JOkbYELSaOnb5MCZiSdRbozT6e87PVFdfUgnQmZlss6ICL+C3xYYdtV2zf7\nAGcDXwFv5HZ3AW7J2/QfRXVvWbTseOCwCnXuDTwYEZ/kA6fL83bpDJwWEQ+SAqJzJX0tIj4pqmNd\n4JJcx/RcFkBnSX8GvgM8XSYQK35fnAJsXzL/COB4SvYL4H3S9vyEtK3vJL0vrgT2AHoDI2rZf3Nf\nu5NGxWc521MY4SYdUE4Cni9a73jSgdJ/ga+X3aLl6+tCOkv0Tj5bMKMfwBLAUaT95IWI+El+72xB\n+swQcH5EDJV0AHAC6fX/vLhtuZ5hudzv5v4dTNrWxfUNIX3GLQ1cQ3o9/kv67FwGGEp6z00DfhwR\nr1foU3/SNu8OrAmckrfNGqTX6x3yWZFy79d8gFf8XtgZOBKgNc+kmZlZy9QUhEfEc8A6khqBLyPi\n4xpW+wp4HNgaeIA0EnY60LfMsl0i4vuSfgocKOkLYDNSILkC8HKVulYGns1t3bvCMgsDL0bE+fk0\n8bZ5eg9gs4hoUsq3WSIiHsgB827A9cDPgRsiYrCkE4B1i8q9WtKnwOqkL/DJhbSdGuwAdI6ITfOF\nrkfVsM49kppyfcXL9yBtsy7AaxHxS0mLAq+QgieYs+1cdbs109alImLHHEAflB+fCewm6SlgrYjY\nXNLiwLP5tPk5pCD6GUl3A6/mQHeliNgqB1lP5WUL+gL3R8SZktYHliMFO59K6lEmuKm2b14GbB8R\n4yUNIQWgi5JSEn4haR9mjqRfBGwbEe/ng4q9yQFniW2AQtpMP+DtiDhUUnfgQWCdvA8+SToour9o\n3W8CR0XE05LOIAVed5CCsF1IgeITktbO79WClUmj0ETEWZQZZc/76iz7Bekg6HtAj4h4L498/5B0\nwPQ56aBlBM3vv0dL6gt8C3iOlBZTk5xGcwQpwO1M2oerOUfSccCqwP/lg7yeJf34CbBjRHwo6SFJ\na+d11ya9D1YjHXxfRTqw6kUKrMdWqHPhiNhO0q7AacAvSuorpJudBfxvRPw17yO9gJ8AF+TPmp2B\nU6l8AEdu25bAj4GTcj39SfvhhcVtYvb3623k9wJpX98SuDJ/juwb+VeRW1tjY/v+geX23r5adZR+\nQMfpi/vR/rTXvtQUhEtak/ThvxTQoJm54QdWWfVmYD9JE0iByScVlns4/3+DdBp5DeDxnNM5XtKr\nzbTtBGAd4NAaulJczxL58ZiiQLIfcEN+fB3pS+763J6b8/SRpFHggoMj4vkcHN4qaRxpRLwWawCP\nAkTE45JqOXW9Ux5N/TrwQK6vuB9fSFpK0qPAFGY9KzCn27nadqvkifz/bXIqE2nUbmlSIDIKIJ9t\n+Ccp0OgZEYWzDaNIwe9mwCb5wAhSGlVx2s99wP/l4G14RIzO04eQAvaexSPLWdl9U9JSQFNEjM/L\njSCNQi5UaC9pH0DSMrnNt+b3xOKkEd1yQfjypO1H7s+WkrbIzxeVtEhETMnLrFiy7jukEfLFcjl/\nydNfLrRTKaddpIC3MEJ/Himgr6Z0vwB4JSLey497A/+NiNcl3QT8Wilfv7n9d3Dkax4knQoMIgWa\ntfg2aaT6C9L+XCkILnZSRNyZDxguk3Qo8PeSfrwP3J5fq9VJ+yHA6HxGqbB/Lw1Mjoh3c/srHUA8\nUFgfODc/Lq6vYH3gaICIOD6XOSz9069JI+QTq/TvyXyQ9jbwbG7vO6RR/FLl3q9DgKdIB5t1MXHi\n5HpV1WKNjd3adftq1VH6AR2nL+5H+9Me+lLpIKDWdJSbSHdEGVdtwRIPkD783waGN7NccdDakP+K\nL6qqmIMaEedK2og00nifKqSjVKgHUqBasB8wXdIupC/GVXJgV9yesm2JiC8l3UUaZXqwZHbnCs2v\nuZ9l6vs4B6WbkgLcKQCSepO2Re+I+EpScfA5p9u52narZGqFxw25zcXXFSyS21KuPVNIF6GeU1x4\n0cHg8zllYwfSiOhVEXEtKY1kqzIBOFTeNyu1qxOz7wNTgDdzbm1xu2Z5XlJ2Yb2zilNqqhgMnBsR\n9+bR3q+VlFdaPhHxcD7rcAhwcjPpKFD99e0H9Cw64Fssl1Xr/nML8Icy09vifTE9nyXZhxSEF94X\ni5BSetaNiAmSii/mndP3RWF6YX+G8u+LaWXKmALsHRFvl1m+nObeS80tW5hfeC/8M3++HRkR1c4w\nmplZG6o1CJ8QEWe0tPCImCLpIdIo9XdJp1BrWhX4haQGUtrDylWWf440QthcOkqzJG1IGv3aoGja\nVaT85yCN3I4lpTBUsjHwN1IKwHK5jHVI+eLlBDmtQekOL11a0N4GYEPSqebvFM3qDozPAfhuQKcc\ngFSqvyXbuTWNIeWJ/1bpbiOrki6IfTOnBb1EyhUeTUod+Z2kc0lB8fkRMSP1QdK+wKsRcZukSaTU\niWuBxoj4Z7nKK+2bEfGBpKaiNJbepBzwpUj7wC3kfSAvi6Q1cnBzFDNHy0u9RUrNGJP7sztwvaRv\nAsdExMl5uRWYOcJa0B14JZ9t2ZmU0w6wqtKFwO+Q9oUhJes9SxqFrZaOUlHed3YF1iyM8Eo6kLTf\n3kxt++/GpH0NoCmP6EPlz4NXgNVz3V1J+eotUVxfQTdgag7AVyS9lpXeF+8BS+QD8E9JaTijyyy3\nJWmAYlOg7H6WjSEdGN+Y04keIu0DewB/kLQNsGxEXFdL5+ZQ8XvhbuAAYFB+LVeKiDPbsG4zMyuj\n1hGmeyTtIGkRSQsV/mpc92bgqajhtmsFEfEsKbAeDZxJHoGXtKOkn9VaTgv1A64umXY1Kf91MHCI\npL8x+100rpY0Mqd/fEZKZ3mGlIP5KPAj0oWs5dxDSkUYlespl8Yw2zp5BPwxYGREPFoy/wFgtVzm\nqqQ85HKjkPNqOxfq/gcwNgfC9wMnRsSnpIvOhpNynsfnZR8lpYWMJgUwpekJLwFDJD0IDKRCf8uo\ntG8eBlyXt3Nn0mt6LSkl5u+ktI/CyOehpH3gYVJqQGnwVzCCFLRBCtw+yfvHHeT0gaIDq9LX9GLS\nwdbN+fFBpDSDZ0iB9WhSSkVzgeCc2gn4R0mKxXDSgcgIKu+/R+f3xUjSQdFpefofSAHo1VTItY6I\n90kXMo4mXbxYy0XL5+T6HialmgwuKfM94P6ctjOQlKrze8qMxuf0rEGkA6rh5IsyJS0r6Y9Fi3bN\nI+pnAs0NUgwEDsvbaWXSdhsE7JH3/4GUD/LbymXAyrk9/YErJK0nqW6pKmZmBg1NTdXSekHSh8x+\nh4KmiGgXt8eTdBLpglH/gJDNIOnJaOY+2vWkdG/mJ0j3J/+0wjK7A9+PiFa5ZZykTUm50ru1Rnk2\nk5q5R3171FrvhTOOvaP6F0b2sxP7zG11baY95Ii2ho7SD+g4fXE/2p/20JfGxm5lb+td691Rlmzd\n5rS6O4HbJH0QEaWj2fMVpTs49Csz66SiCw7bDUm3klI1in0UERXvC10Pku4g3zGnPYiIL5QuIv4t\nZe6Co/RjPccAe5bOmwtPAytIuiQijmzFcusuX/dxXplZN0ZErWc+2jVJl5Iudi21U8zF/cbb23vB\nzMySZkfCJR0cEVfnPMbZRMRp5aabmVnH45Hw9qWj9AM6Tl/cj/anPfRlTkfCC3cImNa6zTEzs/nN\naRfsOs+/zMzMOopqQfif8gWYvnLezMzMzKyVVAvCpzL7vYhh5n1x28WFmWZmZmZm85Nmg/CIqPlH\nMszMzMzMrDbNBuGSmr3wck5+wMfMzMzMbEFXLR2l8EMWq+W/h0gpKL1Jtz8zMzMzM7MWqpaOciqA\npL8CG0XEtPy8M3Bj2zfPzMzMzKzjqTXnuwfpYsyCJmCl1m+OmZmZmVnHV9MvZgJ3AS9JGku6d/j6\nwO1t1iozMzMzsw6s1p+tP0XSMGBt0oj46cCbbdguMzMzM7MOqyW3IOwMTAImAisAj7VJi8zMzMzM\nOriaRsIlDQZ2AJYFXgZWBX7Xhu0yMzMzM+uwah0J3ygiVgfGRcSGwPbAYm3XLDMzMzOzjqvWIPzL\n/L+LpIaIGAts3kZtMjMzMzPr0Gq9O0pIOoL0Yz33SwpgybZrlpmZmZlZx1VrEP5T4BvAh8C+wDLA\nOW3VKDMza38e2X2vutTznSuH1aUeM7N5qaZ0lIhoAjYFjoiI64C/4lsUmpmZmZnNkZqCcEnnAocC\nB+dJ/YCL2qpRZmZmZmYdWa0XZvaOiD2BjwEi4kzSr2aamZmZmVkL1RqEf57/NwFI6kTt+eRmZmZm\nZlak1kD60fyz9ctL+iWwFzCqzVplZu2GpNHAgHxr0sK0c4BJEXFBndvyO+D5/HStiDhuDsvZCngx\nIt5t4Xr9gIHAjyPi4SrLLgucHhGHF03rCQyPiF4tb/WMMnYD7o2IKRXmDyL9uvHzpNet75zWZWZm\nbafZIFxSYaT8VGBP4FPST9ZfCNzatk0zs3biOuCHwNiiaXsBW8+b5rSKQ0i/+tuiIBzYDjihWgAO\nEBETgMOrLTcHfgk8CJQNws3MbP5QbSR8KjkFBWjIjxvy8yagUxu1y8zajxuBR4ATACRtALwZEW9K\nWgMYQvo8mAz0J/2GwJ+BT/K8NUkH8dOBOyLibEn7A0cB04AXIuInkvoDWwCNgIDzI2KopANy3W+Q\nUuMKI+Hk9pxDGiA4F7gcWAXoApwWEfdJOqG4fmAMsAewpqS9gF7AsaTPuycj4thybQFeB3YGNpT0\nAXAl8BRwHzC6wnYoN+rdWdKfge8AT0fE4ZKWB4YCi+Rt8uOIeF3SRbl9nYA/5HmbAPdI2hY4jHSh\n/HTgtkpnJiTtWUsfI2JoufXNzKz1NRuER8QsOeOSRkTE/Dz6ZWYtFBHvSnpV0kYR8QRpVPy6PPti\n4PCI+Hf+Qa8jgb8A3wN6RMR7kq4GliMFkD/N6y0O7BgRH0p6SNLaefrawGbAasANkq4CziYFoh8w\n62g8kvYGVoyIAyQdCHwREb1zUDuSFOgeV1x/RNwvaRwwAHgf+DWwaUR8KekmSYVfA56lLRGxnqR7\nSYH1KEmrAHtExAuS/l5hO5SzBrAL6aDiidz3Y4ALIuIBSTsDp+aDh/+JiFUldQb6R8QwSWcCO5HO\nSvYlBdIAj0i6ubQySV+rtY+kA4F5rrGxW4eoox46Sj+g4/TF/Wh/2mtfWnpxZVP1RcysA7oO2Ad4\nAtiNFLgBbARcIQnS6POYPP2ViHgvPx4OPJDLKASm7wO35/VWB5bO00dHxDRJbwBL5OmTC7nbkh4p\nalNhhH2N/LwXKfAmIt6S9KWkpSrUX1xGD+BvuS1LACtVaEupTyPihSrboZyXI2J87s8Y0ij0Zump\nfk0a9Z4YEe9LeknS7cDNwLUl5WxECp5H5OfdgJ5l6pubPs4TEydObtPyGxu7tXkd9dBR+gEdpy/u\nR/vTHvpS6SCgpUF4Q/VFzKwDuhU4WdL1wEsR8UGe/hmwdf5BL2DGxYcz8pUj4meSvksaQR8paRPg\nEmDdiJgg6c6ieqYWPW7If9OLphWfnesJvEAaDf4zs6bLQUrtmF6m/o2KlpkCjI2I7xd3NqdqlLal\nVHFOdqXtUE7pYEZTLmvviHi7eEZE7CRpfVLKyYHADiX131V84Weud5sy7ZzTPpqZWRvxSLiZVRUR\nkyU9C5zMzFQUgGeAHUk5yvsCE4FXCjMlLQEcHRFnAGfku5J0A6bmAHxF0gj2IhWqfg9YQtKSpLzv\nzUn51wB3kfLA/yHpftLo89akNJYVScF7k6TTSur/ep63MBDA6pK+mdNuTifllbdUs9uhxKqSlgPe\nATYk5ZI/TspT/0MOopcFHgV2i4iLgKckFVJxCm0fC5wraTFSrvyFwIll6mutPpqZWSuqdneU8cwa\neH9T0uvkizQjokdbNs7M2pXrSCkR+xdNOxq4XNKJpECwHynIBSAiPpLUKOkJ0oWaj+Y88ftzKsYz\nwHnA70lB5CwiYnq+5d4o4DVKLsqMiImSBpIuWvwh0EfSCFJQf3iF+t+XNIqUprI7KR/7bklfAk8D\nb83Btml2O5R4BjiLlCYyOiL+mft4taT9SJ+5/XM7NstB/ZfAVXn9kcA/gD6kbfYQKd/9toj4PKec\nFG+jzyS1Rh/NzKwVNTQ1VR7clrRSxZlARPy31VtkZmbt0iO771WXs6HfuXJYm5bfHnJEW0NH6Qd0\nnL64H+1Pe+hLY2O3sul+1e6O4iDbzMzMzKyV1fqz9WZmZmZm1kpaemGmmZktoDa//ZZ5flrXzKyj\n8Ei4mZmZmVmdOQg3MzMzM6szB+FmZmZmZnXmINzMzMzMrM4chJuZmZmZ1ZmDcDMzMzOzOnMQbmZm\nZmZWZw7CzczMzMzqzEG4mZmZmVmdOQg3MzMzM6szB+FmZmZmZnXmINzMzMzMrM4chJuZmZmZ1ZmD\ncDMzMzOzOlt4XjfAzMzmDz+88WfzugkLjEu2OW9eN8HM2phHws3MzMzM6sxBuJmZmZlZnTkINzMz\nMzOrMwfhZmZmZmZ15iDcbAEg6R5JEyTt0swy60j6TjPzt5L0zbZpYXWSNpXUJGm9Vi53UguW3U3S\nIq1Zfy63h6SNykwfIGlQC8sa1tzrbGZm7YODcLMFQETsBNxbZbE9gYpBOHAIMM+CcKAfEMC+87AN\nvwRaPQgHtgFmC8LNzKzj8i0KzRYwkjoBlwOrAJ2B04CJwE+BiZImA6fnxTsDvYBdgD2ANSXtFRGv\n57L6AAOA6cDqwPCIOF3SGsAQoAmYDPQHlszze+V1nwT6AlOAoaTgdhrw40L5JW3eixSAXwOcmKcP\nA94CNgB6APtHxFOSLgI2A14AlNerWI+kwcCGwDvAD4HFgGG5zZ2BnwNrApsA90jaFjiMdGAwHbgt\nIi4oaXOlbXAN8AqwLvA0cBIwCPhK0uvAp8CFwATgbeDVXN5ZwJZAJ2BIRFwvaaVcXifgv8BBRfV3\nBu4BziIdvMzWd0kvA7cBmwMfAv8TEdMxM7M25yDcbMHTD3g7Ig6V1B14MCLWkXQvKUgeBfQBkHQ+\ncEtE3C9pHDCgNEAmjeB+l3Rm7TVSAH8xcHhE/FvSEcCRwF8qtOdM4IKIeEDSzsCppAC32HbAvyLi\nIUnvSdo0IkbneV0i4vuSfgocKOkrYAvSwcOapEC3uXqWBq6PiKMlDQd2BL4HPBYR50rqBfw+InpL\nOhPYCViBdACxRS77EUk3l2ybSttgA2Af4F3gDeA4UsA/KSL+KukJ4ICIeEbS3cCrkrYEVoqIrSR1\nAZ6SdBspwP7fvN55uc8FvwduiogRkoZW6PsqwLURcZykx4B1gHEVXiero8bGbq263Pygo/TF/Wh/\n2mtfHISbLXg2A7aUVAggFy2X5yxpO2At4Pgq5T0VEZ/ldQrTNgKuyM+7AGOqtEeSfk0a0Z1YZpl+\nwPX58XXAfkAhCH84/38D2Jg0Iv9YHtF9TtJrVer5IiIey4+fII2c9yIFuETEk5K+XdKejYDVgBH5\neTegJ/B6yTLltsHLETGB1Ji3gCVKyu4ZEc/kx6OARXPbN5E0Mk9fCFgOWB84Orfz+Fzmz0gj4l0i\nYkCVvn8cEc8Wbb/Sttg8MnHi5KrLNDZ2q2m5+UFH6Yv70f60h75UOghwEG7WgUlaEvgsIqaQArep\npLSMsyLi+pJlix93B34H7BgRTVWqmVpm2mfA1sXr5tSJYp3z/ynA3hHxdoU+dAV2AzaQNICUUrGk\npGPK1N+Q/4pTKgptqFRPaf+a8l9D0bROJctMAe6KiMPLtTkrtw16Mvv2aih5Xtz2wnU7U4ChEXFO\n8YKSplH+2p6FgFUkrRYR/6Zy36u1xczM2ogvzDTr2C4BfiCpgZQyEsDjwO4Akr4p6ey87HRmHpgP\nBU4ujNiWmV/NM6S0DiTtm3OoPwaWkdQgaVlg1bzs46R8cyRtI6lfSVm7klJm1oqI9SJiDeBFYOsK\ndb9CCtgbJK0OFIL/SvUsKmmD/HgT4F+kUeut87KbAM+XbIOxwNaSFsv1DJa0aA3boJLibfumkgZy\nWlBu+66SFpLUVdLFefoY0kWdSDojn70AuJqUxz40l1NtG5uZWZ05CDfr2AYBxwCPAHdHxH+Am4BP\nJD0K3MHMdI6HgYsknULKwT5e0sj8921SasRwSWvWUO/RwMmSRpEuSHw6Ij4AHiAFjmcxM1d7ELCH\npIeAgcxMMynoRwoqi11NhbukRMSTwEukwPMY4J+kixEr1fMWsH+ePg34GzCYFMg/CPw29wdgJPAP\n0ij3hcBDwGPAhIj4vNo2KNfebDRpe+8PnAIMJ70243OfHiWlvozOdY7N6w0EDst1rMzM9Bgi4sHc\n958303czM5tHGpqaqp1pNjObf+QLF/eJiGslLU4aNV85IsqlzVgL/PDGn/kLo04u2ea8qsu0h1zX\n1tJR+uJ+tD/toS+Njd3Kpvp5JNzMOpSI+BLYMN8CcQRwqgNwMzNrb3xhppl1OBFx1Lxug5mZWXMc\nhJuZWU1u2ucP8/y0bmtoD6enzcycjmJmZmZmVmcOws3MzMzM6sxBuJmZmZlZnTkINzMzMzOrMwfh\nZmZmZmZ15iDczMzMzKzOHISbmZmZmdWZg3AzMzMzszpzEG5mZmZmVmcOws3MzMzM6sxBuJmZmZlZ\nnTkINzMzMzOrMwfhZmZmZmZ15iDczMzMzKzOHISbmZmZmdXZwvO6AWZmNn/Y9djb53UTzMzq7qoT\nt2mTcj0SbmZmZmZWZw7CzczMzMzqzEG4mZmZmVmdOSe8HZPUE3gOGFs0eVxEHCOpPzAO2BEYFhET\naixzANAduA34QUQMrHG9m4BvR8T6zSxzJPAj4EtgUeDkiHigwrLrtaT+GtrXFZgADIqIC8vM/xrw\nfET0rKGsPsCZwHSgG/CniPh9a7SzPcnb7I/AmhHRK09bDBgGLAN0JW2HN4AbgVMj4qYy5fShBdtL\n0u0RsXsr92VV4G/AVRFxdoVlvg4MJfWtEzAJOCgiPqyw/InAqIgY3Upt3A+4FlguIiaVmT8A6B4R\ng6qUcyXwRERcnp93A54Gtqj1c6BMmU+T+nrMnKxvZmYt5yC8/YuI6FNm4rD8cNwcFjquhetuAzRW\nmpkPGA4DNoyIryStBlwJ/H979x5v13zmcfyThmiKhozUNa4vvqSUaShBJBGXtop5kbokxvU1WiWh\nxTBl3KZtTBQzLiOTQeLezmhJShHUnSGaogwPQg0SE5W6hEhIzvzx/Hbs7Ox9kjjsczbf9+t1XvZZ\ne+21nmf99j559rN+a6lbhH+M/S/JHmQRfgCwWBG+jMYBgyNiuqSewB2SfhERMzoaZBdzDjkGX61a\ntifwaESMkbQecHtEbCJpOPADYLEinGU8Xp90AV5sD9zcqAAvfkgWr+cASDoVGAFcXG/liDj7E45x\nODANGAaM7cB2TgXulXRNRLwL/D1w2cctwIv+wJQOvN7MzJaRi/AWVDqPx0TEsPL7nyNiNUl3A08C\nRMQxVesPJQvT14AZwAvV25C0D3A88CFZgB1fZ7ezI6JN0h7AiTXPjQOeIjunPYAPIuI5YFDZ/y5k\np3Qe8BdgP7Joquz/AmBrsjt5SURMkLQf8KMS0+8j4lhJZwC9AAEbAcdFxC0lhuHA6cDPJW0QES+W\nzuevSlz3Vx2Pk4B9yM7tb+oUbr2BlcpxnAPsUF63MjAeWJX87IyMiCckjQBGAvOBpyLiSEnrAleX\nZcsBBwHTy7HaEFgBOC0iJkt6nuxI71mW7xIR79QZA8oY3wXsWuK/Aji07GcosCZwVVl9eeAQctrZ\n1cAAYAOyqz0A+DHwV2QhSsn3l1W760t2wQHeIMennkbHa53aWCJiWtX79WDgGPJ98XhEHC1pC7Io\nXgC8U+L/WllvAbAZcH1EnFkTQ3fyvYWkC4Etap7fB1ilxFHJ9SeVx5LOA75BvlfGRsSlkiYA1wNP\n0MGxlNS7bP9wsmgeW/Zb77PZi/yys0L5OToiplbF/ZqkK4ETJP1Hya1/2d5A4GfAB8DL5Bfjyvtk\nHWBF8mzRTbV/LyTVfc+Zmdmnw3PCP3uerC7Ai9HAQRGxKzkVZaEyTeNUYOeIGAT0lbRDzTobAG8D\nRMTNETG45ufaiHgceAR4UdIESftJqnzJWxUYXrb/NrB71bZ7A3tExPbAjsDyJaafkQXMjsCGkoaU\nl/SNiG8DxwLfK9v4MrATMIksMA8o6x5UjsdAFu26n0AWittTCrca/whMkTRR0tGSVi3LjwNujYih\nwFHAuWX5isA3I2IHYNNSSA4ju8hDSqxrAgcC75fjsA9wUXn9csAzEbET8CJZTLdnRjku3YHeJb/u\nZOG5JnBW2e/lwA/KF6JbyALwbODYiPigUaEPIOlB4NqSM8B7ZBFXT6PjtVgsNa87Adi35PJo6aL/\nK3BiOftzD3nsIAvYQ8kvDyPrxFD9Hh1Z5z06iyzuh0uaKmm0pC1Lrl8E/lTiGAicVbPtT2Isvwvc\nBNwKbCxp7bK83mdzKPBKOQYjgK/UyfdcYP+S01kR8X5ZfgGwd0TsDPxf2W9vYHKJdT+g+gtM9d+L\nFcpnz8zMqvTps3KHfqPOmKAAAArbSURBVBpxEd71SdLdVT+nLGH9R+osW78UyZCFTbWvAusCt5XO\n2MbAelU770mepv4JSxARB5Pd78fIbt/tkroBrwOXSroHGEJ2XyuvmQU8K2kiWVRcCWwCPBcRs8tq\ndwN/XR5XOtqvkF1xgH2B20oX9lqyQALoBzxYtY2K68lpMn8HXFMnj0vIbvuvgV2A/5G0Jlm0f78c\np3+r2v8sYGLJb7OS32TgYEnnAitExH+T3f67yz6mA3PLlxCA++rk1UhljGeQc4EhC65eZEd1lKR7\nyekXlWM9Gvg+8HZEPLCE7VO+FO0FXC2pW0TMBGZKGldn3UbHq1EsFdcBN0g6DvhtGb9+EfFwef4u\nPhr3qRHxXtV7YqEy1/ow8ixFezk9X+I8mTxjc6ekw0sB27t88biFxaddfRJjORy4LiLmk++//cvy\nep/Nh4ABksaS12HcWieXOcDPgXUrZy8krU5+fn9d3qNDgLXJL5rbSHqA7IhXj0P134vxwFTMzGwR\nr7/+Tod+GvF0lK5vsTnhkgbVrLN81eN5dbaxoOpx7ReveeR0j92pIyLmSNqZPH3+n+1MR7mOLFCe\nBp4uUwKeIQv8y8lu99OSLqp5LRHxLUlfJwuVg4GTgG5Vq/QA5pTHH1Ytr6wzHNhIUqXbvYmkfuX5\nSu4L846IoyRtSnYF75b0jYhYuF1JPcv82iuAKySNB3Yrx2pkVF2oJ6kH2Y3cskwTuKns48nSad0N\nGC3pcqCtTl6V+Orl1ciHDR53I7u4t0XEWEnDgO+U575Unl+9vQ1L6g/MjIiXI+Kxcjajj6S55PSU\nwZLOJL9s/TEiRrZzvAY3iAWAiBgt6Rqy0/w7STvVhNPo+CwiIq6TtA3Z9R3XznSUOaV4nQxMljQJ\nOEPSNPKah0GR1zPMrtl+h8ayTMvZFjhXUhs5Fm8C51HnsxkRM8r+hgBHSdouImq78wAvkN32innA\nq3X+XhxCdsMHlv8+WvOaihHkF00zM2sCd8Jb09vkKXEkfY28I0V7XlXqRhZG1QLYTNJXyvbOrDpV\nnitEPAGsVR7XnY4CHEEWQJXCpBf5/ppZHv+vpFXIwqJHZduS1pc0KiKmRsQJZJfuWfKUfSWvQSxa\nOFD1+jXIjvcmEbFVRGxFTmU5sOS2dVl1SFm/l6TTIuKZUtjMAr5ctb2Ngd9XTstL+kLJ/QXgYeBv\nyvJ+kn5EHvsPSwHet+yvh6QDgM0j4kZyus/W5BmFShx9gQXR4M4cHbAaMK2Mw958dKxHk3PmX5K0\nf6MXk9N6ji8xrk7O9f4zOaVoWkTMj4jTy7iPXMLxahQLkr4g6afk1JrzyO7vesCTkgaU1RqOex3V\n79FG01FuV16fULFOVZwvlwJ8L6B7+XJVibWjY3kgcHFEbFnenyI77xtR57NZYtwlIiaTU2+2brDd\nRUREZU58v/LfkeXvw2rAixGxgPwy0qPBJtqizl1bzMzs0+FOeGt6HHi3nF6+H3h1CeufQp4Cf4m8\nWGuhiHivMh2gdDv/QF50tqzGA5sCD5dO4vLAqNJJvxh4gCyuxwBnkBcFUva1fSl05pK3mHtX0onA\nrZIWAPdHxP01BVTF/uRp/uru4xVkt3NbcrrDneRxaouItyT1kfQIMBt4MCJmqep2dJL+mZyq8B55\nYdzEiLivdNonSLqPnIM9KiLekHS7pCnkuIwBzifnX19UjsV8YBTwHNlJvosshL73MY7zkvw7cCHw\np/LfccoLUdeLvBjvIeAeSbeQt+vrS5nyRJ7RGAtcVnLsSV4UuEBS3Z1FxHPtHK9edWLZrbxugfJC\nwIckvUUWw4+Rx+ni0jH+CznNpOFtMZfRYWXbp5Hd6jfJuf3vAycppxPdSM7dvqTqdc8CYzswlgeS\nZ3gAiLzA+Qry2oV6n83nyWlAJ5Gd8tNhqW/teAQwXtI8Prp49G1gkqTtyLNSr5RjYGZmnahbW1tb\nZ8dgLUDSa8BapZtmnzNlmsqREfFpfHHoMOW88AERMaqzY2lFkrqTX0i3bW+9PY+f6H8wzOxz5/KT\nd+7Q6/v0WbnuNFN3wm1pTSK75Ft2diCfdcrbG15Z56l74hP6nxstYzxbkXP+692VpKu4CzhN0o+j\n/XuFW31TgRs6Owgzs88Td8LNzGypuBNuZp9Hn1Yn3EW4mZktrbb2brfVKvr0Wbnd24a1is9KHvDZ\nycV5dD1dIZdGRbjvjmJmZmZm1mQuws3MzMzMmsxFuJmZmZlZk7kINzMzMzNrMhfhZmZmZmZN5iLc\nzMzMzKzJXISbmZmZmTWZ7xNuZmZmZtZk7oSbmZmZmTWZi3AzMzMzsyZzEW5mZmZm1mQuws3MzMzM\nmsxFuJmZmZlZk7kINzMzMzNrMhfhZmZmZmZNtlxnB2BmZl2bpPOB7YA24NiImNLJIbVL0mDgv4Cn\nyqI/AmOAq4DuwAzgbyNirqQRwHHAAmBcRFzW/IgXJ2lzYCJwfkRcJKkvSxm/pOWBCcB6wHzgsIh4\noYvkMQHoD7xRVjknIm5ugTzGAAPJumk0MIUWHA+om8tetNiYSPpSiWN14IvAPwGP02Jj4k64mZk1\nJGkQsHFEDACOAC7o5JCW1j0RMbj8jATOAi6OiIHA88DhklYETgN2AQYDP5TUu9MiLkpcFwJ3Vi1e\nlviHA29GxI7AT8lCq+ka5AHwD1Vjc3ML5DEE2Lx8Br4J/AstOB7QMBdosTEB9gQejYhBwH7AebTg\nmLgINzOz9gwFbgSIiKeBVSV9uXND+lgGA5PK49+Q/yhvC0yJiLciYg7wALBD54S3iLnAt4HpVcsG\ns/TxDwVuKOveQeflVC+Perp6HvcC3y2P3wRWpDXHA+rn0r3Oel06l4j4ZUSMKb/2BV6hBcfERbiZ\nmbVnDeD1qt9fL8u6un6SJkm6X9KuwIoRMbc8NxNYk8VzqyzvVBHxYSkYqi1L/AuXR8QCoE1Sj083\n6sU1yAPgGEm/k/QLSavR9fOYHxHvll+PAH5LC45H2X+9XObTYmNSIelB4FpyuknLjYmLcDMzWxbd\nOjuApfAccCawN3AIcBmLXgPVKIdWyA2WPf6ulNdVwMkRsTPwGHBGnXW6ZB6S9iYL12Nqnmq58ajJ\npWXHJCK2J+e0X10TS0uMiYtwMzNrz3QW7XyvRV701GVFxKvldHVbREwDXiOn0fQsq6xN5lWbW2V5\nVzR7GeJfuLxcgNYtIuY1MdaGIuLOiHis/DoJ2IIWyEPS7sApwLci4i1aeDxqc2nFMZHUv1ysTIl9\nOeCdVhsTF+FmZtaeycAwAElfB6ZHxDudG1L7JI2QdEJ5vAZ5B4XxwL5llX2BW4GHgW0krSJpJXJe\n6H2dEPLSuIOlj38yH8373RO4q8mxNiTpV5I2LL8OBp6ki+chqRdwDvCdiJhVFrfkeNTLpRXHBNgJ\nOB5A0urASrTgmHRra2vrjP2amVmLkHQ2+Y/eAuDoiHi8k0Nql6SVyXmiqwA9yKkpfwCuJG9n9hJ5\nS7IPJA0DTiRvv3hhRFzTOVF/RFJ/4FxgfeAD4FVgBHlLtSXGL6k7cCmwMXlx5KER8XIXyeNC4GTg\nPWB2yWNmF8/jSHKKxrNViw8psbXMeEDDXMaT01JaaUx6ktPM+gI9yc/4oyzlZ7yr5OEi3MzMzMys\nyTwdxczMzMysyVyEm5mZmZk1mYtwMzMzM7MmcxFuZmZmZtZkLsLNzMzMzJrMRbiZmZmZWZO5CDcz\nMzMza7L/B6/iIjoY1Qo3AAAAAElFTkSuQmCC\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"tags": []
}
}
]
},
{
"metadata": {
"id": "jE8rtFpHRbqV",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"## 2.3 Loading BigQuery Data with Pandas"
]
},
{
"metadata": {
"id": "qR1gtR0H2Ivj",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"- Create a Cloud Console account and setup a project: https://console.cloud.google.com\n",
"- [Pandas Documentation](http://pandas.pydata.org/pandas-docs/stable/) & [Pandas BigQuery for Colab ](https://colab.research.google.com/notebooks/snippets/bigquery.ipynb#scrollTo=KcAZ2RHCg_Ze)\n",
"---\n",
"\n",
"> --> **USE CASE**: To work with a dataset for ML, load it into a pandas **dataframe**\n",
"\n",
"> --> Authorization needed? **YES**\n",
"\n",
"> --> Can this be rewritten in *Ads Scripts*? **NO**\n",
"---"
]
},
{
"metadata": {
"id": "Jl6muHefLQaY",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### 2.3.1 Example Query: Group performance data by Word count / N-gram\n",
"\n"
]
},
{
"metadata": {
"id": "ulH03WIlZFkk",
"colab_type": "code",
"outputId": "46423bbf-19f1-4346-a8de-d33bb2ec15c5",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 266
}
},
"cell_type": "code",
"source": [
"####################\n",
"project_id = 'trafficestimator-191922'\n",
"#################### \n",
"\n",
"from google.colab import auth\n",
"auth.authenticate_user()\n",
"import os, pandas as pd\n",
"os.environ[\"GOOGLE_CLOUD_PROJECT\"] = project_id\n",
"\n",
"wordGroupDf = pd.io.gbq.read_gbq('''\n",
" SELECT \n",
" LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 AS WordCount, \n",
" SUM(Impressions) AS Impressions,\n",
" SUM(Clicks) AS Clicks,\n",
" ROUND(SUM(Cost/1000000),2) AS Cost,\n",
" ROUND(SUM(Conversions),2) AS Conversions,\n",
" ROUND(SUM(Cost/1000000) / SUM(Clicks),2) AS Cpc,\n",
" ROUND(SUM(Conversions) / SUM(Clicks)*100 ,2)AS CRInPerc,\n",
" CASE \n",
" WHEN SUM(Conversions) > 0 THEN ROUND(SUM(Cost/1000000) / SUM(Conversions),2)\n",
" ELSE 0\n",
" END AS CPA\n",
" FROM\n",
" [1049448236043.SPOR_Transfer.p_SearchQueryStats_2131444427]\n",
" WHERE\n",
" Clicks > 1 AND Query NOT LIKE '%chuster%'\n",
" GROUP BY \n",
" WordCount\n",
" ORDER BY \n",
" WordCount ASC''', project_id=project_id, verbose=False) # % (sample_count, row_count)\n",
"\n",
"wordGroupDf.head(7)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>WordCount</th>\n",
" <th>Impressions</th>\n",
" <th>Clicks</th>\n",
" <th>Cost</th>\n",
" <th>Conversions</th>\n",
" <th>Cpc</th>\n",
" <th>CRInPerc</th>\n",
" <th>CPA</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>84464</td>\n",
" <td>5219</td>\n",
" <td>2005.78</td>\n",
" <td>107.08</td>\n",
" <td>0.38</td>\n",
" <td>2.05</td>\n",
" <td>18.73</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>297563</td>\n",
" <td>32959</td>\n",
" <td>13318.82</td>\n",
" <td>819.23</td>\n",
" <td>0.40</td>\n",
" <td>2.49</td>\n",
" <td>16.26</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>180769</td>\n",
" <td>33177</td>\n",
" <td>12018.56</td>\n",
" <td>1075.24</td>\n",
" <td>0.36</td>\n",
" <td>3.24</td>\n",
" <td>11.18</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>41037</td>\n",
" <td>17612</td>\n",
" <td>6047.65</td>\n",
" <td>683.29</td>\n",
" <td>0.34</td>\n",
" <td>3.88</td>\n",
" <td>8.85</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>10004</td>\n",
" <td>7400</td>\n",
" <td>2460.86</td>\n",
" <td>318.59</td>\n",
" <td>0.33</td>\n",
" <td>4.31</td>\n",
" <td>7.72</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>6</td>\n",
" <td>3067</td>\n",
" <td>2860</td>\n",
" <td>962.40</td>\n",
" <td>143.71</td>\n",
" <td>0.34</td>\n",
" <td>5.02</td>\n",
" <td>6.70</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>7</td>\n",
" <td>1131</td>\n",
" <td>1034</td>\n",
" <td>329.90</td>\n",
" <td>42.93</td>\n",
" <td>0.32</td>\n",
" <td>4.15</td>\n",
" <td>7.68</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" WordCount Impressions Clicks Cost Conversions Cpc CRInPerc \\\n",
"0 1 84464 5219 2005.78 107.08 0.38 2.05 \n",
"1 2 297563 32959 13318.82 819.23 0.40 2.49 \n",
"2 3 180769 33177 12018.56 1075.24 0.36 3.24 \n",
"3 4 41037 17612 6047.65 683.29 0.34 3.88 \n",
"4 5 10004 7400 2460.86 318.59 0.33 4.31 \n",
"5 6 3067 2860 962.40 143.71 0.34 5.02 \n",
"6 7 1131 1034 329.90 42.93 0.32 4.15 \n",
"\n",
" CPA \n",
"0 18.73 \n",
"1 16.26 \n",
"2 11.18 \n",
"3 8.85 \n",
"4 7.72 \n",
"5 6.70 \n",
"6 7.68 "
]
},
"metadata": {
"tags": []
},
"execution_count": 21
}
]
},
{
"metadata": {
"id": "bu-Z6fJF-IRo",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"## 2.4 Using BigQueryML for Prediction\n"
]
},
{
"metadata": {
"id": "zvO-Uxsn0bBo",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"\n",
"### 2.4.1. Introdcution to BigQueryML\n",
"- Documentation: https://cloud.google.com/bigquery/docs/bigqueryml\n",
"- Tutorial for [Linear Regression (Medium)](https://towardsdatascience.com/how-to-do-online-prediction-with-bigquery-ml-db2248c0ae5)\n",
"- Tutorial for [Multi-label classification (Medium)](https://medium.com/analytics-vidhya/multi-class-logistic-regression-using-bigquery-ml-c4a2c8cddad6)"
]
},
{
"metadata": {
"id": "-317PiBVckb3",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### 2.4.2: BQML Demo I: Predicting Product CPA in Shopping by Brand, Category, Device, etc\n"
]
},
{
"metadata": {
"id": "SX02Mejz033T",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"#### STEP 1: **Create the model**\n",
"- Define the model type: linear regression, logistic regression (ie probability of an event) or multi-label classification\n",
"- Set the input features for the prediction\n",
"- Set training parameters if needed"
]
},
{
"metadata": {
"id": "Hzy9r25O-Lv1",
"colab_type": "code",
"outputId": "b130385e-0f82-415d-8aea-8d46987f10af",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 227
}
},
"cell_type": "code",
"source": [
"####################\n",
"project_id = 'trafficestimator-191922'\n",
"#################### \n",
"\n",
"from google.colab import auth\n",
"auth.authenticate_user()\n",
"from google.cloud import bigquery\n",
"client = bigquery.Client(project=project_id)\n",
"\n",
"# Train a model\n",
"train_query = \"\"\"\n",
" create or replace model\n",
" `1049448236043.SPOR_Transfer.cpo_PredictionModel`\n",
" options\n",
" ( model_type='linear_reg',\n",
" input_label_cols=['CostPerConversion'],\n",
" max_iterations=10) AS\n",
" select \n",
" CostPerConversion, AverageCpc, Clicks, Brand, ProductTypeL1, ProductTypeL2, DayOfWeek, Device # \n",
" from `1049448236043.SPOR_Transfer.p_ShoppingProductStats_2131444427`\n",
" where Clicks > 0\n",
" \n",
" limit 100000\n",
" \"\"\"\n",
"training_job = client.query(train_query, location=\"EU\").to_dataframe()\n",
"print(training_job)\n",
"\n",
"training_info = \"\"\"\n",
" select * from ML.TRAINING_INFO(MODEL `1049448236043.SPOR_Transfer.cpo_PredictionModel`)\n",
" \"\"\"\n",
"\n",
"training_info_job = client.query(training_info).to_dataframe()\n",
"print('\\n******\\ntraining_job info\\n')\n",
"print(training_info_job)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"Empty DataFrame\n",
"Columns: []\n",
"Index: []\n",
"\n",
"******\n",
"training_job info\n",
"\n",
" training_run iteration loss eval_loss duration_ms \\\n",
"0 0 0 8.205823e+12 1.307841e+13 6311 \n",
"\n",
" learning_rate \n",
"0 None \n"
],
"name": "stdout"
}
]
},
{
"metadata": {
"id": "q5-wD5eU8QsT",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"#### STEP 2: **Evaluate Model Metrics**\n",
"- Interpret [model metrics: F1-score, accuracy, recall, confusion matrix, AuPRC ](https://cloud.google.com/natural-language/automl/docs/evaluate?hl=en)\n",
"\n",
"--> **NOTE**: All cost metrics need to be divided by 100000\n"
]
},
{
"metadata": {
"id": "hJfApeANCNjZ",
"colab_type": "code",
"outputId": "d4bddbe1-e522-4506-f831-320b4de9209e",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 103
}
},
"cell_type": "code",
"source": [
"# 2. Perform Evaluation\n",
"\n",
"query_evaluate = \"\"\"\n",
" select * \n",
" from ML.EVALUATE (MODEL `1049448236043.SPOR_Transfer.cpo_PredictionModel`, \n",
" (\n",
" select \n",
" CostPerConversion, AverageCpc, Clicks, Brand, ProductTypeL1, ProductTypeL2, DayOfWeek, Device \n",
" from `1049448236043.SPOR_Transfer.p_ShoppingProductStats_2131444427`\n",
" where Clicks > 0\n",
" ))\n",
" \"\"\"\n",
"\n",
"# NOTE: All cost metrics need to be divided by 100000, not done here\n",
"\n",
"evaluation_job = client.query(query_evaluate).to_dataframe()\n",
"print(evaluation_job)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
" mean_absolute_error mean_squared_error mean_squared_log_error \\\n",
"0 352777.803151 2.471107e+14 123.890247 \n",
"\n",
" median_absolute_error r2_score explained_variance \n",
"0 78407.174374 -45.620518 -45.618866 \n"
],
"name": "stdout"
}
]
},
{
"metadata": {
"id": "0o4yg0zK-40s",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"#### STEP 3: **Predict Output Label**\n",
"- Define the model type: linear regression, logistic regression (ie probability of an event) or multi-label classification\n",
"- Set the input features for the prediction\n",
"- Set training parameters if needed\n",
"\n",
"--> ** NOTE**: All cost metrics need to be divided by 100000"
]
},
{
"metadata": {
"id": "BA2z3CDF98IX",
"colab_type": "code",
"outputId": "e6d9a108-ee74-40a2-c883-a9de6e8e2628",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 292
}
},
"cell_type": "code",
"source": [
"# 3. Perform Prediction\n",
"# Pass in the a) model and b) a select from your original table as arguments\n",
"\n",
"query_predict = \"\"\"\n",
" SELECT *\n",
" FROM ML.PREDICT (MODEL `1049448236043.SPOR_Transfer.cpo_PredictionModel`,\n",
" (\n",
" SELECT\n",
" AverageCpc,Clicks,Brand,ProductTypeL1,ProductTypeL2,DayOfWeek,Device\n",
" FROM\n",
" `1049448236043.SPOR_Transfer.p_ShoppingProductStats_2131444427`\n",
" WHERE\n",
" Clicks > 30\n",
" LIMIT\n",
" 5))\n",
" \"\"\"\n",
"prediction_job = client.query(query_predict).to_dataframe()\n",
"prediction_job['predicted_CostPerConversion'] = prediction_job['predicted_CostPerConversion'].apply(lambda x: x/1000000)\n",
"prediction_job['AverageCpc'] = prediction_job['AverageCpc'].apply(lambda x: x/1000000)\n",
"prediction_job.head()\n"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>predicted_CostPerConversion</th>\n",
" <th>AverageCpc</th>\n",
" <th>Clicks</th>\n",
" <th>Brand</th>\n",
" <th>ProductTypeL1</th>\n",
" <th>ProductTypeL2</th>\n",
" <th>DayOfWeek</th>\n",
" <th>Device</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>10.664417</td>\n",
" <td>0.423766</td>\n",
" <td>77</td>\n",
" <td>garmin</td>\n",
" <td>ausruestung</td>\n",
" <td>elektronik</td>\n",
" <td>SATURDAY</td>\n",
" <td>HIGH_END_MOBILE</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>5.191296</td>\n",
" <td>0.325789</td>\n",
" <td>38</td>\n",
" <td>nike</td>\n",
" <td>schuhe</td>\n",
" <td>running-walkingschuhe</td>\n",
" <td>SUNDAY</td>\n",
" <td>HIGH_END_MOBILE</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>4.978100</td>\n",
" <td>0.417027</td>\n",
" <td>37</td>\n",
" <td>fivefingers</td>\n",
" <td>schuhe</td>\n",
" <td>sandalen-amphibienschuhe</td>\n",
" <td>SUNDAY</td>\n",
" <td>HIGH_END_MOBILE</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>6.070629</td>\n",
" <td>0.196136</td>\n",
" <td>44</td>\n",
" <td>adidas</td>\n",
" <td>schuhe</td>\n",
" <td>running-walkingschuhe</td>\n",
" <td>MONDAY</td>\n",
" <td>HIGH_END_MOBILE</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>4.784602</td>\n",
" <td>1.315625</td>\n",
" <td>32</td>\n",
" <td>on</td>\n",
" <td>schuhe</td>\n",
" <td>running-walkingschuhe</td>\n",
" <td>SATURDAY</td>\n",
" <td>HIGH_END_MOBILE</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" predicted_CostPerConversion AverageCpc Clicks Brand ProductTypeL1 \\\n",
"0 10.664417 0.423766 77 garmin ausruestung \n",
"1 5.191296 0.325789 38 nike schuhe \n",
"2 4.978100 0.417027 37 fivefingers schuhe \n",
"3 6.070629 0.196136 44 adidas schuhe \n",
"4 4.784602 1.315625 32 on schuhe \n",
"\n",
" ProductTypeL2 DayOfWeek Device \n",
"0 elektronik SATURDAY HIGH_END_MOBILE \n",
"1 running-walkingschuhe SUNDAY HIGH_END_MOBILE \n",
"2 sandalen-amphibienschuhe SUNDAY HIGH_END_MOBILE \n",
"3 running-walkingschuhe MONDAY HIGH_END_MOBILE \n",
"4 running-walkingschuhe SATURDAY HIGH_END_MOBILE "
]
},
"metadata": {
"tags": []
},
"execution_count": 12
}
]
},
{
"metadata": {
"id": "A5p0WSQ9EzRb",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"### 2.4.2 BQML Demo Query II: Predict Query Performance by Ngram Components > Next project\n",
"\n",
"> --> **USE CASE**: Calculate performance data for unigrams, bigrams and use as input features for linear regression\n",
"\n",
"> --> **CALL FOR COLLABORATION**: Get in touch if you want to work on this"
]
},
{
"metadata": {
"id": "eXeKf2JTExIu",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"#### STEP 0: **Create an ngram perfomance table **\n",
"- Separate and aggregate performance by bigrams and unigrams, to later use as predictors for query performance"
]
},
{
"metadata": {
"id": "YI8a_p-oD_Q7",
"colab_type": "code",
"outputId": "ea7bf672-ed54-48e2-cd59-e73d08929d19",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 458
}
},
"cell_type": "code",
"source": [
"%%bigquery\n",
"\n",
"--- Unigrams Table\n",
"\n",
"WITH\n",
" unigram_TempTable AS (\n",
" SELECT\n",
" Query,\n",
" LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 AS WordCount,\n",
" LENGTH(Query) AS len,\n",
" unigram,\n",
" QueryMatchTypeWithVariant,\n",
" Conversions,\n",
" ConversionValue,\n",
" Clicks,\n",
" ROUND(Cost/1000000,2) AS Cost,\n",
" Impressions,\n",
" ROUND(AverageCpc/1000000,2) AS Cpc\n",
" FROM\n",
" `trafficestimator-191922.SPOR_Transfer.SearchQueryStats_2131444427`\n",
" CROSS JOIN\n",
" UNNEST(SPLIT(Query,' ')) AS unigram\n",
" WHERE\n",
" Query NOT LIKE '%chuster%'# Exclude branded queries\n",
" ORDER BY\n",
" Conversions DESC),\n",
" \n",
"--- UniGrams Table * 5\n",
"\n",
"UniGrams1 AS ( \n",
"SELECT\n",
" unigram,\n",
" ROUND(SUM(Conversions),2) AS Unigram1_Conversions,\n",
" ROUND(SUM(Conversions),2) AS Unigram1_ConvValue,\n",
" ROUND(SUM(cost/1000000),2) AS Unigram1_Cost,\n",
" ROUND(SUM(clicks),0) AS Unigram1_Clicks\n",
"FROM\n",
" unigram_TempTable\n",
"WHERE\n",
" Clicks > 0\n",
"GROUP BY\n",
" unigram\n",
"ORDER BY\n",
" Unigram1_Cost DESC),\n",
"\n",
"\n",
"UniGrams2 AS ( \n",
"SELECT\n",
" unigram,\n",
" ROUND(SUM(Conversions),2) AS Unigram2_Conversions,\n",
" ROUND(SUM(Conversions),2) AS Unigram2_ConvValue,\n",
" ROUND(SUM(cost/1000000),2) AS Unigram2_Cost,\n",
" ROUND(SUM(clicks),0) AS Unigram2_Clicks\n",
"FROM\n",
" unigram_TempTable\n",
"WHERE\n",
" Clicks > 0\n",
"GROUP BY\n",
" unigram\n",
"ORDER BY\n",
" Unigram2_Cost DESC),\n",
"\n",
"UniGrams3 AS ( \n",
"SELECT\n",
" unigram,\n",
" ROUND(SUM(Conversions),2) AS Unigram3_Conversions,\n",
" ROUND(SUM(Conversions),2) AS Unigram3_ConvValue,\n",
" ROUND(SUM(cost/1000000),2) AS Unigram3_Cost,\n",
" ROUND(SUM(clicks),0) AS Unigram3_Clicks\n",
"FROM\n",
" unigram_TempTable\n",
"WHERE\n",
" Clicks > 0\n",
"GROUP BY\n",
" unigram\n",
"ORDER BY\n",
" Unigram3_Cost DESC),\n",
" \n",
" \n",
"UniGrams4 AS ( \n",
"SELECT\n",
" unigram,\n",
" ROUND(SUM(Conversions),2) AS Unigram4_Conversions,\n",
" ROUND(SUM(Conversions),2) AS Unigram4_ConvValue,\n",
" ROUND(SUM(cost/1000000),2) AS Unigram4_Cost,\n",
" ROUND(SUM(clicks),0) AS Unigram4_Clicks\n",
"FROM\n",
" unigram_TempTable\n",
"WHERE\n",
" Clicks > 0\n",
"GROUP BY\n",
" unigram\n",
"ORDER BY\n",
" Unigram4_Cost DESC),\n",
" \n",
"\n",
"UniGrams5 AS ( \n",
"SELECT\n",
" unigram,\n",
" ROUND(SUM(Conversions),2) AS Unigram5_Conversions,\n",
" ROUND(SUM(Conversions),2) AS Unigram5_ConvValue,\n",
" ROUND(SUM(cost/1000000),2) AS Unigram5_Cost,\n",
" ROUND(SUM(clicks),0) AS Unigram5_Clicks\n",
"FROM\n",
" unigram_TempTable\n",
"WHERE\n",
" Clicks > 0\n",
"GROUP BY\n",
" unigram\n",
"ORDER BY\n",
" Unigram5_Cost DESC),\n",
"\n",
"--- FirstBiGrams Table\n",
"\n",
"FirstBiGrams AS (\n",
"SELECT\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 2 THEN CONCAT (SPLIT(Query,' ')[ OFFSET (0)], ' ',SPLIT(Query,' ')[ OFFSET (1)])\n",
" ELSE ''\n",
" END AS FirstBiGram_Word,\n",
" ROUND(SUM(Conversions),2) AS FirstBiGram_Conversions,\n",
" ROUND(SUM(ConversionValue),2) AS FirstBiGram_ConvValue,\n",
" ROUND(SUM(Cost/1000000),2) AS FirstBiGram_Cost,\n",
" SUM(Clicks) AS FirstBiGram_Clicks\n",
"FROM\n",
" `trafficestimator-191922.SPOR_Transfer.SearchQueryStats_2131444427`\n",
"WHERE\n",
" Clicks > 0\n",
" AND LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 2\n",
" AND Query NOT LIKE '%chuster%'\n",
"GROUP BY\n",
" FirstBiGram_Word\n",
"ORDER BY\n",
" FirstBiGram_Clicks DESC), \n",
" \n",
" \n",
"--- SecondBiGrams Table\n",
"\n",
"SecondBiGrams AS (\n",
"SELECT\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 2 THEN CONCAT (SPLIT(Query,' ')[ OFFSET (1)], ' ',SPLIT(Query,' ')[ OFFSET (2)])\n",
" ELSE ''\n",
" END AS SecondBiGram_Word,\n",
" ROUND(SUM(Conversions),2) AS SecondBiGram_Conversions,\n",
" ROUND(SUM(ConversionValue),2) AS SecondBiGram_ConvValue,\n",
" ROUND(SUM(Cost/1000000),2) AS SecondBiGram_Cost,\n",
" SUM(Clicks) AS SecondBiGram_Clicks\n",
"FROM\n",
" `trafficestimator-191922.SPOR_Transfer.SearchQueryStats_2131444427`\n",
"WHERE\n",
" Clicks > 0\n",
" AND LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 2\n",
" AND Query NOT LIKE '%chuster%'\n",
"GROUP BY\n",
" SecondBiGram_Word\n",
"ORDER BY\n",
" SecondBiGram_Clicks DESC), \n",
" \n",
" \n",
"--- ThirdBiGrams Table\n",
"\n",
"ThirdBiGrams AS (\n",
"SELECT\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 3 THEN CONCAT (SPLIT(Query,' ')[ OFFSET (2)], ' ',SPLIT(Query,' ')[ OFFSET (3)])\n",
" ELSE ''\n",
" END AS ThirdBiGram_Word,\n",
" ROUND(SUM(Conversions),2) AS ThirdBiGram_Conversions,\n",
" ROUND(SUM(ConversionValue),2) AS ThirdBiGram_ConvValue,\n",
" ROUND(SUM(Cost/1000000),2) AS ThirdBiGram_Cost,\n",
" SUM(Clicks) AS ThirdBiGram_Clicks\n",
"FROM\n",
" `trafficestimator-191922.SPOR_Transfer.SearchQueryStats_2131444427`\n",
"WHERE\n",
" Clicks > 0\n",
" AND LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 3\n",
" AND Query NOT LIKE '%chuster%'\n",
"GROUP BY\n",
" ThirdBiGram_Word\n",
"ORDER BY\n",
" ThirdBiGram_Clicks DESC), \n",
"\n",
"\n",
"--- FourthBiGrams Table\n",
"\n",
"FourthBiGrams AS (\n",
"SELECT\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 4 THEN CONCAT (SPLIT(Query,' ')[ OFFSET (3)], ' ',SPLIT(Query,' ')[ OFFSET (4)])\n",
" ELSE ''\n",
" END AS FourthBiGram_Word,\n",
" ROUND(SUM(Conversions),2) AS FourthBiGram_Conversions,\n",
" ROUND(SUM(ConversionValue),2) AS FourthBiGram_ConvValue,\n",
" ROUND(SUM(Cost/1000000),2) AS FourthBiGram_Cost,\n",
" SUM(Clicks) AS FourthBiGram_Clicks\n",
"FROM\n",
" `trafficestimator-191922.SPOR_Transfer.SearchQueryStats_2131444427`\n",
"WHERE\n",
" Clicks > 0\n",
" AND LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 4\n",
" AND Query NOT LIKE '%chuster%'\n",
"GROUP BY\n",
" FourthBiGram_Word\n",
"ORDER BY\n",
" FourthBiGram_Clicks DESC), \n",
" \n",
"\n",
"--- AdGroup Performance Table\n",
"\n",
"AdGroup_Perfomance AS (\n",
"SELECT\n",
" AdgroupId,\n",
" ROUND(SUM(Conversions),2) AS Adgroup_Conversions,\n",
" ROUND(SUM(ConversionValue),2) AS Adgroup_ConvValue,\n",
" ROUND(SUM(Cost/1000000),2) AS Adgroup_Cost,\n",
" SUM(Clicks) AS Adgroup_Clicks\n",
"FROM\n",
" `trafficestimator-191922.SPOR_Transfer.AdGroupStats_2131444427`\n",
"WHERE \n",
" Clicks > 0\n",
"GROUP BY\n",
" AdgroupId\n",
"ORDER BY\n",
" Adgroup_Cost DESC),\n",
"\n",
"--- Campaign Performance Table\n",
" \n",
"Campaign_Perfomance AS (\n",
"SELECT\n",
" CampaignId,\n",
" ROUND(SUM(Conversions),2) AS Campaign_Conversions,\n",
" ROUND(SUM(ConversionValue),2) AS Campaign_ConvValue,\n",
" ROUND(SUM(Cost/1000000),2) AS Campaign_Cost,\n",
" SUM(Clicks) AS Campaign_Clicks\n",
"FROM\n",
" `trafficestimator-191922.SPOR_Transfer.CampaignStats_2131444427`\n",
"WHERE\n",
"Clicks > 0 \n",
"GROUP BY\n",
" CampaignId\n",
"ORDER BY\n",
" Campaign_Cost DESC),\n",
"\n",
"\n",
"--- Main Queries Table\n",
"\n",
"Queries AS \n",
"\n",
"(SELECT\n",
" CampaignId,\n",
" AdgroupId,\n",
" Query,\n",
"\n",
" -- bigrams \n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 2 THEN CONCAT (SPLIT(Query,' ')[ OFFSET (0)], ' ',SPLIT(Query,' ')[ OFFSET (1)])\n",
" ELSE ''\n",
" END AS FirstBiGram_Word, \n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 2 THEN CONCAT (SPLIT(Query,' ')[ OFFSET (1)], ' ',SPLIT(Query,' ')[ OFFSET (2)])\n",
" ELSE ''\n",
" END AS SecondBiGram_Word,\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 3 THEN CONCAT (SPLIT(Query,' ')[ OFFSET (2)], ' ',SPLIT(Query,' ')[ OFFSET (3)])\n",
" ELSE ''\n",
" END AS ThirdBiGram_Word,\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 4 THEN CONCAT (SPLIT(Query,' ')[ OFFSET (3)], ' ',SPLIT(Query,' ')[ OFFSET (4)])\n",
" ELSE ''\n",
" END AS FourthBiGram_Word,\n",
"\n",
" -- unigrams\n",
" SPLIT(Query,' ')[OFFSET(0)] AS Unigram_firstWord,\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 1 THEN SPLIT(Query,' ')[OFFSET(1)]\n",
" ELSE ''\n",
" END AS Unigram_secondWord,\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 2 THEN SPLIT(Query,' ')[OFFSET(2)]\n",
" ELSE ''\n",
" END AS Unigram_thirdWord,\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 3 THEN SPLIT(Query,' ')[OFFSET(3)]\n",
" ELSE ''\n",
" END AS Unigram_fourthWord,\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 4 THEN SPLIT(Query,' ')[OFFSET(4)]\n",
" ELSE ''\n",
" END AS Unigram_fifthWord,\n",
" \n",
" ROUND(SUM(Conversions),2) AS Conversions,\n",
" ROUND(SUM(Clicks),2) AS Clicks,\n",
" ROUND(SUM(Cost/1000000),2) AS Cost,\n",
" ROUND(SUM(ConversionValue), 2) AS Revenue # /1000000\n",
"FROM\n",
" `trafficestimator-191922.SPOR_Transfer.SearchQueryStats_2131444427`\n",
"WHERE\n",
" Clicks > 0\n",
" AND Query NOT LIKE '%chuster%'\n",
"GROUP BY\n",
" CampaignId, AdgroupId, Query\n",
"ORDER BY\n",
" Conversions DESC\n",
"LIMIT\n",
" 50000), \n",
" \n",
"\n",
"--- LEFT JOIN Queries and Ngrams\n",
"\n",
"JoinedData AS(\n",
"SELECT * FROM Queries\n",
"LEFT JOIN FirstBiGrams USING (FirstBiGram_Word)\n",
"LEFT JOIN SecondBiGrams USING (SecondBiGram_Word)\n",
"LEFT JOIN ThirdBiGrams USING (ThirdBiGram_Word)\n",
"LEFT JOIN FourthBiGrams USING (FourthBiGram_Word)\n",
"\n",
"LEFT JOIN AdGroup_Perfomance USING (AdgroupId)\n",
"LEFT JOIN Campaign_Perfomance USING (CampaignId)\n",
"\n",
"LEFT JOIN Unigrams1 ON Queries.Unigram_firstWord = Unigrams1.unigram\n",
"LEFT JOIN Unigrams2 ON Queries.Unigram_secondWord = Unigrams2.unigram\n",
"LEFT JOIN Unigrams3 ON Queries.Unigram_thirdWord = Unigrams3.unigram\n",
"LEFT JOIN Unigrams4 ON Queries.Unigram_fourthWord = Unigrams4.unigram\n",
"LEFT JOIN Unigrams5 ON Queries.Unigram_fifthWord = Unigrams5.unigram),\n",
"\n",
"NgramData_By_Query AS (SELECT \n",
" Query, # Clicks,\tConversions,\tCost, Revenue, \n",
" LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 AS WordCount,\n",
" LENGTH(Query) AS CharLength,\n",
" \n",
" /*CASE\n",
" WHEN Conversions > 0 THEN ROUND(Cost / Conversions, 2)\n",
" ELSE null\n",
" END AS CPO,\n",
" \n",
" CASE\n",
" WHEN Revenue > 0 THEN ROUND(Cost / Revenue, 2)\n",
" ELSE null\n",
" END AS CRR,*/\n",
" \n",
" # ROUND((Conversions * 10 - Cost)/ Clicks,2) AS MarginPerClick_MaxCPO_10,\n",
" ROUND((Revenue * 0.35 - Cost)/ Clicks,2) AS MarginPerClick_maxCRR_35,\n",
" \n",
" # Adgroup_Conversions, Adgroup_ConvValue, Adgroup_Cost, Adgroup_Clicks,\n",
" # ROUND((SecondBiGram_Conversions * 10 - SecondBiGram_Cost)/ SecondBiGram_Clicks,2) AS SecBG_MarginPc_MaxCPO_10,\n",
" # AdgroupId, \n",
" ROUND((Adgroup_ConvValue * 0.35 - Adgroup_Cost)/ Adgroup_Clicks,2) AS AdGr_MarginPc_mCRR_35,\n",
" \n",
" # Campaign_Conversions, Campaign_ConvValue, Campaign_Cost, Campaign_Clicks,\n",
" # ROUND((SecondBiGram_Conversions * 10 - SecondBiGram_Cost)/ SecondBiGram_Clicks,2) AS SecBG_MarginPc_MaxCPO_10,\n",
" # CampaignId,\n",
" ROUND((Campaign_ConvValue * 0.35 - Campaign_Cost)/ Campaign_Clicks,2) AS Camp_MarginPc_mCRR_35,\n",
" \n",
" FirstBiGram_Word AS __FirBG_Word, \n",
" # FirstBiGram_Clicks,\tFirstBiGram_Conversions, FirstBiGram_ConvValue, FirstBiGram_Cost,\n",
" # ROUND((FirstBiGram_Conversions * 10 - FirstBiGram_Cost)/ FirstBiGram_Clicks,2) AS FirBG_MarginPc_MaxCPO_10,\n",
" ROUND((FirstBiGram_ConvValue * 0.35 - FirstBiGram_Cost)/ FirstBiGram_Clicks,2) AS FirBG_MarginPc_mCRR_20,\n",
" \n",
" SecondBiGram_Word AS __SecBG_Word,\t\n",
" # SecondBiGram_Clicks,\tSecondBiGram_Conversions,\tSecondBiGram_ConvValue, SecondBiGram_Cost,\n",
" # ROUND((SecondBiGram_Conversions * 10 - SecondBiGram_Cost)/ SecondBiGram_Clicks,2) AS SecBG_MarginPc_MaxCPO_10,\n",
" ROUND((SecondBiGram_ConvValue * 0.35 - SecondBiGram_Cost)/ SecondBiGram_Clicks,2) AS SecBG_MarginPc_mCRR_35,\n",
" \n",
" ThirdBiGram_Word AS __ThiBG_Word,\t\n",
" # ThirdBiGram_Clicks,\tThirdBiGram_Conversions, ThirdBiGram_ConvValue, ThirdBiGram_Cost,\n",
" # ROUND((ThirdBiGram_Conversions * 10 - ThirdBiGram_Cost)/ ThirdBiGram_Clicks,2) AS ThiBG_MarginPc_MaxCPO_10,\n",
" ROUND((ThirdBiGram_ConvValue * 0.35 - ThirdBiGram_Cost)/ ThirdBiGram_Clicks,2) AS ThiBG_MarginPc_mCRR_35,\n",
" \n",
" FourthBiGram_Word AS __FouBG_Word,\t\n",
" # FourthBiGram_Clicks,\tFourthBiGram_Conversions, FourthBiGram_ConvValue, FourthBiGram_Cost,\n",
" # ROUND((FourthBiGram_Conversions * 10 - FourthBiGram_Cost)/ FourthBiGram_Clicks,2) AS FouBG_Profitpc_MaxCPO_10,\n",
" ROUND((FourthBiGram_ConvValue * 0.35 - FourthBiGram_Cost)/ FourthBiGram_Clicks,2) AS FouBG_MarginPc_mCRR_35,\n",
" \n",
" Unigram_firstWord AS __UniG1_Word, # Unigram1_Clicks,\n",
" ROUND((Unigram1_ConvValue * 0.35 - Unigram1_Cost)/ Unigram1_Clicks,2) AS UniG1_MarginPc_mCRR_35,\n",
" \n",
" Unigram_secondWord, #Unigram2_Clicks,\n",
" ROUND((Unigram2_ConvValue * 0.35 - Unigram2_Cost)/ Unigram2_Clicks,2) AS UniG2_MarginPc_mCRR_35,\n",
" \n",
" Unigram_thirdWord, #Unigram3_Clicks,\n",
" ROUND((Unigram3_ConvValue * 0.35 - Unigram3_Cost)/ Unigram3_Clicks,2) AS UniG3_MarginPc_mCRR_35,\n",
" \n",
" Unigram_fourthWord, #Unigram4_Clicks,\n",
" ROUND((Unigram4_ConvValue * 0.35 - Unigram4_Cost)/ Unigram4_Clicks,2) AS UniG4_MarginPc_mCRR_35,\n",
" \n",
" Unigram_fifthWord, #Unigram5_Clicks,\n",
" ROUND((Unigram5_ConvValue * 0.35 - Unigram5_Cost)/ Unigram5_Clicks,2) AS UniG5_MarginPc_mCRR_35\n",
"FROM JoinedData\n",
"WHERE\n",
" Query IS NOT null\n",
"ORDER BY\n",
" Cost DESC) \n",
"\n",
"SELECT * FROM NgramData_By_Query LIMIT 10"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Query</th>\n",
" <th>WordCount</th>\n",
" <th>CharLength</th>\n",
" <th>MarginPerClick_maxCRR_35</th>\n",
" <th>AdGr_MarginPc_mCRR_35</th>\n",
" <th>Camp_MarginPc_mCRR_35</th>\n",
" <th>__FirBG_Word</th>\n",
" <th>FirBG_MarginPc_mCRR_20</th>\n",
" <th>__SecBG_Word</th>\n",
" <th>SecBG_MarginPc_mCRR_35</th>\n",
" <th>...</th>\n",
" <th>__UniG1_Word</th>\n",
" <th>UniG1_MarginPc_mCRR_35</th>\n",
" <th>Unigram_secondWord</th>\n",
" <th>UniG2_MarginPc_mCRR_35</th>\n",
" <th>Unigram_thirdWord</th>\n",
" <th>UniG3_MarginPc_mCRR_35</th>\n",
" <th>Unigram_fourthWord</th>\n",
" <th>UniG4_MarginPc_mCRR_35</th>\n",
" <th>Unigram_fifthWord</th>\n",
" <th>UniG5_MarginPc_mCRR_35</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>canada goose herren</td>\n",
" <td>3</td>\n",
" <td>19</td>\n",
" <td>-0.24</td>\n",
" <td>0.17</td>\n",
" <td>0.57</td>\n",
" <td>canada goose</td>\n",
" <td>0.92</td>\n",
" <td>goose herren</td>\n",
" <td>-0.15</td>\n",
" <td>...</td>\n",
" <td>canada</td>\n",
" <td>0.00</td>\n",
" <td>goose</td>\n",
" <td>0.00</td>\n",
" <td>herren</td>\n",
" <td>0.01</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>meindl wanderschuhe</td>\n",
" <td>2</td>\n",
" <td>19</td>\n",
" <td>0.50</td>\n",
" <td>0.06</td>\n",
" <td>0.15</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>meindl</td>\n",
" <td>0.01</td>\n",
" <td>wanderschuhe</td>\n",
" <td>0.00</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>on cloud</td>\n",
" <td>2</td>\n",
" <td>8</td>\n",
" <td>0.15</td>\n",
" <td>0.22</td>\n",
" <td>0.17</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>on</td>\n",
" <td>0.01</td>\n",
" <td>cloud</td>\n",
" <td>0.01</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>on schuhe</td>\n",
" <td>2</td>\n",
" <td>9</td>\n",
" <td>-0.09</td>\n",
" <td>0.12</td>\n",
" <td>0.15</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>on</td>\n",
" <td>0.01</td>\n",
" <td>schuhe</td>\n",
" <td>0.01</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>on schuhe</td>\n",
" <td>2</td>\n",
" <td>9</td>\n",
" <td>-0.31</td>\n",
" <td>0.43</td>\n",
" <td>0.36</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>on</td>\n",
" <td>0.01</td>\n",
" <td>schuhe</td>\n",
" <td>0.01</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>on cloud</td>\n",
" <td>2</td>\n",
" <td>8</td>\n",
" <td>0.95</td>\n",
" <td>1.28</td>\n",
" <td>0.81</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>on</td>\n",
" <td>0.01</td>\n",
" <td>cloud</td>\n",
" <td>0.01</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>on cloud</td>\n",
" <td>2</td>\n",
" <td>8</td>\n",
" <td>-0.07</td>\n",
" <td>0.73</td>\n",
" <td>0.57</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>on</td>\n",
" <td>0.01</td>\n",
" <td>cloud</td>\n",
" <td>0.01</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>lowa schuhe</td>\n",
" <td>2</td>\n",
" <td>11</td>\n",
" <td>-0.61</td>\n",
" <td>-0.03</td>\n",
" <td>0.15</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>lowa</td>\n",
" <td>0.01</td>\n",
" <td>schuhe</td>\n",
" <td>0.01</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>canada goose damen</td>\n",
" <td>3</td>\n",
" <td>18</td>\n",
" <td>0.69</td>\n",
" <td>0.17</td>\n",
" <td>0.57</td>\n",
" <td>canada goose</td>\n",
" <td>0.92</td>\n",
" <td>goose damen</td>\n",
" <td>1.11</td>\n",
" <td>...</td>\n",
" <td>canada</td>\n",
" <td>0.00</td>\n",
" <td>goose</td>\n",
" <td>0.00</td>\n",
" <td>damen</td>\n",
" <td>0.01</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>ecco sandalen</td>\n",
" <td>2</td>\n",
" <td>13</td>\n",
" <td>-0.17</td>\n",
" <td>0.20</td>\n",
" <td>0.15</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td>...</td>\n",
" <td>ecco</td>\n",
" <td>0.01</td>\n",
" <td>sandalen</td>\n",
" <td>0.01</td>\n",
" <td></td>\n",
" <td>NaN</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" <td></td>\n",
" <td>None</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>10 rows × 24 columns</p>\n",
"</div>"
],
"text/plain": [
" Query WordCount CharLength MarginPerClick_maxCRR_35 \\\n",
"0 canada goose herren 3 19 -0.24 \n",
"1 meindl wanderschuhe 2 19 0.50 \n",
"2 on cloud 2 8 0.15 \n",
"3 on schuhe 2 9 -0.09 \n",
"4 on schuhe 2 9 -0.31 \n",
"5 on cloud 2 8 0.95 \n",
"6 on cloud 2 8 -0.07 \n",
"7 lowa schuhe 2 11 -0.61 \n",
"8 canada goose damen 3 18 0.69 \n",
"9 ecco sandalen 2 13 -0.17 \n",
"\n",
" AdGr_MarginPc_mCRR_35 Camp_MarginPc_mCRR_35 __FirBG_Word \\\n",
"0 0.17 0.57 canada goose \n",
"1 0.06 0.15 \n",
"2 0.22 0.17 \n",
"3 0.12 0.15 \n",
"4 0.43 0.36 \n",
"5 1.28 0.81 \n",
"6 0.73 0.57 \n",
"7 -0.03 0.15 \n",
"8 0.17 0.57 canada goose \n",
"9 0.20 0.15 \n",
"\n",
" FirBG_MarginPc_mCRR_20 __SecBG_Word SecBG_MarginPc_mCRR_35 \\\n",
"0 0.92 goose herren -0.15 \n",
"1 NaN NaN \n",
"2 NaN NaN \n",
"3 NaN NaN \n",
"4 NaN NaN \n",
"5 NaN NaN \n",
"6 NaN NaN \n",
"7 NaN NaN \n",
"8 0.92 goose damen 1.11 \n",
"9 NaN NaN \n",
"\n",
" ... __UniG1_Word UniG1_MarginPc_mCRR_35 \\\n",
"0 ... canada 0.00 \n",
"1 ... meindl 0.01 \n",
"2 ... on 0.01 \n",
"3 ... on 0.01 \n",
"4 ... on 0.01 \n",
"5 ... on 0.01 \n",
"6 ... on 0.01 \n",
"7 ... lowa 0.01 \n",
"8 ... canada 0.00 \n",
"9 ... ecco 0.01 \n",
"\n",
" Unigram_secondWord UniG2_MarginPc_mCRR_35 Unigram_thirdWord \\\n",
"0 goose 0.00 herren \n",
"1 wanderschuhe 0.00 \n",
"2 cloud 0.01 \n",
"3 schuhe 0.01 \n",
"4 schuhe 0.01 \n",
"5 cloud 0.01 \n",
"6 cloud 0.01 \n",
"7 schuhe 0.01 \n",
"8 goose 0.00 damen \n",
"9 sandalen 0.01 \n",
"\n",
" UniG3_MarginPc_mCRR_35 Unigram_fourthWord UniG4_MarginPc_mCRR_35 \\\n",
"0 0.01 None \n",
"1 NaN None \n",
"2 NaN None \n",
"3 NaN None \n",
"4 NaN None \n",
"5 NaN None \n",
"6 NaN None \n",
"7 NaN None \n",
"8 0.01 None \n",
"9 NaN None \n",
"\n",
" Unigram_fifthWord UniG5_MarginPc_mCRR_35 \n",
"0 None \n",
"1 None \n",
"2 None \n",
"3 None \n",
"4 None \n",
"5 None \n",
"6 None \n",
"7 None \n",
"8 None \n",
"9 None \n",
"\n",
"[10 rows x 24 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 29
}
]
},
{
"metadata": {
"id": "omSN3SxBEr8o",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"#### STEP 1: **Create the model**\n",
"- Define the model type: linear regression, logistic regression (ie probability of an event) or multi-label classification\n",
"- Set the input features for the prediction\n",
"- Set training parameters if needed"
]
},
{
"metadata": {
"id": "ajVAj6QNF-NW",
"colab_type": "code",
"colab": {}
},
"cell_type": "code",
"source": [
"# NGRAM PREDICTION MODEL\n",
"\n",
"from google.colab import auth\n",
"auth.authenticate_user()\n",
"\n",
"from google.cloud import bigquery\n",
"client = bigquery.Client(project=project_id)\n",
"\n",
"# Train a model\n",
"train_query = \"\"\"\n",
" create or replace model\n",
" `1049448236043.SPOR_Transfer.marginByQuery_PredictionModel`\n",
" options\n",
" ( model_type='linear_reg',\n",
" input_label_cols=['MarginPerClick_maxCRR_35'],\n",
" max_iterations=10) AS\n",
"\n",
" --- Unigrams Table\n",
" WITH\n",
" unigram_TempTable AS (\n",
" SELECT\n",
" Query,\n",
" LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 AS WordCount,\n",
" LENGTH(Query) AS len,\n",
" unigram,\n",
" QueryMatchTypeWithVariant,\n",
" Conversions,\n",
" ConversionValue,\n",
" Clicks,\n",
" ROUND(Cost/1000000,2) AS Cost,\n",
" Impressions,\n",
" ROUND(AverageCpc/1000000,2) AS Cpc\n",
" FROM\n",
" `trafficestimator-191922.SPOR_Transfer.SearchQueryStats_2131444427`\n",
" CROSS JOIN\n",
" UNNEST(SPLIT(Query,' ')) AS unigram\n",
" WHERE\n",
" Query NOT LIKE '%chuster%'\n",
" ORDER BY\n",
" Conversions DESC),\n",
"\n",
" --- UniGrams Table * 5\n",
"\n",
" UniGrams1 AS ( \n",
" SELECT\n",
" unigram,\n",
" ROUND(SUM(Conversions),2) AS Unigram1_Conversions,\n",
" ROUND(SUM(Conversions),2) AS Unigram1_ConvValue,\n",
" ROUND(SUM(cost/1000000),2) AS Unigram1_Cost,\n",
" ROUND(SUM(clicks),0) AS Unigram1_Clicks\n",
" FROM\n",
" unigram_TempTable\n",
" WHERE\n",
" Clicks > 0\n",
" GROUP BY\n",
" unigram\n",
" ORDER BY\n",
" Unigram1_Cost DESC),\n",
"\n",
"\n",
" UniGrams2 AS ( \n",
" SELECT\n",
" unigram,\n",
" ROUND(SUM(Conversions),2) AS Unigram2_Conversions,\n",
" ROUND(SUM(Conversions),2) AS Unigram2_ConvValue,\n",
" ROUND(SUM(cost/1000000),2) AS Unigram2_Cost,\n",
" ROUND(SUM(clicks),0) AS Unigram2_Clicks\n",
" FROM\n",
" unigram_TempTable\n",
" WHERE\n",
" Clicks > 0\n",
" GROUP BY\n",
" unigram\n",
" ORDER BY\n",
" Unigram2_Cost DESC),\n",
"\n",
" UniGrams3 AS ( \n",
" SELECT\n",
" unigram,\n",
" ROUND(SUM(Conversions),2) AS Unigram3_Conversions,\n",
" ROUND(SUM(Conversions),2) AS Unigram3_ConvValue,\n",
" ROUND(SUM(cost/1000000),2) AS Unigram3_Cost,\n",
" ROUND(SUM(clicks),0) AS Unigram3_Clicks\n",
" FROM\n",
" unigram_TempTable\n",
" WHERE\n",
" Clicks > 0\n",
" GROUP BY\n",
" unigram\n",
" ORDER BY\n",
" Unigram3_Cost DESC),\n",
"\n",
"\n",
" UniGrams4 AS ( \n",
" SELECT\n",
" unigram,\n",
" ROUND(SUM(Conversions),2) AS Unigram4_Conversions,\n",
" ROUND(SUM(Conversions),2) AS Unigram4_ConvValue,\n",
" ROUND(SUM(cost/1000000),2) AS Unigram4_Cost,\n",
" ROUND(SUM(clicks),0) AS Unigram4_Clicks\n",
" FROM\n",
" unigram_TempTable\n",
" WHERE\n",
" Clicks > 0\n",
" GROUP BY\n",
" unigram\n",
" ORDER BY\n",
" Unigram4_Cost DESC),\n",
"\n",
"\n",
" UniGrams5 AS ( \n",
" SELECT\n",
" unigram,\n",
" ROUND(SUM(Conversions),2) AS Unigram5_Conversions,\n",
" ROUND(SUM(Conversions),2) AS Unigram5_ConvValue,\n",
" ROUND(SUM(cost/1000000),2) AS Unigram5_Cost,\n",
" ROUND(SUM(clicks),0) AS Unigram5_Clicks\n",
" FROM\n",
" unigram_TempTable\n",
" WHERE\n",
" Clicks > 0\n",
" GROUP BY\n",
" unigram\n",
" ORDER BY\n",
" Unigram5_Cost DESC),\n",
"\n",
" --- FirstBiGrams Table\n",
"\n",
" FirstBiGrams AS (\n",
" SELECT\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 2 THEN CONCAT (SPLIT(Query,' ')[ OFFSET (0)], ' ',SPLIT(Query,' ')[ OFFSET (1)])\n",
" ELSE ''\n",
" END AS FirstBiGram_Word,\n",
" ROUND(SUM(Conversions),2) AS FirstBiGram_Conversions,\n",
" ROUND(SUM(ConversionValue),2) AS FirstBiGram_ConvValue,\n",
" ROUND(SUM(Cost/1000000),2) AS FirstBiGram_Cost,\n",
" SUM(Clicks) AS FirstBiGram_Clicks\n",
" FROM\n",
" `trafficestimator-191922.SPOR_Transfer.SearchQueryStats_2131444427`\n",
" WHERE\n",
" Clicks > 0\n",
" AND LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 2\n",
" AND Query NOT LIKE '%chuster%'\n",
" GROUP BY\n",
" FirstBiGram_Word\n",
" ORDER BY\n",
" FirstBiGram_Clicks DESC), \n",
"\n",
"\n",
" --- SecondBiGrams Table\n",
"\n",
" SecondBiGrams AS (\n",
" SELECT\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 2 THEN CONCAT (SPLIT(Query,' ')[ OFFSET (1)], ' ',SPLIT(Query,' ')[ OFFSET (2)])\n",
" ELSE ''\n",
" END AS SecondBiGram_Word,\n",
" ROUND(SUM(Conversions),2) AS SecondBiGram_Conversions,\n",
" ROUND(SUM(ConversionValue),2) AS SecondBiGram_ConvValue,\n",
" ROUND(SUM(Cost/1000000),2) AS SecondBiGram_Cost,\n",
" SUM(Clicks) AS SecondBiGram_Clicks\n",
" FROM\n",
" `trafficestimator-191922.SPOR_Transfer.SearchQueryStats_2131444427`\n",
" WHERE\n",
" Clicks > 0\n",
" AND LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 2\n",
" AND Query NOT LIKE '%chuster%'\n",
" GROUP BY\n",
" SecondBiGram_Word\n",
" ORDER BY\n",
" SecondBiGram_Clicks DESC), \n",
"\n",
"\n",
" --- ThirdBiGrams Table\n",
"\n",
" ThirdBiGrams AS (\n",
" SELECT\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 3 THEN CONCAT (SPLIT(Query,' ')[ OFFSET (2)], ' ',SPLIT(Query,' ')[ OFFSET (3)])\n",
" ELSE ''\n",
" END AS ThirdBiGram_Word,\n",
" ROUND(SUM(Conversions),2) AS ThirdBiGram_Conversions,\n",
" ROUND(SUM(ConversionValue),2) AS ThirdBiGram_ConvValue,\n",
" ROUND(SUM(Cost/1000000),2) AS ThirdBiGram_Cost,\n",
" SUM(Clicks) AS ThirdBiGram_Clicks\n",
" FROM\n",
" `trafficestimator-191922.SPOR_Transfer.SearchQueryStats_2131444427`\n",
" WHERE\n",
" Clicks > 0\n",
" AND LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 3\n",
" AND Query NOT LIKE '%chuster%'\n",
" GROUP BY\n",
" ThirdBiGram_Word\n",
" ORDER BY\n",
" ThirdBiGram_Clicks DESC), \n",
"\n",
"\n",
" --- FourthBiGrams Table\n",
"\n",
" FourthBiGrams AS (\n",
" SELECT\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 4 THEN CONCAT (SPLIT(Query,' ')[ OFFSET (3)], ' ',SPLIT(Query,' ')[ OFFSET (4)])\n",
" ELSE ''\n",
" END AS FourthBiGram_Word,\n",
" ROUND(SUM(Conversions),2) AS FourthBiGram_Conversions,\n",
" ROUND(SUM(ConversionValue),2) AS FourthBiGram_ConvValue,\n",
" ROUND(SUM(Cost/1000000),2) AS FourthBiGram_Cost,\n",
" SUM(Clicks) AS FourthBiGram_Clicks\n",
" FROM\n",
" `trafficestimator-191922.SPOR_Transfer.SearchQueryStats_2131444427`\n",
" WHERE\n",
" Clicks > 0\n",
" AND LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 4\n",
" AND Query NOT LIKE '%chuster%'\n",
" GROUP BY\n",
" FourthBiGram_Word\n",
" ORDER BY\n",
" FourthBiGram_Clicks DESC), \n",
"\n",
"\n",
" --- AdGroup Performance Table\n",
"\n",
" AdGroup_Perfomance AS (\n",
" SELECT\n",
" AdgroupId,\n",
" ROUND(SUM(Conversions),2) AS Adgroup_Conversions,\n",
" ROUND(SUM(ConversionValue),2) AS Adgroup_ConvValue,\n",
" ROUND(SUM(Cost/1000000),2) AS Adgroup_Cost,\n",
" SUM(Clicks) AS Adgroup_Clicks\n",
" FROM\n",
" `trafficestimator-191922.SPOR_Transfer.AdGroupStats_2131444427`\n",
" WHERE \n",
" Clicks > 0\n",
" GROUP BY\n",
" AdgroupId\n",
" ORDER BY\n",
" Adgroup_Cost DESC),\n",
"\n",
" --- Campaign Performance Table\n",
"\n",
" Campaign_Perfomance AS (\n",
" SELECT\n",
" CampaignId,\n",
" ROUND(SUM(Conversions),2) AS Campaign_Conversions,\n",
" ROUND(SUM(ConversionValue),2) AS Campaign_ConvValue,\n",
" ROUND(SUM(Cost/1000000),2) AS Campaign_Cost,\n",
" SUM(Clicks) AS Campaign_Clicks\n",
" FROM\n",
" `trafficestimator-191922.SPOR_Transfer.CampaignStats_2131444427`\n",
" WHERE\n",
" Clicks > 0 \n",
" GROUP BY\n",
" CampaignId\n",
" ORDER BY\n",
" Campaign_Cost DESC),\n",
"\n",
"\n",
" --- Main Queries Table\n",
"\n",
" Queries AS \n",
"\n",
" (SELECT\n",
" CampaignId,\n",
" AdgroupId,\n",
" Query,\n",
"\n",
" -- bigrams \n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 2 THEN CONCAT (SPLIT(Query,' ')[ OFFSET (0)], ' ',SPLIT(Query,' ')[ OFFSET (1)])\n",
" ELSE ''\n",
" END AS FirstBiGram_Word, \n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 2 THEN CONCAT (SPLIT(Query,' ')[ OFFSET (1)], ' ',SPLIT(Query,' ')[ OFFSET (2)])\n",
" ELSE ''\n",
" END AS SecondBiGram_Word,\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 3 THEN CONCAT (SPLIT(Query,' ')[ OFFSET (2)], ' ',SPLIT(Query,' ')[ OFFSET (3)])\n",
" ELSE ''\n",
" END AS ThirdBiGram_Word,\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 4 THEN CONCAT (SPLIT(Query,' ')[ OFFSET (3)], ' ',SPLIT(Query,' ')[ OFFSET (4)])\n",
" ELSE ''\n",
" END AS FourthBiGram_Word,\n",
"\n",
" -- unigrams\n",
" SPLIT(Query,' ')[OFFSET(0)] AS Unigram_firstWord,\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 1 THEN SPLIT(Query,' ')[OFFSET(1)]\n",
" ELSE ''\n",
" END AS Unigram_secondWord,\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 2 THEN SPLIT(Query,' ')[OFFSET(2)]\n",
" ELSE ''\n",
" END AS Unigram_thirdWord,\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 3 THEN SPLIT(Query,' ')[OFFSET(3)]\n",
" ELSE ''\n",
" END AS Unigram_fourthWord,\n",
" CASE\n",
" WHEN LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 > 4 THEN SPLIT(Query,' ')[OFFSET(4)]\n",
" ELSE ''\n",
" END AS Unigram_fifthWord,\n",
"\n",
" ROUND(SUM(Conversions),2) AS Conversions,\n",
" ROUND(SUM(Clicks),2) AS Clicks,\n",
" ROUND(SUM(Cost/1000000),2) AS Cost,\n",
" ROUND(SUM(ConversionValue), 2) AS Revenue # /1000000\n",
" FROM\n",
" `trafficestimator-191922.SPOR_Transfer.SearchQueryStats_2131444427`\n",
" WHERE\n",
" Clicks > 0\n",
" AND Query NOT LIKE '%chuster%'\n",
" GROUP BY\n",
" CampaignId, AdgroupId, Query\n",
" ORDER BY\n",
" Conversions DESC\n",
" LIMIT\n",
" 50000), \n",
"\n",
"\n",
" --- LEFT JOIN Queries and Ngrams\n",
"\n",
" JoinedData AS(\n",
" SELECT * FROM Queries\n",
" LEFT JOIN FirstBiGrams USING (FirstBiGram_Word)\n",
" LEFT JOIN SecondBiGrams USING (SecondBiGram_Word)\n",
" LEFT JOIN ThirdBiGrams USING (ThirdBiGram_Word)\n",
" LEFT JOIN FourthBiGrams USING (FourthBiGram_Word)\n",
"\n",
" LEFT JOIN AdGroup_Perfomance USING (AdgroupId)\n",
" LEFT JOIN Campaign_Perfomance USING (CampaignId)\n",
"\n",
" LEFT JOIN Unigrams1 ON Queries.Unigram_firstWord = Unigrams1.unigram\n",
" LEFT JOIN Unigrams2 ON Queries.Unigram_secondWord = Unigrams2.unigram\n",
" LEFT JOIN Unigrams3 ON Queries.Unigram_thirdWord = Unigrams3.unigram\n",
" LEFT JOIN Unigrams4 ON Queries.Unigram_fourthWord = Unigrams4.unigram\n",
" LEFT JOIN Unigrams5 ON Queries.Unigram_fifthWord = Unigrams5.unigram),\n",
"\n",
" NgramData_By_Query AS (SELECT \n",
" Query, # Clicks,\tConversions,\tCost, Revenue, \n",
" LENGTH(Query) - LENGTH(REPLACE(Query, ' ', '')) + 1 AS WordCount,\n",
" LENGTH(Query) AS CharLength,\n",
"\n",
" /*CASE\n",
" WHEN Conversions > 0 THEN ROUND(Cost / Conversions, 2)\n",
" ELSE null\n",
" END AS CPO,\n",
"\n",
" CASE\n",
" WHEN Revenue > 0 THEN ROUND(Cost / Revenue, 2)\n",
" ELSE null\n",
" END AS CRR,*/\n",
"\n",
" # ROUND((Conversions * 10 - Cost)/ Clicks,2) AS MarginPerClick_MaxCPO_10,\n",
" ROUND((Revenue * 0.35 - Cost)/ Clicks,2) AS MarginPerClick_maxCRR_35,\n",
"\n",
" # Adgroup_Conversions, Adgroup_ConvValue, Adgroup_Cost, Adgroup_Clicks,\n",
" # ROUND((SecondBiGram_Conversions * 10 - SecondBiGram_Cost)/ SecondBiGram_Clicks,2) AS SecBG_MarginPc_MaxCPO_10,\n",
" AdgroupId, \n",
" ROUND((Adgroup_ConvValue * 0.35 - Adgroup_Cost)/ Adgroup_Clicks,2) AS AdGr_MarginPc_mCRR_35,\n",
"\n",
" # Campaign_Conversions, Campaign_ConvValue, Campaign_Cost, Campaign_Clicks,\n",
" # ROUND((SecondBiGram_Conversions * 10 - SecondBiGram_Cost)/ SecondBiGram_Clicks,2) AS SecBG_MarginPc_MaxCPO_10,\n",
" CampaignId,\n",
" ROUND((Campaign_ConvValue * 0.35 - Campaign_Cost)/ Campaign_Clicks,2) AS Camp_MarginPc_mCRR_35,\n",
"\n",
" # FirstBiGram_Word AS __FirBG_Word, \n",
" # FirstBiGram_Clicks,\tFirstBiGram_Conversions, FirstBiGram_ConvValue, FirstBiGram_Cost,\n",
" # ROUND((FirstBiGram_Conversions * 10 - FirstBiGram_Cost)/ FirstBiGram_Clicks,2) AS FirBG_MarginPc_MaxCPO_10,\n",
" ROUND((FirstBiGram_ConvValue * 0.35 - FirstBiGram_Cost)/ FirstBiGram_Clicks,2) AS FirBG_MarginPc_mCRR_20,\n",
"\n",
" # SecondBiGram_Word AS __SecBG_Word,\t\n",
" # SecondBiGram_Clicks,\tSecondBiGram_Conversions,\tSecondBiGram_ConvValue, SecondBiGram_Cost,\n",
" # ROUND((SecondBiGram_Conversions * 10 - SecondBiGram_Cost)/ SecondBiGram_Clicks,2) AS SecBG_MarginPc_MaxCPO_10,\n",
" ROUND((SecondBiGram_ConvValue * 0.35 - SecondBiGram_Cost)/ SecondBiGram_Clicks,2) AS SecBG_MarginPc_mCRR_35,\n",
"\n",
" # ThirdBiGram_Word AS __ThiBG_Word,\t\n",
" # ThirdBiGram_Clicks,\tThirdBiGram_Conversions, ThirdBiGram_ConvValue, ThirdBiGram_Cost,\n",
" # ROUND((ThirdBiGram_Conversions * 10 - ThirdBiGram_Cost)/ ThirdBiGram_Clicks,2) AS ThiBG_MarginPc_MaxCPO_10,\n",
" ROUND((ThirdBiGram_ConvValue * 0.35 - ThirdBiGram_Cost)/ ThirdBiGram_Clicks,2) AS ThiBG_MarginPc_mCRR_35,\n",
"\n",
" # FourthBiGram_Word AS __FouBG_Word,\t\n",
" # FourthBiGram_Clicks,\tFourthBiGram_Conversions, FourthBiGram_ConvValue, FourthBiGram_Cost,\n",
" # ROUND((FourthBiGram_Conversions * 10 - FourthBiGram_Cost)/ FourthBiGram_Clicks,2) AS FouBG_Profitpc_MaxCPO_10,\n",
" ROUND((FourthBiGram_ConvValue * 0.35 - FourthBiGram_Cost)/ FourthBiGram_Clicks,2) AS FouBG_MarginPc_mCRR_35,\n",
"\n",
" # Unigram_firstWord AS __UniG1_Word, # Unigram1_Clicks,\n",
" ROUND((Unigram1_ConvValue * 0.35 - Unigram1_Cost)/ Unigram1_Clicks,2) AS UniG1_MarginPc_mCRR_35,\n",
"\n",
" # Unigram_secondWord, #Unigram2_Clicks,\n",
" ROUND((Unigram2_ConvValue * 0.35 - Unigram2_Cost)/ Unigram2_Clicks,2) AS UniG2_MarginPc_mCRR_35,\n",
"\n",
" # Unigram_thirdWord, #Unigram3_Clicks,\n",
" ROUND((Unigram3_ConvValue * 0.35 - Unigram3_Cost)/ Unigram3_Clicks,2) AS UniG3_MarginPc_mCRR_35,\n",
"\n",
" # Unigram_fourthWord, #Unigram4_Clicks,\n",
" ROUND((Unigram4_ConvValue * 0.35 - Unigram4_Cost)/ Unigram4_Clicks,2) AS UniG4_MarginPc_mCRR_35,\n",
"\n",
" # Unigram_fifthWord, #Unigram5_Clicks,\n",
" ROUND((Unigram5_ConvValue * 0.35 - Unigram5_Cost)/ Unigram5_Clicks,2) AS UniG5_MarginPc_mCRR_35\n",
" FROM JoinedData\n",
" WHERE\n",
" Query IS NOT null\n",
" ORDER BY\n",
" Cost DESC)\n",
"\n",
"\n",
" select MarginPerClick_maxCRR_35, Query, WordCount,CharLength, AdgroupId, AdGr_MarginPc_mCRR_35, CampaignId,\tCamp_MarginPc_mCRR_35 \n",
" # Additional Columns: , FirBG_MarginPc_mCRR_20,\tSecBG_MarginPc_mCRR_35,\tThiBG_MarginPc_mCRR_35,\tFouBG_MarginPc_mCRR_35,\tUniG1_MarginPc_mCRR_35,\tUniG2_MarginPc_mCRR_35,\tUniG3_MarginPc_mCRR_35,\tUniG4_MarginPc_mCRR_35,\tUniG5_MarginPc_mCRR_35\n",
" from NgramData_By_Query\n",
"\n",
" \n",
" \"\"\"\n",
"training_job = client.query(train_query, location=\"EU\").to_dataframe()\n",
"print(training_job)"
],
"execution_count": 0,
"outputs": []
},
{
"metadata": {
"id": "_VGEVeWOsKlk",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"#### STEP 1.1: **Get training info**\n"
]
},
{
"metadata": {
"id": "hy5kzjZGRZNL",
"colab_type": "code",
"outputId": "3773d3e8-8a6b-4971-d2dd-d5132f6a92c8",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 68
}
},
"cell_type": "code",
"source": [
"training_info = \"\"\"\n",
" select\n",
" *\n",
" from\n",
" ML.TRAINING_INFO(MODEL `1049448236043.SPOR_Transfer.marginByQuery_PredictionModel`)\n",
" \"\"\"\n",
"\n",
"training_info_job = client.query(training_info).to_dataframe()\n",
"print(training_info_job)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
" training_run iteration loss eval_loss duration_ms learning_rate\n",
"0 0 1 65.278271 285.882289 5917 0.4\n",
"1 0 0 164.848539 290.005527 4439 0.2\n"
],
"name": "stdout"
}
]
},
{
"metadata": {
"id": "dxowKJrUGpYk",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"OH SH... the prediction model is not converging aka ** CRAP** > Back to the drawing board!\n",
"\n",
"![crap](https://user-images.githubusercontent.com/6991865/52307346-a74c2b00-299a-11e9-9135-cd5ff49d89e1.png)"
]
},
{
"metadata": {
"id": "utgKaJZte1KV",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"\n",
"\n",
"---\n",
"---\n",
"---\n",
"\n",
"\n",
"# PART 3. TEXT SUMMARIZATION"
]
},
{
"metadata": {
"id": "rEHm08XApAcF",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"## 3.1 Text Summarization with Sumy (Dan Shapiro)\n"
]
},
{
"metadata": {
"id": "-f9tLmi5i70N",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"- Documentation: https://searchengineland.com/reducing-the-time-it-takes-to-write-meta-descriptions-for-large-websites-299887\n",
"\n",
"- Alternative: Britney Mueller shared a similar summary implementation at TechSEO 2018 with Goose3: https://youtu.be/fyQW5yE1x-g?t=5258\n",
"\n",
"---\n",
"> --> **USE CASE**: Get better long descriptions for pages, if meta description not useful.\n",
"\n",
"> --> Authorization needed? **NO**\n",
"\n",
"> --> Can be rewritten in *Ads Scripts*? **NO**\n",
"\n",
"---"
]
},
{
"metadata": {
"id": "pOTVefQWpMaS",
"colab_type": "code",
"outputId": "bd4d632c-171b-4dba-d2d1-dc4ad3e2e9c8",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 2715
}
},
"cell_type": "code",
"source": [
"####\n",
"\n",
"LANGUAGE = \"german\"\n",
"SENTENCES_COUNT = 2\n",
"urls = ['https://www.bergzeit.de/kletterseil/', 'https://www.bergzeit.de/winterjacke-herren/', 'https://www.bergzeit.de/skijacke-herren/']\n",
"\n",
"# Dutch examples: \n",
"# urls = ['https://www.bol.com/nl/m/baby-kinderwagens/','https://www.zalando.nl/dameskleding-jurken/', 'https://www.wehkamp.nl/kinderen/babykleding/C23_3K3/']\n",
"\n",
"####\n",
"\n",
"!pip install csv\n",
"!pip install bs4\n",
"!pip install sumy\n",
"import csv\n",
"import os\n",
"import nltk\n",
"nltk.download('punkt')\n",
"from sumy.parsers.html import HtmlParser\n",
"from sumy.parsers.plaintext import PlaintextParser\n",
"from sumy.nlp.tokenizers import Tokenizer\n",
"from sumy.summarizers.lsa import LsaSummarizer as Lsa\n",
"from sumy.summarizers.luhn import LuhnSummarizer as Luhn\n",
"from sumy.summarizers.text_rank import TextRankSummarizer as TxtRank\n",
"from sumy.summarizers.lex_rank import LexRankSummarizer as LexRank\n",
"from sumy.summarizers.sum_basic import SumBasicSummarizer as SumBasic\n",
"from sumy.summarizers.kl import KLSummarizer as KL\n",
"from sumy.summarizers.edmundson import EdmundsonSummarizer as Edmundson\n",
"from sumy.nlp.stemmers import Stemmer\n",
"from sumy.utils import get_stop_words\n",
"\n",
"import requests\n",
"from bs4 import BeautifulSoup\n",
"\n",
"\n",
"for line in urls:\n",
" print('\\n*****\\nPrinting 1. META DESCRIPTION for : ' + line + '\\n*****\\n')\n",
" response = requests.get(line)\n",
" soup = BeautifulSoup(response.text)\n",
" metas = soup.find_all('meta')\n",
" print([meta.attrs['content'] for meta in metas if 'name' in meta.attrs and meta.attrs['name'] == 'description'])\n",
"\n",
" stemmer = Stemmer(LANGUAGE)\n",
" print('\\n*****\\n*****\\nPrinting 2. TEXT SUMMARIZATION with ' + str(SENTENCES_COUNT) + 'sentences for : ' + line + '\\n*****\\n')\n",
" lsaSummarizer = Lsa(stemmer)\n",
" lsaSummarizer.stop_words = get_stop_words(LANGUAGE)\n",
"\n",
" luhnSummarizer = Luhn(stemmer)\n",
" luhnSummarizer.stop_words = get_stop_words(LANGUAGE)\n",
"\n",
" lexrankSummarizer = LexRank(stemmer)\n",
" lexrankSummarizer.stop_words = get_stop_words(LANGUAGE)\n",
"\n",
" textrankSummarizer = TxtRank(stemmer)\n",
" textrankSummarizer.stop_words = get_stop_words(LANGUAGE)\n",
"\n",
" sumbasicSummarizer = SumBasic(stemmer)\n",
" sumbasicSummarizer.stop_words = get_stop_words(LANGUAGE)\n",
"\n",
" klSummarizer = KL(stemmer)\n",
" klSummarizer.stop_words = get_stop_words(LANGUAGE)\n",
"\n",
" parser = HtmlParser.from_url(line, Tokenizer(LANGUAGE))\n",
"\n",
" for sentence in lsaSummarizer(parser.document, SENTENCES_COUNT):\n",
" print(sentence)\n",
" print(\"\\n> Summarizing URL via LSA\\n\\n\")\n",
"\n",
" for sentence in luhnSummarizer(parser.document, SENTENCES_COUNT):\n",
" print(sentence)\n",
" print(\"\\n> Summarizing URL via Luhn\\n\\n\")\t\n",
"\n",
" for sentence in lexrankSummarizer(parser.document, SENTENCES_COUNT):\n",
" print(sentence)\n",
" print(\"\\n> Summarizing URL via LexRank\\n\\n\")\t\n",
"\n",
" '''for sentence in textrankSummarizer(parser.document, SENTENCES_COUNT):\n",
" print(sentence)\n",
" print(\"\\n> Summarizing URL via TextRank\\n\\n\")*/'''\n",
"\n",
" for sentence in sumbasicSummarizer(parser.document, SENTENCES_COUNT):\n",
" print(sentence)\n",
" print(\"\\n> Summarizing URL via SumBasic\\n\\n\")\t\n",
"\n",
" for sentence in klSummarizer(parser.document, SENTENCES_COUNT):\n",
" print(sentence)\n",
" print(\"\\n> Summarizing URL via KL-Sum\\n\\n\")"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"Collecting csv\n",
"\u001b[31m Could not find a version that satisfies the requirement csv (from versions: )\u001b[0m\n",
"\u001b[31mNo matching distribution found for csv\u001b[0m\n",
"Requirement already satisfied: bs4 in /usr/local/lib/python3.6/dist-packages (0.0.1)\n",
"Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.6/dist-packages (from bs4) (4.6.3)\n",
"Collecting sumy\n",
"\u001b[?25l Downloading https://files.pythonhosted.org/packages/2f/0e/30ebd2fb0925537a3b2f9fccf0a13171ba557e9450b1702d278159d3c592/sumy-0.7.0-py2.py3-none-any.whl (78kB)\n",
"\u001b[K 100% |████████████████████████████████| 81kB 3.2MB/s \n",
"\u001b[?25hRequirement already satisfied: requests>=2.7.0 in /usr/local/lib/python3.6/dist-packages (from sumy) (2.18.4)\n",
"Collecting breadability>=0.1.20 (from sumy)\n",
" Downloading https://files.pythonhosted.org/packages/ad/2d/bb6c9b381e6b6a432aa2ffa8f4afdb2204f1ff97cfcc0766a5b7683fec43/breadability-0.1.20.tar.gz\n",
"Requirement already satisfied: nltk>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from sumy) (3.2.5)\n",
"Requirement already satisfied: docopt<0.7,>=0.6.1 in /usr/local/lib/python3.6/dist-packages (from sumy) (0.6.2)\n",
"Requirement already satisfied: idna<2.7,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests>=2.7.0->sumy) (2.6)\n",
"Requirement already satisfied: urllib3<1.23,>=1.21.1 in /usr/local/lib/python3.6/dist-packages (from requests>=2.7.0->sumy) (1.22)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.6/dist-packages (from requests>=2.7.0->sumy) (2018.11.29)\n",
"Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests>=2.7.0->sumy) (3.0.4)\n",
"Requirement already satisfied: lxml>=2.0 in /usr/local/lib/python3.6/dist-packages (from breadability>=0.1.20->sumy) (4.2.6)\n",
"Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from nltk>=3.0.2->sumy) (1.11.0)\n",
"Building wheels for collected packages: breadability\n",
" Building wheel for breadability (setup.py) ... \u001b[?25ldone\n",
"\u001b[?25h Stored in directory: /root/.cache/pip/wheels/5a/4d/a1/510b12c5e65e0b2b3ce539b2af66da0fc57571e528924f4a52\n",
"Successfully built breadability\n",
"Installing collected packages: breadability, sumy\n",
"Successfully installed breadability-0.1.20 sumy-0.7.0\n",
"[nltk_data] Downloading package punkt to /root/nltk_data...\n",
"[nltk_data] Unzipping tokenizers/punkt.zip.\n",
"\n",
"*****\n",
"Printing 1. META DESCRIPTION for : https://www.bergzeit.de/kletterseil/\n",
"*****\n",
"\n",
"['Kletterseile im Bergzeit Online Shop ➤ Große Auswahl aller Marken ✓ Kostenloser Versand & Rückversand ab 50€ ✓ Rechnungskauf ✓ Online seit 1999']\n",
"\n",
"*****\n",
"*****\n",
"Printing 2. TEXT SUMMARIZATION with 2sentences for : https://www.bergzeit.de/kletterseil/\n",
"*****\n",
"\n"
],
"name": "stdout"
},
{
"output_type": "stream",
"text": [
"/usr/local/lib/python3.6/dist-packages/sumy/summarizers/lsa.py:76: UserWarning: Number of words (64) is lower than number of sentences (107). LSA algorithm may not work properly.\n",
" warn(message % (words_count, sentences_count))\n"
],
"name": "stderr"
},
{
"output_type": "stream",
"text": [
"Extrem robustes Seil, speziell fürs Training in der Halle entwickelt\n",
"Robustes Einfachseil, das für den Vorstieg in der Halle konzipiert wurde\n",
"\n",
"> Summarizing URL via LSA\n",
"\n",
"\n",
"Robustes, langlebiges Halbseil für erhöhte Sicherheit\n",
"Neu EdelridEagle Lite Pro Dry 9,5 mm Kletterseil\n",
"\n",
"> Summarizing URL via Luhn\n",
"\n",
"\n",
"Imprägniertes Einfachseil zum Sportklettern am Fels, im Eis und in der Halle\n",
"Neu EdelridEagle Lite Pro Dry 9,5 mm Kletterseil\n",
"\n",
"> Summarizing URL via LexRank\n",
"\n",
"\n",
"Mammut9.9 Gym Dry Einfachseil\n",
"Neu EdelridEagle Lite 9,5 mm Kletterseil\n",
"\n",
"> Summarizing URL via SumBasic\n",
"\n",
"\n",
"EdelridEagle Lite Pro Dry 9,5 mm Kletterseil\n",
"Neu EdelridEagle Lite Pro Dry 9,5 mm Kletterseil\n",
"\n",
"> Summarizing URL via KL-Sum\n",
"\n",
"\n",
"\n",
"*****\n",
"Printing 1. META DESCRIPTION for : https://www.bergzeit.de/winterjacke-herren/\n",
"*****\n",
"\n",
"['Herren Winterjacken im Bergzeit Online Shop ➤ Große Auswahl aller Marken ✓ Kostenloser Versand & Rückversand ab 50€ ✓ Rechnungskauf']\n",
"\n",
"*****\n",
"*****\n",
"Printing 2. TEXT SUMMARIZATION with 2sentences for : https://www.bergzeit.de/winterjacke-herren/\n",
"*****\n",
"\n",
"Ob Weihnachtsmarkt oder Skihang – dieser wetterfeste, isolierte Parka macht alles mit\n",
"Lang geschnitten, warm und absolut schick, dieser Parka begeistert in der kalten Jahreszeit\n",
"\n",
"> Summarizing URL via LSA\n",
"\n",
"\n",
"Ob Weihnachtsmarkt oder Skihang – dieser wetterfeste, isolierte Parka macht alles mit\n",
"Ob Weihnachtsmarkt oder Skihang – dieser wetterfeste, isolierte Parka macht alles mit\n",
"\n",
"> Summarizing URL via Luhn\n",
"\n",
"\n",
"AlmgwandHerren Nordgipfel-1 Jacke\n",
"VaudeHerren Manukau Jacke\n",
"\n",
"> Summarizing URL via LexRank\n",
"\n",
"\n",
"343,96 € statt 429,95 €\n",
"AlmgwandHerren Nordgipfel-1 Jacke\n",
"\n",
"> Summarizing URL via SumBasic\n",
"\n",
"\n",
"Diese nachhaltig produzierte Winterjacke weiß optisch und funktionell voll zu überzeugen\n",
"Lang geschnitten, warm und absolut schick, dieser Parka begeistert in der kalten Jahreszeit\n",
"\n",
"> Summarizing URL via KL-Sum\n",
"\n",
"\n",
"\n",
"*****\n",
"Printing 1. META DESCRIPTION for : https://www.bergzeit.de/skijacke-herren/\n",
"*****\n",
"\n",
"['Herren Skijacken im Bergzeit Online Shop ➤ Große Auswahl aller Marken ✓ Kostenloser Versand & Rückversand ab 50€ ✓ Rechnungskauf']\n",
"\n",
"*****\n",
"*****\n",
"Printing 2. TEXT SUMMARIZATION with 2sentences for : https://www.bergzeit.de/skijacke-herren/\n",
"*****\n",
"\n",
"Ein neues Zeitalter der Hardshelljacken bricht an: nahtlos gestrickt, hoch funktionell und angenehm weich\n",
"Ein neues Zeitalter der Hardshelljacken bricht an: nahtlos gestrickt, hoch funktionell und angenehm weich\n",
"\n",
"> Summarizing URL via LSA\n",
"\n",
"\n",
"Ein neues Zeitalter der Hardshelljacken bricht an: nahtlos gestrickt, hoch funktionell und angenehm weich\n",
"Ein neues Zeitalter der Hardshelljacken bricht an: nahtlos gestrickt, hoch funktionell und angenehm weich\n",
"\n",
"> Summarizing URL via Luhn\n",
"\n",
"\n",
"Jacke\n",
"Jacke\n",
"\n",
"> Summarizing URL via LexRank\n",
"\n",
"\n",
"ab 397,46 € statt 529,95 €\n",
"20% HaglöfsHerren Nengal Parka\n",
"\n",
"> Summarizing URL via SumBasic\n",
"\n",
"\n",
"Rundum auf Skifahren und Freeriden bei schlechtem Wetter eingestellte Hardshelljacke mit längerem Schnitt\n",
"Stabile dreilagige Hardshelljacke fürs Freeriden mit herausnehmbarer Primaloft Weste\n",
"\n",
"> Summarizing URL via KL-Sum\n",
"\n",
"\n"
],
"name": "stdout"
}
]
},
{
"metadata": {
"id": "F-Y_HXODVknN",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"\n",
"---\n",
"\n",
"## 3.2 KeyPhrase Dectection API (Microsoft Azure)\n",
"\n"
]
},
{
"metadata": {
"id": "gZ3LnPIli2JN",
"colab_type": "text"
},
"cell_type": "markdown",
"source": [
"- Create a Microsoft Azure account here: https://portal.azure.com\n",
"- API Documentation: https://docs.microsoft.com/en-us/azure/cognitive-services/luis/luis-tutorial-bing-spellcheck\n",
"- Alternative is Amazon Comprehend\n",
"\n",
"---\n",
"> --> **USE CASE**: Writing an ad for brands or categories, getting the top-used nouns\n",
"\n",
"> --> Authorization needed? **YES**\n",
"\n",
"> --> Can be rewritten in *Ads Scripts*? **NO**\n",
"\n",
"---"
]
},
{
"metadata": {
"id": "hUEsGoE0LFUP",
"colab_type": "code",
"outputId": "de93748b-f528-462a-f84d-e9e480124d6a",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 587
}
},
"cell_type": "code",
"source": [
"#################### START CONFIG ##################\n",
"\n",
"# Enter your Google Shopping feed here\n",
"feedUrl = 'https://transport.productsup.io/1053cf37b6bd2d103672/1/lodenfrey_googleshopping.csv'\n",
"\n",
"# Enter your column here separator here \n",
"separator = '~'\n",
"\n",
"# Specify the brand you want to find words for\n",
"brand = \"Woolrich\" # Kiehl's\n",
"\n",
"#################### END CONFIG ####################\n",
"\n",
"\n",
"import html, http.client, urllib.request, urllib.parse, urllib.error, base64, json, requests\n",
"import pandas as pd\n",
"import numpy as np\n",
"from bs4 import BeautifulSoup\n",
"\n",
"# Filter dataframe by Brand, get description and drop duplicates\n",
"dataBlob = pd.read_csv(feedUrl,sep=separator)[['description','brand']]\n",
"filter = dataBlob['brand'] == brand\n",
"dataBlob.where(filter, inplace = True)\n",
"dataBlob = dataBlob.dropna().drop_duplicates()\n",
"dataBlob = dataBlob.drop(['brand'], axis=1)\n",
"print(dataBlob[0:3])\n",
"\n",
"documents, id = { \"documents\" : []}, 1\n",
"\n",
"for desc in dataBlob['description']: \n",
" documents['documents'].append({\"language\": \"de\",\"id\": id,\"text\": str(desc)})\n",
" id += 1\n",
"\n",
"headers = {'Content-Type': 'application/json','Ocp-Apim-Subscription-Key': '80e865a0306e4c218e5305f07577adc2'}\n",
"params = urllib.parse.urlencode({})\n",
"\n",
"phraseCountDf = pd.DataFrame(np.nan, index=[0], columns=['count'])\n",
"\n",
"try:\n",
" conn = http.client.HTTPSConnection('westeurope.api.cognitive.microsoft.com')\n",
" conn.request(\"POST\", \"/text/analytics/v2.0/keyPhrases?%s\" % params, json.dumps(documents), headers)\n",
" json_response = json.loads(conn.getresponse().read())\n",
" \n",
" print('\\n**** Printing key phrase summaries for ' + brand + ' :\\n')\n",
" \n",
" for document in json_response['documents']:\n",
" for phrase in document['keyPhrases']:\n",
" \n",
" # If a column doesn't exists. If not, create a row\n",
" if(phrase not in phraseCountDf.index):\n",
" phraseCountDf.loc[phrase] = [1]\n",
" else:\n",
" phraseCountDf.loc[phrase,'count'] += 1\n",
" \n",
" phraseCountDf = phraseCountDf.sort_values(by=['count'], ascending=False).drop(labels=brand)\n",
" print(phraseCountDf[0:25])\n",
" \n",
" #print('\\n****\\n' + json.dumps(json_response, indent=2))\n",
" \n",
" conn.close()\n",
"except Exception as e:\n",
" print(e)"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
" description\n",
"328 Der Klassiker schlechthin! Der Daunen-Parka vo...\n",
"329 Der Klassiker schlechthin! Der Daunen-Parka vo...\n",
"453 Arctic Parka DF für Herren von Woolrich in Sch...\n",
"\n",
"**** Printing key phrase summaries for Woolrich :\n",
"\n",
" count\n",
"Damen 35.0\n",
"Herren 34.0\n",
"Qualität 33.0\n",
"Funktionalität 33.0\n",
"Tradition der Extraklasse 30.0\n",
"Label stehtallgegenwärtig 26.0\n",
"Kapuze 18.0\n",
"Besatz 13.0\n",
"Modelle 13.0\n",
"Daunenparka 12.0\n",
"kälte 11.0\n",
"wärmenden Daunen 10.0\n",
"Feder 10.0\n",
"allgegenwärtigfür Qualität 9.0\n",
"Outdoor 9.0\n",
"Mädchen 8.0\n",
"Beschichtung 8.0\n",
"Füllung 7.0\n",
"Navy 7.0\n",
"Schwarz 7.0\n",
"abnehmbarem Finnraccoon 6.0\n",
"Details 6.0\n",
"zeitloser Aufmachung 6.0\n",
"Grau 6.0\n",
"hoher Qualität 6.0\n"
],
"name": "stdout"
}
]
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment