Skip to content

Instantly share code, notes, and snippets.

@rezapci
Created August 27, 2019 01:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rezapci/29e21560d600b0cfc839073cb24b889a to your computer and use it in GitHub Desktop.
Save rezapci/29e21560d600b0cfc839073cb24b889a to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"nbformat_minor": 1,
"cells": [
{
"source": "This is the first assgiment for the Coursera course \"Advanced Machine Learning and Signal Processing\"\n\nThe first step is to insert the credentials to the Apache CouchDB / Cloudant database where your sensor data ist stored to. \n\n1. In the project's overview tab of this project just select \"Add to project\"->Connection\n2. From the section \"Your service instances in IBM Cloud\" select your cloudant database and click on \"create\"\n3. Now click in the empty cell below labeled with \"your cloudant credentials go here\"\n4. Click on the \"10-01\" symbol top right and selecrt the \"Connections\" tab\n5. Find your data base connection and click on \"Insert to code\"\n\nThe following video illustrates this process: https://www.youtube.com/watch?v=dCawUGv7qgs\n\nDone, just execute all cells one after the other and you are done - just note that in the last one you have to update your email address (the one you've used for coursera) and obtain a submittion token, you get this from the programming assingment directly on coursera.",
"cell_type": "markdown",
"metadata": {}
},
{
"execution_count": 1,
"cell_type": "code",
"metadata": {},
"outputs": [],
"source": "# The code was removed by Watson Studio for sharing."
},
{
"execution_count": 2,
"cell_type": "code",
"metadata": {},
"outputs": [],
"source": "spark = SparkSession\\\n .builder\\\n .appName(\"Cloudant Spark SQL Example in Python using temp tables\")\\\n .config(\"cloudant.host\",credentials_1['custom_url'].split('@')[1])\\\n .config(\"cloudant.username\", credentials_1['username'])\\\n .config(\"cloudant.password\",credentials_1['password'])\\\n .config(\"jsonstore.rdd.partitions\", 1)\\\n .getOrCreate()"
},
{
"execution_count": 16,
"cell_type": "code",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": "+-----+--------+------+------+------+--------------------+--------------------+\n|CLASS|SENSORID| X| Y| Z| _id| _rev|\n+-----+--------+------+------+------+--------------------+--------------------+\n| 0|asdfghjk| -0.03| -0.03| -0.03|3bab9dca5c34a1686...|1-967b15a020f89c2...|\n| 0|asdfghjk| 0.3| 0.3| 0.3|3bab9dca5c34a1686...|1-771db8b270eed18...|\n| 0|asdfghjk| 0.13| 0.13| 0.13|3bab9dca5c34a1686...|1-03394f9bd867b28...|\n| 0|asdfghjk| 0.33| 0.33| 0.33|3bab9dca5c34a1686...|1-eba940c4d894cca...|\n| 0|asdfghjk| 0.04| 0.04| 0.04|3bab9dca5c34a1686...|1-004320ad84dbdd4...|\n| 0|asdfghjk| 0.11| 0.11| 0.11|3bab9dca5c34a1686...|1-47a51f94506efe1...|\n| 0|asdfghjk| 0.06| 0.06| 0.06|3bab9dca5c34a1686...|1-b0e0ea0b0976f0b...|\n| 0|asdfghjk| 0.08| 0.08| 0.08|3bab9dca5c34a1686...|1-d56ad70b3d29672...|\n| 0|asdfghjk| 1.94| 1.94| 1.94|3bab9dca5c34a1686...|1-deade6c17c7032d...|\n| 0|asdfghjk| 11.29| 11.29| 11.29|3bab9dca5c34a1686...|1-6c46eb08ecdaad5...|\n| 0|asdfghjk| 9.58| 9.58| 9.58|3bab9dca5c34a1686...|1-262e9aac0093c8a...|\n| 0|asdfghjk| 10.71| 10.71| 10.71|3bab9dca5c34a1686...|1-d866494e72fbf3c...|\n| 0|asdfghjk| -0.8| -0.8| -0.8|3bab9dca5c34a1686...|1-53e3e2d2a710b67...|\n| 0|asdfghjk|-32.62|-32.62|-32.62|3bab9dca5c34a1686...|1-f19cdd0864d878b...|\n| 0|asdfghjk| 11.5| 11.5| 11.5|3bab9dca5c34a1686...|1-404266b326293b3...|\n| 0|asdfghjk| 37.79| 37.79| 37.79|3bab9dca5c34a1686...|1-1285882df0469a7...|\n| 0|asdfghjk| 1.74| 1.74| 1.74|3bab9dca5c34a1686...|1-2dd0320839cde3b...|\n| 0|asdfghjk| 1.21| 1.21| 1.21|3bab9dca5c34a1686...|1-8d866076546dabe...|\n| 0|asdfghjk| -1.92| -1.92| -1.92|3bab9dca5c34a1686...|1-556a90fd09abb36...|\n| 0|asdfghjk| 0.16| 0.16| 0.16|3bab9dca5c34a1686...|1-f19648c9b394fb3...|\n+-----+--------+------+------+------+--------------------+--------------------+\nonly showing top 20 rows\n\n"
}
],
"source": "df=spark.read.load('shake', \"com.cloudant.spark\")\n\ndf.createOrReplaceTempView(\"df\")\nspark.sql(\"SELECT * from df\").show()\n"
},
{
"execution_count": 17,
"cell_type": "code",
"metadata": {},
"outputs": [
{
"execution_count": 17,
"metadata": {},
"data": {
"text/plain": "DataFrame[summary: string, CLASS: string, SENSORID: string, X: string, Y: string, Z: string, _id: string, _rev: string]"
},
"output_type": "execute_result"
}
],
"source": "df.describe()"
},
{
"execution_count": 4,
"cell_type": "code",
"metadata": {},
"outputs": [],
"source": "!rm -Rf a2_m1.parquet"
},
{
"execution_count": 5,
"cell_type": "code",
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'df' is not defined",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-5-ef0980a402bd>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mdf\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mdf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrepartition\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mdf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mwrite\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mjson\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'a2_m1.json'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mNameError\u001b[0m: name 'df' is not defined"
],
"output_type": "error"
}
],
"source": "df = df.repartition(1)\ndf.write.json('a2_m1.json')"
},
{
"execution_count": 6,
"cell_type": "code",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": "--2019-08-26 23:36:24-- https://raw.githubusercontent.com/romeokienzler/developerWorks/master/coursera/ai/rklib.py\r\nResolving raw.githubusercontent.com (raw.githubusercontent.com)... 199.232.8.133\r\nConnecting to raw.githubusercontent.com (raw.githubusercontent.com)|199.232.8.133|:443... connected.\r\nHTTP request sent, awaiting response... 200 OK\r\nLength: 2289 (2.2K) [text/plain]\r\nSaving to: \u2018rklib.py\u2019\r\n\r\n100%[======================================>] 2,289 --.-K/s in 0s \r\n\r\n2019-08-26 23:36:24 (40.5 MB/s) - \u2018rklib.py\u2019 saved [2289/2289]\r\n\r\n"
}
],
"source": "!rm -f rklib.py\n!wget https://raw.githubusercontent.com/romeokienzler/developerWorks/master/coursera/ai/rklib.py"
},
{
"execution_count": 7,
"cell_type": "code",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": "\tzip warning: name not matched: a2_m1.json\r\n\r\nzip error: Nothing to do! (try: zip -r a2_m1.json.zip . -i a2_m1.json)\r\n"
}
],
"source": "!zip -r a2_m1.json.zip a2_m1.json"
},
{
"execution_count": 8,
"cell_type": "code",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": "base64: a2_m1.json.zip: No such file or directory\r\n"
}
],
"source": "!base64 a2_m1.json.zip > a2_m1.json.zip.base64"
},
{
"execution_count": 1,
"cell_type": "code",
"metadata": {},
"outputs": [
{
"ename": "ModuleNotFoundError",
"evalue": "No module named 'rklib'",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-1-38aca612ae08>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mfrom\u001b[0m \u001b[0mrklib\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0msubmit\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mkey\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"1injH2F0EeiLlRJ3eJKoXA\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0mpart\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"wNLDt\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0memail\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"rezapci@msn.com\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0msecret\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"pZTmtIwHWLc68gjF\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'rklib'"
],
"output_type": "error"
}
],
"source": "from rklib import submit\nkey = \"1injH2F0EeiLlRJ3eJKoXA\"\npart = \"wNLDt\"\nemail = \"rezapci@msn.com\"\nsecret = \"pZTmtIwHWLc68gjF\"\n\nwith open('a2_m1.json.zip.base64', 'r') as myfile:\n data=myfile.read()\nsubmit(email, secret, key, part, [part], data)"
},
{
"execution_count": null,
"cell_type": "code",
"metadata": {},
"outputs": [],
"source": ""
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.6",
"name": "python3",
"language": "python"
},
"language_info": {
"mimetype": "text/x-python",
"nbconvert_exporter": "python",
"version": "3.6.8",
"name": "python",
"file_extension": ".py",
"pygments_lexer": "ipython3",
"codemirror_mode": {
"version": 3,
"name": "ipython"
}
}
},
"nbformat": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment