Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save raov5/664df16e0788908d5f2d206c4c55478a to your computer and use it in GitHub Desktop.
Save raov5/664df16e0788908d5f2d206c4c55478a to your computer and use it in GitHub Desktop.
This notebook teaches users how to upload an .RData file into an R Jupyter Notebook in IBM's Data Science Experience (DSX).
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"kernelspec": {
"language": "R",
"display_name": "R with Spark 2.0",
"name": "r-spark20"
},
"language_info": {
"version": "3.3.2",
"file_extension": ".r",
"mimetype": "text/x-r-source",
"name": "R",
"codemirror_mode": "r",
"pygments_lexer": "r"
}
},
"cells": [
{
"execution_count": 1,
"cell_type": "code",
"outputs": [],
"metadata": {
"collapsed": true
},
"source": "#@author: Venky Rao raove@us.ibm.com\n#@last edited: 13 Sep 2017\n#huge thanks to Charles Gomes of IBM"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "# Add a .RData file to a DSX R Notebook in 5 steps"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Step 1: Uploaded your .RData file into your Object Store"
},
{
"execution_count": 3,
"cell_type": "code",
"outputs": [],
"metadata": {
"collapsed": true
},
"source": "#before you try adding the file to your notebook, the file needs to be loaded into your Object Store\n#you can do this by clicking the 1001 button near the top right corner of your screen\n#then you can either drop your file there or click the \"browse\" button to find your file and upload it\n#for this notebook, I am attempting to upload the file called \"Titanic.RData\" which I have in my Object Store"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Step 2: Insert your Credentials"
},
{
"execution_count": 5,
"cell_type": "code",
"outputs": [],
"metadata": {
"collapsed": true
},
"source": "#the next step is to insert your credentials so that DSX can access your Object Store\n#you do this by choosing a new cell in your Notebook and,\n#clicking on the \"Insert to code\" drop down arrow that is displayed below,\n#the name of the file that you want to upload\n\n#when you click the drop down arrow, you will see a drop down list\n#pick a new cell in your notebook and then click the \"Insert Credentials\" option"
},
{
"execution_count": null,
"cell_type": "code",
"outputs": [],
"metadata": {
"collapsed": true
},
"source": "# The code was removed by DSX for sharing."
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Step 3: Insert textConnectionObject"
},
{
"execution_count": 7,
"cell_type": "code",
"outputs": [],
"metadata": {
"collapsed": true
},
"source": "#the next step is to insert textConnectionObject\n#you do this by choosing a new cell in your Notebook and,\n#clicking on the \"Insert to code\" drop down arrow that is displayed below,\n#the name of the file that you want to upload\n\n#when you click the drop down arrow, you will see a drop down list\n#pick a new cell in your notebook and then click the \"Insert textConnectionObject\" option\n\n#after you do this, ENSURE that you make the following changes:\n\n#REPLACE\n#data <- content(httr::GET(url = access_url, add_headers (\"Content-Type\" = \"application/json\", \"X-Auth-Token\" = x_subject_token)), as=\"text\")\n #textConnection(data)\n\n#WITH\n#rawdata <- content(httr::GET(url = access_url, add_headers (\"Content-Type\" = \"application/json\", \"X-Auth-Token\" = x_subject_token)), as=\"raw\")\n #rawdata\n\n#then, after the following line of code:\n# Your data file was loaded into a textConnection object and you can process the data with your package of choice.\n#data.1 <- getObjectStorageFileWithCredentials_2d7ac8c3aee941f7a4287c68e0f0b586(\"DSXTipsandTricks\", \"Titanic.RData\")\n\n#ADD the following line of code:\n\n#writeBin(data.1, \"Titanic.RData\")"
},
{
"execution_count": 8,
"cell_type": "code",
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"text": "Loading required package: httr\nLoading required package: RCurl\nLoading required package: bitops\n\nAttaching package: \u2018RCurl\u2019\n\nThe following object is masked from \u2018package:SparkR\u2019:\n\n base64\n\n"
}
],
"metadata": {},
"source": "# The code was removed by DSX for sharing."
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Step 4: Load the .RData file"
},
{
"execution_count": 15,
"cell_type": "code",
"outputs": [],
"metadata": {
"collapsed": true
},
"source": "load(file = \"Titanic.RData\")"
},
{
"execution_count": 17,
"cell_type": "code",
"outputs": [
{
"metadata": {},
"data": {
"text/latex": "\\begin{enumerate*}\n\\item 'Titanic.rdata'\n\\item 'Titanic.RData'\n\\end{enumerate*}\n",
"text/html": "<ol class=list-inline>\n\t<li>'Titanic.rdata'</li>\n\t<li>'Titanic.RData'</li>\n</ol>\n",
"text/plain": "[1] \"Titanic.rdata\" \"Titanic.RData\"",
"text/markdown": "1. 'Titanic.rdata'\n2. 'Titanic.RData'\n\n\n"
},
"output_type": "display_data"
}
],
"metadata": {},
"source": "#check whether the file has been added to the notebook\nsystem(\"ls\", intern=TRUE) #invoke the UNIX shell from an R kernel and list all the files"
},
{
"execution_count": 20,
"cell_type": "code",
"outputs": [
{
"metadata": {},
"data": {
"text/latex": "\\begin{enumerate*}\n\\item 'data.1'\n\\item 'getObjectStorageFileWithCredentials\\_2d7ac8c3aee941f7a4287c68e0f0b586'\n\\item 'q'\n\\item 'sc'\n\\item 'sqlContext'\n\\item 'Titanic'\n\\end{enumerate*}\n",
"text/html": "<ol class=list-inline>\n\t<li>'data.1'</li>\n\t<li>'getObjectStorageFileWithCredentials_2d7ac8c3aee941f7a4287c68e0f0b586'</li>\n\t<li>'q'</li>\n\t<li>'sc'</li>\n\t<li>'sqlContext'</li>\n\t<li>'Titanic'</li>\n</ol>\n",
"text/plain": "[1] \"data.1\" \n[2] \"getObjectStorageFileWithCredentials_2d7ac8c3aee941f7a4287c68e0f0b586\"\n[3] \"q\" \n[4] \"sc\" \n[5] \"sqlContext\" \n[6] \"Titanic\" ",
"text/markdown": "1. 'data.1'\n2. 'getObjectStorageFileWithCredentials_2d7ac8c3aee941f7a4287c68e0f0b586'\n3. 'q'\n4. 'sc'\n5. 'sqlContext'\n6. 'Titanic'\n\n\n"
},
"output_type": "display_data"
}
],
"metadata": {},
"source": "ls() #lists all the variables in the R workspace\n #we can see that \"Titanic\" is listed; this contains all our data"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Step 5: Explore your dataset"
},
{
"execution_count": 21,
"cell_type": "code",
"outputs": [],
"metadata": {
"collapsed": true
},
"source": "#now we can explore our dataset as shown below:"
},
{
"execution_count": 22,
"cell_type": "code",
"outputs": [
{
"metadata": {},
"data": {
"text/latex": "\\begin{tabular}{r|llllll}\n Gender & Age & Name & Fare & Class & Survived\\\\\n\\hline\n\t Female & 29 & Allen, Miss. Elisabeth Walton & 211.34 & Upper & Yes \\\\\n\t Male & 1 & Allison, Master. Hudson Trevor & 151.55 & Upper & Yes \\\\\n\t Female & 2 & Allison, Miss. Helen Loraine & 151.55 & Upper & No \\\\\n\t Male & 30 & Allison, Mr. Hudson Joshua Creighton & 151.55 & Upper & No \\\\\n\t Female & 25 & Allison, Mrs. Hudson J C (Bessie Waldo Daniels) & 151.55 & Upper & No \\\\\n\t Male & 48 & Anderson, Mr. Harry & 26.55 & Upper & Yes \\\\\n\\end{tabular}\n",
"text/html": "<table>\n<thead><tr><th scope=col>Gender</th><th scope=col>Age</th><th scope=col>Name</th><th scope=col>Fare</th><th scope=col>Class</th><th scope=col>Survived</th></tr></thead>\n<tbody>\n\t<tr><td>Female </td><td>29 </td><td>Allen, Miss. Elisabeth Walton </td><td>211.34 </td><td>Upper </td><td>Yes </td></tr>\n\t<tr><td>Male </td><td> 1 </td><td>Allison, Master. Hudson Trevor </td><td>151.55 </td><td>Upper </td><td>Yes </td></tr>\n\t<tr><td>Female </td><td> 2 </td><td>Allison, Miss. Helen Loraine </td><td>151.55 </td><td>Upper </td><td>No </td></tr>\n\t<tr><td>Male </td><td>30 </td><td>Allison, Mr. Hudson Joshua Creighton </td><td>151.55 </td><td>Upper </td><td>No </td></tr>\n\t<tr><td>Female </td><td>25 </td><td>Allison, Mrs. Hudson J C (Bessie Waldo Daniels)</td><td>151.55 </td><td>Upper </td><td>No </td></tr>\n\t<tr><td>Male </td><td>48 </td><td>Anderson, Mr. Harry </td><td> 26.55 </td><td>Upper </td><td>Yes </td></tr>\n</tbody>\n</table>\n",
"text/plain": " Gender Age Name Fare Class\n1 Female 29 Allen, Miss. Elisabeth Walton 211.34 Upper\n2 Male 1 Allison, Master. Hudson Trevor 151.55 Upper\n3 Female 2 Allison, Miss. Helen Loraine 151.55 Upper\n4 Male 30 Allison, Mr. Hudson Joshua Creighton 151.55 Upper\n5 Female 25 Allison, Mrs. Hudson J C (Bessie Waldo Daniels) 151.55 Upper\n6 Male 48 Anderson, Mr. Harry 26.55 Upper\n Survived\n1 Yes \n2 Yes \n3 No \n4 No \n5 No \n6 Yes "
},
"output_type": "display_data"
}
],
"metadata": {},
"source": "head(Titanic) #lists the frst 6 rows of our dataset"
},
{
"execution_count": 23,
"cell_type": "code",
"outputs": [
{
"metadata": {},
"data": {
"text/latex": "\\begin{tabular}{r|llllll}\n & Gender & Age & Name & Fare & Class & Survived\\\\\n\\hline\n\t1040 & Female & 15 & Yasbeck, Mrs. Antoni (Selini Alexander) & 14.45 & Lower & Yes \\\\\n\t1041 & Male & 46 & Youseff, Mr. Gerious & 7.22 & Lower & No \\\\\n\t1042 & Female & 14 & Zabour, Miss. Hileni & 14.45 & Lower & No \\\\\n\t1043 & Male & 26 & Zakarian, Mr. Mapriededer & 7.22 & Lower & No \\\\\n\t1044 & Male & 27 & Zakarian, Mr. Ortin & 7.22 & Lower & No \\\\\n\t1045 & Male & 29 & Zimmerman, Mr. Leo & 7.88 & Lower & No \\\\\n\\end{tabular}\n",
"text/html": "<table>\n<thead><tr><th></th><th scope=col>Gender</th><th scope=col>Age</th><th scope=col>Name</th><th scope=col>Fare</th><th scope=col>Class</th><th scope=col>Survived</th></tr></thead>\n<tbody>\n\t<tr><th scope=row>1040</th><td>Female </td><td>15 </td><td>Yasbeck, Mrs. Antoni (Selini Alexander)</td><td>14.45 </td><td>Lower </td><td>Yes </td></tr>\n\t<tr><th scope=row>1041</th><td>Male </td><td>46 </td><td>Youseff, Mr. Gerious </td><td> 7.22 </td><td>Lower </td><td>No </td></tr>\n\t<tr><th scope=row>1042</th><td>Female </td><td>14 </td><td>Zabour, Miss. Hileni </td><td>14.45 </td><td>Lower </td><td>No </td></tr>\n\t<tr><th scope=row>1043</th><td>Male </td><td>26 </td><td>Zakarian, Mr. Mapriededer </td><td> 7.22 </td><td>Lower </td><td>No </td></tr>\n\t<tr><th scope=row>1044</th><td>Male </td><td>27 </td><td>Zakarian, Mr. Ortin </td><td> 7.22 </td><td>Lower </td><td>No </td></tr>\n\t<tr><th scope=row>1045</th><td>Male </td><td>29 </td><td>Zimmerman, Mr. Leo </td><td> 7.88 </td><td>Lower </td><td>No </td></tr>\n</tbody>\n</table>\n",
"text/plain": " Gender Age Name Fare Class Survived\n1040 Female 15 Yasbeck, Mrs. Antoni (Selini Alexander) 14.45 Lower Yes \n1041 Male 46 Youseff, Mr. Gerious 7.22 Lower No \n1042 Female 14 Zabour, Miss. Hileni 14.45 Lower No \n1043 Male 26 Zakarian, Mr. Mapriededer 7.22 Lower No \n1044 Male 27 Zakarian, Mr. Ortin 7.22 Lower No \n1045 Male 29 Zimmerman, Mr. Leo 7.88 Lower No "
},
"output_type": "display_data"
}
],
"metadata": {},
"source": "tail(Titanic) #lists the last 6 rows of our dataset"
},
{
"execution_count": 24,
"cell_type": "code",
"outputs": [
{
"metadata": {},
"data": {
"text/plain": " Gender Age Name \n Female:388 Min. : 0.00 Connolly, Miss. Kate : 2 \n Male :657 1st Qu.:21.00 Kelly, Mr. James : 2 \n Median :28.00 Abbing, Mr. Anthony : 1 \n Mean :29.84 Abbott, Master. Eugene Joseph : 1 \n 3rd Qu.:39.00 Abbott, Mr. Rossmore Edward : 1 \n Max. :80.00 Abbott, Mrs. Stanton (Rosa Hunt): 1 \n (Other) :1037 \n Fare Class Survived \n Min. : 0.00 Lower :500 No :618 \n 1st Qu.: 8.05 Middle:261 Yes:427 \n Median : 15.75 Upper :284 \n Mean : 36.69 \n 3rd Qu.: 35.50 \n Max. :512.33 \n "
},
"output_type": "display_data"
}
],
"metadata": {},
"source": "summary(Titanic) #provides a summary of our dataset"
},
{
"execution_count": 26,
"cell_type": "code",
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": "'data.frame':\t1045 obs. of 6 variables:\n $ Gender : Factor w/ 2 levels \"Female\",\"Male\": 1 2 1 2 1 2 1 2 1 2 ...\n $ Age : num 29 1 2 30 25 48 63 39 53 71 ...\n $ Name : Factor w/ 1307 levels \"Abbing, Mr. Anthony\",..: 22 24 25 26 27 31 46 47 51 55 ...\n $ Fare : num 211 152 152 152 152 ...\n $ Class : Factor w/ 3 levels \"Lower\",\"Middle\",..: 3 3 3 3 3 3 3 3 3 3 ...\n $ Survived: Factor w/ 2 levels \"No\",\"Yes\": 2 2 1 1 1 2 2 1 2 1 ...\n - attr(*, \"na.action\")=Class 'omit' Named int [1:264] 16 38 41 47 60 70 71 75 81 107 ...\n .. ..- attr(*, \"names\")= chr [1:264] \"16\" \"38\" \"41\" \"47\" ...\n"
}
],
"metadata": {},
"source": "str(Titanic) #provides the structure of our dataset"
}
],
"nbformat": 4,
"nbformat_minor": 1
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment