github repo for rest of specialization: Data Science Coursera
R was developed by statisticians working at...
The University of Auckland
The definition of free software consists of four freedoms (freedoms 0 through 3). Which of the following is NOT one of the freedoms that are part of the definition?
The freedom to sell the software for any price.
In R the following are all atomic data types EXCEPT
matrix
If I execute the expression x <- 4 in R, what is the class of the object 'x' as determined by the `class()' function?
numeric
x <- 4
class(x)
What is the class of the object defined by x <- c(4, TRUE)?
numeric
x <- c(4, TRUE)
class(x)
If I have two vectors x <- c(1,3, 5) and y <- c(3, 2, 10), what is produced by the expression cbind(x, y)?
a 3 by 2 numeric matrix
x <- c(1,3, 5)
y <- c(3, 2, 10)
cbind(x, y)
A key property of vectors in R is that
elements of a vector all must be of the same class
Suppose I have a list defined as x <- list(2, "a", "b", TRUE). What does x[[1]] give me?
a numeric vector containing the element 2
x <- list(2, "a", "b", TRUE)
x[[1]]
class(x[[1]])
Suppose I have a vector x <- 1:4 and a vector y <- 2. What is produced by the expression x + y?
a numeric vector with elements 3, 4, 5, 6.
x <- 1:4
y <- 2
x + y
class(x + y)
Suppose I have a vector x <- c(17, 14, 4, 5, 13, 12, 10) and I want to set all elements of this vector that are greater than 10 to be equal to 4. What R code achieves this?
x[x >= 11] <- 4
x <- c(17, 14, 4, 5, 13, 12, 10)
x[x >= 11] <- 4
x
In the dataset provided for this Quiz, what are the column names of the dataset?
Ozone, Solar.R, Wind, Temp, Month, Day
# install package if doesnt exist
install.packages("data.table")
library("data.table")
# Reading in data
quiz_data <- fread('hw1_data.csv')
# Column names of the dataset
names(quiz_data)
Extract the first 2 rows of the data frame and print them to the console. What does the output look like?
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
# First two rows
quiz_data[c(1,2)]
How many observations (i.e. rows) are in this data frame?
153
nrows(quiz_data)
Extract the last 2 rows of the data frame and print them to the console. What does the output look like?
Ozone Solar.R Wind Temp Month Day
152 18 131 8.0 76 9 29
153 20 223 11.5 68 9 30
tail(quiz_data, 2)
What is the value of Ozone in the 47th row?
21
quiz_data[47, Ozone]
How many missing values are in the Ozone column of this data frame?
37
# Going back to data.frame because dont it hasnt been taught yet in this specialization
hw1 = read.csv('hw1_data.csv')
sub = subset(quiz_data, is.na(Ozone))
nrow(sub)
# Can also remmove Missing Values using Something like This
quiz_data[complete.cases(quiz_data),]
What is the mean of the Ozone column in this dataset? Exclude missing values (coded as NA) from this calculation.
42.1
The 'mean' function can be used to calculate the mean.
hw1 = read.csv('hw1_data.csv')
sub = subset(hw1, !is.na(Ozone), select = Ozone)
apply(sub, 2, mean)
Extract the subset of rows of the data frame where Ozone values are above 31 and Temp values are above 90. What is the mean of Solar.R in this subset?
212.8
quiz_data = read.csv('hw1_data.csv')
sub = subset(quiz_data, Ozone > 31 & Temp > 90, select = Solar.R)
apply(sub, 2, mean)
What is the mean of "Temp" when "Month" is equal to 6?
79.1
quiz_data = read.csv('hw1_data.csv')
sub = subset(hw1, Month == 6, select = Temp)
apply(sub, 2, mean)
What was the maximum ozone value in the month of May (i.e. Month = 5)?
115
quiz_data = read.csv('hw1_data.csv')
sub = subset(quiz_data, Month == 5 & !is.na(Ozone), select = Ozone)
apply(sub, 2, max)
Thanks for this, so helpful. The course didn't go through a lot of the above code