Skip to content

Instantly share code, notes, and snippets.

@selfboot
Last active November 8, 2015 07:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save selfboot/7f180d832afa8ccfc40e to your computer and use it in GitHub Desktop.
Save selfboot/7f180d832afa8ccfc40e to your computer and use it in GitHub Desktop.
Mining Massive Datasets Quiz Week7B Basic.
# Question 1
# Suppose we have an LSH family h of (d1,d2,.6,.4) hash functions.
# We can use three functions from h and the AND-construction to form a (d1,d2,w,x) family,
# and we can use two functions from h and the OR-construction to form a (d1,d2,y,z) family.
# Calculate w, x, y, and z, and then identify the correct value of one of these in the list below.
matrix_m <- matrix(c(0,0.5,0.5,0,1,0,0,0,0,0,0,1,0,0,1,0) * 0.7,4,4)
beta_set <- matrix(c(2/3,1/3,0,0,2/3,1/3,0,0,2/3,1/3,0,0,2/3,1/3,0,0) * 0.3, 4, 4)
a <- matrix_m + beta_set
v <- matrix(c(0.25,0.25,0.25,0.25),ncol=1)
for (i in 1:500){
v <- a %*% v
}
v
# Question 2
# Let w be the PageRank of each of the second-tier pages, and let z be the PageRank of each of the supporting pages.
# Then the equations relating y, w, and z are:
# y = x + βzm + (1-β)/n
# w = βy/k + (1-β)/n
# z = βkw/m + (1-β)/n
# Then:
# y = x + β^3y + β(1-β)m/n + β^2(1-β)k/n + (1-β)/n
# Neglect the last term (1-β)/n, per the directions in the statement of the problem.
# If we move the term β^3y to the left, and note that β^3 = (1-β)(1+β+&beta2), we get
# y = x/(1-β^3) + (β(1-β)/(1-β^3))(m/n) + (β^2(1-β)/(1-β^3))(k/n)
# For β = 0.85, these coefficients evaluate to:
# y = 2.59x + 0.33(m/n) + 0.28(k/n)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment