Skip to content

Instantly share code, notes, and snippets.

@adgaudio
Created April 26, 2014 05:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save adgaudio/11312117 to your computer and use it in GitHub Desktop.
Save adgaudio/11312117 to your computer and use it in GitHub Desktop.
SparkR install notes
# My SparkR install notes. SparkR gives R access to Apache Spark.
#
# For details about SparkR, see their site:
# http://amplab-extras.github.io/SparkR-pkg/
#
# Author: Alex Gaudio <adgaudio@gmail.com>
#
# Note the aweful hack where I symlink libjvm.so to /usr/lib. I did that to get rJava installed.
set -e
set -u
cd ~/
# install R
#wget http://lib.stat.cmu.edu/R/CRAN/src/base/R-3/R-3.1.0.tar.gz
#tar xvzf R-3.1.0.tar.gz
# ubuntu deps
sudo apt-get install libcurl4-openssl-dev # devtools --> Rcurl dep probably covered by r-base
sudo apt-get build-dep r-base
# export variables to PATH (for rJava install)
sudo R CMD javareconf
R CMD javareconf -e echo # necessary?
# aweful hack, but it works
sudo ln -s /usr/lib/jvm/jdk1.7.0_51/jre/lib/amd64/server/libjvm.so /usr/lib/
# configure R with default mirror for package installs
echo 'options(repos=structure(c(CRAN="http://lib.stat.cmu.edu/R/CRAN/")))' >> ~/.Rprofile
# install R package dependencies
fp="/tmp/sparkR.$$"
cat > $fp <<EOF
install.packages("devtools")
install.packages("rJava")
library(devtools)
# install_github("amplab-extras/SparkR-pkg", subdir="pkg") # done in prior step
EOF
Rscript $fp
rm $fp
# install sparkr
git clone https://github.com/amplab-extras/SparkR-pkg
(cd SparkR-pkg ;
SPARK_HADOOP_VERSION="2.0.0-mr1-cdh4.4.0" ./install-dev.sh ;
# inject sparkr library into R libPath
echo ".libPaths(c(\"`pwd`/lib\", .libPaths()))" >> ~/.Rprofile
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment