Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
installation notes for DATA201 and DATA422
Data Wrangling Stack
--------------------
In this course we will use:
- R as default programming language
- Tidyverse as the R dialect of choice
- The shell commands (bash or zsh), through the terminal
- JupyterLab as interface to R (and a bit of Rstudio)
- JupyterLab requires Python to be built and installed, so you need Python as well.
- Julia, Java(script), Scala, you-name-it: all of this is optional (and some exploration is suggested for 400 level students)
Don't worry if all, or some, of these names don't ring any bell to you.
You'll get familiar :-)
Linux, Mac Os X, and Windows have a very good support for this software.
Yet, each one of the three operating systems have some peculiar behaviour in installing.
We first introduce you to the general strategy for installing the software stack we need:
0) Make sure that you can use a terminal
1) Check whether Python 3.5 (or newer version) is installed, if not install it
2) Check whether R 3.4 (or newer version) is installed, if not install it
3) Check whether you have conda or pip installed, if you don't have any of the two, install pip
4) Install JupyterLab
5) Run R and install the R kernel for JupyterLab
6) Run JupyterLab, open the InstallationTest.ipynb and run its cells
6.1) InstallationTest.ipynb will guide you through the installation of Tidyverse (it may take a while) and do some little checks
7) Install other programming languages and their Jupyter kernels if you feel explorish (talk to me about this)
If you get to the end of 6.1) without errors, you are good to go!
### Details
0) **Windows** is not always ready to provide a good terminal. So, you are in for a ride!
You first need to install a serious terminal. Cygwin is a good choice. See here a step by step drive through: https://smarttechnicalworld.com/terminal-for-windows/
0) **Mac or Linux** you are good to go, it's already installed.
If you are in Mac Os X, it's a good idea to install the free Xcode suite https://developer.apple.com/xcode/ and the "Command Line Tools". For this last one, run
xcode-select --install
in the terminal.
If your are not familiar with the terminal, check out this tutorial: https://www.digitalocean.com/community/tutorials/an-introduction-to-the-linux-terminal
(and maybe this *optional* one about navigating folders and files, https://www.digitalocean.com/community/tutorials/basic-linux-navigation-and-file-management)
1.1) in your terminal, check whether you have Python installed and what's its version. The command is:
$ python --version
or
$ python3 --version
if the outcome is something like
Python 3.5
or a bigger number, you are good to go.
If you don't have it, you need to install it.
Linux: use your package manager.
Mac Os X: install Homebrew and then Python (follow this steps http://docs.python-guide.org/en/latest/starting/install3/osx/, don't worry about the last section of Pipenv and Virtual Environments).
Windows: Download and install it from here: https://www.python.org/downloads/ (at the moment of writing this the most recent version is 3.7.0).
2) Check whether R 3.4 (or newer version) is installed, if not install it
In the terminal, run
$ R --version
if the output is something like
R version 3.4 ...
or a bigger numbers, you are good to go.
If not, you need to install it.
Linux: once again, use your package manager.
Mac Os X: You'll need to first install some dependencies and then R.
Go to: https://cran.r-project.org/bin/macosx/tools/ and install the latest versions of clang and gfortran.
Then go to: https://cran.r-project.org/bin/macosx/ and install the latest version of R
Windows: Download and install it from here: https://cran.r-project.org/bin/windows/base/
3) Check whether you have conda or pip installed. The command is:
For pip:
$ pip --version
or
$ pip3 --version
if the outcome is something like
pip 10.0.1 ...
For conda:
$ conda --version
if the outcome is something like
conda 4.4.5
or a bigger numbers, you are good to go. You need at least one between conda and pip.
Otherwise, you need to install it.
Linux: use your package manager to install pip.
Mac Os X: you should have installed Python and Pip in one go using Homebrew in step 0). Are you sure you don't have pip?
Windows: download the file https://bootstrap.pypa.io/get-pip.py in any folder (paste the link in a browser and "save as..."); open the terminal and navigate to that folder; run
$ python get-pip.py
4) Install JupyterLab
That's the easy part! Open the terminal and run
$ pip install jupyter
and
$ pip install jupyterlab
5) Run R and install the R kernel for JupyterLab
Go in the terminal. Run
$ R
this will open the console for R: the commands you type now will be interpreted by R. Type (without the >)
> install.packages('devtools')
then press the enter key. Magic will happen. Now you have installed a package that extends the base capabilities of R.
Go again with
> install.packages(c('repr', 'IRdisplay', 'evaluate', 'crayon', 'pbdZMQ', 'uuid', 'digest'))
This will install a bunch of other things.
Once you have installed the libraries run
> devtools::install_github('IRkernel/IRkernel')
which will install the library "IRkernel" and
> IRkernel::installspec()
which will allow JupyterLab to "speak" to R.
Then you can quite R. Type
> quit()
press enter, then type
> yes
press enter and you are out of it.
More details and options here: https://irkernel.github.io/installation/
6) Run JupyterLab, open the InstallationTest.ipynb and run its cells
6.1) InstallationTest.ipynb will guide you through the installation of Tidyverse (it may take a while) and do some little checks
To run JupyterLab, open a terminal in the folder where your work files are (in this case, the folder where InstallationTest.ipynb is) and run
$ jupyter-lab .
It should start a browser interface to JupyterLab. You can double click on the InstallationTest.ipynb (you will find it on the left column) to open it.
To terminate JupyterLab, go back to the terminal and press the control+c keys (keep control key pressed and press c). It will ask you if you want to stop the server, say yes.
7) Install other programming languages and their Jupyter kernels if you feel explorish (talk to me about this)
For 400 level students: you are supposed to at least try all the Julia notebooks. To do that, you need to install Julia.
Windows and Mac Os X: go to https://julialang.org/downloads/ and download the right version of Julia.
Linux: install Julia using your distribution package manager.
Then, open Julia in the Terminal by running
$ julia
to let JupyterLab speak with Julia
> Pkg.update()
> Pkg.add("IJulia")
The next time you'll run JupyterLab, your Julia kernel will be available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment