Skip to content

Instantly share code, notes, and snippets.

@trouleau
Last active February 17, 2021 12:36
Show Gist options
  • Save trouleau/92a4be77163f85b9d9d4693b73e31497 to your computer and use it in GitHub Desktop.
Save trouleau/92a4be77163f85b9d9d4693b73e31497 to your computer and use it in GitHub Desktop.
ix-faq

Frequently asked questions

This is a list of questions related to the technical infrastructure required to do the labs.

I want to use an EPFL-provided computer. How should I proceed?

You can use the computers in BC 07-08. These computers run virtual machines; choose one of the two following images that contains the software that you will use during the course is

IC-CO-IN-SC
IC-BLC-IN-SC

Many programs that will be useful during the course (such as Python, Jupyter) are not the "default" ones found in $PATH. You can find them in /opt/anaconda3/bin. (The same holds for the cluster.)

Basically, you will find these two commands useful:

/opt/anaconda3/bin/jupyter console   # Launch a command-line interpreter
/opt/anaconda3/bin/jupyter notebook  # Launch a notebook server

Be careful: launching jupyter notebook (without the absolute path as described above) seems to work at first, but many of the libraries that we use in the course will not be available (e.g., matplotlib).

I need to use my own computer. How should I proceed?

You need to have python3 installed on your machine to do the labs. We recommend you to install Python with Anaconda.

To install Anaconda, go to https://www.anaconda.com/distribution/, choose your distribution, and download the Python 3.7 graphical installer. You will then be walked through the installation steps by the installer.

How can I connect to the cluster?

The server is iccluster040.iccluster.epfl.ch. You need to be on the EPFL network to be able to reach it (either on-campus or connected via VPN). You can connect to the server using your GASPAR credentials. On the command line, type:

ssh -l USERNAME iccluster040.iccluster.epfl.ch

Where USERNAME is your GASPAR username.

How should I transfer a file to/from the cluster?

Linux and Apple OS X users can use scp to transfer files. The basic syntax of scp is scp [from] [to]. The [from] portion can be a filename or a directory/folder. The [to] portion will contain your username, the hostname of the cluster login node, and the destination directory/folder. For example:

scp /SOME/LOCAL/FILE ${USER}@iccluster040.iccluster.epfl.ch:/SOME/REMOTE/DIRECTORY

It is possible to transfer a directory using scp with options -r -p. The -r indicates that the copy is recursive. The -p preserves dates/times/permissions of the files.

You can also transfer files between your local computer and the cluster using a SFTP client, such as Cyberduck (OSX),

FileZilla (Linux), WinSCP (Windows).

How can I configure Jupyter on the cluster to be able to use it remotely?

Step-by-step instructions is available here.

I get an error about Spark UI and port binding, what should I do?

If you get an error that looks like this:

17/02/28 19:30:19 ERROR SparkUI: Failed to bind SparkUI
java.net.BindException: Address already in use: Service 'SparkUI' failed
after 16 retries! Consider explicitly setting the appropriate port for the
service 'SparkUI' (for example spark.ui.port for SparkUI) to an available
port or increasing spark.port.maxRetries.
    at sun.nio.ch.Net.bind0(Native Method)
    at sun.nio.ch.Net.bind(Net.java:433)
    at sun.nio.ch.Net.bind(Net.java:425)
    ...

You certainly did not set the spark.ui.port number properly in your .profile. To solve the issue, just edit your .profile and add the line

alias pyspark='pyspark --conf spark.ui.port=xxxx0'

where xxxx0 is the last 4 four digits of you SCIPER number followed by 0.

If the first digit starts with 0 or if the overall number is larger than 6553, then the port number will not be set properly. In this case, replace xxxx with any random number between 1024 and 6553.

I am having trouble with PySpark. Help!

If too many users are connected to the cluster and have requested resources from YARN (e.g., using large values in the call to pyspark---see above), there may be no more resources left for you. There are two symptoms of this:

  1. The Jupyter kernel seems to be "working" permanently, and code cells never execute.
  2. The Spark context (variable sc) is empty.

You might also see messages like this in the terminal.

17/02/28 19:34:07 INFO Client: Application report for
application_1484292377252_0269 (state: ACCEPTED)

In this case, be patient and try a bit later when the server is less crowded. You can also try to run Spark without asking resources to YARN, by simply typing pyspark in the shell, without arguments (results are not guaranteed).

I am getting permission errors when editing files with vim on the BC07-08 machines, what should I do?

When saving a file with vim on the BC07-08 machines, you get the following error:

E137: Viminfo file is not writable: /home/USERNAME/.viminfo

You can safely ignore this error. It is just an error linked to a permission issue on your Myfile directory, but it does not prevent you to actually save the file you were editing.

I can't connect to the cluster, the hostname cannot be resolved, what should I do?

When connecting to the cluster with ssh, if you get an error

ssh: Could not resolve hostname [hostname]: nodename nor servname provided, or not known 

Try deconnecting/reconnecting your wifi, or connecting through VPN.

I am getting an error about Malformed HTTP message on jupyter logs and can't access my notebooks, what should I do?

Try clearing the cache of your web browser and reload the page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment