Skip to content

Instantly share code, notes, and snippets.

@vankesteren
Last active October 14, 2024 08:52
Show Gist options
  • Save vankesteren/e5369e66fb252ed6ec70abd1894d31e3 to your computer and use it in GitHub Desktop.
Save vankesteren/e5369e66fb252ed6ec70abd1894d31e3 to your computer and use it in GitHub Desktop.
Documentation for using the Methods & Statistics department compute server

Department compute server

This is the documentation for using the Methods & Statistics department compute server.

Connecting to the server

Connecting to the server is only available in two ways:

  • from our department with an ethernet connection
  • via Utrecht University vpn from anywhere.

To connect, type in your browser the following URL: mscomputer.fss.uu.nl. You wil be greeted with an RStudio login window. This work best in google chrome or mozilla firefox.

User account

To access the server, you need a login / user account, which is available on request.

Creating a user

A user account needs to be manually created for you. Send an email to the admin (Erik-Jan) for this, with an explanation of why you want to use the computer. You will receive a default password which you can change when you first log in.

Updating the password

Login to the server, open a terminal within the rstudio browser window (shift + alt + R), type passwd <your-user-name> (for example passwd erikjan) and follow the prompts.

Note for admin
export username=newuser
sudo useradd -m -d /data/$username $username
sudo passwd $username
sudo chown -R $username /data/$username
sudo chmod -R go-rw /data/$username

Using the server

To use the server, abide by these rules:

  1. Please read the below carefully.
  2. If you aren't sure about something, read again and then ask before doing.
  3. If you misuse the computer, your account will be suspended.

Multiple users can connect to the server at the same time. If you are preparing a script, you can always login to the server. If you want to run a large simulation, please reserve time for this on the Google sheet schedule.

Server specifications

CPU     :  2 x Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
Threads :  48
Memory  :  64GB
GPU     :  Nvidia GTX 1080Ti 
Storage :  Main disk : 2TB    WDC WD20EFRX-68EUZN0  /data
           OS disk   : 120GB  INTEL SSDSC2KW120H6   /

Data storage

Please exclusively use your home directory (/data/<your-user-name>/, or alternatively ~/), which is on the 2TB main harddrive. No other user can see your files there.

Example

my_big_matrix <- matrix(0, 1e4, 1e4)
saveRDS(my_big_matrix, "~/bigfile.rds")

Backing up your data

You are responsible for archiving your data. The server is not backed up in any way and we provide no guarantees. Consider your home directory as temporary/scratch storage.

After running simulations, it is wise to archive your results somewhere you can access them in case the hard drive of the server breaks. You can do this from the RStudio server by selecting "download" in the files tab.

It can also be done using the scp command from your own computer via the terminal (if you have it installed).

Example

To copy the entire folder simulation_folder to the local backup folder local_backup, the user testuser can run the following command:

scp -rC testuser@mscomputer.fss.uu.nl:~/simulation_folder local_backup
Note for admin Checking storage space can be done as follows:
df -h /data
sudo du -hs /data/*

SMART tests for the hard drive (/dev/sda) should be run every now and then using smartctl. Check if the RAW_VALUE column shows Reallocated_Sector_Ct and such. The MTBF of the hard disk is 1 million hours, so this should be fine for a while.

sudo smartctl -a /dev/sda
sudo smartctl -t short /dev/sda
sudo smartctl -a /dev/sda

R sessions

When you log in, you start an R session. It will remain open until you stop it (red button in the top right corner). Please close your R session when you are done. The R version is the following:

R version 4.2.3 (2023-03-15) -- "Shortstop Beagle"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R packages

Everyone gets their own personal R package repository automatically in ~/R/packages. There should not be any interference with package versions.

If R warns you that you need a library for installing packages (e.g. libssl-dev), please contact the admin to install it for you. This is what such a warning may look like upon trying to install a package:

checking mpfr.h usability... no
checking mpfr.h presence... no
checking for mpfr.h... no
configure: error: Header file mpfr.h not found; maybe use --with-mpfr-include=INCLUDE_PATH

In addition, global packages are installed for everyone to use by default. If you want to use a newer version of a preinstalled package, simply install it as normal and your own version will be used.

Preinstalled packages Version
devtools 2.4.5
Note for admin
install.packages("devtools", library = "/opt/R/4.0.3/lib/R/library")

Parallel processing

Don't use more than 46 cores. This leaves 2 cores for other people preparing their stuff. If you want to know how many cores are currently being used (and by whom), open a terminal (shift + alt + R) and type htop.

Parallel processing clusters can be set up using cl <- parallel::makeCluster(46). Make sure to close your cluster once you're done with it: parallel::stopCluster(cl).

I like using the package pbapply for parallel simulations, but you can use any method :)

In R, some packages / functions use OpenMP to parallellize underlying C/C++ code. By default, the behaviour of these programs is to use all the cores of a system. An example of such a function is mgcv::bam(). If your code uses such functions, you can restrict the number of cores used by putting the following code at the top of your R script:

Sys.setenv("OMP_THREAD_LIMIT" = 46)

Additional software

Software Location
Mplus /opt/mplus/8.11
JAGS /usr/bin/jags
Matlab /bin/matlab
Julia /opt/julia
PyCharm /opt/pycharm

How to run additional software with a GUI

For Windows:

  1. Install XMing and Putty.
  2. Run XMing on your computer -- this will start an X server to accept incoming display connections
  3. Run putty with the following configuration:
  • Host name: mscomputer.fss.uu.nl
  • Port: 22
  • under Connection > SSH > X11: check enable X11 forwarding and set X display location = localhost:0.0
  1. Click open in putty
  2. type in your username and password
  3. run the program, for example pycharm.

A display should now open. If it does not, contact the administrator.

Matlab

To use matlab, you need to activate matlab for your account. For this, you need to create a mathworks account using your uu address on https://nl.mathworks.com/. Then, you can activate matlab for your account together with the administrator. Once activated, you can run matlab .m files in the terminal as follows:

cat path/to/mymatlabfile.m | matlab -nodisplay -nosplash -nodesktop
Note for admin first, connect via ssh with x forwarding, then run
sudo activate_matlab

If you want to install/use additional software, please send an email to the admin (Erik-Jan; or, better yet, come and find me in C1.22).

GPU

You can use the GPU if you are doing neural network stuff. Please indicate that you are using the GPU in the google sheet as well.

  • Current Nvidia driver installed: 510.39.01
  • Current CUDA version installed: 11.6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment