Skip to content

Instantly share code, notes, and snippets.

@riga
Last active September 1, 2023 09:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save riga/c12acbd0f1e836e65e85e264c962627a to your computer and use it in GitHub Desktop.
Save riga/c12acbd0f1e836e65e85e264c962627a to your computer and use it in GitHub Desktop.
First steps with cmsdist

First steps with cmsdist

1. Create fork of cmsdist

  • Go to https://github.com/cms-sw/cmsdist
  • Fork the repository into your user space by clicking on the "Fork" bottom at the top right. If you don't have a personal GitHub account yet, please create one and create the fork.

2. Setup cmsdist and pkgtools

Info 1: pktools contains all the tools create software packages defined in cmsdist. This way, package setup and tools for their installation can be developed and maintained separately. We are not going to make changes to pkgtools, so we don't need a personal fork.

Info 2: We are going to clone both repositories into your user space, either on /afs/user/... or /afs/work/... and the exact location is up to you. For the following example, let's assume it's ${HOME}/mlprod. Later on, we are going to use dedicated CMS build machines with a lot of cores, and these machines have access to /afs.

  • Open a shell and connect to lxplus via ssh.
    • ssh lxplus
  • Create the directory where you want to keep the repositories.
    • mkdir ${HOME}/mlprod && cd ${HOME}/mlprod
  • Clone pkgtools.
    • git clone git@github.com:cms-sw/pkgtools.git
  • Clone your cmsdist fork.
    • git clone git@github.com:YOUR_USER/cmsdist.git
  • Add the cms-sw remote which makes it easier later on to pull updates from upstream.
    • ( cd cmsdist && git remote add cms git@github.com:cms-sw/cmsdist.git )

3. Setup a working directory on a cms build machine

  • Login to lxplus.
    • ssh lxplus
  • From lxplus, open a shell on a build machine, say cmsdev23.
    • ssh cmsdev23
  • Go the the central build directory and create a directory with your name. This directory is usually used by all people that build cms software which makes it easier for admins to clean up the machines at some point.
    • mkdir -p /build/$( whoami ) && cd /build/$( whoami )
  • Create symlinks to your cmsdist and pkgtools repositories.
    • ln -s ${HOME}/mlprod/{pkgtools,cmsdist} .

4. Prepare the environment

This step consists of two items: the cleaning of your current environment, and the preparation of special environment variables for the build process.

  • First we make sure that the environment is clean, i.e., that paths and libraries you might have setup previously (e.g. in your bashrc) are removed. For this, make sure that the environment variables PATH, PYTHONPATH and LD_LIBRARY_PATH do not point to custom software installations. You can do this in various ways, but here we create a small file that does that for you (and that we also use to define build variables later), so create a file called setup.sh with these lines:
export PATH="/afs/cern.ch/cms/caf/scripts:/cvmfs/cms.cern.ch/common:/usr/sue/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/puppetlabs/bin"
export PYTHONPATH=""
export LD_LIBRARY_PATH=""
  • Certain variables (such as the current release branch, queues and default architecture flags) must be defined for the build process to run smoothly. The general procedure is defined at the central CMSDIST readme. You can use the following lines to insert these setup variables into your setup.sh.
ARCH="slc7_amd64_gcc11"
CMSSW="CMSSW_13_0_X"

echo -e "" > setup.sh
for v in $( \
    curl -s https://raw.githubusercontent.com/cms-sw/cms-bot/master/config.map \
        | grep "SCRAM_ARCH=${ARCH};" \
        | grep "RELEASE_QUEUE=${CMSSW};" \
        | tr ";" " "
); do
    [ ! -z "$v" ] && echo "export ${v}" >> setup.sh
done
  • Make sure to source this file everytime you start working in a new shell.
    • source setup.sh

5. Build a package

As you can see in the cmsdist repository, there are a lot of software packages that are integrated into cmsdist, with files being named as PACKAGE.spec. Python packages are either called py3-PACKAGE.spec or pip/PACKAGE.file when they can be directly installed via pip.

Let's pick a non-python package, say alpaka and install it.

  • Build alpaka.
    • ./pkgtools/cmsBuild -c cmsdist -a ${SCRAM_ARCH} -i mydir --weekly -j 2 build alpaka
    • -c requests the location to your cmsdist repository.
    • -a is the build architecture and the current one is slc7_amd64_gcc11.
    • -i defines the directory in which the build files are written and you can pick any directory name here.
    • --weekly enables the usage of the weekly cache. Without the cache, all dependencies would have to be build recursively which would always like to avoid.
    • -j sets the number of cores to use. It's always good to first check if other people are currently running on the same machine (e.g. via htop), and then set the number of cores to use accordingly.
    • At the end of the command, we define what we want to do, and this is build alpaka.
  • The command will run for a while. You will see that the build tool will first identity all of alpaka's dependencies and then it tries to install them if they were subject to changes in your cmsdist repository. If not (and this is case we except for now), pre-built packages are fetched from the cmsdist build cache and and they don't need to be re-built. The alpaka package itself is re-built though.

If everything works out, the return code of the command should be zero, which you can check with echo $?. If not, the build process most likely raised an error which you can use to debug and fix the issue.

Let's assume everything worked just fine, and before going into detail on how to use the new package, we first install a python package, since their naming is slightly different. Say, we want to build the arrow package defined in pip/arrow.file. Unlike packages that are directly located in the top directory of cmsdist, for python packages one needs to define the main python version to build a package for. These days, this will mostly be python 3.

  • Build arrrow.
    • ./pkgtools/cmsBuild -c cmsdist -a ${SCRAM_ARCH} -i mydir --weekly -j 2 build py3-arrow
    • Note the prefix py3- which picks the python version.
  • Since the general setup was already done in the previous step when we built alpaka, this build command should run rather quickly.

5. Build the full software stack

Before we can test the custom software packages, we need to build the full software stack which is eventually used in a cmssw environment in step 6. For this, we only need to build a single package cmssw-tool-conf which requires all available packages in cmsdist.

  • Build cmssw-tool-conf.
    • ./pkgtools/cmsBuild -c cmsdist -a ${SCRAM_ARCH} -i mydir --weekly -j 2 build cmssw-tool-conf
    • cmssw-tool-conf is a wrapper for all tool files needed by CMSSW (i.e. instructions for CMSSW on how to load any possible software package), which themselves trigger all known packages.
  • This takes a while, so you might want to consider using more than just 2 cores.

6. Setup a cmssw environment and test the software

The next steps should be executed in a second shell, but on the same cms build machine.

  • Login to lxplus.
    • ssh lxplus
  • Login to the build machine.
    • ssh cmsdev23
  • In a directory of your choice, create a fresh cmssw environment. The version used below might be outdated, so please check if you like to use a newer one.
    • export SCRAM_ARCH=slc7_amd64_gcc11
    • cmsrel CMSSW_13_1_X_2023-02-14-1100 && cd CMSSW_13_1_X_2023-02-14-1100 (or any other existing release)
  • Before doing the usual cd src && cmsenv, we want to configure this release to use the custom software build above. For this, go to the tools directory.
    • cd config/toolbox/${SCRAM_ARCH}/tools
  • Delete the default tools, i.e., configurations of software packages.
    • rm -rf available selected
  • Copy the tool files from your build directory.
    • cp -r /build/$( whoami )/mydir/${SCRAM_ARCH}/cms/cmssw-tool-conf/THE_VERSION/tools/* .
    • The cmssw-tool-conf directory might contain several subdirectories representing versions (THE_VERSION) of the software stack. Use the one that you installed with the command in step 5 above which usually prints the created version at the end of the install process.
  • Then, go back to the CMSSW release and setup the tools you just copied.
    • cd ../../../../CMSSW_13_1_X_2023-02-14-1100
    • scram setup
  • Now, you can finally setup your release as always.
    • cd src && cmsenv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment