This repository contains different resources for the LINMA2710 courses given at UCLouvain.
| Week | Wednesday | Topic | Thursday | Topic | Lecturer |
|---|---|---|---|---|---|
| S1 | 04/02/2026 | 05/02/2026 | C++ | Absil | |
| S2 | 11/02/2026 | 12/02/2026 | C++ | Absil | |
| S3 | 18/02/2026 | Parallel | 19/02/2026 | Parallel | Legat |
| S4 | 25/02/2026 | 26/02/2026 | Parallel | Legat | |
| S5 | 04/03/2026 | Distributed | 05/03/2026 | Distributed | Legat |
| S6 | 11/03/2026 | 12/03/2026 | Distributed | Legat | |
| S7 | 18/03/2026 | GPU | 19/03/2026 | GPU | Legat |
| S8 | 25/03/2026 | 26/03/2026 | GPU | Legat | |
| S9 | 01/04/2026 | 02/04/2026 | PDE | Absil | |
| S10 | 08/04/2026 | 09/04/2026 | PDE | Absil | |
| S11 | 15/04/2026 | Q&A project | 16/04/2026 | PDE | Absil |
| 🥚 | 22/04/2026 | 🐣 | 23/04/2026 | 🐇 | 🐰 |
| 🥚 | 29/04/2026 | 🐣 | 30/04/2026 | 🐇 | 🐰 |
| S12 | 06/05/2026 | Oral project | 07/05/2026 | Power Consumption | Legat |
| S13 | 13/05/2026 | Oral project | 14/05/2026 | ✝️⬆️☁️ |
In order to use the CECI clusters, you need a CECI account.
If you don't already have an account (if you don't know whether you have an account, chances are you don't have one), first create one.
You will receive an email, follow the link in the email and in the field labelled "Email of Supervising Professor", enter benoit.legat@uclouvain.be.
Follow the steps detailed here in order to download your private key, create the corresponding public key and create the file .ssh/config.
You should now be able to connect to the manneback cluster with
(your computer) $ ssh mannebackTip
For choosing the clusters, check this list and this one with manneback to see which one has the hardware you need. The Lyra cluster was recently added with GPU support so it could also be used for P3 if manneback is overloaded.
In addition to the information below and the CECI documentation here is a little FAQ.
We mention here 4 ways to sync your files:
- Copy file by file with
scp - Using
git - Mount a whole folder with
sshfs - Use an extension of your IDE
Follow this guide to copy files from your computer to the cluster. For instance, with scp you can copy a file submit.sh from your computer with:
(your computer) $ scp submit.sh manneback:.It might however be a bit tedious to keep the files in sync with scp. I recommend pushing your project in a private (don't use a public git as your code shouldn't be accessible to other students!) git (for instance in https://forge.uclouvain.be/) and pull it from the CECI cluster. You can then easily update the code on the CECI cluster with git pull.
Warning
Do not sync the binaries of with the CECI cluster as you might have a different architecture. Exclude them from the git by adding them in the .gitignore file and simply recompile them on the cluster.
You can also modify the files in a folder locally using sshfs.
For instance, I have a LINMA2710 folder in my home directory on the manneback cluster.
To access these files locally on a new folder manneback, I can do
(local computer)$ mkdir manneback-sshfs
(local computer)$ sshfs manneback:/home/ucl/inma/blegat/LINMA2710 ./manneback-sshfsYou can then open the manneback-sshfs with your favorite IDE on your local computers and you will be modifying files directly on the cluster!
If you open a terminal in your IDE, it will still be running on the CPU and GPU of your local computer even though it will see the files on the cluster.
Therefore, in order to compile and run your program on the cluster, you still need to ssh to the cluster in that terminal.
A popular approach to remote development over ssh is using the Remote - SSH extension of VSCode as detailed here.
This will open a VSCode window where you will see the files on the folder of the cluster like with the sshfs approach but the terminal you open in
VSCode will also be running on the cluster.
Warning
In the community open-source releases of VSCode such as VSCodium, Open VSX is used instead of the VS Marketplace. As the Remote - SSH extension is available in the marketplace but not in Open VSX, you can Open Remote - SSH instead. Note that the well-known workaround to use the VS Marketplace in VSCodium violates the terms of use of the marketplace which only allows it to be used with the binaries provided by Microsoft.
The command that you run directly after connecting with ssh are run on the login node which has limited resources as it is only meant for you to connect and send jobs via Slurm that are executed on compute nodes, you will also not have any GPU on the login node. So don't just run your program with [blegat@mbackf1 ~] ./a.out (note mbackf1 which means you are on a login node).
To run your code, submit a job with Slurm.
Use this tool to generate a submission script.
Warning
The --partition option is dependent on the the cluster. As manneback is not an option in the tool, use another cluster and then remove the line with --partition or update it with one of the partition listed by sinfo.
Save this script as a file, say submit.sh. You can then use it with
[blegat@mbackf1 ~] sbatch submit.sh
```ion
The output produced by the job is written in the file `slurm-<JOBID>.out` where `<JOBID>` is the job id listed in the `JOBID` column of the table outputted by
```sh
[blegat@mbackf1 ~] squeue --meYou can also use salloc to be able to execute commands interactively in the allocated compute nodes.
[blegat@mbackf1 ~]$ salloc --ntasks=4
salloc: Pending job allocation 56630153
salloc: job 56630153 queued and waiting for resources
salloc: job 56630153 has been allocated resources
salloc: Granted job allocation 56630153
salloc: Waiting for resource configuration
salloc: Nodes mb-sky002 are ready for job
[blegat@mb-sky002 examples]$ ml OpenMPI
[blegat@mb-sky002 examples]$ srun ./a.out
Process 3/4 is running on node <<mb-sky002.cism.ucl.ac.be>>
Process 0/4 is running on node <<mb-sky002.cism.ucl.ac.be>>
Process 1/4 is running on node <<mb-sky002.cism.ucl.ac.be>>
Process 2/4 is running on node <<mb-sky002.cism.ucl.ac.be>>Note that the output will be displayed directly on the terminal and not to a slurm-<JOBID>.out file.
This means that, if you loose the ssh connection (which can easily happen, e.g., if you laptop is suspended),
you will loose the ability to interact with the allocated session on the compute nodes (you could also use sattach to reattach it) and also the output of the terminal.
One useful trick is to use screen. If your ssh connection is lost, simply reconnect and run screen -r to get your session back. More details here.
The command lines that are either executed in the shell opened by salloc or that are inside the submit.sh script executed by sbatch are each using only one process.
To allocate several processes for one command, use srun. The srun commands inherits from the options passed to salloc and sbatch so no need to repeat the --ntasks options etc... for srun.
When using MPI, you would like to run your executable with several processes.
For this, you typically use mpiexec when running it on your laptop.
Inside a salloc shell or inside a sbatch submit.sh script, either use srun (recommended by Slurm), mpirun (recommended by OpenMPI), or mpiexec which is mostly equivalent to mpirun. See also the CECI doc.
Don't use both (e.g., srun mpirun ./a.out) as otherwise srun will run ntasks times mpirun which will run with ntasks processes, which is not what you want.
Do not use module load CUDA. This command uses Lmod to set LD_LIBRARY_PATH (as detailed in the output of module show CUDA) which is discouraged.
Running Julia interactively on a compute node is as simple as running $ srun --pty julia.
If CUDA was precompiled on a node with no GPU (such as the login node), you will see the error
julia> using CUDA
┌ Error: CUDA.jl could not find an appropriate CUDA runtime to use.
│
│ CUDA.jl's JLLs were precompiled without an NVIDIA driver present.
│ This can happen when installing CUDA.jl on an HPC log-in node,
│ or in a container. In that case, you need to specify which CUDA
│ version to use at run time by calling `CUDA.set_runtime_version!`
│ or provisioning the preference it sets at compile time.
│
│ If you are not running in a container or on an HPC log-in node,
│ try re-compiling the CUDA runtime JLL and re-loading CUDA.jl:
│ pkg = Base.PkgId(Base.UUID("76a88914-d11a-5bdc-97e0-2f5a05c973a2"),
│ "CUDA_Runtime_jll")
│ Base.compilecache(pkg)
│ # re-start Julia and re-load CUDA.jl
│
│ For more details, refer to the CUDA.jl documentation at
│ https://cuda.juliagpu.org/stable/installation/overview/
└ @ CUDA ~/.julia/packages/CUDA/1kIOw/src/initialization.jl:118Just copy-paste these lines on the REPL to re-compile CUDA.jl and then exit it and restart a new Julia session with srun again.
If you still get the error, leave the REPL, then run the following (replacing v1.11 by your Julia version of course):
(manneback cluster) $ rm -r ~/.julia/compiled/v1.11/CUDA*New, start a new Julia session with srun and using CUDA should not error anymore.
See here for additional information.