Nektar++ on the Imperial College HPC cluster – HX1
Basic Information about HX1
HX1 (or Hex) is the new High Performance Computing Cluster provided by the Research Computing Services at Imperial College London. HX1 is designed for the following types of applications:
- Multi-node parallel applications, typically using MPI to communicate between the compute nodes.
- GPU accelerated applications such as those utilising CUDA.
For single node parallel applications, continue using the CX3 service. (Taken from: HX1 User Guide)
Access to the cluster
To get access, you need to follow 2 steps:
- Contact your supervisor (see request access for details)
- Request for access from the Research Computing Services (RCS) by filling this form
Before you start using the HPC cluster, you are also expected to be familiar with the Linux command line, SSH, and the PBSPro job submission system.
Once your supervisor has given you access and your request is accepted by the RCS Team, you can log in with your Imperial College credentials by typing the following command into a Linux shell or MacOS terminal
ssh username@login.hx1.hpc.ic.ac.uk
as long as you are connected to the university’s network or the university vpn is activated.
When you access the HPC cluster, you will connect to a login node. The login nodes are shared by all users, and you should therefore not run any compute intensive tasks on these nodes. Instead, the login nodes should only be used to
- Download and compile Nektar++ (
git
works for cloning) - Prepares a PBS script to submit and run jobs
- Submits the job to the queue
Compilation Instructions
Before you start compiling Nektar++, you need to load the following modules
module load SCOTCH/7.0.1-iimpi-2022a intel/2022a CMake/3.24.3-GCCcore-11.3.0
It is advised that you install Nektar++ in your $HOME
directory, which is the directory you enter when you access the cluster using ssh
. You can now download and install Nektar++ in your Programs
directory by following these instructions:
To organise all your files, you can create a new directory for all your programs
mkdir $HOME/Programs
Go the newly created $HOME directory
cd $HOME/Programs
Download the Nektar++ source code to your Programs directory
git clone https://gitlab.nektar.info/nektar/nektar.git nektar
Create a directory where Nektar++ will be installed. The name of this directory will be “build”.
cd nektar
mkdir build
cd build
To configure the build of Nektar++, use the following command
cmake -DCMAKE_CXX_COMPILER=$(which icpx) \
-DCMAKE_C_COMPILER=$(which icx) \
-DCMAKE_Fortran_COMPILER=$(which ifort) \
-DNEKTAR_USE_MPI=ON \
-DNEKTAR_USE_SCOTCH=ON \
-DNEKTAR_USE_SYSTEM_BLAS_LAPACK=OFF \
-DTHIRDPARTY_BUILD_BLAS_LAPACK=OFF \
-DNEKTAR_USE_MKL=ON \
-DNEKTAR_USE_HDF5=ON
-DTHIRDPARTY_BUILD_HDF5=ON \
-DTHIRDPARTY_BUILD_BOOST=ON \
-DNEKTAR_ENABLE_SIMD_AVX512=ON ..
If you only want to compile a specific solver, or further configure the build, you write ccmake [FLAGS] ../
. For details on how to configure Nektar++, see the user guide available in Chapter 1 of https://www.nektar.info/getting-started/documentation. All options are summarized under Section 1.3.5.
After the build has been configured using cmake
, you can compile the code using
make -j 4 install
Note: Do not run regression tests as usual on the login node, as other users are connecting to the cluster via the login node and are submitting their jobs to the cluster. Instead, follow the instructions below.
Running Regression Tests
To ensure that the Nektar++ source code has been built and compiled correctly, it is advised to run regression tests. It is a good practice to use the following submission script to execute the unit tests of the code:
#!/bin/bash
#PBS -l walltime=1:00:00
#PBS -l select=1:ncpus=10:mpiprocs=10:mem=80gb
# Clean up previous modules
module purge
# Path to Nektar++ installation
export NEKTAR_BUILD=$HOME/Programs/nektar/build
# Load all the needed modules
module load intel/2022a
module load SCOTCH/7.0.1-iimpi-2022a
module load CMake/3.24.3-GCCcore-11.3.0
module list
# Update the LD_LIBRARY_PATH variable to include the newly compiled libraries of your nektar build
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${NEKTAR_BUILD}/build/ThirdParty/dist/lib
# Bootstrapping Intel MPI, see hx1-user-guide
export I_MPI_HYDRA_BOOTSTRAP="rsh"
export I_MPI_HYDRA_BOOTSTRAP_EXEC="/opt/pbs/bin/pbs_tmrsh"
export I_MPI_HYDRA_BRANCH_COUNT=0
# Execute the regression tests
cd $NEKTAR_BUILD
ctest | tee reg_output.txt
cp reg_output.txt $HOME/
Assuming the script file is regression_testing.pbs, you can submit a job to run the test as follows
qsub -q hx regression_testing.pbs
This submits the regression tests job to an execution queue. The PBS queue system will execute your job once resources are available. Check the status of your job(s) using
qstat -u $USER
You will find the results of the regression tests in the reg_output.txt
file which has been copied in your $HOME
directory.
Running Simulations
To execute Nektar++ in parallel on the HPC cluster, you need to create and submit a PBS script. The PBC script tells the queue system or scheduler which resources you need, how long your simulation will take, and how to execute your code. Your PBS submission script need to include the following settings:
#!/bin/bash
#PBS -l select=N:ncpus=X:mem=Ygb
#PBS -l walltime=HH:MM:SS
where N
denotes the number of nodes, and X
the number of processors per node (maximum is 64 cores per node). The total number MPI processes that will be launched is therefore N*X
. You need to specify the requested memory per node via mem=Y
in gb (Gigabytes). Your selection of N affects the total requested memory as Total Requested Memory = N*Y
. The maximum available memory per node is 512 gb. The final option is the requested wall clock time as walltime=HH:MM:SS which expresses how long your job will run at the requested number of resources. For further documentation, see HX1 User Guide.
After you specified the resources you need, you need to make sure all modules are loaded. To do this, simply add the following lines to your PBC script
module load SCOTCH/7.0.1-iimpi-2022a intel/2022a CMake/3.24.3-GCCcore-11.3.0
These are the same modules as we loaded before when compiling Nektar++.
The environment variables $PBS_O_WORKDIR
and $TMP_DIR
point to the directory where you submitted the job and the local directory where the cluster executes your code, respectively. Therefore, if you submit the job from the directory where you store all the input files to Nektar++, you can make sure that the cluster knows where to find your input files without having to specify absolute paths. Finally, we will add the actual command for executing Nektar++
# Setup Environment Variables
NEK_PATH=$HOME/Programs/nektar INC_SOLVER=$NEK_PATH/build/dist/bin/IncNavierStokesSolver COMP_SOLVER=$NEK_PATH/build/dist/bin/CompressibleFlowSolver
JOB_NAME=name-of-input-file
# Export Third Party Libraries export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$NEK_PATH/build/ThirdParty/dist/lib
# Run the solver executable
mpiexec -v6 $INC_SOLVER ${JOB_NAME}.xml --io-format Hdf5 > $PBS_O_WORKDIR/output.txt
Note that you don’t have to specify the number of MPI processes. If you don’t use the HDF5 I/O format, you can also remove the associated flag. To submit your job, change directory to the directory where the input files for Nektar++ are stored. After this, run the following command
qsub -q hx job_script.pbs
Here, we assume that the PBS script file is called job_script.pbs
, and that it is stored under the $HOME
directory.