Differences

This shows you the differences between two versions of the page.

--- en:hpc [2022/04/14 13:17] – created grikiete
+++ en:hpc [2022/07/05 10:11] – [GPU užduotys (SLURM)] grikiete
@@ Line 1: / Line 1: @@
 ====== Description of the Equipment ======
-A Distributed Computing Network (DCN) is a specially designed network of computers capable of running applications that can exchange data efficiently.
+A High Performance Computing (HPC) is a specially designed network of computers capable of running applications that can exchange data efficiently.
-VU MIF PST consists of a supercomputer from the clusters (the first number is the actual and available amount ):
+VU MIF HPC consists of a supercomputer from the clusters (the first number is the actual and available amount):
-^Pavadinimas ^Mazgai ^CPU ^GPU ^RAM        ^HDD    ^Tinklas ^Pastabos|
+^Title ^Nodes ^CPU ^GPU ^RAM        ^HDD    ^Network ^Notes|
 ^main        ^35/36  ^48  ^0   ^384GiB     ^0      ^1Gbit/s, 2x10Gbit/s, 4xEDR(100Gbit/s) infiniband ^[[https://ark.intel.com/content/www/us/en/ark/products/192447/intel-xeon-gold-6252-processor-35-75m-cache-2-10-ghz.html|CPU]]|
 ^gpu         ^3/3    ^40  ^8   ^512GB/32GB  ^7TB   ^2x10Gbit/s, 4xEDR(100Gbit/s) infiniband ^[[https://ark.intel.com/content/www/us/en/ark/products/91753/intel-xeon-processor-e5-2698-v4-50m-cache-2-20-ghz.html|CPU]] [[https://en.wikipedia.org/wiki/Nvidia_DGX#DGX-1|NVIDIA DGX-1]]|
 ^power       ^2/2    ^32  ^4   ^1024GB/32GB ^1.8TB ^2x10Gbit/s, 4xEDR(100Gbit/s) infiniband ^[[https://www.ibm.com/products/power-systems-ac922|IBM Power System AC922]]|
-Iš viso	**40/41** mazgų, **1912** CPU cores su **17TB** RAM, **32** GPU su **1TB** RAM.
+Total	**40/41** nodes, **1912** CPU cores with **17TB** RAM, **32** GPU with **1TB** RAM.
+The processor below = CPU = core - a single core of the processor (with all hyperthreads if they are turned on).
+====== Software ======
+In **main** and **gpu** partitions there are installed [[https://docs.qlustar.com/Qlustar/11.0/HPCstack/hpc-user-manual.html|Qlustar 11]] OS. It is based on Ubuntu 18.04 LTS. In **power** partition there is installed Ubuntu 18.04 LTS.
+You can check the list of OS package with the command ''dpkg -l'' (in login node **hpc** or in **power** nodes).
+With the command [[https://sylabs.io/guides/3.2/user-guide/index.html|singularity]] it is possible to make use of ready-made copies of container files in directories ''/apps/local/hpc'', ''/apps/local/nvidia'', ''/apps/local/intel'', ''/apps/local/lang'' or to download from singularity and docker online repositories. You can also create your own singularity containers using the MIF cloud service.
+You can prepare your container with singularity, for example:
+<code shell>
+$ singularity build --sandbox /tmp/python docker://python:3.8
+$ singularity exec -w /tmp/python pip install package
+$ singularity build python.sif /tmp/python
+$ rm -rf /tmp/python
+</code>
+Similarly, you can use R, Julia or other containers that do not require root privileges to install packages.
+If you want to add OS packages to the singularity container, you need root/superuser privileges. With fakeroot, we simulate them, and copy the required library ''libfakeroot-sysv.so'' into the container, for example:
+<code shell>
+$ singularity build --sandbox /tmp/python docker://ubuntu:18.04
+$ cp /libfakeroot-sysv.so /tmp/python/
+$ fakeroot -l /libfakeroot-sysv.so singularity exec -w /tmp/python apt-get update
+$ fakeroot -l /libfakeroot-sysv.so singularity exec -w /tmp/python apt-get install python3.8 ...
+$ fakeroot -l /libfakeroot-sysv.so singularity exec -w /tmp/python apt-get clean
+$ rm -rf /tmp/python/libfakeroot-sysv.so /tmp/python/var/lib/apt/lists (you can clean up more of what you don't need)
+$ singularity build python.sif /tmp/python
+$ rm -rf /tmp/python
+</code>
+There are ready-made scripts to run your **hadoop** tasks using the [[https://github.com/LLNL/magpie|Magpie]] set in the directory ''/apps/local/bigdata''.
+With [[https://hpc.mif.vu.lt/hub/|JupyterHub]] you can run calculations with the python command line in a web browser and use the [[https://jupyter.org|JupyterLab]] environment. If you install your own JupyterLab environment in your home directory, you need to install the additional ''batchspawner'' package - this will start your environment, for example:
+<code shell>
+$ python3.7 -m pip install --upgrade pip setuptools wheel
+$ python3.7 -m pip install --ignore-installed batchspawner jupyterlab
+</code>
+Alternatively, you can use a container that you made via JupyterHub. In that container, you need to install the ''batchswapner'' and ''jupyterlab'' packages, and to create a script ''~/.local/bin/batchspawner-singleuser'' with execution permissions (''chmod +x ~/.local/bin/batchspawner-singleuser'').
+<code shell>
+#!/bin/sh
+exec singularity exec --nv myjupyterlab.sif batchspawner-singleuser "$@"
+</code>
+====== Registration ======
+  * **For VU MIF network users** - HPC can be used without additional registration if the available resources are enough (monthly limit - **100 CPU-h and 6 GPU-h**). Once this limit has been reached, you can request more by filling in [[https://forms.office.com/Pages/ResponsePage.aspx?id=ghrFgo1UykO8-b9LfrHQEidLsh79nRJAvOP_wV9sgmdUM0ZMR1FINFg3TzVaNlhDSEhUN1A3QTlVUC4u|ITOAC service request form]].
+  * **For users of the VU computer network** - you must fill in the [[https://forms.office.com/Pages/ResponsePage.aspx?id=ghrFgo1UykO8-b9LfrHQEidLsh79nRJAvOP_wV9sgmdUM0ZMR1FINFg3TzVaNlhDSEhUN1A3QTlVUC4u|ITOAC service request form]] to get access to MIF HPC. After the confirmation of your request, you must create your account in [[https://hpc.mif.vu.lt|Waldur portal]]. More details read [[waldur|here]].
+  * **For other users (non-members of the VU community)** - you must fill in the [[https://forms.office.com/Pages/ResponsePage.aspx?id=ghrFgo1UykO8-b9LfrHQEidLsh79nRJAvOP_wV9sgmdUMDE1QUo3Slo3UVYwTjM4TDMyTEdZT0tSNi4u|ITOAC service request form]] to get access to MIF HPC. After the confirmation of your request, you must come to VU MIF Didlaukio str. 47, Room 302/304 to receive your login credentials. Please arranged the exact time by phone + 370 5219 5005. With these credentials you are able to create an account in [[https://hpc.mif.vu.lt|Waldur portal]]. More details read [[waldur|here]].
+====== Connection ======
+You need to use SSH applications (ssh, putty, winscp, mobaxterm) and Kerberos or SSH key authentication to connect to **HPC**.
+If **Kerberos** is used:
+  * Log in to the Linux environment in a VU MIF classroom or public terminal with your VU MIF username and password or login to **uosis.mif.vu.lt** with your VU MIF username and password using **ssh** or **putty**.
+  * Check if you have a valid Kerberos key (ticket) with the **klist** command. If the key is not available or has expired, the **kinit** command must be used.
+  * Connect to the **hpc** node with the command **ssh hpc** (password must not be required).
+If **SSH keys** are used (e.g. if you need to copy big files):
+  * If you don't have SSH keys, you can find instructions on how to create them in a Windows environment **[[duk:ssh_key|here]]**
+  *     Before you can use this method, you need to log in with Kerberos at least once. Then create a ''~/.ssh'' directory in the HPC file system and put your **ssh public key** (in OpenSSH format) into the ''~/.ssh/authorized_keys'' file.
+  *     Connect with **ssh**, **sftp**, **scp**, **putty**, **winscp** or any other **ssh** protocol supported software to **hpc.mif.vu.lt** with your **ssh private key**, specifying your VU MIF user name. It should not require a login password, but may require your ssh private key password.
+The **first time** you connect, you **will not** be able to run **SLURM jobs** for the first **5 minutes**. After that, SLURM account will be created.
+====== Lustre - Shared File System ======
+VU MIF HPC shared file system is available in the directory ''/scratch/lustre''.
+The system creates directory ''/scratch/lustre/home/username'' for each HPC user, where **username** is the HPC username.
+The files in this file system are equally accessible on all compute nodes and on the **hpc** node.
+Please use these directories only for their purpose and clean them up after calculations.
+====== HPC Partition ======
+^Partition ^Time limit ^RAM	   ^Notes|
+^main             ^7d            ^7000MB  ^CPU cluster|
+^gpu              ^48h           ^12000MB ^GPU cluster|
+^power            ^48h           ^2000MB  ^IBM Power9 cluster|
+The time limit for tasks is **2h** in all partitions if it has not been specified. The table shows the maximum time limit.
+The **RAM** column gives the amount of RAM allocated to each reserved **CPU** core.
+====== Batch Processing of Tasks (SLURM) ======
+To use computing resources of the HPC, you need to create task scenarios (sh or csh).
+Example:
+<code shell mpi-test-job.sh>
+#!/bin/bash
+#SBATCH -p main
+#SBATCH -n4
+module load openmpi
+mpicc -o mpi-test mpi-test.c
+mpirun mpi-test
+</code>
+After submission and confirmation of your application to the ITOAC services, you need to create a user at https://hpc.mif.vu.lt/. The created user will be included in the relevant project, which will have a certain amount of resources. In order to use the project resources for calculations, you need to provide your allocation number. Below is an example with the allocation parameter "alloc_xxxx_project" (not applicable for VU MIF users, VU MIF users do not have to specify the --account parameter).
+<code shell mpi-test-job.sh>
+#!/bin/bash
+#SBATCH --account=alloc_xxxx_projektas
+#SBATCH -p main
+#SBATCH -n4
+#SBATCH --time=minutes
+module load openmpi
+mpicc -o mpi-test mpi-test.c
+mpirun mpi-test
+</code>
+It contains instructions for the task performer as special comments.
+ -p short - which queue to send to (main, gpu, power).
+ -n4 - how many processors to reserve (**NOTE:** if you set the number of cores to be used to x, but actually use fewer cores programmatically, the accounting will still count all the x "requested" cores, so we recommend to calculate this in advance).
+The initial running directory of the task is the current directory (**pwd**) on the login node from where the task is run, unless it was changed to another directory by the -D parameter. For the initial running directory, use the HPC shared filesystem directories **/scratch/lustre**, as it must exist on the compute node and the job output file **slurm-JOBID.out** is created there, unless redirected by -o or -i (for these it is advisable to use the shared filesystem as well).
+The generated script is sent with the command //sbatch//,
+''$ sbatch mpi-test-job''
+which returns the number of the submitted job **JOBID**.
+The status of a pending or ongoing task can be checked with the command //squeue//
+''$ squeue -j JOBID''
+With the //scancel// command it is possible to cancel the running of a task or to remove it from the queue
+''$ scancel JOBID''
+If you forgot your tasks **JOBID**, you can check them with the command //squeue//
+''$ squeue''
+Completed tasks are no longer displayed in **squeue**.
+If the specified number of processors is not available, your task is added to the queue. It will remain in the queue until a sufficient number of processors become available or until you remove it with **scancel**.
+The **output** of the running job is recorded in the file **slurm-JOBID.out**. The error output is written to the same file unless you specified somewhere else. The file names can be changed with the **sbatch** command parameters -o (specify the output file) and -e (specify the error file).
+More about SLURM opportunities you can read [[https://slurm.schedmd.com/quickstart.html|Quick Start User Guide]].
+====== Interactive Tasks (SLURM) ======
+Interactive tasks can be done with the //srun// command:
+<code>
+$ srun --pty $SHELL
+</code>
+The above command will connect you to the compute node environment assigned to SLURM and allow you to directly run and debug programs on it.
+After the commands are done disconnect from the compute node with the command
+<code>
+$ exit
+</code>
+If you want to run graphical programs, you need to connect to **ssh -X** to **uosis.mif.vu.lt** and **hpc**:
+<code>
+$ ssh -X uosis.mif.vu.lt
+$ ssh -X hpc
+$ srun --pty $SHELL
+</code>
+In **power** cluster interactive tasks can be performed with
+<code>
+$ srun -p power --mpi=none --pty $SHELL
+</code>
+====== GPU Tasks(SLURM) ======
+To use GPU you need to specify additionally <code>--gres gpu:N</code> where N is desired GPU amount.
+Su ''nvidia-smi'' užduotyje galite pasitikrinti kiek GPU buvo paskirta.
+Pavyzdys interaktyvios užduoties su 1 GPU:
+<code>
+$ srun -p gpu --gres gpu --pty $SHELL
+</code>
+====== Įvadas į OpenMPI ======
+Ubuntu 18.04 LTS yra **2.1.1** versijos OpenMPI paketai.
+Norint pasinaudoti naujesne **4.0.1** versija reikia naudoti
+<code>
+module load openmpi/4.0
+</code>
+prieš vykdant MPI komandas.
+===== MPI programų kompiliavimas =====
+Paprastos MPI programos pavyzdys yra kataloge ''/scratch/lustre/test/openmpi''. **mpicc** (**mpiCC**, **mpif77**, **mpif90**, **mpifort**) yra apvalkalai C (C++, F77, F90, Fortran) kompiliatoriams, kurie automatiškai įtraukia į komandų eilutę reikiamus **MPI** intarpų (include) ir bibliotekų failus.
+<code>
+$ mpicc -o foo foo.c
+$ mpif77 -o foo foo.f
+$ mpif90 -o foo foo.f
+</code>
+===== MPI programų vykdymas =====
+MPI programos startuojamos su **mpirun** arba **mpiexec** programa. Daugiau apie jas galima sužinoti su komanda **man mpirun** arba **man mpiexec**.
+Paprasta (SPMD) programa gali būti startuojama su tokia mpirun komandų eilute.
+<code>
+$ mpirun foo
+</code>
+Tai naudos visus paskirtus procesorius, pagal tai, kiek jų buvo užsakyta. Jeigu norima pasinaudoti mažiau, tai **mpirun** galima nurodyti parametrą ''-np kiekis''. Nepageidaujama ilgesniam laikui naudoti mažiau, nei rezervuota, nes neišnaudoti CPU lieka laisvi. Didesnį kiekį, nei rezervuotą, yra griežtai draudžiama naudoti, nes tai gali turėti įtakos kitų užduočių vykdymui.
+Daugiau apie instaliuotą **OpenMPI** yra [[https://www.open-mpi.org|OpenMPI]] puslapyje.
+====== Užduočių efektyvumas ======
+  * Prašome išnaudoti ne mažiau 50% užsakyto CPU kiekio.
+  * Naudoti daugiau CPU, nei užsakyta, nepadidins efektyvumo, nes jūsų užduotis galės naudoti tik tiek CPU, kiek buvo užsakyta.
+  * Jeigu naudosite parametrą ''--mem=X'', tai užduotis gali rezervuoti daugiau **CPUs** proporcingai norimos atminties kiekiui. Pvz: užsakius ''--mem=14000'' eilėje **main**, bus užsakyti ne mažiau 2 CPUs, jei kiti parametrai nenurodo daugiau. Jeigu jūsų užduotis naudos mažiau, tai bus neefektyvus resursų naudojimas, be to tai gali veikti lėčiau, nes gali būti naudojama kita, nei vykdančio, procesoriaus atmintis.
+====== Resursų limitai ======
+Jeigu jūsų užduotys nestartuoja su priežastimi **AssocGrpCPUMinutesLimit** arba **AssocGrpGRESMinutes**,
+ tai pasitikrinkite ar užduotims dar liko neišnaudotų CPU/GPU resursų iš (mėnesio) limito.
+Peržiūrėti kiek išnaudota resursų
+<code>
+sreport -T cpu,mem,gres/gpu cluster AccountUtilizationByUser Start=0101 End=0131 User=USERNAME
+</code>
+kur **USERNAME** jūsų MIF naudotojo vardas, o **Start** ir **End** nurodo einamojo mėnesio pradžios ir pabaigos datas. Jas galima nurodyti ir kaip ''$(date +%m01)'' ir ''$(date +%m31)'', kas nurodo einamojo mėnesio pirmą ir paskutines dienas.
+Atkreipkite dėmesį, kad naudojimas pateikiamas minutėmis, o į valandas konvertuoti reikia dalinant iš 60.
+Kitas būdas pažiūrėti limitus ir jų išnaudojimą
+<code>
+sshare -l -A USERNAME_mif -p -o GrpTRESRaw,GrpTRESMins,TRESRunMins
+</code>
+kur **USERNAME** MIF naudotojo vardas. Arba parametre **-A** nurodyti tą sąskaitą (account), kurio naudojimą norima pažiūrėti. Duomenys pateikiami minutėmis. **GrpTRESRaw** - kiek išnaudota. **GrpTRESMins** - koks yra limitas. **TRESRunMins** - likę resursai dar vis vykdomų užduočių.
-Toliau tekste procesorius = CPU = core - procesoriaus vienas branduolys (su visomis hypergijomis, jei jos yra įjungtos).