en:hpc
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
en:hpc [2022/05/06 07:40] – [Connection] rolnas | en:hpc [2024/02/21 12:50] (current) – [Singularity] rolnas | ||
---|---|---|---|
Line 16: | Line 16: | ||
====== Software ====== | ====== Software ====== | ||
- | In **main** and **gpu** partitions there are installed [[https:// | + | In **main** and **gpu** partitions there are installed [[https:// |
You can check the list of OS package with the command '' | You can check the list of OS package with the command '' | ||
+ | |||
+ | ===== Singularity ===== | ||
With the command [[https:// | With the command [[https:// | ||
Line 29: | Line 31: | ||
$ rm -rf /tmp/python | $ rm -rf /tmp/python | ||
</ | </ | ||
+ | |||
+ | |||
+ | To use this container, it is advised to use different directory instead of home directory (for python packages don't mixup with installed to home directory). | ||
+ | <code shell> | ||
+ | $ mkdir ~/workdir | ||
+ | $ singularity exec -H ~/ | ||
+ | </ | ||
+ | |||
Similarly, you can use R, Julia or other containers that do not require root privileges to install packages. | Similarly, you can use R, Julia or other containers that do not require root privileges to install packages. | ||
Line 42: | Line 52: | ||
$ rm -rf /tmp/python | $ rm -rf /tmp/python | ||
</ | </ | ||
+ | |||
+ | ===== Hadoop ===== | ||
There are ready-made scripts to run your **hadoop** tasks using the [[https:// | There are ready-made scripts to run your **hadoop** tasks using the [[https:// | ||
+ | |||
+ | ===== JupyterHub ===== | ||
With [[https:// | With [[https:// | ||
Line 60: | Line 74: | ||
====== Registration ====== | ====== Registration ====== | ||
- | * **For VU MIF network users** - HPC can be used without additional registration if the available resources are enough (monthly limit - **100 CPU-h and 6 GPU-h**). Once this limit has been reached, you can request more by filling in [[https:// | + | * **For VU MIF network users** - HPC can be used without additional registration if the available resources are enough (monthly limit - **500 CPU-h and 60 GPU-h**). Once this limit has been reached, you can request more by filling in [[https:// |
* **For users of the VU computer network** - you must fill in the [[https:// | * **For users of the VU computer network** - you must fill in the [[https:// | ||
Line 77: | Line 91: | ||
If **SSH keys** are used (e.g. if you need to copy big files): | If **SSH keys** are used (e.g. if you need to copy big files): | ||
- | * If you don't have SSH keys, you can find instructions on how to create them in a Windows environment **[[duk: | + | * If you don't have SSH keys, you can find instructions on how to create them in a Windows environment **[[en:duk: |
* | * | ||
* | * | ||
- | The first time you connect, you **will not** be able to run **SLURM | + | The **first time** you connect, you **will not** be able to run **SLURM |
====== Lustre - Shared File System ====== | ====== Lustre - Shared File System ====== | ||
Line 92: | Line 106: | ||
Please use these directories only for their purpose and clean them up after calculations. | Please use these directories only for their purpose and clean them up after calculations. | ||
+ | |||
+ | ====== HPC Partition ====== | ||
+ | |||
+ | ^Partition ^Time limit ^RAM | ||
+ | ^main | ||
+ | ^gpu ^48h | ||
+ | ^power | ||
+ | |||
+ | The time limit for tasks is **2h** in all partitions if it has not been specified. The table shows the maximum time limit. | ||
+ | |||
+ | The **RAM** column gives the amount of RAM allocated to each reserved **CPU** core. | ||
+ | |||
+ | ====== Batch Processing of Tasks (SLURM) ====== | ||
+ | |||
+ | To use computing resources of the HPC, you need to create task scenarios (sh or csh). | ||
+ | |||
+ | Example: | ||
+ | |||
+ | <code shell mpi-test-job.sh> | ||
+ | #!/bin/bash | ||
+ | #SBATCH -p main | ||
+ | #SBATCH -n4 | ||
+ | module load openmpi | ||
+ | mpicc -o mpi-test mpi-test.c | ||
+ | mpirun mpi-test | ||
+ | </ | ||
+ | |||
+ | After submission and confirmation of your application to the ITOAC services, you need to create a user at https:// | ||
+ | |||
+ | <code shell mpi-test-job.sh> | ||
+ | #!/bin/bash | ||
+ | #SBATCH --account=alloc_xxxx_projektas | ||
+ | #SBATCH -p main | ||
+ | #SBATCH -n4 | ||
+ | #SBATCH --time=minutes | ||
+ | module load openmpi | ||
+ | mpicc -o mpi-test mpi-test.c | ||
+ | mpirun mpi-test | ||
+ | </ | ||
+ | |||
+ | |||
+ | It contains instructions for the task performer as special comments. | ||
+ | |||
+ | -p short - which queue to send to (main, gpu, power). | ||
+ | |||
+ | -n4 - how many processors to reserve (**NOTE:** if you set the number of cores to be used to x, but actually use fewer cores programmatically, | ||
+ | |||
+ | The initial running directory of the task is the current directory (**pwd**) on the login node from where the task is run, unless it was changed to another directory by the -D parameter. For the initial running directory, use the HPC shared filesystem directories **/ | ||
+ | |||
+ | The generated script is sent with the command //sbatch//, | ||
+ | |||
+ | '' | ||
+ | |||
+ | which returns the number of the submitted job **JOBID**. | ||
+ | |||
+ | The status of a pending or ongoing task can be checked with the command //squeue// | ||
+ | |||
+ | '' | ||
+ | |||
+ | With the //scancel// command it is possible to cancel the running of a task or to remove it from the queue | ||
+ | |||
+ | '' | ||
+ | |||
+ | If you forgot your tasks **JOBID**, you can check them with the command //squeue// | ||
+ | |||
+ | '' | ||
+ | |||
+ | Completed tasks are no longer displayed in **squeue**. | ||
+ | |||
+ | If the specified number of processors is not available, your task is added to the queue. It will remain in the queue until a sufficient number of processors become available or until you remove it with **scancel**. | ||
+ | |||
+ | The **output** of the running job is recorded in the file **slurm-JOBID.out**. The error output is written to the same file unless you specified somewhere else. The file names can be changed with the **sbatch** command parameters -o (specify the output file) and -e (specify the error file). | ||
+ | |||
+ | More about SLURM opportunities you can read [[https:// | ||
+ | |||
+ | ====== Interactive Tasks (SLURM) ====== | ||
+ | |||
+ | Interactive tasks can be done with the //srun// command: | ||
+ | |||
+ | < | ||
+ | $ srun --pty $SHELL | ||
+ | </ | ||
+ | |||
+ | The above command will connect you to the compute node environment assigned to SLURM and allow you to directly run and debug programs on it. | ||
+ | |||
+ | After the commands are done disconnect from the compute node with the command | ||
+ | |||
+ | < | ||
+ | $ exit | ||
+ | </ | ||
+ | |||
+ | If you want to run graphical programs, you need to connect to **ssh -X** to **uosis.mif.vu.lt** and **hpc**: | ||
+ | |||
+ | < | ||
+ | $ ssh -X uosis.mif.vu.lt | ||
+ | $ ssh -X hpc | ||
+ | $ srun --pty $SHELL | ||
+ | </ | ||
+ | |||
+ | In **power** cluster interactive tasks can be performed with | ||
+ | |||
+ | < | ||
+ | $ srun -p power --mpi=none --pty $SHELL | ||
+ | </ | ||
+ | |||
+ | ====== GPU Tasks (SLURM) ====== | ||
+ | |||
+ | To use GPU you need to specify additionally < | ||
+ | |||
+ | With '' | ||
+ | |||
+ | Example of an interactive task with 1 GPU: | ||
+ | < | ||
+ | $ srun -p gpu --gres gpu --pty $SHELL | ||
+ | </ | ||
+ | |||
+ | ====== Introduction to OpenMPI ====== | ||
+ | |||
+ | Ubuntu 18.04 LTS is the packet of **2.1.1** OpenMPI version. | ||
+ | To use the newer version **4.0.1** you need to use | ||
+ | < | ||
+ | module load openmpi/4.0 | ||
+ | </ | ||
+ | before running MPI commands. | ||
+ | |||
+ | ===== MPI Compiling Programs ===== | ||
+ | |||
+ | An example of a simple MPI program is in the directory ''/ | ||
+ | |||
+ | < | ||
+ | $ mpicc -o foo foo.c | ||
+ | $ mpif77 -o foo foo.f | ||
+ | $ mpif90 -o foo foo.f | ||
+ | </ | ||
+ | ===== Implementation of MPI Programmes ===== | ||
+ | |||
+ | MPI programs are started with **mpirun** or **mpiexec**. You can learn more about them with the **man mpirun** or **man mpiexec** command. | ||
+ | |||
+ | A simple (SPMD) program can be started with the following mpirun command line. | ||
+ | |||
+ | < | ||
+ | $ mpirun foo | ||
+ | </ | ||
+ | |||
+ | All allocated processors will be used according to the number ordered. If you want to use less, you can specify the -np quantity parameter in **mpirun**. It is not recommended to use less CPU than reserved for a longer time period, as unused CPUs remain free. | ||
+ | |||
+ | **ATTENTION** It is strictly forbidden to use more CPU than you have reserved, as this may affect the performance of other tasks. | ||
+ | |||
+ | Find more information on [[https:// | ||
+ | |||
+ | ====== Task Efficiency ====== | ||
+ | |||
+ | * Please use at least 50% of the ordered CPU quantity. | ||
+ | * Using more CPUs than ordered will not improve performance, | ||
+ | * If you use the '' | ||
+ | |||
+ | ====== The Limits of Resources ====== | ||
+ | |||
+ | If your tasks don't start because of **AssocGrpCPUMinutesLimit** or **AssocGrpGRESMinutes**, | ||
+ | |||
+ | //The first way to see how much resources are used:// | ||
+ | |||
+ | < | ||
+ | sreport -T cpu, | ||
+ | </ | ||
+ | |||
+ | Where the **USERNAME** - is your MIF user name. **Start** and **End** show the start and end days of the current month. You can specify them also by '' | ||
+ | |||
+ | **NOTE** Usage of resources is given in minutes, divide the number by 60 to get hours. | ||
+ | |||
+ | //The second way to see how much resources are used:// | ||
+ | |||
+ | < | ||
+ | sshare -l -A USERNAME_mif -p -o GrpTRESRaw, | ||
+ | </ | ||
+ | |||
+ | Where **USERNAME** is your MIF user name. Or specify the account whose usage you want to see in **-A**. The data is also displayed in minutes: | ||
+ | * **GrpTRESRaw** - how much is used. | ||
+ | * **GrpTRESMins** - what is the limit. | ||
+ | * **GGRTRESRunMins** - the remaining resources for tasks that are still running. | ||
+ | |||
+ | ====== The Links ====== | ||
+ | |||
+ | * [[waldur|HPC Waldur portal description]] | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | * [[http:// | ||
+ | * [[pagalba@mif.vu.lt]] - registration of the **HPC** problems. | ||
en/hpc.txt · Last modified: 2024/02/21 12:50 by rolnas