RE: Slurm squeue: see number of gpus

How can this be done?

Bobbel Asked on December 1, 2023 in uncategorized.
Add Comment
1 Answers
To see the number of GPUs allocated to your jobs in Slurm's queue, you will need to use the `squeue` command along with some formatting options to display the GPU count. Here's what you can do: 1. Standard `squeue` with no formatting related to GPUs: ```bash squeue -u $USER ``` 2. To see the number of GPUs requested, you can format the output using the `--Format` or `-o` option: ```bash squeue -u $USER -o "%.18i %.9P %.8j %.8u %.2t %.10M %.6D %C %G" ``` - `%i`: Job ID - `%P`: Partition - `%j`: Job name - `%u`: User name - `%t`: Job state - `%M`: Time running - `%D`: Number of nodes - `%C`: Number of CPUs - `%G`: Generic resource (GRES) count 3. If your Slurm configuration uses `GRES` (Generic Resource Scheduling), which includes GPUs: ```bash squeue -u $USER --format="%i %t %G" ``` This will output the job ID, job state, and generic resource count in a custom format, where `%G` should show the GPU allocation details. 4. Alternatively, you can check the output of `scontrol show job JOB_ID` for specific information about a given job, including GPU allocation: ```bash scontrol show job JOB_ID ``` Replace `JOB_ID` with your actual job ID number. Then look for the `TresPerNode` or `GRES` field in the output, which should list GPU resources if they are allocated. 5. If you want to see GPU usage for all jobs, you may need administrator access to run this: ```bash sinfo --Node --long --Format="NodeHost,Gres:25" ``` Remember that your specific Slurm setup might require different options or may have custom GRES names for GPUs. If you encounter any difficulties or unexpected outputs, consult your cluster's documentation or reach out to your system administrator for guidance, as they can provide configurations specific to your Slurm setup.
Answered on December 1, 2023.
Add Comment

Your Answer

By posting your answer, you agree to the privacy policy and terms of service.