Running Jobs on RCC Systems Using the Slurm Workload Manager and Best Practices
RCC users can run jobs either interactively or by submitting them to the workload manager, SLURM (Simple Linux Utility for Resource Management), for scheduling on allocated resources. This involves specifying various parameters such as wall clock time, memory, number of CPU cores, GPUs, etc. SLURM is used by the RCC to manage workloads across its compute clusters, including Midway for open science and MidwayR for restricted data in the secure data enclave (SDE).
This workshop aims to provide a comprehensive understanding of job submission across all compute partitions at the RCC. Participants will learn how to configure and submit Slurm jobs, use essential Slurm commands, and avoid common errors that can lead to extended queue times or job failures. Additionally, the workshop will cover best practices for optimizing job performance and resource utilization.
By the end of the workshop,participants will be able to:
-
Understand the different Midway resources and partitions for job submission.
-
create and submit Slurm batch scripts.
-
Gain proficiency in using the RCC module system to manage software and runtime environments for serial, and parallel jobs.
-
Utilize Slurm job arrays for executing jobs with varying initial parameters.
-
Efficiently manage and schedule independent processes within a Slurm job allocation.
-
Manage jobs using distributed (e.g., MPI) and shared memory (e.g., OpenMP) programming models standalone and in a hybrid (e.g., MPI+OpenMP) fashion.
-
Request and utilize Slurm interactive sessions.
-
Apply best practices and debugging techniques for writing and optimizing Slurm scripts.
Participants will leave the workshop equipped with skills to navigate SLURM effectively, ensuring efficient job submission and resource utilization on RCC’s compute clusters.
Level: Intermediate
Duration: 1.5 hours
Prerequisites: Basic understanding of programming or scripting languages. Some familiarity with Linux CLI. Must have an active RCC account.
RegisterWednesday, February 26, 2025 - 13:00 to 14:30