Primary tabs

Using Slurm Workload Manager

RCC users have the flexibility to run jobs either interactively or by submitting them to the workload manager for scheduling on allocated resources. This process involves specifying various parameters such as wall clock time, memory, number of CPU cores, GPUs, etc. SLURM (Simple Linux Utility for Resource Management) is the tool used  by RCC to manage workloads across its compute clusters, including Midway for open science and MidwayR for restricted systems in the SDE.

This workshop aims to provide a thorough understanding of job submission across all compute partitions at RCC. Participants will learn how to configure a Slurm job, use essential Slurm commands, submit jobs effectively, and avoid common errors that can lead  to extended queue times or job failures. Additionally, the workshop will cover best practices for optimizing job performance and resource utilization.

By the end of the workshop, attendees will:

  • Understand the different Midway resources and partitions for job submission.

  • Be able to create and submit Slurm batch scripts. 

  • Gain proficiency in using the RCC module system for software and runtime environments for submitting serial, and parallel jobs.

  • Utilize Slurm job arrays for executing jobs with varying initial parameters.

  • Efficiently manage and schedule independent processes within a Slurm job allocation.

  • Manage jobs using distributed (e.g., MPI) and shared memory (e.g., OpenMP) programming models standalone and in a hybrid (e.g., MPI+OpenMP) fashion.

  • Request and utilize Slurm interactive sessions.

  • Apply best practices and debugging techniques for writing and optimizing Slurm scripts. 

Participants will leave the workshop with the skills needed to navigate the Slurm workload manager effectively, ensuring efficient job submission and resource usage on RCC’s compute clusters.

Level: Introductory

Duration: 1.5 hours

Prerequisites: Basic understanding of programming or scripting languages. Some familiarity with Linux CLI. Active RCC account.

 

Register

Thursday, November 21, 2024 - 14:00 to 16:00