[Online] Scaling CUDA-Accelerated Applications

Europe/Berlin
Online

Online

Description

Scaling CUDA-Accelerated Applications

NHR@FAU

Schedule & Format

  • Date: 2026, September 7-9
  • Times:
    • Sep 7: 9:00 - 15:00 CE(S)T
    • Sep 8: 9:00 - 15:00 CE(S)T
    • Sep 9: 9:00 - 15:00 CE(S)T
  • Format: Three-day
  • Location: Online via Zoom
  • Language: English

Registered participants will receive the video conferencing link via email on the day before the course.

From Zero to Multi-Node GPU Programming

This event is part of the From Zero to Multi-Node GPU Programming series. Registration is done individually for each part of the series.

  • Part 1 - Introduction to CUDA C/C++ (2026, September 3-4) (Register)
  • Part 2 - Scaling CUDA-Accelerated Applications (this course) (2026, September 7-9) (Register)

Instructors

This course is organized by Erlangen National High Performance Computing Center (NHR@FAU) in collaboration with NHR@TUD.

Course Description

Scaling a GPU application beyond a single accelerator requires both intra-node and inter-node parallelism. This course provides a comprehensive treatment of both: part one covers CUDA streams, multi-GPU execution within a node, and direct peer-to-peer GPU memory access; part two extends that foundation to multi-node deployments using CUDA-aware MPI and NVSHMEM, including domain decomposition and halo exchange patterns. The course uses a progression from CPU baseline through managed memory and algorithmic partitioning to full distributed execution.

This course was developed to replace the two formerly separate NVIDIA DLI courses Accelerating CUDA C++ Applications with Multiple GPUs and Scaling CUDA C++ Applications to Multiple Nodes which have been first on hold and then finally discontinued in 2025 and 2026.

Prerequisites

Knowledge

  • Experience with CUDA C++ GPU programming, including memory allocation, kernel launches, grid-stride loops, and error handling (equivalent to the Introduction to CUDA C/C++ course)
  • Familiarity with the Linux command line as well as compiling and running CUDA applications

Technical

  • A modern web browser (for JupyterHub access to NHR@FAU's HPC clusters)
  • A local installation of NVIDIA Nsight Systems

Course Structure

  • CPU baseline and GPU porting: managed memory, algorithmic work partitioning
  • CUDA streams and copy/compute overlap: concurrent execution and Nsight Systems profiling
  • Multi-GPU programming: device management, workload indexing, and peer-to-peer communication
  • Multi-node parallelism: MPI fundamentals, CUDA-aware MPI, and halo exchanges
  • NVSHMEM: symmetric memory model, GPU-initiated transfers, and distributed solvers

Learning Outcomes

After completing this course, you will be able to:

  • Use concurrent CUDA streams to overlap memory transfers with GPU computation
  • Scale CUDA C++ workloads across multiple GPUs within a single compute node
  • Enable and exploit direct peer-to-peer GPU memory access for efficient intra-node communication
  • Write portable, scalable SPMD code using CUDA-aware MPI with inter-node GPU communication
  • Apply NVSHMEM for GPU-initiated data transfers using the symmetric memory model
  • Implement domain decomposition and halo exchange patterns for distributed GPU workloads
  • Profile multi-GPU execution and identify performance bottlenecks with NVIDIA Nsight Systems

Registration, Wait List and Withdrawal Policy

Registration

Please register at the bottom of this page. Registration is open until a few days before the course starts, or until the course is fully booked.

Prices and Eligibility

This course is open and free of charge for participants affiliated with academic institutions in European Union (EU) member states and Horizon 2020-associated countries.

Wait List

If the course reaches its maximum capacity, you can request to join the wait list by sending an email to nhr-training@fau.de. Please include your name and university affiliation in the message.

Withdrawal Policy

Please only register if you are committed to attending the course. No-shows will be blacklisted and excluded from future events.

If you need to withdraw your registration, please either cancel it directly through the registration system or send an email to nhr-training@fau.de.

Additional Courses

You can find an up-to-date list of all courses offered by NHR@FAU at https://hpc.fau.de/teaching/tutorials-and-courses/.

Registration
Participants
0 / 63
The agenda of this meeting is empty