[Online] Choosing GPU Programming Approaches

Europe/Berlin
Online

Online

Description

Choosing GPU Programming Approaches

NHR@FAU

Schedule & Format

  • Date: 2026, November 9-10
  • Times:
    • Nov 9: 9:00 - 13:00 CE(S)T
    • Nov 10: 9:00 - 13:00 CE(S)T
  • Format: Two half-days
  • Location: Online via Zoom
  • Language: English

Registered participants will receive the video conferencing link via email on the day before the course.

Instructor

This course is organized by Erlangen National High Performance Computing Center (NHR@FAU).

Course Description

The GPU programming landscape has grown from a single dominant framework (CUDA) into a diverse ecosystem of vendor-neutral and performance-portable alternatives. Choosing the right approach for a given application - considering hardware targets, portability requirements, team expertise, and performance goals - is a non-trivial decision. This course surveys the most widely used GPU programming models: CUDA/HIP, SYCL, modern C++ parallel algorithms, Thrust, OpenACC, OpenMP offloading, and Kokkos. For each approach, participants see representative code, learn the key abstractions, and assess the trade-offs in portability, expressiveness, and performance.

Prerequisites

Knowledge

  • Familiarity with modern C++ programming (templates, lambdas, and the STL)
  • Prior experience with at least one GPU programming approach is recommended but not required

Technical

  • A modern web browser (exercises run on NHR@FAU's HPC clusters via JupyterHub - no local installation required)

Course Structure

  • GPU programming landscape: hardware diversity, portability challenges, and framework taxonomy
  • Low-level, vendor-specific approaches: CUDA and HIP
  • Open standard directives: OpenACC and OpenMP target offloading
  • Performance portability libraries: Kokkos
  • C++ abstraction layers: SYCL, Thrust, and standard library parallel algorithms
  • Performance analysis: profiling with Nsight Systems and Nsight Compute, and common optimization patterns
  • Hands-on programming challenge: porting STREAM, a 2D stencil, and a conjugate-gradient solver across multiple approaches
  • Comparative evaluation: portability, performance, and practical considerations for framework selection

Learning Outcomes

After completing this course, you will be able to:

  • Describe the key abstractions and execution model of each major GPU programming framework: CUDA/HIP, SYCL, OpenACC, OpenMP offloading, Kokkos, Thrust, and standard C++ parallel algorithms
  • Implement a representative kernel in multiple frameworks and compare the resulting code
  • Evaluate each approach across the dimensions of portability, performance, and programming effort
  • Select the most appropriate GPU programming model for a given combination of hardware targets and application requirements
  • Profile a GPU application with Nsight Systems and Nsight Compute and relate observed performance to the choice of approach
  • Identify NHR@FAU courses and resources for deepening expertise in any specific framework

Registration, Wait List and Withdrawal Policy

Registration

Please register at the bottom of this page. Registration is open until a few days before the course starts, or until the course is fully booked.

Prices and Eligibility

Free for participants affiliated with academic institutions in EU member states and Horizon 2020-associated countries

Wait List

Email nhr-training@fau.de with name and university affiliation

Withdrawal Policy

Withdraw through the registration system or email nhr-training@fau.de. No-shows will be excluded from future events.

If you need to withdraw your registration, please either cancel it directly through the registration system or send an email to nhr-training@fau.de.

Additional Courses

You can find an up-to-date list of all courses offered by NHR@FAU at https://hpc.fau.de/teaching/tutorials-and-courses/.

Registration
Participants
0 / 40
The agenda of this meeting is empty