[Online] Fundamentals of Accelerated Computing with Modern CUDA C++

Europe/Berlin
Online

Online

Description

Fundamentals of Accelerated Computing with Modern CUDA C++

NHR@FAU

Schedule & Format

  • Date: 2026, October 27-29
  • Times:
    • Oct 27: 9:00 - 13:00 CE(S)T
    • Oct 28: 9:00 - 13:00 CE(S)T
    • Oct 29: 9:00 - 13:00 CE(S)T
  • Format: Three half-days
  • Location: Online via Zoom
  • Language: English

Registered participants will receive the video conferencing link via email on the day before the course.

Instructor

This course is organized by Erlangen National High Performance Computing Center (NHR@FAU) in collaboration with NVIDIA Deep Learning Institute (DLI).

Course Description

This course teaches GPU acceleration of C++ applications using CUDA, with an emphasis on modern C++ idioms rather than low-level GPU APIs. Starting from library-provided parallel algorithms that execute transparently on the GPU, it progresses through custom CUDA kernels, thread hierarchies, shared memory, and concurrent streams - covering the full range from high-level abstractions to fine-grained GPU control. No prior CUDA or GPU programming experience is required.

Further information about this tutorial can be found on the NVIDIA DLI course page.

Prerequisites

Knowledge

  • C++ programming experience, including lambda expressions and standard library algorithms

Technical

  • A free NVIDIA developer account
  • A local installation of NVIDIA Nsight Systems is recommended

Course Structure

  • GPU programming fundamentals: writing and launching CUDA-accelerated C++ code; applying parallel algorithms on the GPU
  • Concurrency and profiling: CUDA streams, asynchronous data transfers, and code analysis with NVIDIA Nsight Systems
  • Custom kernel development: thread hierarchies, shared memory, and cooperative parallel algorithms

Learning Outcomes

After completing this course, you will be able to:

  • Accelerate C++ applications by writing, compiling, and running GPU code with CUDA
  • Apply parallel algorithms to GPU workloads without writing custom kernels
  • Manage CPU-GPU data movement and optimize memory access patterns
  • Write custom CUDA kernels and manage thread hierarchies and shared memory
  • Overlap computation with data transfers using concurrent CUDA streams
  • Profile GPU code and identify performance bottlenecks with NVIDIA Nsight Systems

Registration, Wait List and Withdrawal Policy

Registration

Please register at the bottom of this page. Registration is open until a few days before the course starts, or until the course is fully booked.

Prices and Eligibility

Free for participants affiliated with academic institutions in EU member states and Horizon 2020-associated countries

Wait List

Email nhr-training@fau.de with name and university affiliation

Withdrawal Policy

Withdraw through the registration system or email nhr-training@fau.de. No-shows will be excluded from future events.

If you need to withdraw your registration, please either cancel it directly through the registration system or send an email to nhr-training@fau.de.

Additional Courses

You can find an up-to-date list of all courses offered by NHR@FAU at https://hpc.fau.de/teaching/tutorials-and-courses/.

Registration
Participants
0 / 40
The agenda of this meeting is empty