CUDA at Scale for the Enterprise


This course will aid in students in learning in concepts that scale the use of GPUs and the CPUs that manage their use beyond the most common consumer-grade GPU installations. They will learn how to manage asynchronous workflows, sending and receiving events to encapsulate data transfers and control signals. Also, students will walk through application of GPUs to sorting of data and processing images, implementing their own software using these techniques and libraries.

By the end of the course, you will be able to do the following:
– Develop software that can use multiple CPUs and GPUs
– Develop software that uses CUDA’s events and streams capability to create asynchronous workflows
– Use the CUDA computational model to to solve canonical programming challenges including data sorting and image processing
To be successful in this course, you should have an understanding of parallel programming and experience programming in C/C++.
This course will be extremely applicable to software developers and data scientists working in the fields of high performance computing, data processing, and machine learning.

What you will learn

Course Overview

The purpose of this module is for students to understand how the course will be run, topics, how they will be assessed, and expectations.

Multiple CPU/GPU Systems

In professional settings, use of one CPU managing one GPU, is not a viable configuration to solve complex challenges. Students will apply CUDA capabilities for allowing multiple CPUs to communicate and manage software kernels on multiple GPUs. This will allow for scaling the size of input data and computational complexity. Students will learn the advantages and limitations of this form of synchronous processing.

CUDA Events and Streams

Students will learn to utilize CUDA events and streams in their programs, to allow for asynchronous data and control flows. This will allow more interactive and long-lasting software, including analytic user interfaces, near live-streaming video or financial feeds, and dynamic business processing systems.

Sorting Using GPUs

The purpose of this module is for students to understand the basis in hardware and software that CUDA uses. This is required to appropriately develop software to optimally take advantage of GPU resources.

What’s included