Show The Graduate Center Menu

Programming w/ Massively Parallel GPU & CUDA

Instructor: Professor Paula Whitlock

Course Topics

Using GPUs for Massively Parallel Computing

  • Introduction to GPU Computing - history, architecture, massively parallel computations, data parallelism vs task parallelism

  • Introduction to CUDA

    • Review of thread programming

    • CUDA program structure

    • Device memory and data transfer

    • Host vs device functions

  • Cuda thread and GPU thread processors

    • Assignment of threads to processors

    • Blocks, grids and warps

    • Synchronization and scaleability

    • Thread scheduling

  • CUDA memories

    • Problems with efficiency

    • Types of device memory

    • Strategies for using memory

  • GPU Performance Issues

    • Programming and thread execution

    • Memory coalescing techniaues

    • Dynamic partitioning of resources - registers, thread block slots and thread slots

    • Prefetching data

    • Granularity of threads

    • Measuring performance

  • Floating Point Calculations

    • How floating point numbers are represented

    • Precision, accuracy and rounding

    • Algorithm considerations

  • Applications

  • Streams and task parallelism

  • CUDA on multiple GPUs

  • Available tools



"Programming Massively Parallel Processors: A Hands-on Approach," David B. Kirk and Wen-Mei Hwu, MOrgan Kaufmann, 2nd edition, 2013