Source

CUDA Tile IR is an MLIR-based intermediate representation and compiler infrastructure for CUDA kernel optimization, focusing on tile-based computation patterns and optimizations targeting NVIDIA tensor core units. The project provides a comprehensive ecosystem for expressing and optimizing tiled computations for NVIDIA GPUs, simplifying the development of high-performance CUDA kernels through abstractions for common tiling patterns, memory hierarchy management, and GPU-specific optimizations.

This open-source release is aligned with the CUDA Toolkit 13.1 release. For more information about CUDA Tile, visit https://developer.nvidia.com/cuda/tile.

Core Components

CUDA Tile is composed of:

CUDA Tile Specification

CUDA Tile development is driven by the CUDA Tile IR specification, which defines the formal semantics, operations, and type system for tile-based computations on NVIDIA GPUs. For detailed information about the CUDA Tile IR specification, including dialect operations, type system, and transformation passes, please refer to the CUDA Tile Specification.


Tags: language   mlir  

Last modified 15 January 2026