Software must be optimized for both Compute (including SIMD vector parallelism) and effective memory sub-system utilization to achieve scaled performance on modern hardware. In this talk we present state-of-the-art Intel Advisor Roofline performance model automation which helps to identify memory bottlenecks and balance between CPU and memory utilization. The talk will not only cover “cache-aware” Roofline implementation, but also new capabilities to produce DRAM (“original”) and multi-level (L1, L2, LLC, MCDRAM and DRAM – all de-coupled) Roofline model flavors in order to guide DRAM- or cache-bound applications optimization.
IXPUG Webinar Series
Roofline