Resources

IXPUG banner image

NVIDIA GPUs have dominated GPU-accelerated supercomputers for over a decade, but AMD and Intel GPUs have recently boosted cutting-edge supercomputers. Increased competition among GPU vendors has driven performance improvement; however, platform and programming environments diverge simultaneously. In this study, we have developed and optimized $N$-body codes written in CUDA C++, HIP C++, and SYCL for NVIDIA, AMD, and Intel GPUs, respectively, to find a promising environment for developing scientific codes. The fastest code on NVIDIA H100 SXM, written in SYCL and compiled by Intel oneAPI, processed 2.16e+12 interactions per second. On AMD Instinct MI210, SYCL code compiled by AdaptiveCpp and HIP C++ code achieved almost identical performance, with SYCL code achieving a slightly higher performance of 9.06e+11 interactions per second. Only the SYCL code compiled by Intel oneAPI was tested on Intel Data Center GPU Max 1100, and the resultant processing rate was 8.87e+11 interactions per second.

Event Name

IXPUG Workshop at HPC Asia 2025

Keywords

IXPUG Workshop at HPC Asia 2025

Video Name