Faster sorting with SIMD CUDA intrinsics (2024)

Faster sorting with SIMD CUDA intrinsics