GPU Prefix Sums: A nearly complete collection

Parallel Reduce and Scan on the GPU

anubis_offload: userscript to offload Anubis PoW to native CPU or GPU code

One GPU translates into three to five of the fastest Ethernet switch ports

Rust GPU physics engine

Dissecting the Apple M1 GPU, the end

Kioxia's 5TB, 64 GB/s flash module puts NAND toward the memory bus for AI GPU

How to stream voxel data from a 64Tree real time into GPU

Nvidia Tilus: A Tile-Level GPU Kernel Programming Language

Using large-scale search to discover fast GPU kernels in Rust

Show HN: Luminal – Open-source, search-based GPU compiler

Writing a Rust GPU kernel driver: a brief introduction on how GPU drivers work

Linux 6.16.1 Fixes a Large Intel GPU Driver Performance Regression – Up to 30%

Investigating the Nvidia AI GPU Black Market [video]

GPU-rich labs have won: What's left for the rest of us is distillation

Would a "venv" wrapper around multiprocessing be useful? (hardware-aware pools, NUMA, GPU, etc.)

How to set up rust-gpu and winit?

Pedagogical AI/GPU Compiler

Rust running on every GPU

A GPU Calculator That Helps Calculate What GPU to Use

An Introduction to GPU Profiling and Optimization

Show HN: My GPU Fan Saga – A DIY ATX Fan Controller

GPUHammer: Rowhammer attacks on GPU memories are practical

Impact of PCIe 5.0 Bandwidth on GPU Content Creation and LLM Performance

Show HN: Fixstars AIBooster – Accelerate AI Training and Cut GPU Costs

Streaming Voxels to the GPU in Rust – Visibility-Based Approach

GET gpu for your LLMs to host them...

RapidRAW: A non-destructive and GPU-accelerated RAW image editor

Rust – Low CPU and GPU Usage Despite High FPS? How to Maximize Performance?

Calculating the Fibonacci numbers on GPU

More →