AXIOM: Built a sparse dynamic routing architecture for LLM inference entirely in Rust. No ML frameworks, no GPU, 1.2M parameters

Track real-time GPU and LLM pricing across all cloud and inference providers

Track real-time GPU and LLM pricing across all cloud and inference providers

I built a computing environment in Rust where every program is AI-generated, compiled to WASM, and GPU-rendered via wgpu

GPU-accelerated declarative plotting in WebGL – introducing Gladly

ORE: A process manager written in Rust to schedule GPU resources and prevent security vulnerability, VRAM OOM (using Tokio Semaphores & Axum) for local LLMs

GPU Rack Power Density, 2015–2025

Async/Await on the GPU

How NVIDIA's CuTe replaces GPU index arithmetic with composable layout algebra

Testing "Raw" GPU Cache Latency