Speculative KV coding: losslessly compressing KV cache by up to ~4×

Bringing Up DeepSeek-V4-Flash on AMD MI300X

Also-RANS: Asymmetric Numeral Systems for Entropy Coding