progscrape: substack.recursal.ai

Attention is NOT all you need: Qwerky-72B trained using only 8 AMD MI300X GPUs

5 months ago substack.recursal.ai amd

EagleX 1.7T: Soaring past LLaMA 7B 2T in both English and Multi-lang evals

18 months ago substack.recursal.ai