Attention is NOT all you need: Qwerky-72B trained using only 8 AMD MI300X GPUs

EagleX 1.7T: Soaring past LLaMA 7B 2T in both English and Multi-lang evals