Cerebras Code now supports GLM 4.6 at 1000 tokens/sec

GPT-OSS 120B Runs at 3000 tokens/sec on Cerebras

REAP: One-Shot Pruning for Trillion-Parameter Mixture-of-Experts Models

Cerebras systems raises $1.1B Series G

Cerebras Code

Qwen3 Coder 480B is Live on Cerebras

Cerebras launches Qwen3-235B, achieving 1.5k tokens per second

Cerebras achieves 2,500T/s on Llama 4 Maverick (400B)

100x defect tolerance: How we solved the yield problem

Llama 3.1 405B now runs at 969 tokens/s on Cerebras Inference

Cerebras Launches the Fastest AI Inference

Cerebras Inference: AI at Instant Speed