Writing an LLM from scratch, part 13 – attention heads are dumb

Related Stories

Breaking Quadratic Barriers: A Non-Attention LLM for Ultra-Long Context Horizons

Writing Load Balancer From Scratch In 250 Line of Code - Beginner Friendly

Learning Rust from Scratch

Tokasaurus: An LLM inference engine for high-throughput workloads

Minimal HTTP Server Implementation from Scratch

Google Veo 3 Implemented from Scratch

Building a Redis clone from scratch

Could an LLM create a full Domain-Specific Language?

wrote BitTorrent Client from scratch in Go

LLM Agents Are Breaking Your Platform, Not Your Architecture

Emulating an iPhone in QEMU (Part 2)

Let's Write a JSON Parser From Scratch

Building Web Apps from Scratch: HTTP Protocol Explained

Shadow: A browser engine from scratch in Javascript

Show HN: I wrote a BitTorrent Client from scratch

Building a mini search engine from scratch in Python

TLOB: Dual Attention Transformer Predicts Price Trends from Order Book Data

Built a log processing pipeline with Go and an LLM and wanted to share

From LLM to AI Agent: What's the Real Journey Behind AI System Development?

The Emperor's New LLM

Writing in the Age of LLMs

Log-Linear Attention

Compiling 64Bit Linux from Scratch on Windows XP (by NCommander)

Building a Debugger: Write a Native x64 Debugger From Scratch

Implementing a convolutional neural network from scratch with no libraries

Built an MCP Client into my Rust LLM inference engine - Connect to external tools automatically!

This is an AI scaffolding that directs the bootstrapping LLM at RSI and Ethical ASI

java mooc exercises are not not showing after part 2

Reverse Engineering Cursor's LLM Client

Compare LLM Prompts & Models