progscrape: gilesthomas.com

10Gb/s Ethernet: switching to a Broadcom SFP+ module

6 hours ago gilesthomas.com

10Gb/s Ethernet: what I did to get it working in my home

48 days ago gilesthomas.com

Using DistributedDataParallel to train a base model from scratch in the cloud

5 months ago gilesthomas.com

LLM from scratch, part 28 – training a base model from scratch on an RTX 3090

6 months ago gilesthomas.com llm

Writing an LLM from scratch, part 22 – training our LLM

8 months ago gilesthomas.com llm

Writing an LLM from scratch, part 20 – starting training, and cross entropy loss

8 months ago gilesthomas.com llm

The maths you need to start understanding LLMs

9 months ago gilesthomas.com llm

Writing an LLM from scratch, part 13 – attention heads are dumb

13 months ago gilesthomas.com llm

Writing an LLM from scratch, part 8 – trainable self-attention

15 months ago gilesthomas.com llm

Writing an LLM from scratch, part 10 – dropout

15 months ago gilesthomas.com llm

It’s still worth blogging in the age of AI

15 months ago gilesthomas.com ai

The benefits of learning in public

15 months ago gilesthomas.com