Loading...

Tag trends are in beta. Feedback? Thoughts? Email me at [email protected]

Claude is More Anxious than GPT

Alignment faking in large language models

FrontierMath was funded by OpenAI

Some Lessons from the OpenAI FrontierMath Debacle

Human study on AI spear phishing campaigns

The Online Sports Gambling Experiment Has Failed

AIs Will Increasingly Attempt Shenanigans

o1: A Technical Primer

10-Step Anti-Procrastination Checklist (2013)

By default, capital will matter more than ever after AGI

Measuring hardware overhang (2020)

OpenAI Email Archives

The Median Researcher Problem

Laziness Death Spirals

Three subtle examples of data leakage

Adam Optimizer Causes Privileged Basis in Transformer LM Residual Stream

Contra papers claiming superhuman AI forecasting

OpenAI shows 'Strawberry' to feds, races to launch it

We don't know how bad most things are nor precisely how they're bad

Superbabies: Putting the pieces together

Claude 3.5 Sonnet Reproduces BIG-Bench Canary String

I would have shit in that alley, too

Transformers Represent Belief State Geometry in Their Residual Stream

Autism as the Kolmogorov complexity phenotype

Ex-OpenAI employee reported losing 85% of his family's net worth

Refusal in LLMs is mediated by a single direction

Becoming an Amateur Polyglot

Third Time: a better way to work (2022)

My Clients, the Liars

Social status hacks from The Improv Wiki

More →