Loading...

Tag trends are in beta. Feedback? Thoughts? Email me at [email protected]

The Cost of Our Lies to AI

British naval dominance during the age of sail

Methods of defence against AGI manipulation

Language equivariance as a way of figuring out what an AI "means"

Navigating Burnout

Playing in the Creek

How to Make Superbabies

A slow guide to confronting doom

Recent AI model progress feels mostly like bullshit

Tied Crosscoders: Tracing How Chat LLM Behavior Emerges from Base Model

Apparent signs of distress during LLM redteaming

A bear case: My predictions regarding AI progress

How much are LLMs boosting real-world programmer productivity?

Metacompilation. Making compilation more self referential.

Claude is More Anxious than GPT

Alignment faking in large language models

FrontierMath was funded by OpenAI

Some Lessons from the OpenAI FrontierMath Debacle

Human study on AI spear phishing campaigns

The Online Sports Gambling Experiment Has Failed

AIs Will Increasingly Attempt Shenanigans

o1: A Technical Primer

10-Step Anti-Procrastination Checklist (2013)

By default, capital will matter more than ever after AGI

Measuring hardware overhang (2020)

OpenAI Email Archives

The Median Researcher Problem

Laziness Death Spirals

Three subtle examples of data leakage

Adam Optimizer Causes Privileged Basis in Transformer LM Residual Stream

More →