Large language models are poor medical coders

Researchers find that GPT-4 performs as well as or better than doctors on medical tests, especially in psychiatry