LLMs Diagnose ER Patients as Well as Docs

Editorial

TechCrunch·May 3, 2026

💡LLMs Diagnose ER Patients as Well as Docs

AI models outperform doctors at initial triage

TL;DR

A new study shows OpenAI's LLMs perform on par or better than human physicians during ER triage. The o1 model got the exact diagnosis in 67% of cases, compared to 50-55% for doctors.

Researchers published a study this week in Science showing that large language models like OpenAI's can diagnose patients as accurately as internal medicine physicians at initial emergency room triage. The o1 model managed to offer the exact or very close diagnosis in 67% of cases, compared to 50-55% for two attending physicians who were unaware if diagnoses came from AI or humans. This is a big deal for anyone working on medical AI applications, as it suggests LLMs could soon play a major role in triage decisions. The study also highlights the need for more real-world testing and accountability frameworks.

LLMs Diagnose ER Patients as Well as Docs

Key Points

1

Study published this week in Science; researchers compared OpenAI's models to human physicians

2

LLM o1 performed nominally better or equally well at each diagnostic touchpoint compared to two attending physicians

3

o1 model got the exact diagnosis in 67% of triage cases, while doctors had it 50-55% of the time

4

Researchers only studied how models performed with text-based information available in electronic medical records

5

Current studies suggest foundation models are more limited in reasoning over non-text inputs

Why It Matters

If you're working on medical AI applications, this study shows LLMs could soon play a major role in triage decisions. However, the need for real-world testing and accountability frameworks is urgent to ensure safe deployment.

llmemergency-medicinediagnosisartificial-intelligence

Frequently Asked Questions

Why does this matter?

If you're working on medical AI applications, this study shows LLMs could soon play a major role in triage decisions. However, the need for real-world testing and accountability frameworks is urgent to ensure safe deployment.

What happened?

A new study shows OpenAI's LLMs perform on par or better than human physicians during ER triage. The o1 model got the exact diagnosis in 67% of cases, compared to 50-55% for doctors.

💡LLMs Diagnose ER Patients as Well as Docs

Key Points

Why It Matters

Frequently Asked Questions

Comments

Enjoyed this article?