Large language model evaluationGeneral-Purpose LLMs Beat Specialized Clinical AI on Every Benchmark , and That Should Make You Rethink Fine-TuningA Nature Medicine evaluation finds frontier general-purpose models outperform dedicated clinical AI platforms across every tested category, challenging the assumption that domain specialization always pays off.Nature MedicineLarge Language ModelsClinical AIFine-TuningHallucination Free·Jun 13, 2026·5 min readRead the story