Advancing conversational diagnostic AI with multimodal reasoning
Khaled Saab et al. · Nature Medicine · 2026

Advancing conversational diagnostic AI with multimodal reasoning
Khaled Saab et al.•Nature Medicine•May 14, 2026
Multimodal AMIE integrates skin images, ECGs, and clinical documents into diagnostic conversations, reaching diagnostic accuracy and conversation quality comparable to or better than primary care physicians.
AI Summary
EnglishMultimodal AMIE integrates skin images, ECGs, and clinical documents into diagnostic conversations, reaching diagnostic accuracy and conversation quality comparable to or better than primary care physicians.
Conclusion
By augmenting Gemini 2.0 Flash with a state-aware reasoning framework, multimodal AMIE achieved diagnostic accuracy and conversational quality comparable to or exceeding primary care physicians in this OSCE evaluation.
Research Question
Can a large language model request, interpret, and reason over multimodal clinical data such as skin images, ECGs, and clinical documents during diagnostic conversations, and how does its performance compare with primary care physicians?
Methods
- Design
- Exploratory randomized, blinded OSCE-style study with 105 clinical scenarios, 25 trained patient actors, 19 board-certified primary care physicians, 18 specialist evaluators, and 210 simulated text-chat telemedicine consultations.
- Measures
- 105 cases, 210 consultations, superior on 29 of 32 specialist-rated axes, diagnostic accuracy P<0.001
Main Results
Multimodal AMIE achieved significantly higher top-k diagnostic accuracy than PCPs (P<0.001). Specialist evaluation favored AMIE on 29 of 32 axes and on 7 of 9 multimodal reasoning dimensions. Patient satisfaction was comparable or better across evaluated dimensions.
Limitations
This was an exploratory study rather than a preregistered randomized clinical trial. Text-chat consultations also limit physical examination and nonverbal clinical cues.
Implications
The system could support diagnostic care in telemedicine and in areas with limited access to primary care, but prospective validation of safety, reliability, and regulatory readiness is needed before clinical deployment.
Key Terms
- AMIE(p.2)
- A conversational diagnostic AI system based on large language models, extended in this study to handle multimodal clinical inputs.
- State-aware reasoning(p.3)
- A dialogue management approach that adapts history-taking, diagnostic reasoning, management, and follow-up questions to the patient state and uncertainty.
- OSCE(p.4)
- Objective Structured Clinical Examination, a standardized clinical skills assessment using trained patient actors.
Notes
A production-quality demo paper that shows Paperfy’s summary, Figure 1 extraction, and radio-style listening experience for an AI medicine paper.
In the full app, select summary text and turn it into reusable notes.
Listen to a demo paper first.
Sign up to turn your own PDFs into the same listening experience.