Overview
Primary care providers in resource-limited settings face significant challenges in accurate diagnosis, particularly when managing complex clinical presentations with limited access to specialists or advanced diagnostics. This research program investigates how generative AI and large language models (LLMs) can augment clinical decision-making in these contexts.
Our approach combines cutting-edge AI capabilities with rigorous evaluation methods, ensuring that AI tools genuinely improve clinical outcomes rather than simply appearing to perform well on benchmark tests.
Key Research Questions
-
Can LLMs generate clinically accurate synthetic vignettes for training and evaluation? We’re developing evaluation frameworks to rigorously assess the accuracy of AI-generated clinical scenarios.
-
How can reinforcement learning with expert feedback (RLEF) improve AI diagnostic suggestions? By collecting structured feedback from practicing clinicians, we aim to fine-tune AI systems to provide more contextually appropriate recommendations.
-
What are the barriers to AI adoption in low-resource clinical settings? Through pilot implementations in Kenya, we’re identifying practical challenges and solutions for deploying AI diagnostic support.
Current Projects
Reinforcement Learning with Collective Expert Feedback
Gillings Innovation Lab, December 2023 - April 2026
This project develops methodology for using structured clinician feedback to improve generative AI systems for primary care. Key aims include:
- Constructing evaluation frameworks for AI-generated clinical vignettes
- Collecting physician feedback on synthetic diagnostic scenarios
- Testing feedback integration within RLEF systems for iterative improvement
Kenya Pilot Implementation
In partnership with CFK Africa
Feasibility evaluation for deploying AI-assisted diagnostic support in Kenyan primary care settings, including physician recruitment, platform development, and initial feedback collection.
Methods
Our research employs:
- Synthetic vignette generation using large language models
- Expert panel evaluation with multiple physician assessments
- Reinforcement learning to incorporate human feedback into AI systems
- Mixed-methods implementation research to understand adoption barriers
- Randomized pilot studies to measure real-world impact