Overview
Clinical AI systems are rapidly being deployed in healthcare settings, but methods for evaluating their real-world effectiveness lag behind. This research program develops rigorous approaches to assess AI-assisted clinical decision support from multiple perspectives.
Key Research Questions
-
How do we measure the true clinical impact of AI diagnostic tools? Traditional performance metrics (sensitivity, specificity, AUC) don’t capture how AI changes clinician behavior or patient outcomes.
-
What role does “soft ground truth” play in AI evaluation? When even expert labels are uncertain, how should we calibrate and assess AI predictions?
-
How do implementation factors affect AI effectiveness? The same algorithm may perform differently depending on workflow integration, clinician training, and patient population.
Current Projects
Evaluating AI-Assisted Radiology
A multi-site study examining how AI diagnostic aids affect radiologist performance and patient outcomes in breast cancer screening.
Soft Ground Truth Methods
Developing statistical approaches for evaluating AI systems when gold-standard labels are unavailable or uncertain.
Implementation Science for Clinical AI
Qualitative and quantitative research on barriers and facilitators to effective AI implementation in clinical settings.