New $60 Million Initiative Evaluates AI Reliability in Global Health

The promise of artificial intelligence to revolutionize healthcare is, at this point, almost a cliché. Headlines routinely proclaim AI’s potential to diagnose diseases faster, personalize treatments, and alleviate burdens on overstretched healthcare systems. But beneath the hype lies a critical gap: a demonstrable lack of rigorous evidence proving these tools actually work – particularly in the settings where they are most needed. The newly launched Evidence for AI in Health (EVAH) initiative, a US$60 million partnership between Wellcome, the Gates Foundation, and the Novo Nordisk Foundation, isn’t aiming to build the next groundbreaking AI diagnostic; it’s aiming to determine which existing and emerging tools are genuinely ready for deployment in low- and middle-income countries, and under what conditions. This isn’t about slowing innovation, but about ensuring that AI serves to reduce, rather than exacerbate, existing health inequities.

Beyond the Hype: What EVAH Will Actually Evaluate

The core problem EVAH addresses is the “black box” nature of many AI health applications. While AI algorithms can achieve impressive results in controlled laboratory settings, their performance can falter dramatically when applied to real-world clinical environments, especially those with limited resources and diverse patient populations. The initiative will focus on evaluating tools designed for frontline health workers in Sub-Saharan Africa and South and South-East Asia, specifically those assisting with triage, diagnosis, and referral processes. This isn’t a blanket assessment of all AI in healthcare; it’s a targeted effort to assess tools poised for immediate impact. Charlotte Watts, Executive Director of Solutions at Wellcome, emphasizes the collaborative approach, stating, “Only by working in partnership, and investing in rigorous evidence generation and learning, will we be able to support decision-makers and services to meet the needs of the communities they serve.” The scope of tools under consideration is broad, encompassing prediction models for disease risk, computer vision for analyzing medical images like X-rays, large language models to aid clinical documentation, and even multimodal AI systems that integrate various data types.

Drawn from wellcome.org.

A Multifaceted Approach to Real-World Validation

EVAH’s methodology is deliberately comprehensive, moving beyond simple accuracy metrics to consider the practical realities of implementation. The initiative will employ a range of evaluation methods, including implementation research – observing how tools function within existing healthcare workflows – and randomized controlled trials, the gold standard for assessing effectiveness. Crucially, economic evaluations will assess the cost-effectiveness of these tools, a vital consideration for resource-constrained settings. Perhaps most importantly, EVAH will prioritize “acceptability studies,” investigating how patients, clinicians, and communities perceive and interact with these technologies. A highly accurate AI diagnostic is useless if it isn’t trusted or utilized by the people it’s intended to help. This focus on local context is bolstered by partnerships with the Abdul Latif Jameel Poverty Action Lab and the African Population Health Research Centre, organizations with established expertise in conducting impactful research that informs policy.

Prioritizing Equity and Local Leadership

The initiative isn’t simply about validating AI; it’s about validating AI that is equitably designed and deployed. Tools will be prioritized if they are specifically designed for resource-limited settings and, critically, trained on data that accurately reflects the populations they are intended to serve. This addresses a significant concern within the AI field: algorithmic bias. AI models trained on data predominantly from Western populations may perform poorly – or even perpetuate existing biases – when applied to different ethnic or socioeconomic groups. Trevor Mundel, President of Global Health at the Gates Foundation, highlights the potential for accelerated innovation, stating, “Ensuring new AI tools are backed with real-world evidence can significantly reduce the time it takes to turn promising ideas into scalable innovations.” The commitment to open access – all EVAH findings will be freely available online – further underscores the initiative’s dedication to transparency and equitable knowledge sharing. Lene Oddershede, Chief Scientific Officer at the Novo Nordisk Foundation, explains that this open access will “provide decision-makers with crucial data on efficacy, economic value and acceptability of these technologies in the contexts where they’re most needed.”

The Next Phase: What to Watch For

EVAH’s initial focus will be on AI-enabled decision support tools, with the first requests for proposals already underway. However, the long-term success of the initiative hinges on its ability to build sustainable local capacity for AI evaluation. The question now isn’t just which AI tools work, but who will be equipped to independently assess and adapt these technologies in the future. As EVAH progresses, it will be crucial to monitor not only the published findings, but also the development of local research infrastructure and the engagement of regional stakeholders. Will EVAH’s model of rigorous, locally-led evaluation become the standard for AI health deployments globally, or will the rush to innovation continue to outpace the evidence needed to ensure equitable and effective outcomes? The answer to that question will determine whether AI truly lives up to its promise of transforming healthcare for all.