PROBAST+AI guidelines updated to improve AI quality assessment

The PROBAST+AI guidelines have been updated to provide clearer, more comprehensive standards for evaluating artificial intelligence (AI) models in health care research. Originally introduced to assess the risk of bias and the robustness of prediction models, they have been revised to address the growing importance of AI in health care decision-making.
The PROBAST (Prediction model Risk Of Bias ASsessment Tool) guidelines were first published in 2019 to evaluate the quality and sources of bias of prediction models in health care research, i.e., those that estimate the probability of a health outcome for individuals. Since that time, the use of artificial intelligence (AI) and machine learning techniques have become more widespread, resulting in a need for new guidance for studies that incorporate these technologies.
"As AI becomes more widely used in medical decision-making, it's vital that researchers have the tools to critically appraise them and the potential biases or limitations of these models," said Gary Collins, Professor of Medical Statistics at NDORMS, University of Oxford. "The original PROBAST framework was an important step, but we recognized the need for additional guidance to address the unique challenges of AI research."
Published in the BMJ, the PROBAST+AI guidelines were developed by an international working group consisting of experts in prediction model research, artificial intelligence, and systematic reviews.
The key features of the updated guidelines include:
- Comprehensive Assessment Criteria: The guidelines expand on the original framework, providing detailed criteria for evaluating AI models. This includes assessments of data quality, model development, and validation processes.
- Focus on Bias and Fairness: Recognizing that bias can lead to unequal health care outcomes, PROBAST+AI emphasizes the need to identify and mitigate biases in AI systems, which includes a thorough evaluation of the data sets used to train AI models.
- Inclusion of Stakeholder Perspectives: The guidelines encourage researchers to involve diverse stakeholders, including patients and health care providers, in the development and evaluation of AI models. This collaborative approach helps ensure that the models meet real-world needs.
- Reduce research waste: The framework can be used to guide the design and analysis of a predictive AI study. PROBAST+AI aligns with the TRIPOD+AI reporting guidelines to improve the accuracy, effectiveness, generalizability, and appropriate use of AI models.
- Emphasis on Real-World Impact: The guidelines stress the importance of assessing how AI models perform in actual clinical settings, rather than just in controlled environments. This focus on practicality aims to ensure that AI tools align with intended use, are effective and beneficial in day-to-day health care.
The PROBAST+AI guidelines are expected to have a significant impact on how AI research in health care is conducted and reported. This will ensure they meet high standards of validity, fairness, and applicability to ultimately build greater trust in the use of AI to support clinical decision-making.
More information:
Karel G M Moons et al, PROBAST+AI: an updated quality, risk of bias, and applicability assessment tool for prediction models using regression or artificial intelligence methods, BMJ (2025). DOI: 10.1136/bmj-2024-082505
Provided by University of Oxford