
BRITE Institute is building a structured database of healthcare AI failure modes. The database will translate evidence about known and potential failures into practical tools that help AI developers build safer products and help healthcare organizations understand, evaluate, and manage the risks of adopting them.
Artificial intelligence can fail in healthcare in many ways. A system may omit a critical allergy, fabricate a clinical fact, fail to recognize an urgent condition, perform poorly for a particular patient population, or present uncertain information with unjustified confidence. Harm may also arise even when the underlying model works as designed—for example, when an AI product is poorly integrated into clinical workflows or clinicians rely too heavily on its recommendations.
Information about these risks is currently scattered across academic studies, regulatory reports, safety evaluations, incident reports, technical documentation, and individual organizations. There is no widely used system that translates this evidence into practical, product-specific safety guidance.
BRITE Institute is developing a structured, searchable database that documents:
The database will support two primary goals.
First, it will help AI developers generate customized, curated safety checklists for specific products. Rather than relying on a generic list of AI risks, developers will be able to identify the failure modes most relevant to their product’s intended use, model architecture, data sources, users, patient populations, and clinical environment.
Second, it will help hospital procurement teams, clinicians, and healthcare leaders understand and plan for the risks of acquiring and implementing AI products. Users will be able to identify relevant risks, ask vendors more precise questions, evaluate proposed safeguards, and develop implementation plans that protect patients throughout the product lifecycle.
The final database will contain substantially more detail. The table below demonstrates the types of information that may be included.
Severity will depend on the product, clinical context, patient population, likelihood of detection, and availability of safeguards. The database will therefore distinguish between the existence of a failure mode and the level of risk it creates in a particular implementation.
Healthcare organizations are adopting AI products under significant pressure to improve efficiency, reduce administrative burden, and address workforce shortages. However, procurement teams and clinicians may lack a systematic way to determine how an AI product could fail, which patients could be harmed, and whether the vendor’s safeguards are adequate.
This creates risks for patients. A failure that appears minor in a demonstration may become dangerous when the product is used repeatedly, connected to incomplete data, deployed in a high-pressure environment, or relied upon by clinicians who do not understand its limitations. Because AI systems can operate at scale, a single design or implementation weakness may affect hundreds or thousands of patients.
The database is intended to make patient protection a routine part of both product development and healthcare procurement. It will help organizations move beyond broad questions such as “Is this AI accurate?” and instead ask:
A structured failure-mode database may also make procurement faster and more efficient. Procurement teams often spend substantial time independently identifying risks, developing vendor questions, seeking internal expertise, and determining appropriate implementation controls. A curated framework can reduce duplication, standardize reviews, and help teams focus rapidly on the risks most relevant to a particular product.
Faster procurement should not mean weaker scrutiny. By giving purchasers a clearer and more consistent evaluation process, the database can help healthcare organizations make well-informed decisions more quickly while maintaining patient safety as the central criterion.
For developers, identifying relevant failure modes early can reduce costly redesign, prevent recurrent safety problems, and clarify testing requirements before a product reaches patients. For hospitals, early identification of implementation risks can prevent unsafe deployment, unexpected workflow disruption, and reliance on controls that are inadequate in real clinical settings.
Developers will be able to generate customized safety checklists based on a product’s purpose, users, data, clinical setting, and level of autonomy. These checklists can guide requirements development, model testing, interface design, documentation, deployment planning, and post-market monitoring.
Hospital procurement teams, clinicians, information-technology leaders, and safety officers can use the database to identify product-specific risks, develop vendor questions, compare products, evaluate safety claims, and determine which controls must be established before implementation.
Failure-mode examples can be converted into educational materials that teach clinicians how AI systems may fail, how to recognize unreliable outputs, when independent verification is required, and how to report suspected failures or near misses.
The database can support standardized benchmarks that test whether AI products are robust against known healthcare failure modes. These benchmarks may evaluate factors such as completeness, factual accuracy, uncertainty communication, subgroup performance, clinical prioritization, and resistance to misleading or incomplete inputs.
In a subsequent project, BRITE Institute will use the database to inform a healthcare AI red-teaming system. Documented failure modes will be translated into test scenarios that deliberately challenge AI products, record their responses, and score the effectiveness of their safeguards.
Healthcare organizations can compare incidents and near misses with known failure patterns. This can help investigators identify technical, clinical, human, workflow, and organizational causes and select interventions that address the underlying problem rather than only the immediate error.
This project is currently in development.

Did you know that many research findings are manipulated—or even outright false? Some estimates suggest that up to 90% of published research may be unreliable. Meanwhile, more than $167 billion in taxpayer money is spent annually on research and development.
At BRITE Institute, we believe research should do more than just look credible. It should be credible. That’s why we go above and beyond typical standards with rigorous practices that ensure honesty, transparency, and accuracy at every step. Below are just some of the ways we safeguard the integrity of our work:
BRITE Institute never p-hacks or manipulates data to achieve a desired outcome. If a paper relies on complex statistical analyses, we use an external statistician to ensure objectivity and validity.
BRITE Institute prioritizes transparency at every stage of the research process. Whenever possible, we publish our full data sets and use open access publishing.
BRITE Institute does not publish for the sake of publishing. Our research is built with end-users in mind—whether it’s policy-makers, engineers, or community leaders—ensuring that findings are not only trustworthy but also actionable.