Understanding the evidence for artificial intelligence in healthcare
The entire evaluation sequence—technical performance, usability and workflow, and impact—should also be repeated whenever conditions change, especially if the models may learn and change their performance over time.7 Model performance can vary for a wide variety of reasons such as changes in the und...
Gespeichert in:
| Veröffentlicht in: | BMJ quality & safety Jg. 34; H. 7; S. 421 |
|---|---|
| Hauptverfasser: | , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
England
BMJ Publishing Group LTD
01.07.2025
|
| Schlagworte: | |
| ISSN: | 2044-5415, 2044-5423, 2044-5423 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | The entire evaluation sequence—technical performance, usability and workflow, and impact—should also be repeated whenever conditions change, especially if the models may learn and change their performance over time.7 Model performance can vary for a wide variety of reasons such as changes in the underlying data used for prediction or behavioural changes from use of the model itself. In general, most AI algorithms either predict or classify, so their performance is measured in a manner similar to the evaluation of diagnostic tests, using metrics such as sensitivity, specificity and area under the curve.12 13 Healthcare providers should pay particular attention to rates of false positives and false negatives, as well as their consequences, as clinical judgement is often needed in selecting performance thresholds. Studies of healthcare AI tools should explicitly report false positive and false negative rates rather than composite measures such as F1 scores (an evaluation metric that combines precision and recall) so that medical practitioners can determine their suitability for practice. The phases of clinical research used for drugs and devices are useful for framing AI evaluation, but there are important nuances, as articulated by Park and colleagues.16 AI is similar to drugs and devices, as early-phase ‘laboratory’ studies are needed to demonstrate proof-of-concept technical performance, usability of prototypes, and potential for impact. |
|---|---|
| Bibliographie: | SourceType-Scholarly Journals-1 content type line 14 ObjectType-Editorial-2 ObjectType-Commentary-1 content type line 23 |
| ISSN: | 2044-5415 2044-5423 2044-5423 |
| DOI: | 10.1136/bmjqs-2025-018559 |