Proof of concept evaluation | Ensure no overlap between development and testing cohort. |
| Check that ground-truth quality is reasonable. |
| Provide comparison with conventional and state-of-the-art methods. |
| Choose figures of merit that motivate further clinical evaluation. |
Technical task-specific evaluation | Choose clinically relevant tasks: Detection/quantification/combination of both. |
| Determine the right study type: Simulation/phantom/clinical. |
| Ensure that simulation studies are realistic and account for population variability. |
| Testing cohort should be external. |
| Reference standard should be high quality and correspond to the task. |
| Use a reliable strategy to extract task-specific information. |
| Choose figures of merit that quantify task performance. |
Clinical evaluation | Determine study type: Retrospective, prospective observational, prospective interventional, or postdeployment real-world studies. |
| Testing cohort must be external. |
| Collected data should represent the target population as stated in the claim. |
| Reference standard should be high quality and be representative of those used for clinical decision making. |
| Figure of merit should reflect performance on clinical decision making. |
Postdeployment evaluation | Monitor devices and follow reporting guidelines. |
| Consider phantom studies as sanity checks to assess routine performance. |
| Periodically monitor data drift. |
| For off-label evaluation, follow recommendations as in clinical/technical evaluation depending on objective. |