Assessing Radiology Research on Artificial Intelligence: A Brief Guide for Authors, Reviewers and Readers

In recent years, we have seen a sharp increase in publications concerning machine learning and deep learning in radiology. Consequently, some journals report that around a quarter of all their publications in 2018 related to these topics, one way or another. Of course, with so much research around, it is important to be able to assess concerns of scientific quality. To help authors, reviewers and readers and guide them on how to evaluate AI-related research, Radiology’s Editorial Board published a brief guide that could serve as an interim until more formal guidelines on AI research are published.

So, what are the suggested items to look for in publications regarding machine learning and artificial intelligence?

1. All image sets (i.e. training, validation and testing set) should be clearly defined.
The image sets for training, internal validation and independent testing should be carefully selected and without overlap. Inclusion and exclusion criteria should be clearly mentioned.

2. An external dataset should be used for final statistical reporting.
Validation of the AI model’s results on an external and independent dataset is useful to exclude overfitting and document the model’s generalizability.

3. Multivendor images should be used for all datasets.
Images from different vendors may look different even to the radiologist’s eye. To prevent an AI-model from being vendor-specific, images from a variety of vendors should be included in all steps.

4. Size of training, validation and testing sets should be justified.
Estimating the number of images needed can be difficult. However, if a clear estimate cannot be given, at least some evaluation of model performance, depending on training images, should be done.

5. A widely accepted reference standard is mandatory.
An established gold standard should be used as labels to the images. The radiological report as produced in clinical routine may not always be optimal (e.g. an enlarged lymph node in CT may have been reported as malignant, but only histopathological analysis can reliably determine the cause of enlargement).

6. Preparation of image data should be described.
Was manual interaction needed to prepare the images for the AI algorithm (e.g. definition of bounding boxes)? Or, did the algorithm simply consume all images of a specific DICOM-series? These considerations are important to estimate the usability of the algorithm in clinical routine.

7. The performance of the AI system should be compared to that of a radiology expert.
To really be able to determine an algorithm’s potential impact on clinical routine, the algorithm’s performance should be compared to that of a radiology expert’s. Outperforming students, for example, might be nice, but if inferior to an expert in the field, it is unlikely the algorithm will have any practical impact.

8. The AI algorithm’s performance and decision making should be clear.
To alleviate the fear that an AI algorithm may be a black box, so-called saliency maps may be used to indicate which parts of the image were deemed relevant. But, more importantly, clinically useful performance metrics such as sensitivity, specificity, PPV and NPV should be reported instead of a single AUC value.

9. In order to verify claims of performance, the AI algorithm should be accessible in some form.
The AI algorithm should be publicly available in some form so that independent researchers could potentially validate any performance claims made. This does not necessarily mean that it should be freely available, but researchers should be given some form of access to the algorithm so that the results can be verified.

Until more formalized reporting guidelines, such as CONSORT-AI and SPIRIT-AI [1], are published, these suggestions published by the RSNA could help authors, reviewers and readers to evaluate the scientific quality of AI-related research.

Read more on this topic in “Assessing Radiology Research on Artificial Intelligence: A Brief Guide for Authors, Reviewers, and Readers – From the Radiology Editorial Board”.

Key considerations

  • Are all three image sets (training, validation and test sets) defined?
  • Is an external test set used for final statistical reporting?
  • Have multivendor images been used to evaluate the AI algorithm?
  • Are the sizes of the training, validation and test sets justified?
  • Was the AI algorithm trained using a standard of reference that is widely accepted in our field?
  • Was the preparation of images for the AI algorithm adequately described?
  • Were the results of the AI algorithm compared with those of a radiology expert’s and/or pathology?
  • Was the manner in which the AI algorithm makes decisions demonstrated?
  • Is the AI algorithm publicly available?

[1] Xiaoxuan Liu, Livia Faes, Melanie J Calvert, Alastair K Denniston on behalf of the CONSORT/SPIRIT-AI Extension Group (2019). The Lancet, DOI: 10.1016/S0140-6736(19)31819-7

WRITTEN BY

Latest posts

Become A Member Today!

You will have access to a wide range of benefits that can help you advance your career and stay up-to-date with the latest developments in the field of radiology. These benefits include access to educational resources, networking opportunities with other professionals in the field, opportunities to participate in research projects and clinical trials, and access to the latest technologies and techniques. 

Check out our different membership options.

If you don’t find a fitting membership send us an email here.

Membership

for radiologists, radiology residents, professionals of allied sciences (including radiographers/radiological technologists, nuclear medicine physicians, medical physicists, and data scientists) & professionals of allied sciences in training residing within the boundaries of Europe

  • Reduced registration fees for ECR 1
  • Reduced fees for the European School of Radiology (ESOR) 2
  • Exclusive option to participate in the European Diploma. 3
  • Free electronic access to the journal European Radiology 4
  • Content e-mails for all ESR journals
  • Updates on offers & events through our newsletters
  • Exclusive access to the ESR feed in Juisci

€ 11 /year

Yes! That is less than €1 per month.

Free membership

for radiologists, radiology residents or professionals of allied sciences engaged in practice, teaching or research residing outside Europe as well as individual qualified professionals with an interest in radiology and medical imaging who do not fulfil individual or all requirements for any other ESR membership category & former full members who have retired from all clinical practice
  • Reduced registration fees for ECR 1
  • Free electronic access to the journal European Radiology
  • Content e-mails for all 3 ESR journals 4
  • Updates on offers & events through our newsletters
  • Exclusive access to the ESR feed in Juisci

€ 0

The best things in life are free.

ESR Friends

For students, company representatives or hospital managers etc.

  • Content e-mails for all 3 ESR journals 4
  • Updates on offers & events through our newsletters

€ 0

Friendship doesn’t cost a thing.

The membership type best fitting for you will be selected automatically during the application process.

Footnotes:

01

Reduced registration fees for ECR 2024:
Provided that ESR 2023 membership is activated and approved by August 31, 2023.

Reduced registration fees for ECR 2025:
Provided that ESR 2024 membership is activated and approved by August 31, 2024.

02
Not all activities included
03
Examination based on the ESR European Training Curriculum (radiologists or radiology residents).
04
European Radiology, Insights into Imaging, European Radiology Experimental.