Skip to main content

Main menu

  • Home
  • Content
    • Current
    • Ahead of print
    • Past Issues
    • JNM Supplement
    • SNMMI Annual Meeting Abstracts
    • Continuing Education
    • JNM Podcasts
  • Subscriptions
    • Subscribers
    • Institutional and Non-member
    • Rates
    • Journal Claims
    • Corporate & Special Sales
  • Authors
    • Submit to JNM
    • Information for Authors
    • Assignment of Copyright
    • AQARA requirements
  • Info
    • Reviewers
    • Permissions
    • Advertisers
  • About
    • About Us
    • Editorial Board
    • Contact Information
  • More
    • Alerts
    • Feedback
    • Help
    • SNMMI Journals
  • SNMMI
    • JNM
    • JNMT
    • SNMMI Journals
    • SNMMI

User menu

  • Subscribe
  • My alerts
  • Log in
  • My Cart

Search

  • Advanced search
Journal of Nuclear Medicine
  • SNMMI
    • JNM
    • JNMT
    • SNMMI Journals
    • SNMMI
  • Subscribe
  • My alerts
  • Log in
  • My Cart
Journal of Nuclear Medicine

Advanced Search

  • Home
  • Content
    • Current
    • Ahead of print
    • Past Issues
    • JNM Supplement
    • SNMMI Annual Meeting Abstracts
    • Continuing Education
    • JNM Podcasts
  • Subscriptions
    • Subscribers
    • Institutional and Non-member
    • Rates
    • Journal Claims
    • Corporate & Special Sales
  • Authors
    • Submit to JNM
    • Information for Authors
    • Assignment of Copyright
    • AQARA requirements
  • Info
    • Reviewers
    • Permissions
    • Advertisers
  • About
    • About Us
    • Editorial Board
    • Contact Information
  • More
    • Alerts
    • Feedback
    • Help
    • SNMMI Journals
  • View or Listen to JNM Podcast
  • Visit JNM on Facebook
  • Join JNM on LinkedIn
  • Follow JNM on Twitter
  • Subscribe to our RSS feeds
Meeting ReportPhysics, Instrumentation & Data Sciences - Data Sciences

Multimodal Large Language Model Based PET/CT Report Generation: Capabilities and Comparison of ChatGPT-4 Versus ChatGPT-3.5

Rick Wray and Randy Yeh
Journal of Nuclear Medicine June 2024, 65 (supplement 2) 242447;
Rick Wray
1Memorial Sloan Kettering Cancer Center
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Randy Yeh
1Memorial Sloan Kettering Cancer Center
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Figures & Data
  • Info & Metrics
Loading

Abstract

242447

Introduction: ChatGPT is a multimodal large language model created by OpenAI with both free and for-fee access. While this primarily functions as a chatbot imitating human conversation it also offers the additional feature of prompt engineering to accomplish machine-learning-based tasks. At our institution, most backend informatics processes require in-house software development by a group of highly trained and skilled staff including, physicians, informaticians, programmers, and information technologists. Accomplishing a task such as auto-population of text within a structured PET/CT report can take a significant amount of time and human resources. The purpose of this abstract was to determine the capability of ChatGPT to generate structured PET/CT reports from unstructured data and compare the performance of GPT-4 with GPT-3.5.

Methods: Structured, standardized PET/CT report templates with form fields are utilized at our institution. The empty template for an FDG PET/CT report was used for prompt engineering. The Nuclear Medicine physician read PET/CT reports from 10 random de-identified patients within the past month were used for analysis. The clinical statement, technique, findings, and impression data were removed from the templates to be used for prompts. After a trial and error period of three attempts an optimized prompt was engineered using GPT-3.5 to generate a complete PET/CT report by filling in the structured, standardized PET/CT report template with the de-identified patient scan data. Subsequently, GPT-4 was compared to GPT-3.5 by using increasing levels of disorganization of the de-identified patient data, involving; organized data, clustered data, and unorganized clustered data, to prompt results. Prompt generations from GPT-4 and GPT-3.5 were compared to the original Nuclear Medicine physician report as well as each other for errors.

Results: The prompt engineered to generate the report was, "Please add these scan findings and interpretation, ‘[de-identified patient scan data here]’, to this template: ‘[structured, standardized PET/CT report template here]’, and if a section has no findings then add ‘No abnormal uptake’." Reports were generated by ChatGPT within 1 minute of prompting. GPT-3.5 generated nearly complete reports but had errors at all three levels of data disorganization. Examples included adding abnormal findings to the wrong section, adding "No abnormal uptake." to sections with abnormal findings, copying human punctuation, "image 1:30" instead of "image 130", and dictation, "produce and meter" instead of "cm" errors, from the original de-identified patient scan data, and changing the formatting of punctuation (Figure 1). GPT-4 successfully generated complete reports without errors at any of the three levels of data disorganization and only had instances of changing the formatting of punctuation. This was able to be resolved by engineering the prompt to include "do not change formatting". In addition to a lack of errors, GPT-4 corrected the human punctuation, "1:30" was changed to "130", and dictation, "produce and meter" was changed to "cm", errors (Figure 2).

Conclusions: ChatGPT is capable of generating a PET/CT report using a structured, standardized report template and varying levels of organized to disorganized patient scan data. GPT-4 outperforms GPT-3.5, which was found to make mistakes not compatible with clinical implementation. GPT-4 executes with velocity, veracity, and adds benefit to physician-read reports by correcting human-made grammatical mistakes.

Figure
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure
  • Download figure
  • Open in new tab
  • Download powerpoint
Previous
Back to top

In this issue

Journal of Nuclear Medicine
Vol. 65, Issue supplement 2
June 1, 2024
  • Table of Contents
  • Index by author
Article Alerts
Sign In to Email Alerts with your Email Address
Email Article

Thank you for your interest in spreading the word on Journal of Nuclear Medicine.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Multimodal Large Language Model Based PET/CT Report Generation: Capabilities and Comparison of ChatGPT-4 Versus ChatGPT-3.5
(Your Name) has sent you a message from Journal of Nuclear Medicine
(Your Name) thought you would like to see the Journal of Nuclear Medicine web site.
Citation Tools
Multimodal Large Language Model Based PET/CT Report Generation: Capabilities and Comparison of ChatGPT-4 Versus ChatGPT-3.5
Rick Wray, Randy Yeh
Journal of Nuclear Medicine Jun 2024, 65 (supplement 2) 242447;

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Share
Multimodal Large Language Model Based PET/CT Report Generation: Capabilities and Comparison of ChatGPT-4 Versus ChatGPT-3.5
Rick Wray, Randy Yeh
Journal of Nuclear Medicine Jun 2024, 65 (supplement 2) 242447;
Twitter logo Facebook logo LinkedIn logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One
Bookmark this article

Jump to section

  • Article
  • Figures & Data
  • Info & Metrics

Related Articles

  • No related articles found.
  • Google Scholar

Cited By...

  • No citing articles found.
  • Google Scholar

More in this TOC Section

  • Deep Learning Based Position and Non-Linearity Correction for High-Performance PET Detector Using a Time-Over-Threshold Readout Method
  • AI detects patient race from myocardial perfusion PET: towards understanding unintended biases of predictive models
  • A streamlined workflow for crowdsource annotation of medical images
Show more Physics, Instrumentation & Data Sciences - Data Sciences

Similar Articles

SNMMI

© 2025 SNMMI

Powered by HighWire