GPT-4 for automated sequence-level determination of MRI protocols based on radiology request forms from clinical routine

Terzis, Robert ORCID: 0009-0007-1068-8477, Kaya, Kenan ORCID: 0009-0008-7625-3457, Schömig, Thomas ORCID: 0009-0001-7560-4285, Janssen, Jan Paul ORCID: 0000-0003-0980-4606, Iuga, Andra-Iza ORCID: 0000-0002-3694-0235, Kottlors, Jonathan ORCID: 0000-0001-5021-6895, Lennartz, Simon ORCID: 0009-0000-7189-3764, Gietzen, Carsten ORCID: 0000-0002-2354-3847, Gözdas, Cansin, Müller, Lukas ORCID: 0000-0002-8626-4044, Hahnfeldt, Robert ORCID: 0000-0001-7997-3216, Maintz, David ORCID: 0000-0002-8942-3776, Dratsch, Thomas ORCID: 0000-0003-4014-7763 and Pennig, Lenhard ORCID: 0000-0002-6606-9313 (2025). GPT-4 for automated sequence-level determination of MRI protocols based on radiology request forms from clinical routine. European Radiology, 36 (2). pp. 1541-1552. Springer Nature. ISSN 1432-1084 Open Access

PDF
s00330-025-11888-4.pdf
Bereitstellung unter der CC-Lizenz: Creative Commons Attribution.
Download (1MB)

Identification Number:10.1007/s00330-025-11888-4

Official URL: https://doi.org/10.1007/s00330-025-11888-4

Abstract

Objectives: This study evaluated GPT-4’s accuracy in MRI sequence selection based on radiology request forms (RRFs), comparing its performance to radiology residents. Materials and methods: This retrospective study included 100 RRFs across four subspecialties (cardiac imaging, neuroradiology, musculoskeletal, and oncology). GPT-4 and two radiology residents (R1: 2 years, R2: 5 years MRI experience) selected sequences based on each patient’s medical history and clinical questions. Considering imaging society guidelines, five board-certified specialized radiologists assessed protocols based on completeness, quality, and utility in consensus, using 5-point Likert scales. Clinical applicability was rated binarily by the institution’s lead radiographer. Results: GPT-4 achieved median scores of 3 (1–5) for completeness, 4 (1–5) for quality, and 4 (1–5) for utility, comparable to R1 (3 (1–5), 4 (1–5), 4 (1–5); each p > 0.05) but inferior to R2 (4 (1–5), 5 (1-5); p < 0.01, respectively, and 5 (1–5); p < 0.001). Subspecialty protocol quality varied: GPT-4 matched R1 (4 (2–4) vs. 4 (2–5), p = 0.20) and R2 (4 (2–5); p = 0.47) in cardiac imaging; showed no differences in neuroradiology (all 5 (1–5), p > 0.05); scored lower than R1 and R2 in musculoskeletal imaging (3 (2–5) vs. 4 (3–5); p < 0.01, and 5 (3–5); p < 0.001); and matched R1 (4 (1–5) vs. 2 (1–4), p = 0.12) as well as R2 (5 (2–5); p = 0.20) in oncology. GPT-4-based protocols were clinically applicable in 95% of cases, comparable to R1 (95%) and R2 (96%). Conclusion: GPT-4 generated MRI protocols with notable completeness, quality, utility, and clinical applicability, excelling in standardized subspecialties like cardiac and neuroradiology imaging while yielding lower accuracy in musculoskeletal examinations. Key Points: Question Long MRI acquisition times limit patient access, making accurate protocol selection crucial for efficient diagnostics, though it’s time-consuming and error-prone, especially for inexperienced residents. Findings GPT-4 generated MRI protocols of remarkable yet inconsistent quality, performing on par with an experienced resident in standardized fields, but moderately in musculoskeletal examinations. Clinical relevance The large language model can assist less experienced radiologists in determining detailed MRI protocols and counteract increasing workloads. The model could function as a semi-automatic tool, generating MRI protocols for radiologists’ confirmation, optimizing resource allocation, and improving diagnostics and cost-effectiveness.

Item Type:	Article
Creators:	Creators Email ORCID ORCID Put Code Terzis, Robert UNSPECIFIED https://orcid.org/0009-0007-1068-8477 UNSPECIFIED Kaya, Kenan UNSPECIFIED https://orcid.org/0009-0008-7625-3457 UNSPECIFIED Schömig, Thomas UNSPECIFIED https://orcid.org/0009-0001-7560-4285 UNSPECIFIED Janssen, Jan Paul UNSPECIFIED https://orcid.org/0000-0003-0980-4606 UNSPECIFIED Iuga, Andra-Iza UNSPECIFIED https://orcid.org/0000-0002-3694-0235 UNSPECIFIED Kottlors, Jonathan UNSPECIFIED https://orcid.org/0000-0001-5021-6895 UNSPECIFIED Lennartz, Simon UNSPECIFIED https://orcid.org/0009-0000-7189-3764 UNSPECIFIED Gietzen, Carsten UNSPECIFIED https://orcid.org/0000-0002-2354-3847 UNSPECIFIED Gözdas, Cansin UNSPECIFIED UNSPECIFIED UNSPECIFIED Müller, Lukas UNSPECIFIED https://orcid.org/0000-0002-8626-4044 UNSPECIFIED Hahnfeldt, Robert UNSPECIFIED https://orcid.org/0000-0001-7997-3216 UNSPECIFIED Maintz, David UNSPECIFIED https://orcid.org/0000-0002-8942-3776 UNSPECIFIED Dratsch, Thomas UNSPECIFIED https://orcid.org/0000-0003-4014-7763 UNSPECIFIED Pennig, Lenhard UNSPECIFIED https://orcid.org/0000-0002-6606-9313 UNSPECIFIED
URN:	urn:nbn:de:hbz:38-802548
Identification Number:	10.1007/s00330-025-11888-4
Journal or Publication Title:	European Radiology
Volume:	36
Number:	2
Page Range:	pp. 1541-1552
Number of Pages:	12
Date:	8 August 2025
Publisher:	Springer Nature
ISSN:	1432-1084
Language:	English
Faculty:	Faculty of Medicine
Divisions:	Faculty of Medicine > Radiologische Diagnostik > Institut und Poliklinik für Radiologische Diagnostik
Subjects:	Medical sciences Medicine
Uncontrolled Keywords:	Keywords Language Artificial intelligence ; Large language models ; Cardiac imaging ; Neuroradiology ; Magnetic resonance imaging English
['eprint_fieldname_oa_funders' not defined]:	Publikationsfonds UzK
Refereed:	Yes
URI:	http://kups.ub.uni-koeln.de/id/eprint/80254

Downloads

Downloads per month over past year

Altmetric

Export

Actions (login required)

View Item

Universität zu Köln

Kölner UniversitätsPublikationsServer

Abstract

Downloads

Altmetric

Export

Actions (login required)