Grothey, Bastian ORCID: 0000-0002-0883-6481, Odenkirchen, Jan, Brkic, Adnan, Schömig-Markiefka, Birgid ORCID: 0000-0003-1893-8796, Quaas, Alexander ORCID: 0000-0002-3537-6011, Büttner, Reinhard ORCID: 0000-0001-8806-4786 and Tolkach, Yuri ORCID: 0000-0001-5239-2841 (2025). Comprehensive testing of large language models for extraction of structured data in pathology. Communications Medicine, 5 (1). ISSN 2730-664X

[thumbnail of s43856-025-00808-8.pdf] PDF
s43856-025-00808-8.pdf
Bereitstellung unter der CC-Lizenz: Creative Commons Attribution.

Download (4MB)
Identification Number:10.1038/s43856-025-00808-8

Abstract

Background: Pathology departments generate large volumes of unstructured data as free-text diagnostic reports. Converting these reports into structured formats for analytics or artificial intelligence projects requires substantial manual effort by specialized personnel. While recent studies show promise in using advanced language models for structuring pathology data, they primarily rely on proprietary models, raising cost and privacy concerns. Additionally, important aspects such as prompt engineering and model quantization for deployment on consumer-grade hardware remain unaddressed. Methods: We created a dataset of 579 annotated pathology reports in German and English versions. Six language models (proprietary: GPT-4; open-source: Llama2 13B, Llama2 70B, Llama3 8B, Llama3 70B, and Qwen2.5 7B) were evaluated for their ability to extract eleven key parameters from these reports. Additionally, we investigated model performance across different prompt engineering strategies and model quantization techniques to assess practical deployment scenarios. Results: Here we show that open-source language models extract structured data from pathology reports with high precision, matching the accuracy of proprietary GPT-4 model. The precision varies significantly across different models and configurations. These variations depend on specific prompt engineering strategies and quantization methods used during model deployment. Conclusions: Open-source language models demonstrate comparable performance to proprietary solutions in structuring pathology report data. This finding has significant implications for healthcare institutions seeking cost-effective, privacy-preserving data structuring solutions. The variations in model performance across different configurations provide valuable insights for practical deployment in pathology departments. Our publicly available bilingual dataset serves as both a benchmark and a resource for future research.

Item Type: Article
Creators:
Creators
Email
ORCID
ORCID Put Code
Grothey, Bastian
UNSPECIFIED
UNSPECIFIED
Odenkirchen, Jan
UNSPECIFIED
UNSPECIFIED
UNSPECIFIED
Brkic, Adnan
UNSPECIFIED
UNSPECIFIED
UNSPECIFIED
Schömig-Markiefka, Birgid
UNSPECIFIED
UNSPECIFIED
Quaas, Alexander
UNSPECIFIED
UNSPECIFIED
Büttner, Reinhard
UNSPECIFIED
UNSPECIFIED
Tolkach, Yuri
UNSPECIFIED
UNSPECIFIED
URN: urn:nbn:de:hbz:38-792734
Identification Number: 10.1038/s43856-025-00808-8
Journal or Publication Title: Communications Medicine
Volume: 5
Number: 1
Date: 31 March 2025
ISSN: 2730-664X
Language: English
Faculty: Faculty of Medicine
Divisions: Faculty of Medicine > Anatomie
Faculty of Medicine > Pathologie und Neuropathologie > Institut für Pathologie
Subjects: Medical sciences Medicine
['eprint_fieldname_oa_funders' not defined]: Publikationsfonds UzK
Refereed: Yes
URI: http://kups.ub.uni-koeln.de/id/eprint/79273

Downloads

Downloads per month over past year

Altmetric

Export

Actions (login required)

View Item View Item