In EDS ansehen

Large language models can extract metadata for annotation of human neuroimaging publications.

Gespeichert in:

Bibliographische Detailangaben
Titel:	Large language models can extract metadata for annotation of human neuroimaging publications.
Autoren:	Turner, Matthew D., Appaji, Abhishek, Ar Rakib, Nibras, Golnari, Pedram, Rajasekar, Arcot K., K V, Anitha Rathnam, Sahoo, Satya S., Wang, Yue, Wang, Lei, Turner, Jessica A.
Quelle:	Frontiers in Neuroinformatics; 2025, p1-16, 16p
Schlagwörter:	LANGUAGE models, ANNOTATIONS, DATA extraction, BRAIN imaging, GENERATIVE pre-trained transformers
Firma/Körperschaft:	OPENAI Inc.
Abstract:	We show that recent (mid-to-late 2024) commercial large language models (LLMs) are capable of good quality metadata extraction and annotation with very little work on the part of investigators for several exemplar real-world annotation tasks in the neuroimaging literature. We investigated the GPT-4o LLM from OpenAI which performed comparably with several groups of specially trained and supervised human annotators. The LLM achieves similar performance to humans, between 0.91 and 0.97 on zero-shot prompts without feedback to the LLM. Reviewing the disagreements between LLM and gold standard human annotations we note that actual LLM errors are comparable to human errors in most cases, and in many cases these disagreements are not errors. Based on the specific types of annotations we tested, with exceptionally reviewed gold-standard correct values, the LLM performance is usable for metadata annotation at scale. We encourage other research groups to develop and make available more specialized "micro-benchmarks," like the ones we provide here, for testing both LLMs, and more complex agent systems annotation performance in real-world metadata annotation tasks. [ABSTRACT FROM AUTHOR]
	Copyright of Frontiers in Neuroinformatics is the property of Frontiers Media S.A. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Datenbank:	Biomedical Index

Full Text Finder

Nájsť tento článok vo Web of Science

Beschreibung
Abstract:	We show that recent (mid-to-late 2024) commercial large language models (LLMs) are capable of good quality metadata extraction and annotation with very little work on the part of investigators for several exemplar real-world annotation tasks in the neuroimaging literature. We investigated the GPT-4o LLM from OpenAI which performed comparably with several groups of specially trained and supervised human annotators. The LLM achieves similar performance to humans, between 0.91 and 0.97 on zero-shot prompts without feedback to the LLM. Reviewing the disagreements between LLM and gold standard human annotations we note that actual LLM errors are comparable to human errors in most cases, and in many cases these disagreements are not errors. Based on the specific types of annotations we tested, with exceptionally reviewed gold-standard correct values, the LLM performance is usable for metadata annotation at scale. We encourage other research groups to develop and make available more specialized "micro-benchmarks," like the ones we provide here, for testing both LLMs, and more complex agent systems annotation performance in real-world metadata annotation tasks. [ABSTRACT FROM AUTHOR]
ISSN:	16625196
DOI:	10.3389/fninf.2025.1609077