Specialized AI and neurosurgeons in niche expertise: a proof-of-concept in neuromodulation with vagus nerve stimulation

Uloženo v:
Podrobná bibliografie
Název: Specialized AI and neurosurgeons in niche expertise: a proof-of-concept in neuromodulation with vagus nerve stimulation
Autoři: Barrit, Sami, Ranuzzi, Giovanni, Fetzer, Steffen, Al Barajraji, Mejdeddine, El Hadwe, Salim, Zanello, Marc, Otler, Martin, O'Flaherty, Julieta, Massager, Nicolas, Madsen, Joseph R, Dibue, Maxine, Carron, Romain
Přispěvatelé: Apollo - University of Cambridge Repository
Zdroj: Acta Neurochir (Wien)
Acta neurochirurgica, vol. 167, no. 1, pp. 203
Informace o vydavateli: Springer Science and Business Media LLC, 2025.
Rok vydání: 2025
Témata: Artificial intelligence, Drug Resistant Epilepsy, Epilepsy, Vagus Nerve Stimulation, Neuromodulation, Research, Proof of Concept Study, Neurosurgeons, Humans, Vagus Nerve Stimulation/methods, Neurosurgeons/education, Drug Resistant Epilepsy/therapy, Artificial Intelligence, Clinical Competence, Surveys and Questionnaires, VNS, Vagus nerve stimulation
Popis: OBJECTIVE: Applying large language models (LLM) in specialized medical disciplines presents unique challenges requiring precision, reliability, and domain-specific relevance. We evaluated a specialized LLM-driven system against neurosurgeons in vagus nerve stimulation (VNS) for drug-resistant epilepsy knowledge assessment—a complex neuromodulation therapy requiring transdisciplinary expertise in neural anatomy, epileptic disorders, and device technology. MATERIALS AND METHODS: Thirty-six European neurosurgeons who completed a 2-day VNS masterclass were assessed using a multiple-choice questionnaire comprising 14 items with 67 binary propositions. We deployed open-source models—LLaMa 2 70B and MXBAI embedding model—using Neura, an AI infrastructure enabling transparent grounding through advanced retrieval augmented generation. The knowledge base consisted of 125 VNS-related publications curated by multidisciplinary faculty. Scoring ranged from -1 to + 1 per question. Performance was analyzed using Wilcoxon signed-rank tests, confusion matrices, and metrics including accuracy, precision, recall, and specificity. RESULTS: The AI achieved a score of 0.75, exceeding the highest individual clinician score (0.68; median: 0.50), with statistical significance (p
Druh dokumentu: Article
Other literature type
Popis souboru: application/pdf; application/zip; text/xml
Jazyk: English
ISSN: 0942-0940
DOI: 10.1007/s00701-025-06610-8
Přístupová URL adresa: https://serval.unil.ch/resource/serval:BIB_969B4DF16F06.P001/REF.pdf
http://nbn-resolving.org/urn/resolver.pl?urn=urn:nbn:ch:serval-BIB_969B4DF16F065
https://serval.unil.ch/notice/serval:BIB_969B4DF16F06
https://www.repository.cam.ac.uk/handle/1810/387496
https://doi.org/10.1007/s00701-025-06610-8
Rights: CC BY NC ND
Přístupové číslo: edsair.doi.dedup.....0a629e672570ee884c9e93a24ff534e8
Databáze: OpenAIRE
Popis
Abstrakt:OBJECTIVE: Applying large language models (LLM) in specialized medical disciplines presents unique challenges requiring precision, reliability, and domain-specific relevance. We evaluated a specialized LLM-driven system against neurosurgeons in vagus nerve stimulation (VNS) for drug-resistant epilepsy knowledge assessment—a complex neuromodulation therapy requiring transdisciplinary expertise in neural anatomy, epileptic disorders, and device technology. MATERIALS AND METHODS: Thirty-six European neurosurgeons who completed a 2-day VNS masterclass were assessed using a multiple-choice questionnaire comprising 14 items with 67 binary propositions. We deployed open-source models—LLaMa 2 70B and MXBAI embedding model—using Neura, an AI infrastructure enabling transparent grounding through advanced retrieval augmented generation. The knowledge base consisted of 125 VNS-related publications curated by multidisciplinary faculty. Scoring ranged from -1 to + 1 per question. Performance was analyzed using Wilcoxon signed-rank tests, confusion matrices, and metrics including accuracy, precision, recall, and specificity. RESULTS: The AI achieved a score of 0.75, exceeding the highest individual clinician score (0.68; median: 0.50), with statistical significance (p
ISSN:09420940
DOI:10.1007/s00701-025-06610-8