Specialized AI and neurosurgeons in niche expertise: a proof-of-concept in neuromodulation with vagus nerve stimulation
Uloženo v:
| Název: | Specialized AI and neurosurgeons in niche expertise: a proof-of-concept in neuromodulation with vagus nerve stimulation |
|---|---|
| Autoři: | Barrit, Sami, Ranuzzi, Giovanni, Fetzer, Steffen, Al Barajraji, Mejdeddine, El Hadwe, Salim, Zanello, Marc, Otler, Martin, O'Flaherty, Julieta, Massager, Nicolas, Madsen, Joseph R, Dibue, Maxine, Carron, Romain |
| Přispěvatelé: | Apollo - University of Cambridge Repository |
| Zdroj: | Acta Neurochir (Wien) Acta neurochirurgica, vol. 167, no. 1, pp. 203 |
| Informace o vydavateli: | Springer Science and Business Media LLC, 2025. |
| Rok vydání: | 2025 |
| Témata: | Artificial intelligence, Drug Resistant Epilepsy, Epilepsy, Vagus Nerve Stimulation, Neuromodulation, Research, Proof of Concept Study, Neurosurgeons, Humans, Vagus Nerve Stimulation/methods, Neurosurgeons/education, Drug Resistant Epilepsy/therapy, Artificial Intelligence, Clinical Competence, Surveys and Questionnaires, VNS, Vagus nerve stimulation |
| Popis: | OBJECTIVE: Applying large language models (LLM) in specialized medical disciplines presents unique challenges requiring precision, reliability, and domain-specific relevance. We evaluated a specialized LLM-driven system against neurosurgeons in vagus nerve stimulation (VNS) for drug-resistant epilepsy knowledge assessment—a complex neuromodulation therapy requiring transdisciplinary expertise in neural anatomy, epileptic disorders, and device technology. MATERIALS AND METHODS: Thirty-six European neurosurgeons who completed a 2-day VNS masterclass were assessed using a multiple-choice questionnaire comprising 14 items with 67 binary propositions. We deployed open-source models—LLaMa 2 70B and MXBAI embedding model—using Neura, an AI infrastructure enabling transparent grounding through advanced retrieval augmented generation. The knowledge base consisted of 125 VNS-related publications curated by multidisciplinary faculty. Scoring ranged from -1 to + 1 per question. Performance was analyzed using Wilcoxon signed-rank tests, confusion matrices, and metrics including accuracy, precision, recall, and specificity. RESULTS: The AI achieved a score of 0.75, exceeding the highest individual clinician score (0.68; median: 0.50), with statistical significance (p |
| Druh dokumentu: | Article Other literature type |
| Popis souboru: | application/pdf; application/zip; text/xml |
| Jazyk: | English |
| ISSN: | 0942-0940 |
| DOI: | 10.1007/s00701-025-06610-8 |
| Přístupová URL adresa: | https://serval.unil.ch/resource/serval:BIB_969B4DF16F06.P001/REF.pdf http://nbn-resolving.org/urn/resolver.pl?urn=urn:nbn:ch:serval-BIB_969B4DF16F065 https://serval.unil.ch/notice/serval:BIB_969B4DF16F06 https://www.repository.cam.ac.uk/handle/1810/387496 https://doi.org/10.1007/s00701-025-06610-8 |
| Rights: | CC BY NC ND |
| Přístupové číslo: | edsair.doi.dedup.....0a629e672570ee884c9e93a24ff534e8 |
| Databáze: | OpenAIRE |
| Abstrakt: | OBJECTIVE: Applying large language models (LLM) in specialized medical disciplines presents unique challenges requiring precision, reliability, and domain-specific relevance. We evaluated a specialized LLM-driven system against neurosurgeons in vagus nerve stimulation (VNS) for drug-resistant epilepsy knowledge assessment—a complex neuromodulation therapy requiring transdisciplinary expertise in neural anatomy, epileptic disorders, and device technology. MATERIALS AND METHODS: Thirty-six European neurosurgeons who completed a 2-day VNS masterclass were assessed using a multiple-choice questionnaire comprising 14 items with 67 binary propositions. We deployed open-source models—LLaMa 2 70B and MXBAI embedding model—using Neura, an AI infrastructure enabling transparent grounding through advanced retrieval augmented generation. The knowledge base consisted of 125 VNS-related publications curated by multidisciplinary faculty. Scoring ranged from -1 to + 1 per question. Performance was analyzed using Wilcoxon signed-rank tests, confusion matrices, and metrics including accuracy, precision, recall, and specificity. RESULTS: The AI achieved a score of 0.75, exceeding the highest individual clinician score (0.68; median: 0.50), with statistical significance (p |
|---|---|
| ISSN: | 09420940 |
| DOI: | 10.1007/s00701-025-06610-8 |
Full Text Finder
Nájsť tento článok vo Web of Science