What Makes a Top-Performing Precision Medicine Search Engine? Tracing Main System Features in a Systematic Way

From 2017 to 2019 the Text REtrieval Conference (TREC) held a challenge task on precision medicine using documents from medical publications (PubMed) and clinical trials. Despite lots of performance measurements carried out in these evaluation campaigns, the scientific community is still pretty unsu...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:arXiv.org
Hlavní autoři: Faessler, Erik, Oleynik, Michel, Hahn, Udo
Médium: Paper
Jazyk:angličtina
Vydáno: Ithaca Cornell University Library, arXiv.org 05.06.2020
Témata:
ISSN:2331-8422
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract From 2017 to 2019 the Text REtrieval Conference (TREC) held a challenge task on precision medicine using documents from medical publications (PubMed) and clinical trials. Despite lots of performance measurements carried out in these evaluation campaigns, the scientific community is still pretty unsure about the impact individual system features and their weights have on the overall system performance. In order to overcome this explanatory gap, we first determined optimal feature configurations using the Sequential Model-based Algorithm Configuration (SMAC) program and applied its output to a BM25-based search engine. We then ran an ablation study to systematically assess the individual contributions of relevant system features: BM25 parameters, query type and weighting schema, query expansion, stop word filtering, and keyword boosting. For evaluation, we employed the gold standard data from the three TREC-PM installments to evaluate the effectiveness of different features using the commonly shared infNDCG metric.
AbstractList From 2017 to 2019 the Text REtrieval Conference (TREC) held a challenge task on precision medicine using documents from medical publications (PubMed) and clinical trials. Despite lots of performance measurements carried out in these evaluation campaigns, the scientific community is still pretty unsure about the impact individual system features and their weights have on the overall system performance. In order to overcome this explanatory gap, we first determined optimal feature configurations using the Sequential Model-based Algorithm Configuration (SMAC) program and applied its output to a BM25-based search engine. We then ran an ablation study to systematically assess the individual contributions of relevant system features: BM25 parameters, query type and weighting schema, query expansion, stop word filtering, and keyword boosting. For evaluation, we employed the gold standard data from the three TREC-PM installments to evaluate the effectiveness of different features using the commonly shared infNDCG metric.
Author Faessler, Erik
Hahn, Udo
Oleynik, Michel
Author_xml – sequence: 1
  givenname: Erik
  surname: Faessler
  fullname: Faessler, Erik
– sequence: 2
  givenname: Michel
  surname: Oleynik
  fullname: Oleynik, Michel
– sequence: 3
  givenname: Udo
  surname: Hahn
  fullname: Hahn, Udo
BookMark eNotT1tLAkEYHaIgM39AbwM9r41z36cIsQsoCS74KN-O3-hYztrMGvnv28inwzmcC-eGXMYmIiF3IzaUVin2AOknfA85Y3rIuLHqgvS4EKPCSs6vySDnHWOMa8OVEj0Sl1to6Qw-MFOgVXMo5ph8k_Yhbug8oQs5NJHOcB1ciEgXCMlt6SRuOvZIqwTuzzmDEOnilFvc02eE9pi6vk6CswhtcHQJp1ty5eEz4-CMfVI9T6rxazF9f3kbP00LUNwWfi3qWlssBZNSl90LrFF6WXqUzjKpjFZmbby12kqlS1C1d2WXMF5B7azok_v_2kNqvo6Y29WuOabYLa64ZKXRdmSs-AXdhFw5
ContentType Paper
Copyright 2020. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2020. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID 8FE
8FG
ABJCF
ABUWG
AFKRA
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
HCIFZ
L6V
M7S
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
DOI 10.48550/arxiv.2006.02785
DatabaseName ProQuest SciTech Collection
ProQuest Technology Collection
Materials Science & Engineering Collection
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Central Essentials
ProQuest Central
Technology Collection
ProQuest One Community College
ProQuest Central
SciTech Premium Collection
ProQuest Engineering Collection
Engineering Database
ProQuest Central Premium
ProQuest One Academic
ProQuest - Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering Collection
DatabaseTitle Publicly Available Content Database
Engineering Database
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Engineering Collection
ProQuest One Academic UKI Edition
ProQuest Central Korea
Materials Science & Engineering Collection
ProQuest Central (New)
ProQuest One Academic
ProQuest One Academic (New)
Engineering Collection
DatabaseTitleList Publicly Available Content Database
Database_xml – sequence: 1
  dbid: PIMPY
  name: ProQuest Publicly Available Content
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Physics
EISSN 2331-8422
Genre Working Paper/Pre-Print
GroupedDBID 8FE
8FG
ABJCF
ABUWG
AFKRA
ALMA_UNASSIGNED_HOLDINGS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
FRJ
HCIFZ
L6V
M7S
M~E
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
ID FETCH-LOGICAL-a528-fd3bb68e9304469278ebe4f49fe4c80457657d7f88684569a5bfc9b687f5abc83
IEDL.DBID PIMPY
IngestDate Mon Jun 30 09:36:44 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a528-fd3bb68e9304469278ebe4f49fe4c80457657d7f88684569a5bfc9b687f5abc83
Notes SourceType-Working Papers-1
ObjectType-Working Paper/Pre-Print-1
content type line 50
OpenAccessLink https://www.proquest.com/publiccontent/docview/2409768178?pq-origsite=%requestingapplication%
PQID 2409768178
PQPubID 2050157
ParticipantIDs proquest_journals_2409768178
PublicationCentury 2000
PublicationDate 20200605
PublicationDateYYYYMMDD 2020-06-05
PublicationDate_xml – month: 06
  year: 2020
  text: 20200605
  day: 05
PublicationDecade 2020
PublicationPlace Ithaca
PublicationPlace_xml – name: Ithaca
PublicationTitle arXiv.org
PublicationYear 2020
Publisher Cornell University Library, arXiv.org
Publisher_xml – name: Cornell University Library, arXiv.org
SSID ssj0002672553
Score 1.7242188
SecondaryResourceType preprint
Snippet From 2017 to 2019 the Text REtrieval Conference (TREC) held a challenge task on precision medicine using documents from medical publications (PubMed) and...
SourceID proquest
SourceType Aggregation Database
SubjectTerms Ablation
Algorithms
Configurations
Performance evaluation
Precision medicine
Query expansion
Search engines
Standard data
Title What Makes a Top-Performing Precision Medicine Search Engine? Tracing Main System Features in a Systematic Way
URI https://www.proquest.com/docview/2409768178
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrZ3NS8MwFMCDbgqe_MaPOXLwWra1TZOcBsqGgh1Fh87TeElTGEI3223of29em-lB8OQ1bSG8Ju8j7-X3CLlWIecgIPRE5isboHDpKa4ZYlx5amwoBxXx5vmBj0ZiMpGJux5durLKjU6sFHVNe8a6bauEO-lc44l5x0dMUyR6XPQX7x72kMJcq2uosU2aCN7qNkgzuY-T1-8zFz_i1oMO6uRmhfLqQPExW7uchM8F-6WSKzsz3P_fGR7YmcHCFIdky-RHZLeq9tTlMckR2E1jeDMlBTqeL7ykvj9g7RhNCtd2h8Yu707rmmRaswv71No3jW_GMMtpjTyn6EqubOhO7RC4QYTB0hf4PCHj4WB8e-e5xgseMF94WRooFQkjA8z2SisW-6fDLJSZCbWwPiCPGE95JkQkrP8lgalMS_sFzxgoLYJT0sjnuTkjFMDu8jQSACoIVdeHAGQEJkUIUK9n-DlpbWQ5dZunnP6I7uLvx5dkz8fwFw9FWIs0lsXKXJEdvV7OyqJNmjeDUfLYxnLOp7ZbC19CDsSM
linkProvider ProQuest
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V07T8MwED6VFgQTb_HGA4wR1HnYGVAHHmrVpspQQZmqc-JIFVJaEij0R_EfOScpDEhsDKxO4uhyl3v7O4Az5QiBEh1LJlxRgCJ8S4nINTCuItYUymGBeHPfE_2-HA79sAYfi7Mwpq1yoRMLRR1PIpMjv-AGmMmTTSFb02fLTI0y1dXFCI1SLLp6_kYhW37VuSH-nnN-dzu4blvVVAELXS6tJLaV8qT2bVPK9LmQRIaTOH6inUiSgyM8V8QikdKT5Fz46Kok8ukJkbioImnTtkvQcEjWL-vQCDtB-PiV1OGeIBfdLqunBVbYBWbv41lV9KCXuT90fmHI7tb_2SfYINJxqrNNqOl0C1aKftUo34bUQI6zAJ90zpANJlMrLE9AkCVmYVYNDmJB1TnAyq5qVqIvthhZ6MjcGeA4ZSVoOzPO8GtG-9ESVosGzpY94HwHBn9B4S7U00mq94Ahkp6KPYmobEddcrTR91DHBsao2dRiH44WzBpVv38--ubUwe-XT2G1PQh6o16n3z2ENW6CeZPicY-g_pK96mNYjmYv4zw7qUSNweiPOfsJpA4SEQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=What+Makes+a+Top-Performing+Precision+Medicine+Search+Engine%3F+Tracing+Main+System+Features+in+a+Systematic+Way&rft.jtitle=arXiv.org&rft.au=Faessler%2C+Erik&rft.au=Oleynik%2C+Michel&rft.au=Hahn%2C+Udo&rft.date=2020-06-05&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422&rft_id=info:doi/10.48550%2Farxiv.2006.02785