Automatic assignment of biomedical categories: toward a generic approach
Motivation: We report on the development of a generic text categorization system designed to automatically assign biomedical categories to any input text. Unlike usual automatic text categorization systems, which rely on data-intensive models extracted from large sets of training data, our categoriz...
Uloženo v:
| Vydáno v: | Bioinformatics Ročník 22; číslo 6; s. 658 - 664 |
|---|---|
| Hlavní autor: | |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
England
Oxford University Press
15.03.2006
Oxford Publishing Limited (England) |
| Témata: | |
| ISSN: | 1367-4803, 1460-2059, 1367-4811 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Motivation: We report on the development of a generic text categorization system designed to automatically assign biomedical categories to any input text. Unlike usual automatic text categorization systems, which rely on data-intensive models extracted from large sets of training data, our categorizer is largely data-independent. Methods: In order to evaluate the robustness of our approach we test the system on two different biomedical terminologies: the Medical Subject Headings (MeSH) and the Gene Ontology (GO). Our lightweight categorizer, based on two ranking modules, combines a pattern matcher and a vector space retrieval engine, and uses both stems and linguistically-motivated indexing units. Results and Conclusion: Results show the effectiveness of phrase indexing for both GO and MeSH categorization, but we observe the categorization power of the tool depends on the controlled vocabulary: precision at high ranks ranges from above 90% for MeSH to <20% for GO, establishing a new baseline for categorizers based on retrieval methods. Contact:Patrick.Ruch@sim.hcuge.ch |
|---|---|
| AbstractList | Motivation: We report on the development of a generic text categorization system designed to automatically assign biomedical categories to any input text. Unlike usual automatic text categorization systems, which rely on data-intensive models extracted from large sets of training data, our categorizer is largely data-independent. Methods: In order to evaluate the robustness of our approach we test the system on two different biomedical terminologies: the Medical Subject Headings (MeSH) and the Gene Ontology (GO). Our lightweight categorizer, based on two ranking modules, combines a pattern matcher and a vector space retrieval engine, and uses both stems and linguistically-motivated indexing units. Results and Conclusion: Results show the effectiveness of phrase indexing for both GO and MeSH categorization, but we observe the categorization power of the tool depends on the controlled vocabulary: precision at high ranks ranges from above 90% for MeSH to <20% for GO, establishing a new baseline for categorizers based on retrieval methods. Contact:Patrick.Ruch@sim.hcuge.ch We report on the development of a generic text categorization system designed to automatically assign biomedical categories to any input text. Unlike usual automatic text categorization systems, which rely on data-intensive models extracted from large sets of training data, our categorizer is largely data-independent.MOTIVATIONWe report on the development of a generic text categorization system designed to automatically assign biomedical categories to any input text. Unlike usual automatic text categorization systems, which rely on data-intensive models extracted from large sets of training data, our categorizer is largely data-independent.In order to evaluate the robustness of our approach we test the system on two different biomedical terminologies: the Medical Subject Headings (MeSH) and the Gene Ontology (GO). Our lightweight categorizer, based on two ranking modules, combines a pattern matcher and a vector space retrieval engine, and uses both stems and linguistically-motivated indexing units.METHODSIn order to evaluate the robustness of our approach we test the system on two different biomedical terminologies: the Medical Subject Headings (MeSH) and the Gene Ontology (GO). Our lightweight categorizer, based on two ranking modules, combines a pattern matcher and a vector space retrieval engine, and uses both stems and linguistically-motivated indexing units.Results show the effectiveness of phrase indexing for both GO and MeSH categorization, but we observe the categorization power of the tool depends on the controlled vocabulary: precision at high ranks ranges from above 90% for MeSH to <20% for GO, establishing a new baseline for categorizers based on retrieval methods.RESULTS AND CONCLUSIONResults show the effectiveness of phrase indexing for both GO and MeSH categorization, but we observe the categorization power of the tool depends on the controlled vocabulary: precision at high ranks ranges from above 90% for MeSH to <20% for GO, establishing a new baseline for categorizers based on retrieval methods. We report on the development of a generic text categorization system designed to automatically assign biomedical categories to any input text. Unlike usual automatic text categorization systems, which rely on data-intensive models extracted from large sets of training data, our categorizer is largely data-independent. In order to evaluate the robustness of our approach we test the system on two different biomedical terminologies: the Medical Subject Headings (MeSH) and the Gene Ontology (GO). Our lightweight categorizer, based on two ranking modules, combines a pattern matcher and a vector space retrieval engine, and uses both stems and linguistically-motivated indexing units. Results show the effectiveness of phrase indexing for both GO and MeSH categorization, but we observe the categorization power of the tool depends on the controlled vocabulary: precision at high ranks ranges from above 90% for MeSH to <20% for GO, establishing a new baseline for categorizers based on retrieval methods. Motivation: We report on the development of a generic text categorization system designed to automatically assign biomedical categories to any input text. Unlike usual automatic text categorization systems, which rely on data-intensive models extracted from large sets of training data, our categorizer is largely data-independent. Methods: In order to evaluate the robustness of our approach we test the system on two different biomedical terminologies: the Medical Subject Headings (MeSH) and the Gene Ontology (GO). Our lightweight categorizer, based on two ranking modules, combines a pattern matcher and a vector space retrieval engine, and uses both stems and linguistically-motivated indexing units. Results and Conclusion: Results show the effectiveness of phrase indexing for both GO and MeSH categorization, but we observe the categorization power of the tool depends on the controlled vocabulary: precision at high ranks ranges from above 90% for MeSH to <20% for GO, establishing a new baseline for categorizers based on retrieval methods. Contact: Patrick.Ruch@sim.hcuge.ch MOTIVATION: We report on the development of a generic text categorization system designed to automatically assign biomedical categories to any input text. Unlike usual automatic text categorization systems, which rely on data-intensive models extracted from large sets of training data, our categorizer is largely data-independent. METHODS: In order to evaluate the robustness of our approach we test the system on two different biomedical terminologies: the Medical Subject Headings (MeSH) and the Gene Ontology (GO). Our lightweight categorizer, based on two ranking modules, combines a pattern matcher and a vector space retrieval engine, and uses both stems and linguistically-motivated indexing units. RESULTS: and Conclusion: Results show the effectiveness of phrase indexing for both GO and MeSH categorization, but we observe the categorization power of the tool depends on the controlled vocabulary: precision at high ranks ranges from above 90% for MeSH to <20% for GO, establishing a new baseline for categorizers based on retrieval methods. CONTACT: Patrick.Ruchim.hcuge.ch |
| Author | Ruch, Patrick |
| Author_xml | – sequence: 1 givenname: Patrick surname: Ruch fullname: Ruch, Patrick |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/16287934$$D View this record in MEDLINE/PubMed |
| BookMark | eNqFkUtPHSEAhUmjqY_2J7SZdOFulDdDu9Lr45qYdNOmxg1huHCLnYErMFH_vejVNnXjChbfOQfO2QEbIQYLwCcE9xGU5KD30QcX06iLN_mgL1505B3YRpTDFkMmN-qdcNHSDpItsJPzNYQMUUrfgy3EcSckodtgfjiV-OTR6Jz9Mow2lCa6pvqPduGNHhqji13G5G3-2pR4q9Oi0c3SBpseVatVitr8_gA2nR6y_fh87oKfpyc_ZvP24vvZ-ezwojUM8dJSS5DhsKv51DlqYMd71nMsILHSCOkkEVpjjJlemAVizmGtmbHSUYykhWQX7K19a-zNZHNRo8_GDoMONk5ZcSEIhpS8CSJJCRJYVPDLK_A6TinUT1SmPpMR3lXo8zM09bUXtUp-1OlevTRZAbYGTIo5J-v-IVA9Lqb-X0ytF6u6b690xpdKxFCS9sOb6nat9rnYu7-ROv2pRRDB1PzySh3NLo_w8dUvJckD7U-y1g |
| CODEN | BOINFP |
| CitedBy_id | crossref_primary_10_1016_j_jbi_2007_06_004 crossref_primary_10_1186_1471_2105_10_129 crossref_primary_10_1186_1471_2105_10_148 crossref_primary_10_1186_gb_2008_9_s2_s6 crossref_primary_10_1371_journal_pone_0115892 crossref_primary_10_1002_asi_21170 crossref_primary_10_1002_asi_23351 crossref_primary_10_1016_j_artmed_2012_08_006 crossref_primary_10_1186_s13326_017_0123_3 crossref_primary_10_1186_gb_2008_9_s2_s3 crossref_primary_10_1002_asi_23290 crossref_primary_10_1038_srep32252 crossref_primary_10_1197_jamia_M2431 crossref_primary_10_1186_1471_2105_9_S3_S9 crossref_primary_10_1016_j_jbi_2008_07_003 crossref_primary_10_1016_j_ijmedinf_2006_05_002 crossref_primary_10_1016_j_ymeth_2014_10_023 crossref_primary_10_1016_j_jbi_2015_04_013 crossref_primary_10_1371_journal_pone_0209961 crossref_primary_10_1186_1471_2105_9_S5_S3 crossref_primary_10_1016_j_websem_2011_11_009 crossref_primary_10_1016_j_irbm_2012_10_002 crossref_primary_10_1186_1471_2105_14_171 crossref_primary_10_1186_s13326_016_0073_1 crossref_primary_10_1186_1471_2105_10_313 crossref_primary_10_1007_s10278_015_9792_6 crossref_primary_10_1186_1471_2105_12_S8_S2 crossref_primary_10_2196_jmir_2043 crossref_primary_10_1093_bib_bbaa394 crossref_primary_10_1155_2023_2989791 crossref_primary_10_1002_asi_23435 crossref_primary_10_1002_asi_23972 crossref_primary_10_1007_s10257_014_0259_y crossref_primary_10_1016_j_jbi_2017_07_011 crossref_primary_10_1002_jrsm_27 crossref_primary_10_1093_bioinformatics_btp249 crossref_primary_10_1186_1471_2105_14_208 crossref_primary_10_1155_2008_342746 crossref_primary_10_1136_amiajnl_2010_000055 crossref_primary_10_1186_1471_2105_9_S8_S2 crossref_primary_10_1186_s13326_016_0096_7 |
| Cites_doi | 10.1177/002383096500800404 10.1186/1471-2105-6-S1-S1 10.1186/1471-2105-6-S1-S23 10.1016/S1386-5056(02)00057-6 10.1101/gr.461403 10.1145/1067268.1067273 10.1162/coli.2003.29.2.328 10.1016/0020-0271(71)90024-6 10.1023/A:1007413511361 10.1016/0010-4825(95)00055-0 10.1186/1471-2105-4-20 10.3115/1072228.1072370 10.1007/978-3-540-31865-1_9 |
| ContentType | Journal Article |
| Copyright | Copyright Oxford University Press(England) Mar 15, 2006 |
| Copyright_xml | – notice: Copyright Oxford University Press(England) Mar 15, 2006 |
| DBID | BSCLL AAYXX CITATION CGR CUY CVF ECM EIF NPM 7QF 7QO 7QQ 7SC 7SE 7SP 7SR 7TA 7TB 7TM 7TO 7U5 8BQ 8FD F28 FR3 H8D H8G H94 JG9 JQ2 K9. KR7 L7M L~C L~D P64 7X8 |
| DOI | 10.1093/bioinformatics/bti783 |
| DatabaseName | Istex CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Aluminium Industry Abstracts Biotechnology Research Abstracts Ceramic Abstracts Computer and Information Systems Abstracts Corrosion Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts Materials Business File Mechanical & Transportation Engineering Abstracts Nucleic Acids Abstracts Oncogenes and Growth Factors Abstracts Solid State and Superconductivity Abstracts METADEX Technology Research Database ANTE: Abstracts in New Technology & Engineering Engineering Research Database Aerospace Database Copper Technical Reference Library AIDS and Cancer Research Abstracts Materials Research Database ProQuest Computer Science Collection ProQuest Health & Medical Complete (Alumni) Civil Engineering Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Biotechnology and BioEngineering Abstracts MEDLINE - Academic |
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Materials Research Database Oncogenes and Growth Factors Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Mechanical & Transportation Engineering Abstracts Nucleic Acids Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Health & Medical Complete (Alumni) Materials Business File Aerospace Database Copper Technical Reference Library Engineered Materials Abstracts Biotechnology Research Abstracts AIDS and Cancer Research Abstracts Advanced Technologies Database with Aerospace ANTE: Abstracts in New Technology & Engineering Civil Engineering Abstracts Aluminium Industry Abstracts Electronics & Communications Abstracts Ceramic Abstracts METADEX Biotechnology and BioEngineering Abstracts Computer and Information Systems Abstracts Professional Solid State and Superconductivity Abstracts Engineering Research Database Corrosion Abstracts MEDLINE - Academic |
| DatabaseTitleList | MEDLINE - Academic MEDLINE CrossRef Engineering Research Database Materials Research Database |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Biology |
| EISSN | 1460-2059 1367-4811 |
| EndPage | 664 |
| ExternalDocumentID | 1006454661 16287934 10_1093_bioinformatics_bti783 ark_67375_HXZ_BCXB2DZW_9 |
| Genre | Evaluation Studies Research Support, Non-U.S. Gov't Journal Article Research Support, N.I.H., Extramural |
| GroupedDBID | -~X .2P .I3 482 48X 5GY AAMVS ABGNP ABJNI ABPTD ACGFS ACUFI ADZXQ ALMA_UNASSIGNED_HOLDINGS BSCLL CZ4 EE~ F5P F9B H5~ HAR HW0 IOX KSI KSN NGC Q5Y RD5 ROZ RXO TLC TN5 TOX WH7 ~91 --- -E4 .DC .GJ 0R~ 1TH 23N 2WC 4.4 53G 5WA 70D AAIJN AAIMJ AAJKP AAJQQ AAKPC AAMDB AAOGV AAPQZ AAPXW AAUQX AAVAP AAVLN AAYXX ABEJV ABEUO ABIXL ABNGD ABNKS ABPQP ABQLI ABWST ABXVV ABZBJ ACIWK ACPRK ACUKT ACUXJ ACYTK ADBBV ADEYI ADEZT ADFTL ADGKP ADGZP ADHKW ADHZD ADMLS ADOCK ADPDF ADRDM ADRTK ADVEK ADYVW ADZTZ AECKG AEGPL AEJOX AEKKA AEKSI AELWJ AEMDU AENEX AENZO AEPUE AETBJ AEWNT AFFNX AFFZL AFGWE AFIYH AFOFC AFRAH AGINJ AGKEF AGQPQ AGQXC AGSYK AHMBA AHXPO AI. AIJHB AJEEA AJEUX AKHUL AKWXX ALTZX ALUQC AMNDL APIBT APWMN ARIXL ASPBG AVWKF AXUDD AYOIW AZFZN AZVOD BAWUL BAYMD BHONS BQDIO BQUQU BSWAC BTQHN C1A C45 CAG CDBKE CITATION COF CS3 DAKXR DIK DILTD DU5 D~K EBD EBS EJD EMOBN FEDTE FHSFR FLIZI FLUFQ FOEOM FQBLK GAUVT GJXCC GROUPED_DOAJ GX1 H13 HVGLF HZ~ J21 JXSIZ KAQDR KOP KQ8 M-Z MK~ ML0 N9A NLBLG NMDNZ NOMLY NTWIH NVLIB O0~ O9- OAWHX ODMLO OJQWA OK1 OVD OVEED P2P PAFKI PB- PEELM PQQKQ Q1. R44 RNS ROL ROX RUSNO RW1 SV3 TEORI TJP TR2 VH1 W8F WOQ X7H YAYTL YKOAZ YXANX ZKX ~KM ABQTQ ADRIX AFXEN BCRHZ CGR CUY CVF ECM EIF M49 NPM RIG 7QF 7QO 7QQ 7SC 7SE 7SP 7SR 7TA 7TB 7TM 7TO 7U5 8BQ 8FD F28 FR3 H8D H8G H94 JG9 JQ2 K9. KR7 L7M L~C L~D P64 7X8 |
| ID | FETCH-LOGICAL-c516t-4e31c6082874ff4c086b5b62703e9c79f937aa2225adcd15ff2aa5ce9f4219e03 |
| ISICitedReferencesCount | 68 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000236111600004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1367-4803 |
| IngestDate | Fri Sep 05 07:17:58 EDT 2025 Mon Oct 06 18:05:53 EDT 2025 Mon Oct 06 17:26:49 EDT 2025 Wed Feb 19 01:43:22 EST 2025 Tue Nov 18 21:21:01 EST 2025 Sat Nov 29 05:33:27 EST 2025 Sat Sep 20 11:02:08 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 6 |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c516t-4e31c6082874ff4c086b5b62703e9c79f937aa2225adcd15ff2aa5ce9f4219e03 |
| Notes | To whom correspondence should be addressed. istex:D55620CA35BE6A1C4A5346A269529F8130D77ADC ark:/67375/HXZ-BCXB2DZW-9 Associate Editor: Alfonso Valencia ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23 ObjectType-Undefined-1 ObjectType-Feature-3 |
| OpenAccessLink | https://academic.oup.com/bioinformatics/article-pdf/22/6/658/48839321/bioinformatics_22_6_658.pdf |
| PMID | 16287934 |
| PQID | 198745368 |
| PQPubID | 36124 |
| PageCount | 7 |
| ParticipantIDs | proquest_miscellaneous_67732043 proquest_miscellaneous_19431727 proquest_journals_198745368 pubmed_primary_16287934 crossref_primary_10_1093_bioinformatics_bti783 crossref_citationtrail_10_1093_bioinformatics_bti783 istex_primary_ark_67375_HXZ_BCXB2DZW_9 |
| PublicationCentury | 2000 |
| PublicationDate | 2006-03-15 |
| PublicationDateYYYYMMDD | 2006-03-15 |
| PublicationDate_xml | – month: 03 year: 2006 text: 2006-03-15 day: 15 |
| PublicationDecade | 2000 |
| PublicationPlace | England |
| PublicationPlace_xml | – name: England – name: Oxford |
| PublicationTitle | Bioinformatics |
| PublicationTitleAlternate | Bioinformatics |
| PublicationYear | 2006 |
| Publisher | Oxford University Press Oxford Publishing Limited (England) |
| Publisher_xml | – name: Oxford University Press – name: Oxford Publishing Limited (England) |
| References | Stolz (2023012408504954800_b32) 1965; 8 Ruch (2023012408504954800_b26) 2002; 67 Aronson (2023012408504954800_b4) 2005 Lewis (2023012408504954800_b20) 1996 Park (2023012408504954800_b22) 2002 Hersh (2023012408504954800_b13) 2005; 39 Singhal (2023012408504954800_b31) 1996 Wilbur (2023012408504954800_b33) 1996; 26 Amini (2023012408504954800_b2) 2005 de Bruijn (2023012408504954800_b8) 2003 Hersh (2023012408504954800_b14) 1994 Camon (2023012408504954800_b5) 2003; 13 Manber (2023012408504954800_b21) 1994 Yang (2023012408504954800_b36) 1999; 1 Rasolofo (2023012408504954800_b24) 2003 Ruch (2023012408504954800_b28) 2005 Couto (2023012408504954800_b7) 2004 Ehrler (2023012408504954800_b10) 2005; 6 Gaizauskas (2023012408504954800_b12) 2003; 29 Zobel (2023012408504954800_b38) 1998 Lewis (2023012408504954800_b19) 1995 Ruch (2023012408504954800_b27) 2000 Singhal (2023012408504954800_b30) 2001; 24 Larkey (2023012408504954800_b18) 1996 Shah (2023012408504954800_b29) 2003; 4 Funk (2023012408504954800_b11) 1983; 71 Cooper (2023012408504954800_b6) 1971; 7 Hirschman (2023012408504954800_b15) 2005; 6 Ruch (2023012408504954800_b25) 2002 Yang (2023012408504954800_b35) 1996 Pustejovsky (2023012408504954800_b23) 2001 Aldous (2023012408504954800_b1) 1985 Joachims (2023012408504954800_b16) 1999 Kim (2023012408504954800_b17) 2001 Yang (2023012408504954800_b37) 1992 Domingos (2023012408504954800_b9) 1997; 29 Yang (2023012408504954800_b34) 1996 Arampatzis (2023012408504954800_b3) 2000 |
| References_xml | – volume: 8 year: 1965 ident: 2023012408504954800_b32 article-title: A probabilistic procedure for grouping words into phrases publication-title: Lang. Speech doi: 10.1177/002383096500800404 – start-page: 23 year: 1994 ident: 2023012408504954800_b21 article-title: GLIMPSE: a tool to search through entire file systems – volume-title: BioCreative Notebook Papers, CNB 2004 year: 2004 ident: 2023012408504954800_b7 article-title: FIGO: findings GO terms in unStructured text – volume: 6 year: 2005 ident: 2023012408504954800_b15 article-title: Overview of BioCreAtIvE: critical assessment of information extraction for biology publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-6-S1-S1 – volume: 6 year: 2005 ident: 2023012408504954800_b10 article-title: Data-poor Categorization and Passage Retrieval for Gene Ontology Annotation in Swiss-Prot publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-6-S1-S23 – start-page: 88 year: 1996 ident: 2023012408504954800_b35 article-title: Sampling strategies and learning efficiency in text categorization – volume: 67 start-page: 75 year: 2002 ident: 2023012408504954800_b26 article-title: Evaluating and reducing the effect of data corruption when applying bag of words approaches to medical records publication-title: Int. J. Med. Inf. doi: 10.1016/S1386-5056(02)00057-6 – volume: 13 start-page: 562 year: 2003 ident: 2023012408504954800_b5 article-title: The Gene Ontology Annotation (GOA) project: implementation of GO in Swiss-Prot, TrEMBL and InterPro publication-title: Genome Res. doi: 10.1101/gr.461403 – start-page: 447 year: 1992 ident: 2023012408504954800_b37 article-title: A linear least squares fit mapping method for information retrieval from natural language texts – start-page: 319 year: 2001 ident: 2023012408504954800_b17 article-title: Automatic MeSH term assignment and quality assessment publication-title: Proc. AMIA Symp. – start-page: 307 year: 1998 ident: 2023012408504954800_b38 article-title: How reliable are large-scale information retrieval experiments? – volume: 39 start-page: 21 year: 2005 ident: 2023012408504954800_b13 article-title: Report on the TREC 2004 Genomics track publication-title: SIGIR Forum doi: 10.1145/1067268.1067273 – volume: 29 start-page: 328 year: 2003 ident: 2023012408504954800_b12 article-title: Recent advances in computational terminology publication-title: Comput. Linguist. doi: 10.1162/coli.2003.29.2.328 – start-page: 246 year: 1995 ident: 2023012408504954800_b19 article-title: Evaluating and optimizing autonomous text classification systems – start-page: 1 volume-title: Advances in Kernel Methods—Support Vector Learning year: 1999 ident: 2023012408504954800_b16 article-title: Making large-scale SVM learning practical – volume: 24 start-page: 35 year: 2001 ident: 2023012408504954800_b30 article-title: Modern information retrieval: a brief overview publication-title: IEEE Data Eng. Bull. – volume: 1 start-page: 67 year: 1999 ident: 2023012408504954800_b36 article-title: An evaluation of statistical approaches to text categorization publication-title: J. Inf. Ret. – start-page: 36 year: 2005 ident: 2023012408504954800_b4 article-title: Fusion of knowledge-intensive and statistical approaches for retrieving and annotating textual genomics documents – start-page: 101 year: 2003 ident: 2023012408504954800_b24 article-title: Term proximity scoring for keyword-based retrieval systems – volume: 71 start-page: 176 year: 1983 ident: 2023012408504954800_b11 article-title: Indexing consistency in medline publication-title: Bull Med. Libr. Assoc. – year: 2001 ident: 2023012408504954800_b23 article-title: Extraction and disambiguation of acronym–meaning pairs in medline – start-page: 21 year: 1996 ident: 2023012408504954800_b31 article-title: Pivoted document length normalization – volume: 7 start-page: 19 year: 1971 ident: 2023012408504954800_b6 article-title: A definition of relevance for information retrieval publication-title: Inf. Storage Retr. doi: 10.1016/0020-0271(71)90024-6 – start-page: 289 year: 1996 ident: 2023012408504954800_b18 article-title: Combining classifiers in text categorization – start-page: 298 year: 1996 ident: 2023012408504954800_b20 article-title: Training algorithms for linear text classifiers – volume-title: ACM-SAC Information Access and Retrieval Track year: 2002 ident: 2023012408504954800_b25 article-title: Information retrieval and spelling errors: improving effectiveness by lexical disambiguation – start-page: 111 year: 2000 ident: 2023012408504954800_b27 article-title: Minimal commitment and full lexical disambiguation: balancing rules and hidden Markov models – start-page: 192 year: 1994 ident: 2023012408504954800_b14 article-title: OHSUMED: an interactive retrieval evaluation and new large test collection for research – start-page: 358 year: 1996 ident: 2023012408504954800_b34 article-title: An evaluation of statistical approaches to medline indexing – start-page: 142 year: 2005 ident: 2023012408504954800_b2 article-title: Automatic text summarization based on word-clusters and ranking algorithms – volume: 29 start-page: 103 year: 1997 ident: 2023012408504954800_b9 article-title: On the optimality of the simple bayesian classifier under zero-one loss publication-title: Mach. Learn. doi: 10.1023/A:1007413511361 – start-page: 1 volume-title: École d'été de probabilités de Saint-Flour, XIII—1983, Volume 1117 of Lecture Notes in Mathematics year: 1985 ident: 2023012408504954800_b1 article-title: Exchangeability and related topics – volume: 26 start-page: 209 year: 1996 ident: 2023012408504954800_b33 article-title: An analysis of statistical term strength and its use in the indexing and retrieval of molecular biology texts publication-title: Comput. Biol. Med. doi: 10.1016/0010-4825(95)00055-0 – start-page: 451 year: 2003 ident: 2023012408504954800_b8 article-title: Finding gene functions using litminer – volume: 4 year: 2003 ident: 2023012408504954800_b29 article-title: Information extraction from full text scientific articles: where are the keywords? publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-4-20 – volume-title: Encyclopedia of Library and Information Science year: 2000 ident: 2023012408504954800_b3 article-title: Linguistically motivated information retrieval – year: 2002 ident: 2023012408504954800_b22 article-title: Automatic glossary extraction: beyond terminology identification doi: 10.3115/1072228.1072370 – year: 2005 ident: 2023012408504954800_b28 article-title: Features combination for extracting gene functions from medline doi: 10.1007/978-3-540-31865-1_9 |
| SSID | ssj0051444 ssj0005056 |
| Score | 2.189459 |
| Snippet | Motivation: We report on the development of a generic text categorization system designed to automatically assign biomedical categories to any input text.... We report on the development of a generic text categorization system designed to automatically assign biomedical categories to any input text. Unlike usual... MOTIVATION: We report on the development of a generic text categorization system designed to automatically assign biomedical categories to any input text.... |
| SourceID | proquest pubmed crossref istex |
| SourceType | Aggregation Database Index Database Enrichment Source Publisher |
| StartPage | 658 |
| SubjectTerms | Abstracting and Indexing as Topic - methods Algorithms Artificial Intelligence Documentation - methods MEDLINE Natural Language Processing Pattern Recognition, Automated - methods Periodicals as Topic Proteins - classification |
| Title | Automatic assignment of biomedical categories: toward a generic approach |
| URI | https://api.istex.fr/ark:/67375/HXZ-BCXB2DZW-9/fulltext.pdf https://www.ncbi.nlm.nih.gov/pubmed/16287934 https://www.proquest.com/docview/198745368 https://www.proquest.com/docview/19431727 https://www.proquest.com/docview/67732043 |
| Volume | 22 |
| WOSCitedRecordID | wos000236111600004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVASL databaseName: Oxford Journals Open Access Collection customDbUrl: eissn: 1460-2059 dateEnd: 20220930 omitProxy: false ssIdentifier: ssj0005056 issn: 1367-4803 databaseCode: TOX dateStart: 19850101 isFulltext: true titleUrlDefault: https://academic.oup.com/journals/ providerName: Oxford University Press – providerCode: PRVASL databaseName: Oxford Journals Open Access Collection customDbUrl: eissn: 1460-2059 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0005056 issn: 1367-4803 databaseCode: TOX dateStart: 19850101 isFulltext: true titleUrlDefault: https://academic.oup.com/journals/ providerName: Oxford University Press |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELZoCxIXxJtQKD4gLig0sR075tYW0J4Kh0VEvUROYksrUFLtZtH233ccO8kGtFAOXKJdx46cmfF4Jp75BqHXpGCVio0JI6JZyJRUoWRFFVaEw5CCME2KrtiEOD9Ps0x-8eWtVl05AVHX6WYjL_8rq6ENmG1TZ_-B3cNDoQF-A9PhCmyH640Yf7JuG4_DurLRGf1pv0u0d4AgFh6isU5yF9bRRc6-VbaachdY3-OMTw58F43HWG37-PitbwU0dNmSf8pB3FJ_1KKgp5FTOdq1MR4BgT1ut9eZhGzJxrYC5A6I3e-l3CGU_6amHYRVMZm5bWgXIqXjztSfxv-yYQ1hhGr53caliSSfZRf56Vl2Sj5cfMvlHjogIpE2wm_-ORsjfiKLG-T-gJ3IXMFj_8Z9dpekx9N5HbtZTeyWA7sEN7udks44md9H97xXgU-cNDxAt3T9EN1xdUavHqHZIBN4lAncGDzKBB5l4j12EoEV9hKBe4l4jL5--jg_m4W-hEZYJjFvQ6ZpXHJX1cAYVoIDWyQFJ6DntSyFNGCdKmV9flWVVZwYQ5RKSi0Ng61MR_QJ2q-bWj9DGIwYRUxBJaXgQnMlGezZVOmqjFUCLlqAWE-gvPT48rbMyY_cxTnQfErX3NE1QO-GYZcOYOVvA9501B9675KCAB327Mn9-lzl9hsbSyhPA_RquAsa1R6TqVo3a9vFGtVE7O7BhaA2pzxATx3Xx5lzoLSk7PlNJ3mI7o4L9QXab5dr_RLdLn-2i9XyCO2JLD3qpPga9lKzjA |
| linkProvider | Oxford University Press |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Automatic+assignment+of+biomedical+categories%3A+toward+a+generic+approach&rft.jtitle=Bioinformatics&rft.date=2006-03-15&rft.pub=Oxford+University+Press&rft.issn=1367-4803&rft.eissn=1460-2059&rft.volume=22&rft.issue=6&rft.spage=658&rft.epage=664&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbti783&rft.externalDBID=n%2Fa&rft.externalDocID=ark_67375_HXZ_BCXB2DZW_9 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4803&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4803&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4803&client=summon |