GHOSTX: An Improved Sequence Homology Search Algorithm Using a Query Suffix Array and a Database Suffix Array
DNA sequences are translated into protein coding sequences and then further assigned to protein families in metagenomic analyses, because of the need for sensitivity. However, huge amounts of sequence data create the problem that even general homology search analyses using BLASTX become difficult in...
Gespeichert in:
| Veröffentlicht in: | PloS one Jg. 9; H. 8; S. e103833 |
|---|---|
| Hauptverfasser: | , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
United States
Public Library of Science
06.08.2014
Public Library of Science (PLoS) |
| Schlagworte: | |
| ISSN: | 1932-6203, 1932-6203 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | DNA sequences are translated into protein coding sequences and then further assigned to protein families in metagenomic analyses, because of the need for sensitivity. However, huge amounts of sequence data create the problem that even general homology search analyses using BLASTX become difficult in terms of computational cost. We designed a new homology search algorithm that finds seed sequences based on the suffix arrays of a query and a database, and have implemented it as GHOSTX. GHOSTX achieved approximately 131-165 times acceleration over a BLASTX search at similar levels of sensitivity. GHOSTX is distributed under the BSD 2-clause license and is available for download at http://www.bi.cs.titech.ac.jp/ghostx/. Currently, sequencing technology continues to improve, and sequencers are increasingly producing larger and larger quantities of data. This explosion of sequence data makes computational analysis with contemporary tools more difficult. We offer this tool as a potential solution to this problem. |
|---|---|
| AbstractList | DNA sequences are translated into protein coding sequences and then further assigned to protein families in metagenomic analyses, because of the need for sensitivity. However, huge amounts of sequence data create the problem that even general homology search analyses using BLASTX become difficult in terms of computational cost. We designed a new homology search algorithm that finds seed sequences based on the suffix arrays of a query and a database, and have implemented it as GHOSTX. GHOSTX achieved approximately 131–165 times acceleration over a BLASTX search at similar levels of sensitivity. GHOSTX is distributed under the BSD 2-clause license and is available for download at http://www.bi.cs.titech.ac.jp/ghostx/. Currently, sequencing technology continues to improve, and sequencers are increasingly producing larger and larger quantities of data. This explosion of sequence data makes computational analysis with contemporary tools more difficult. We offer this tool as a potential solution to this problem. DNA sequences are translated into protein coding sequences and then further assigned to protein families in metagenomic analyses, because of the need for sensitivity. However, huge amounts of sequence data create the problem that even general homology search analyses using BLASTX become difficult in terms of computational cost. We designed a new homology search algorithm that finds seed sequences based on the suffix arrays of a query and a database, and have implemented it as GHOSTX. GHOSTX achieved approximately 131-165 times acceleration over a BLASTX search at similar levels of sensitivity. GHOSTX is distributed under the BSD 2-clause license and is available for download at http://www.bi.cs.titech.ac.jp/ghostx/. Currently, sequencing technology continues to improve, and sequencers are increasingly producing larger and larger quantities of data. This explosion of sequence data makes computational analysis with contemporary tools more difficult. We offer this tool as a potential solution to this problem.DNA sequences are translated into protein coding sequences and then further assigned to protein families in metagenomic analyses, because of the need for sensitivity. However, huge amounts of sequence data create the problem that even general homology search analyses using BLASTX become difficult in terms of computational cost. We designed a new homology search algorithm that finds seed sequences based on the suffix arrays of a query and a database, and have implemented it as GHOSTX. GHOSTX achieved approximately 131-165 times acceleration over a BLASTX search at similar levels of sensitivity. GHOSTX is distributed under the BSD 2-clause license and is available for download at http://www.bi.cs.titech.ac.jp/ghostx/. Currently, sequencing technology continues to improve, and sequencers are increasingly producing larger and larger quantities of data. This explosion of sequence data makes computational analysis with contemporary tools more difficult. We offer this tool as a potential solution to this problem. DNA sequences are translated into protein coding sequences and then further assigned to protein families in metagenomic analyses, because of the need for sensitivity. However, huge amounts of sequence data create the problem that even general homology search analyses using BLASTX become difficult in terms of computational cost. We designed a new homology search algorithm that finds seed sequences based on the suffix arrays of a query and a database, and have implemented it as GHOSTX. GHOSTX achieved approximately 131-165 times acceleration over a BLASTX search at similar levels of sensitivity. GHOSTX is distributed under the BSD 2-clause license and is available for download at |
| Audience | Academic |
| Author | Ishida, Takashi Akiyama, Yutaka Kakuta, Masanori Suzuki, Shuji |
| AuthorAffiliation | American University in Cairo, Egypt Graduate School of Information Science and Engineering, Tokyo Institute of Technology, Meguro-ku, Tokyo, Japan |
| AuthorAffiliation_xml | – name: American University in Cairo, Egypt – name: Graduate School of Information Science and Engineering, Tokyo Institute of Technology, Meguro-ku, Tokyo, Japan |
| Author_xml | – sequence: 1 givenname: Shuji surname: Suzuki fullname: Suzuki, Shuji – sequence: 2 givenname: Masanori surname: Kakuta fullname: Kakuta, Masanori – sequence: 3 givenname: Takashi surname: Ishida fullname: Ishida, Takashi – sequence: 4 givenname: Yutaka surname: Akiyama fullname: Akiyama, Yutaka |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/25099887$$D View this record in MEDLINE/PubMed |
| BookMark | eNqNk1Fr2zAQx83oWNts32BshsHYHpJJli1bfRiEbmsChbClHXsTF1lyFGwrk-zSfPvJjTPiUsbwg83d7_6n-1t3HpzUppZB8BqjCSYp_rQxra2hnGx9eIIwIhkhz4IzzEg0phEiJ0ffp8G5cxuEEpJR-iI4jRLEWJalZ0F1NVssb35dhNM6nFdba-5kHi7l71bWQoYzU5nSFDsfASvW4bQsjNXNugpvna6LEMLvrbQ-3Sql78OptbALoc594gs0sAInB7mXwXMFpZOv-vcouP329eZyNr5eXM0vp9djkaGoGSdpFquMqiimUklFqRICUqZWOM3xSlGC_Rx5DFkCIosYzRGjjK6QTBKcylVMRsHbve62NI73TjmOfZ4y5tU9Md8TuYEN31pdgd1xA5o_BIwtONhGi1JyIIlIZZ4zpFjsHQWUpxGKCSDCAMtO63PfrV1VMheybiyUA9FhptZrXpg7HuOIMP9TRsGHXsAa77xreKWdkGUJtTTtw7kjkqIk6Xq9e4Q-PV1PFeAH0LUyvq_oRPk0xlmKUpJ1bSdPUP7JZaWFv1ZK-_ig4OOgwDONvG8KaJ3j8-WP_2cXP4fs-yN2LaFs1s6UbaNN7Ybgm2On_1p8uM8euNgDwhrnrFRc6AY6HT-aLjlGvFueg2m8Wx7eL48vjh8VH_T_WfYHrysb8w |
| CitedBy_id | crossref_primary_10_1186_s12864_017_3504_1 crossref_primary_10_1186_s12864_023_09495_y crossref_primary_10_1093_bioinformatics_btae397 crossref_primary_10_1186_s40168_018_0460_1 crossref_primary_10_1093_nar_gkv1070 crossref_primary_10_3390_biology9090295 crossref_primary_10_3390_genes12111656 crossref_primary_10_1038_s41598_020_65277_6 crossref_primary_10_1016_j_toxicon_2023_107556 crossref_primary_10_3389_fmolb_2023_1137303 crossref_primary_10_1016_j_jmb_2015_11_006 crossref_primary_10_1080_09168451_2018_1476122 crossref_primary_10_3389_fmicb_2022_955032 crossref_primary_10_3390_biology13110952 crossref_primary_10_1038_s41598_021_94059_x crossref_primary_10_1111_1462_2920_14730 crossref_primary_10_1093_nar_gky1013 crossref_primary_10_1002_pro_3711 crossref_primary_10_1016_j_ygeno_2020_10_015 crossref_primary_10_1016_j_mimet_2020_105860 crossref_primary_10_1007_s12088_016_0629_x crossref_primary_10_3390_ijms18102124 crossref_primary_10_1007_s13258_017_0629_1 crossref_primary_10_1128_msystems_00949_23 crossref_primary_10_1177_03009858211052662 crossref_primary_10_1111_ppl_70306 crossref_primary_10_1016_j_csbj_2019_07_011 crossref_primary_10_1016_j_scitotenv_2019_07_140 crossref_primary_10_1016_j_jbiotec_2017_02_020 crossref_primary_10_1186_s12859_024_05766_x crossref_primary_10_1371_journal_pone_0192898 crossref_primary_10_1038_s41598_019_46610_0 crossref_primary_10_1099_ijsem_0_005268 crossref_primary_10_1159_000524437 crossref_primary_10_1186_s12859_021_04425_9 crossref_primary_10_3389_fpls_2018_00902 crossref_primary_10_1128_aem_00272_23 crossref_primary_10_1128_AEM_02068_20 crossref_primary_10_3389_fmicb_2024_1414422 crossref_primary_10_3390_genes12091455 crossref_primary_10_1186_s13068_015_0387_8 crossref_primary_10_1016_j_imu_2020_100323 crossref_primary_10_1128_Spectrum_00166_21 crossref_primary_10_1038_srep29043 crossref_primary_10_1371_journal_pone_0157338 crossref_primary_10_1093_nar_gkw1092 crossref_primary_10_1016_j_gene_2023_148045 crossref_primary_10_3389_fgene_2022_839453 crossref_primary_10_1016_j_envpol_2021_117774 |
| Cites_doi | 10.1186/1471-2105-12-159 10.1093/bioinformatics/btr595 10.1016/0888-7543(91)90071-L 10.1111/j.1742-4658.2005.04945.x 10.1093/nar/28.1.27 10.1093/bioinformatics/btq644 10.1093/dnares/dsm018 10.1093/nar/gkr988 10.1038/nature11234 10.1038/nature05414 10.1016/S0022-2836(05)80360-2 10.1101/gr.229202. Article published online before March 2002 10.1093/nar/25.17.3389 |
| ContentType | Journal Article |
| Copyright | COPYRIGHT 2014 Public Library of Science 2014 Suzuki et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. 2014 Suzuki et al 2014 Suzuki et al |
| Copyright_xml | – notice: COPYRIGHT 2014 Public Library of Science – notice: 2014 Suzuki et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: 2014 Suzuki et al 2014 Suzuki et al |
| DBID | AAYXX CITATION CGR CUY CVF ECM EIF NPM IOV ISR 3V. 7QG 7QL 7QO 7RV 7SN 7SS 7T5 7TG 7TM 7U9 7X2 7X7 7XB 88E 8AO 8C1 8FD 8FE 8FG 8FH 8FI 8FJ 8FK ABJCF ABUWG AEUYN AFKRA ARAPS ATCPS AZQEC BBNVY BENPR BGLVJ BHPHI C1K CCPQU D1I DWQXO FR3 FYUFA GHDGH GNUQQ H94 HCIFZ K9. KB. KB0 KL. L6V LK8 M0K M0S M1P M7N M7P M7S NAPCQ P5Z P62 P64 PATMY PDBOC PHGZM PHGZT PIMPY PJZUB PKEHL PPXIY PQEST PQGLB PQQKQ PQUKI PRINS PTHSS PYCSY RC3 7X8 5PM DOA |
| DOI | 10.1371/journal.pone.0103833 |
| DatabaseName | CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Gale In Context: Opposing Viewpoints Gale In Context: Science ProQuest Central (Corporate) Animal Behavior Abstracts Bacteriology Abstracts (Microbiology B) Biotechnology Research Abstracts Nursing & Allied Health Database Ecology Abstracts Entomology Abstracts (Full archive) Immunology Abstracts Meteorological & Geoastrophysical Abstracts Nucleic Acids Abstracts Virology and AIDS Abstracts Agricultural Science Collection Health & Medical Collection ProQuest Central (purchase pre-March 2016) Medical Database (Alumni Edition) ProQuest Pharma Collection Public Health Database Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Natural Science Collection Hospital Premium Collection Hospital Premium Collection (Alumni Edition) ProQuest Central (Alumni) (purchase pre-March 2016) Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest One Sustainability (subscription) ProQuest Central UK/Ireland Advanced Technologies & Computer Science Collection Agricultural & Environmental Science Collection ProQuest Central Essentials ProQuest : Biological Science Collection journals [unlimited simultaneous users] ProQuest Central (ProQuest) Technology collection Natural Science Collection Environmental Sciences and Pollution Management ProQuest One ProQuest Materials Science Collection ProQuest Central Engineering Research Database Health Research Premium Collection (ProQuest) Health Research Premium Collection (Alumni) ProQuest Central Student AIDS and Cancer Research Abstracts SciTech Premium Collection ProQuest Health & Medical Complete (Alumni) Materials Science Database (ProQuest) Nursing & Allied Health Database (Alumni Edition) Meteorological & Geoastrophysical Abstracts - Academic ProQuest Engineering Collection ProQuest Biological Science Collection Agriculture Science Database Health & Medical Collection (Alumni Edition) PML(ProQuest Medical Library) Algology Mycology and Protozoology Abstracts (Microbiology C) Biological Science Database Engineering Database Nursing & Allied Health Premium ProQuest advanced technologies & aerospace journals ProQuest Advanced Technologies & Aerospace Collection Biotechnology and BioEngineering Abstracts Environmental Science Database Materials Science Collection ProQuest Central Premium ProQuest One Academic Publicly Available Content Database ProQuest Health & Medical Research Collection ProQuest One Academic Middle East (New) ProQuest One Health & Nursing ProQuest One Academic Eastern Edition (DO NOT USE) One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection Environmental Science Collection Genetics Abstracts MEDLINE - Academic PubMed Central (Full Participant titles) DOAJ Directory of Open Access Journals |
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Agricultural Science Database Publicly Available Content Database ProQuest Central Student ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials Nucleic Acids Abstracts SciTech Premium Collection ProQuest Central China Environmental Sciences and Pollution Management ProQuest One Applied & Life Sciences ProQuest One Sustainability Health Research Premium Collection Meteorological & Geoastrophysical Abstracts Natural Science Collection Health & Medical Research Collection Biological Science Collection ProQuest Central (New) ProQuest Medical Library (Alumni) Engineering Collection Advanced Technologies & Aerospace Collection Engineering Database Virology and AIDS Abstracts ProQuest Biological Science Collection ProQuest One Academic Eastern Edition Agricultural Science Collection ProQuest Hospital Collection ProQuest Technology Collection Health Research Premium Collection (Alumni) Biological Science Database Ecology Abstracts ProQuest Hospital Collection (Alumni) Biotechnology and BioEngineering Abstracts Environmental Science Collection Entomology Abstracts Nursing & Allied Health Premium ProQuest Health & Medical Complete ProQuest One Academic UKI Edition Environmental Science Database ProQuest Nursing & Allied Health Source (Alumni) Engineering Research Database ProQuest One Academic Meteorological & Geoastrophysical Abstracts - Academic ProQuest One Academic (New) Technology Collection Technology Research Database ProQuest One Academic Middle East (New) Materials Science Collection ProQuest Health & Medical Complete (Alumni) ProQuest Central (Alumni Edition) ProQuest One Community College ProQuest One Health & Nursing ProQuest Natural Science Collection ProQuest Pharma Collection ProQuest Central ProQuest Health & Medical Research Collection Genetics Abstracts ProQuest Engineering Collection Biotechnology Research Abstracts Health and Medicine Complete (Alumni Edition) ProQuest Central Korea Bacteriology Abstracts (Microbiology B) Algology Mycology and Protozoology Abstracts (Microbiology C) Agricultural & Environmental Science Collection AIDS and Cancer Research Abstracts Materials Science Database ProQuest Materials Science Collection ProQuest Public Health ProQuest Nursing & Allied Health Source ProQuest SciTech Collection Advanced Technologies & Aerospace Database ProQuest Medical Library Animal Behavior Abstracts Materials Science & Engineering Collection Immunology Abstracts ProQuest Central (Alumni) MEDLINE - Academic |
| DatabaseTitleList | Agricultural Science Database MEDLINE MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: PIMPY name: Publicly Available Content Database url: http://search.proquest.com/publiccontent sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Sciences (General) Engineering |
| DocumentTitleAlternate | GHOSTX: An Improved Sequence Homology Search Algorithm |
| EISSN | 1932-6203 |
| ExternalDocumentID | 1551699784 oai_doaj_org_article_a35c7edd90f94932a0d72043a039a1e4 PMC4123905 3395002881 A418707385 25099887 10_1371_journal_pone_0103833 |
| Genre | Research Support, Non-U.S. Gov't Journal Article |
| GeographicLocations | Japan |
| GeographicLocations_xml | – name: Japan |
| GroupedDBID | --- 123 29O 2WC 53G 5VS 7RV 7X2 7X7 7XC 88E 8AO 8C1 8CJ 8FE 8FG 8FH 8FI 8FJ A8Z AAFWJ AAUCC AAWOE AAYXX ABDBF ABIVO ABJCF ABUWG ACCTH ACGFO ACIHN ACIWK ACPRK ACUHS ADBBV ADRAZ AEAQA AENEX AEUYN AFFHD AFKRA AFPKN AFRAH AHMBA ALMA_UNASSIGNED_HOLDINGS AOIJS APEBS ARAPS ATCPS BAWUL BBNVY BCNDV BENPR BGLVJ BHPHI BKEYQ BPHCQ BVXVI BWKFM CCPQU CITATION CS3 D1I D1J D1K DIK DU5 E3Z EAP EAS EBD EMOBN ESX EX3 F5P FPL FYUFA GROUPED_DOAJ GX1 HCIFZ HH5 HMCUK HYE IAO IEA IGS IHR IHW INH INR IOV IPY ISE ISR ITC K6- KB. KQ8 L6V LK5 LK8 M0K M1P M48 M7P M7R M7S M~E NAPCQ O5R O5S OK1 OVT P2P P62 PATMY PDBOC PHGZM PHGZT PIMPY PJZUB PPXIY PQGLB PQQKQ PROAC PSQYO PTHSS PYCSY RNS RPM SV3 TR2 UKHRP WOQ WOW ~02 ~KM ALIPV CGR CUY CVF ECM EIF IPNFZ NPM PV9 RIG RZL BBORY 3V. 7QG 7QL 7QO 7SN 7SS 7T5 7TG 7TM 7U9 7XB 8FD 8FK AZQEC C1K DWQXO ESTFP FR3 GNUQQ H94 K9. KL. M7N P64 PKEHL PQEST PQUKI PRINS RC3 7X8 PUEGO 5PM - 02 AAPBV ABPTK ADACO BBAFP KM |
| ID | FETCH-LOGICAL-c802t-5784f86f246efef66fcca79fb17d1bf631386d4a85ac8296d09696b0e5517eb43 |
| IEDL.DBID | FPL |
| ISICitedReferencesCount | 63 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000339995100035&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1932-6203 |
| IngestDate | Fri Nov 26 17:12:39 EST 2021 Fri Oct 03 12:31:52 EDT 2025 Tue Nov 04 01:54:20 EST 2025 Sat Sep 27 21:29:21 EDT 2025 Tue Oct 07 07:19:01 EDT 2025 Sat Nov 29 13:01:01 EST 2025 Sat Nov 29 10:16:47 EST 2025 Wed Nov 26 10:35:21 EST 2025 Wed Nov 26 10:28:54 EST 2025 Thu May 22 21:22:34 EDT 2025 Mon Jul 21 06:04:23 EDT 2025 Sat Nov 29 06:10:49 EST 2025 Tue Nov 18 22:33:08 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 8 |
| Language | English |
| License | This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. Creative Commons Attribution License |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c802t-5784f86f246efef66fcca79fb17d1bf631386d4a85ac8296d09696b0e5517eb43 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Competing Interests: The authors have declared that no competing interests exist. Conceived and designed the experiments: SS MK TI YA. Performed the experiments: SS. Analyzed the data: SS. Contributed reagents/materials/analysis tools: SS MK TI. Contributed to the writing of the manuscript: SS MK TI YA. |
| OpenAccessLink | http://dx.doi.org/10.1371/journal.pone.0103833 |
| PMID | 25099887 |
| PQID | 1551699784 |
| PQPubID | 1436336 |
| ParticipantIDs | plos_journals_1551699784 doaj_primary_oai_doaj_org_article_a35c7edd90f94932a0d72043a039a1e4 pubmedcentral_primary_oai_pubmedcentral_nih_gov_4123905 proquest_miscellaneous_1552370554 proquest_journals_1551699784 gale_infotracmisc_A418707385 gale_infotracacademiconefile_A418707385 gale_incontextgauss_ISR_A418707385 gale_incontextgauss_IOV_A418707385 gale_healthsolutions_A418707385 pubmed_primary_25099887 crossref_citationtrail_10_1371_journal_pone_0103833 crossref_primary_10_1371_journal_pone_0103833 |
| PublicationCentury | 2000 |
| PublicationDate | 2014-08-06 |
| PublicationDateYYYYMMDD | 2014-08-06 |
| PublicationDate_xml | – month: 08 year: 2014 text: 2014-08-06 day: 06 |
| PublicationDecade | 2010 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States – name: San Francisco – name: San Francisco, USA |
| PublicationTitle | PloS one |
| PublicationTitleAlternate | PLoS One |
| PublicationYear | 2014 |
| Publisher | Public Library of Science Public Library of Science (PLoS) |
| Publisher_xml | – name: Public Library of Science – name: Public Library of Science (PLoS) |
| References | PJ Turnbaugh (ref2) 2006; 444 M Kanehisa (ref12) 2012; 40 Y Zhao (ref7) 2012; 28 M Ghodsi (ref10) 2009; 2009 SF Altschul (ref4) 1997; 25 ref8 PD Vouzis (ref9) 2011; 27 WR Pearson (ref14) 1991; 11 Y Ye (ref6) 2011; 12 M Kanehisa (ref11) 2000; 28 (ref13) 2012; 486 ref3 WJ Kent (ref5) 2002; 12 SF Altschul (ref15) 2005; 272 K Kurokawa (ref1) 2007; 14 |
| References_xml | – volume: 12 start-page: 159 year: 2011 ident: ref6 article-title: RAPSearch: a fast protein similarity search tool for short reads publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-12-159 – volume: 2009 start-page: 83 year: 2009 ident: ref10 article-title: Inexact Local Alignment Search over Suffix Arrays publication-title: Proceedings IEEE International Conference on Bioinformatics and Biomedicine – volume: 28 start-page: 125 year: 2012 ident: ref7 article-title: RAPSearch2: a fast and memory-efficient protein similarity search tool for next-generation sequencing data publication-title: Bioinformatics doi: 10.1093/bioinformatics/btr595 – volume: 11 start-page: 635 year: 1991 ident: ref14 article-title: Searching protein sequence libraries: Comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms publication-title: Genomics doi: 10.1016/0888-7543(91)90071-L – volume: 272 start-page: 5101 year: 2005 ident: ref15 article-title: Protein database searches using compositionally adjusted substitution matrices publication-title: The FEBS journal doi: 10.1111/j.1742-4658.2005.04945.x – volume: 28 start-page: 27 year: 2000 ident: ref11 article-title: KEGG: Kyoto Encyclopedia of Genes and Genomes publication-title: Nucleic Acids Research doi: 10.1093/nar/28.1.27 – volume: 27 start-page: 182 year: 2011 ident: ref9 article-title: GPU-BLAST: using graphics processors to accelerate protein sequence alignment publication-title: Bioinformatics doi: 10.1093/bioinformatics/btq644 – volume: 14 start-page: 169 year: 2007 ident: ref1 article-title: Comparative metagenomics revealed commonly enriched gene sets in human gut microbiomes publication-title: DNA Research : an international journal for rapid publication of reports on genes and genomes doi: 10.1093/dnares/dsm018 – volume: 40 start-page: D109 year: 2012 ident: ref12 article-title: KEGG for integration and interpretation of large-scale molecular data sets publication-title: Nucleic Acids Research doi: 10.1093/nar/gkr988 – ident: ref8 – volume: 486 start-page: 207 year: 2012 ident: ref13 article-title: Structure, function and diversity of the healthy human microbiome publication-title: Nature doi: 10.1038/nature11234 – volume: 444 start-page: 1027 year: 2006 ident: ref2 article-title: An obesity-associated gut microbiome with increased capacity for energy harvest publication-title: Nature doi: 10.1038/nature05414 – ident: ref3 doi: 10.1016/S0022-2836(05)80360-2 – volume: 12 start-page: 656 year: 2002 ident: ref5 article-title: BLAT---The BLAST-Like Alignment Tool publication-title: Genome Research doi: 10.1101/gr.229202. Article published online before March 2002 – volume: 25 start-page: 3389 year: 1997 ident: ref4 article-title: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs publication-title: Nucleic Acids Research doi: 10.1093/nar/25.17.3389 |
| SSID | ssj0053866 |
| Score | 2.4246883 |
| Snippet | DNA sequences are translated into protein coding sequences and then further assigned to protein families in metagenomic analyses, because of the need for... |
| SourceID | plos doaj pubmedcentral proquest gale pubmed crossref |
| SourceType | Open Website Open Access Repository Aggregation Database Index Database Enrichment Source |
| StartPage | e103833 |
| SubjectTerms | Algorithms Alternating current Analysis Arrays Bioinformatics Biology and Life Sciences Computer applications Cost analysis Data bases Data processing Databases, Genetic Deoxyribonucleic acid DNA Downloading Engineering Gene sequencing Genomes Homology Information science Nucleotide sequence Protein families Proteins Queries Research and Analysis Methods Search algorithms Seeds Sensitivity Sensitivity analysis Sequence Alignment - methods Sequence Homology Software |
| SummonAdditionalLinks | – databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9NAEF6hqAcuiJZHUwosCAk4uLW9G6_NrSBKkVCLaEG5Wet9JJYSO7ITBP-emfXGqlGlcuCYzGcrmZmdnbFnvyHklc007Dsc17eJA66YCYqYiwAKMSQbgZJEuYPCX8T5eTqdZl-vjfrCnrCOHrhT3LFkEyWM1lloMw7Jhgw1zlVhMmSZjIxjAg1Fti2muhgMqzhJ_EE5JqJjb5ejVV2ZI5xskDI22IgcX38flUerRd3elHL-3Tl5bSs6vU_u-RySnnS_fZfcMdUe2fWrtKVvPJX02wdk-ens4vJq-o7Kipbu8YHRdNs9Tef10j1Up527U7mY1U25ni8pdsPPqKQAbEC8sbb8RWXTyN9wJw0CbCzFDXAge0i-n368-nAW-AELgUrDeB3AauU2TWzME2ONTRIL9hSZLSKho8ImLAI9ai7TiVRpnCU6RC6dIjSQZglTcPaIjCpQ6T6hcRpCJoNsdazgKlLZJLRSJ_DByjjWZkzYVtu58uzjOARjkbtXagKqkE55Odoo9zYak6C_atWxb9yCf4-G7LHIne2-AI_KvUflt3nUmDxHN8i7g6h9BMhPeATBDdl_xuSlQyB_RoUNOjO5adv888WPfwBdfhuAXnuQrUEdSvpDEfCfkJdrgDwcICEKqIF4H512q5U2796AZmBhuHLryDeLX_RivCk23VWm3jhMzJBtCTCPO7_vNQuZMxTqqRgTMVgRA9UPJVU5d_TlHJIliAQH_8NWT8hdyGC568hMDslo3WzMU7Kjfq7LtnnmYsIfMK1mGg priority: 102 providerName: Directory of Open Access Journals – databaseName: Nursing & Allied Health Database dbid: 7RV link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lb9NAEF5B4ACHQsujgQILQgIObm3vxg8uKDxKkFALTVvlZq33kUZK7DROEPx7ZtYbU6MKkDg6M7bjee3s7uw3hDw3qYJxh6N_69DjkmkvD3nswUQMwUZgSiLtQeHP8cFBMhqlX9yCW-XKKtcx0QZqVUpcI9-rd3RgzsPfzM897BqFu6uuhcZVci3A3BjsOT46XUdi8OUocsflWBzsOe3szstC72J_g4Sx1nBkUfub2NyZT8vqssTz9_rJCwPS_q3__ZTbZMOlorRf284muaKLLXLzAkDhFtl0rl_Rlw6f-tUdMvs4OBwej17TfkHrNQmt6NCVZNNBObMr9bQuZKb96RjevTybUVudQAX9utILIK-MmXyHty_EDyoKBYT3YilwVG3R7pKT_Q_H7wae69rgycQPlx6EAG6SyIQ80kabKDJgJHFq8iBWQW4iFoBaFBdJT8gkTCPlI0BP7muQSqxzzu6RTgEa2iY0THxIjxACj-VcBjLt-UaoCC6MCEOlu4StlZdJB2mOnTWmmd2ni2FqU4syQ5VnTuVd4jV3zWtIj7_wv0W7aHgRkNv-UC7GmfPvTLCejLVSqW9SDjmx8BW2_2HCZ6kINO-SJ2hVWX26tQkrWZ8HEDERUqhLnlkOBOUosOpnLFZVlX06PP0HpuFRi-mFYzIliEMKd9ICvgnBvlqcOy1OCC2yRd5GH1hLpcp-WS7cubbty8lPGzI-FCv5Cl2uLE_IEMIJeO7XbtRIFtJxmP0ncZfELQdrib5NKSZnFhOdQwYG4eXBn__WQ3IDEl5uCzijHdJZLlb6Ebkuvy0n1eKxDR4_ARj6d7k priority: 102 providerName: ProQuest |
| Title | GHOSTX: An Improved Sequence Homology Search Algorithm Using a Query Suffix Array and a Database Suffix Array |
| URI | https://www.ncbi.nlm.nih.gov/pubmed/25099887 https://www.proquest.com/docview/1551699784 https://www.proquest.com/docview/1552370554 https://pubmed.ncbi.nlm.nih.gov/PMC4123905 https://doaj.org/article/a35c7edd90f94932a0d72043a039a1e4 http://dx.doi.org/10.1371/journal.pone.0103833 |
| Volume | 9 |
| WOSCitedRecordID | wos000339995100035&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: DOA dateStart: 20060101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: M~E dateStart: 20060101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre – providerCode: PRVPQU databaseName: Agriculture Science Database customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: M0K dateStart: 20061201 isFulltext: true titleUrlDefault: https://search.proquest.com/agriculturejournals providerName: ProQuest – providerCode: PRVPQU databaseName: Biological Science Database customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: M7P dateStart: 20061201 isFulltext: true titleUrlDefault: http://search.proquest.com/biologicalscijournals providerName: ProQuest – providerCode: PRVPQU databaseName: Engineering Database customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: M7S dateStart: 20061201 isFulltext: true titleUrlDefault: http://search.proquest.com providerName: ProQuest – providerCode: PRVPQU databaseName: Environmental Science Database customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: PATMY dateStart: 20061201 isFulltext: true titleUrlDefault: http://search.proquest.com/environmentalscience providerName: ProQuest – providerCode: PRVPQU databaseName: Health & Medical Collection customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: 7X7 dateStart: 20061201 isFulltext: true titleUrlDefault: https://search.proquest.com/healthcomplete providerName: ProQuest – providerCode: PRVPQU databaseName: Materials Science Database (ProQuest) customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: KB. dateStart: 20061201 isFulltext: true titleUrlDefault: http://search.proquest.com/materialsscijournals providerName: ProQuest – providerCode: PRVPQU databaseName: Nursing & Allied Health Database customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: 7RV dateStart: 20061201 isFulltext: true titleUrlDefault: https://search.proquest.com/nahs providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest advanced technologies & aerospace journals customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: P5Z dateStart: 20061201 isFulltext: true titleUrlDefault: https://search.proquest.com/hightechjournals providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central (ProQuest) customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: BENPR dateStart: 20061201 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: Public Health Database customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: 8C1 dateStart: 20061201 isFulltext: true titleUrlDefault: https://search.proquest.com/publichealth providerName: ProQuest – providerCode: PRVPQU databaseName: Publicly Available Content Database customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: PIMPY dateStart: 20061201 isFulltext: true titleUrlDefault: http://search.proquest.com/publiccontent providerName: ProQuest – providerCode: PRVATS databaseName: Public Library of Science (PLoS) Journals Open Access customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: FPL dateStart: 20060101 isFulltext: true titleUrlDefault: http://www.plos.org/publications/ providerName: Public Library of Science |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3db9MwELdYxwM8ABsfK4xiEBLsISWJXTvhrR0rnbZ1oR1T4SVyEnur1CZVPxD895ydNCzTJuDlpPbOTXr2ne_s888IvVV-AvMO1fYtXYvGRFqRS7kFiZgGG4GUJDYHhY95v--NRn7wJ1G8toNPuPOh0GlzlqWyqW8l8AjZQJsuYUwnW93geO15wXYZK47H3dayMv0YlP7SF9dmk2xxU6B5vV7yygTUffi_r_4IPShCTdzOx8YWuiPTbXT_CgDhNtoqTHuB3xf403uP0fRz73R4NvqI2ynO1xxkgodFyTXuZVOzEo_zQmXcnlxk8_HycopN9QEW-MtKzoG9Umr8E54-F7-wSBNgfBJLoWfNCu8J-to9ONvvWcWtDFbs2e7SAhOnymPKpUwqqRhTMAi4ryKHJ06kGHGgGxIqvJaIPddnia0BeCJbQmzGZUTJU1RLQSE7CLueDeGPhrgjEY2d2G_ZSiQMPijhuomsI7LurDAuIMv1zRmT0OzDcUhdclWGWsNhoeE6sspWsxyy4y_yHT0OSlkNuG2-gK4MC_sNBWnFXCaJbyufQswr7ERf70OETXzhSFpHr_QoCvPTq6XbCNvUAY-oIYPq6I2R0KAbqa7quRCrxSI8PD3_B6HhoCL0rhBSGagjFsVJCvhPGsyrIrlbkQTXEVfYO3rMr7WyCPNtUx96GFqu7eBm9uuSrX9UV-qlMlsZGZdoiCaQeZabTalZCLchu_d4HfGKQVVUX-Wk40uDeU4hwgL38fz2N36B7kEwS01xJttFteV8JV-iu_GP5Xgxb6ANPjjXdMQN9YB6-04DbXYO-sGgYdZiGsadAD3qNIGe2Eea8sDQIdCg9R1aBIcnwbff5XVyUQ |
| linkProvider | Public Library of Science |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lb9NAEF6VgAQcgJZHA4UuCAQc3NrejR9ICAVKSdTQFhJQxMWs7d00UmKHOAH6p_iNzKwf1KgCLj1wTGbseCfz9M5-Q8hD5ccQdzjat7QNHjFphDZ3DSjEEGwESpJIHxTuufv73nDoH66QH-VZGGyrLH2idtRxGuE78u18RwdqHv5i9sXAqVG4u1qO0MjVYk8ef4OSLXve3YH_95Ft774evOoYxVQBI_JMe2GAinLlOcrmjlRSOY6CRbi-Ci03tkLlMIt5TsyF1xKRZ_tObCKATGhKeABXhpzBfc-R85zbJlrRYetT6fnBdzhOcTyPudZ2oQ1bszSRWzhPwWOsFv70lIAqFjRmkzQ7LdH9vV_zRADcvfq_ie4auVKk2rSd28YqWZHJGrl8AoBxjawWri2jTwr87afXyfRN56A_GD6j7YTm71xkTPtFyzntpFO9E0HzRm3anoxgrYujKdXdF1TQd0s5B_JSqfF3-PW5OKYiiYGwIxYCs4Ya7Qb5cCZCuEkaCWjEOqG2Z0L6hxB_LOSRFfktU4nYgQ9K2HYsm4SVyhJEBWQ7Tg6ZBHof0oXSLRdlgCoWFCrWJEZ11SyHLPkL_0vUw4oXAcf1F-l8FBT-KxCsFbkyjn1T-RxyfmHGON6ICZP5wpK8STZRi4P89G7lNoM2tyAiIGRSkzzQHAg6kmBX00gssyzoHnz8B6b--xrT44JJpSCOSBQnSWBNCGZW49yocYLrjGrkdbS5UipZ8MtS4MrSlk4n36_IeFPsVExkutQ8NkOIKuC5lZttJVkoN3wfonqTuDWDrom-TknGRxrznUOGCe7z9p8fa5Nc7Aze9oJed3_vDrkEyT3XzarOBmks5kt5l1yIvi7G2fyedlyUfD5rc_8JRYbUxA |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lb9NAEF6VgBAcgJZHA4UuCAQc3NjejR9ICAVCSNSqTUlBERezsXfTSIkd4gToX-PXMWOvTY0q4NIDx2TGjncyT-_sN4Q8Vn4EcYejfUvb4CGTxsjmrgGFGIKNQEkSZgeF99z9fW849Ptr5EdxFgbbKgufmDnqKAnxHXkj39GBmoc3lG6L6Lc7r-ZfDJwghTutxTiNXEV25ck3KN_Sl702_NdPbLvz9uhN19ATBozQM-2lAerKlecomztSSeU4Chbk-mpkuZE1Ug6zmOdEXHhNEXq270QmgsmMTAkP48oRZ3DfC-SiCzUmthP2m5-KKAB-xHH0UT3mWg2tGTvzJJY7OFvBY6wSCrOJAWVcqM2nSXpW0vt77-apYNi5_j-L8Qa5plNw2sptZp2syXiDXD0FzLhB1rXLS-kzjcv9_CaZveseDI6GL2grpvm7GBnRgW5Fp91klu1Q0LyBm7amY1jr8nhGs64MKujhSi6AvFJq8h1-fSFOqIgjILTFUmA2UaHdIh_ORQi3SS0G7dgk1PZMSAsR-o-NeGiFftNUInLggxK2Hck6YYXiBKGGcseJItMg2590oaTLRRmgugVa3erEKK-a51Amf-F_jTpZ8iIQefZFshgH2q8FgjVDV0aRbyqfQy0gzAjHHjFhMl9YktfJNmp0kJ_qLd1p0OIWRAqEUqqTRxkHgpHEqI9jsUrToHfw8R-YBu8rTE81k0pAHKHQJ0xgTQhyVuHcqnCCSw0r5E20v0IqafDLauDKwq7OJj8syXhT7GCMZbLKeGyG0FXAcyc34VKyUIb4PkT7OnErxl0RfZUST44zLHgOmSe41bt_fqxtchmsPNjr7e_eI1cg5-dZD6uzRWrLxUreJ5fCr8tJuniQ-TBKPp-3tf8E1D7djg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=GHOSTX%3A+An+Improved+Sequence+Homology+Search+Algorithm+Using+a+Query+Suffix+Array+and+a+Database+Suffix+Array&rft.jtitle=PloS+one&rft.au=Suzuki%2C+Shuji&rft.au=Kakuta%2C+Masanori&rft.au=Ishida%2C+Takashi&rft.au=Akiyama%2C+Yutaka&rft.date=2014-08-06&rft.pub=Public+Library+of+Science&rft.issn=1932-6203&rft.eissn=1932-6203&rft.volume=9&rft.issue=8&rft_id=info:doi/10.1371%2Fjournal.pone.0103833&rft.externalDBID=ISR&rft.externalDocID=A418707385 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1932-6203&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1932-6203&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1932-6203&client=summon |