Advances in Big Data Bio Analytics

Delivering effective data analytics is of crucial importance to the interpretation of the multitude of biological datasets currently generated by an ever increasing number of high throughput techniques. Logic programming has much to offer in this area. Here, we detail advances that highlight two of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org
Hauptverfasser: Angelopoulos, Nicos, Wielemaker, Jan
Format: Paper
Sprache:Englisch
Veröffentlicht: Ithaca Cornell University Library, arXiv.org 18.09.2019
Schlagworte:
ISSN:2331-8422
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Delivering effective data analytics is of crucial importance to the interpretation of the multitude of biological datasets currently generated by an ever increasing number of high throughput techniques. Logic programming has much to offer in this area. Here, we detail advances that highlight two of the strengths of logical formalisms in developing data analytic solutions in biological settings: access to large relational databases and building analytical pipelines collecting graph information from multiple sources. We present significant advances on the bio_db package which serves biological databases as Prolog facts that can be served either by in-memory loading or via database backends. These advances include modularising the underlying architecture and the incorporation of datasets from a second organism (mouse). In addition, we introduce a number of data analytics tools that operate on these datasets and are bundled in the analysis package: bio_analytics. Emphasis in both packages is on ease of installation and use. We highlight the general architecture of our components based approach. An experimental graphical user interface via SWISH for local installation is also available. Finally, we advocate that biological data analytics is a fertile area which can drive further innovation in applied logic programming.
AbstractList Delivering effective data analytics is of crucial importance to the interpretation of the multitude of biological datasets currently generated by an ever increasing number of high throughput techniques. Logic programming has much to offer in this area. Here, we detail advances that highlight two of the strengths of logical formalisms in developing data analytic solutions in biological settings: access to large relational databases and building analytical pipelines collecting graph information from multiple sources. We present significant advances on the bio_db package which serves biological databases as Prolog facts that can be served either by in-memory loading or via database backends. These advances include modularising the underlying architecture and the incorporation of datasets from a second organism (mouse). In addition, we introduce a number of data analytics tools that operate on these datasets and are bundled in the analysis package: bio_analytics. Emphasis in both packages is on ease of installation and use. We highlight the general architecture of our components based approach. An experimental graphical user interface via SWISH for local installation is also available. Finally, we advocate that biological data analytics is a fertile area which can drive further innovation in applied logic programming.
Author Angelopoulos, Nicos
Wielemaker, Jan
Author_xml – sequence: 1
  givenname: Nicos
  surname: Angelopoulos
  fullname: Angelopoulos, Nicos
– sequence: 2
  givenname: Jan
  surname: Wielemaker
  fullname: Wielemaker, Jan
BookMark eNotjktLw0AURgdRsNb-AHdB14l37p3HzTLWJxTcdF9mJjOSUhLNpEX_vQFdfQcOHL4rcd4PfRTiRkKlWGu4d-N3d6pkDXUFjFqdiQUSyZIV4qVY5bwHADQWtaaFuG3ak-tDzEXXFw_dR_HoJjfDUDS9O_xMXcjX4iK5Q46r_12K7fPTdv1abt5f3tbNpnQaZSmVdSYlNNFGy8ZCCBASO8-zZSJUvpU22ATAXpNLjDL6ENo2KB2tp6W4-8t-jsPXMeZptx-O43wi7xBrImZlJP0CvbNAuQ
ContentType Paper
Copyright 2019. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2019. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID 8FE
8FG
ABJCF
ABUWG
AFKRA
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
HCIFZ
L6V
M7S
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
DOI 10.48550/arxiv.1909.08254
DatabaseName ProQuest SciTech Collection
ProQuest Technology Collection
Materials Science & Engineering Collection
ProQuest Central (Alumni Edition)
ProQuest Central UK/Ireland
ProQuest Central Essentials
ProQuest Central
Technology Collection
ProQuest One Community College
ProQuest Central Korea
SciTech Premium Collection
ProQuest Engineering Collection
Engineering Database
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering Collection
DatabaseTitle Publicly Available Content Database
Engineering Database
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Engineering Collection
ProQuest One Academic UKI Edition
ProQuest Central Korea
Materials Science & Engineering Collection
ProQuest Central (New)
ProQuest One Academic
ProQuest One Academic (New)
Engineering Collection
DatabaseTitleList Publicly Available Content Database
Database_xml – sequence: 1
  dbid: PIMPY
  name: Publicly Available Content Database (subscription)
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Architecture
Physics
EISSN 2331-8422
Genre Working Paper/Pre-Print
GroupedDBID 8FE
8FG
ABJCF
ABUWG
AFKRA
ALMA_UNASSIGNED_HOLDINGS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
FRJ
HCIFZ
L6V
M7S
M~E
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
ID FETCH-LOGICAL-a521-147a6ff26e7e78670cc0cf8ab852183324bd17c7f008b53af821ebccddc45e7b3
IEDL.DBID M7S
IngestDate Mon Jun 30 09:26:02 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a521-147a6ff26e7e78670cc0cf8ab852183324bd17c7f008b53af821ebccddc45e7b3
Notes SourceType-Working Papers-1
ObjectType-Working Paper/Pre-Print-1
content type line 50
OpenAccessLink https://www.proquest.com/docview/2293388461?pq-origsite=%requestingapplication%
PQID 2293388461
PQPubID 2050157
ParticipantIDs proquest_journals_2293388461
PublicationCentury 2000
PublicationDate 20190918
PublicationDateYYYYMMDD 2019-09-18
PublicationDate_xml – month: 09
  year: 2019
  text: 20190918
  day: 18
PublicationDecade 2010
PublicationPlace Ithaca
PublicationPlace_xml – name: Ithaca
PublicationTitle arXiv.org
PublicationYear 2019
Publisher Cornell University Library, arXiv.org
Publisher_xml – name: Cornell University Library, arXiv.org
SSID ssj0002672553
Score 1.699848
SecondaryResourceType preprint
Snippet Delivering effective data analytics is of crucial importance to the interpretation of the multitude of biological datasets currently generated by an ever...
SourceID proquest
SourceType Aggregation Database
SubjectTerms Analytics
Architecture
Data analysis
Datasets
Exact solutions
Graphical user interface
Logic programming
Mathematical analysis
Prolog
Relational data bases
Title Advances in Big Data Bio Analytics
URI https://www.proquest.com/docview/2293388461
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07T8MwED5BCxJiAAqIR6kixOq2iZPYmVALrWCgiqBDmSrHPldZ0pKUip-P46aAhMTCZsuDn3ff-b7THcCNERmKSmuiDDoTI4mUCKk94lHmBTzhWjJhi02w0YhPJlFcOdyKKqxyoxOtolZzWfrIO57BJcoNWrq3izdSVo0q2dWqhMY21MssCa4N3Xv58rF4oZkwoGsy06bu6oj8I121DQpGbfs5-qWCLa4MD_67okOox2KB-RFsYdaA_d4PWqABuza8UxbHcN1bU_2Fk2ZOP50592IpTGPu2KQkZarmExgPB-O7B1JVRyDCQC5xfSZCrb0QGTIesq6UXam5SHhQWj3GTkqUyyTTBuSTgArNPRcTKZWSfoAsoadQy-YZnoEjZKRQIxWYmM8aR654yJVPu8h9iSw6h-bmAKbVCy-m37u_-Hv4EvaMkWHjslzehNoyf8cr2JGrZVrkLaj3B6P4uWUvzvTix6f49ROGnKMN
linkProvider ProQuest
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LTwIxEJ4gaDQeVNT4QN0YPa6wLbvtHoxBkUBAQiIHbqTbh9kLIIuoP8r_6GwBNTHxxsFbkybNdtqZb6bz7QzABaoM1coYVyE6u6iJ1BXSEJdQRnwecSOZsM0mWLvNe72wk4GPxb8wKa1yYROtoVZDmb6RFwniEuWIlt7N6NlNu0al2dVFC43ZtWjq91cM2ZLrRhXP95KQ2n33ru7Ouwq4AqHK9cpMBMaQQDPNeMBKUpak4SLifuotoH8RKY9JZhAcI58Kw4mnIymVkmVfs4jisiuQQy-ChJYp-Pj1pEMC3J9PZ7lTWymsKMZv8fQKQTe8srHYL4tvYay29c8EsA25jhjp8Q5k9CAPm5UfSY88rFnyqkx24bwyIzIkTjxwbuMnpyomAgdDx5ZcSQtR70F3GR-5D9nBcKAPwBEyVNpoKnSEoSjXXPGAqzItaV6WmoWHUFjIuz_X36T_Leyjv6fPYL3efWj1W4128xg20J2yDDSPFyA7Gb_oE1iV00mcjE_tXXGgv-Sj-QSSmvyd
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Advances+in+Big+Data+Bio+Analytics&rft.jtitle=arXiv.org&rft.au=Angelopoulos%2C+Nicos&rft.au=Wielemaker%2C+Jan&rft.date=2019-09-18&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422&rft_id=info:doi/10.48550%2Farxiv.1909.08254