Combinatorial Optimization Machine Learning Algorithms and Statistical Modeling in Genomics

The dissertation contains a broad set of algorithmic questions that arise in machine learning and combinatorics. We have exploited the special combinatorial structure of the problem in order to improve the running time. We also use optimization techniques in statistical modeling and machine learning...

Celý popis

Uložené v:
Podrobná bibliografia
Hlavný autor: Le, Thong
Médium: Dissertation
Jazyk:English
Vydavateľské údaje: ProQuest Dissertations & Theses 01.01.2019
Predmet:
ISBN:1085587215, 9781085587211
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract The dissertation contains a broad set of algorithmic questions that arise in machine learning and combinatorics. We have exploited the special combinatorial structure of the problem in order to improve the running time. We also use optimization techniques in statistical modeling and machine learning to solve some problems in genomics, and improve the robustness of deep neural network models. There are three main results in the dissertation.1) The matrix-chain multiplication problem is a classic problem that is widely taught to illustrate dynamic programming. The textbook solution runs in Θ(n3) time. Based on triangulating convex polygons, we give a complete correct proofs and implementation details of an O(n2) algorithm. We also extend the solution to a more general class of problems and give an approximation algorithm which runs in linear time.2) Several algorithms have been developed that use high throughput sequencing technology (HTS) characterize structural variations (SV). Most of the existing approaches focus on detecting relatively simple types of SVs such as insertions, deletions, and short inversions. In fact, complex SVs are of crucial importance and several have been associated with genomic disorders. To better understand the contribution of complex SVs to human disease, we need new algorithms to accurately discover and genotype such variants. We gives a novel statistical modeling method to characterize complex structural variation (SV) in genome.3) We study how to attack a machine learning models so that we can improve the robustness of deep neural networks. We propose a novel way to formulate the hard-label black-box attack as a real-valued optimization problem which is usually continuous and can be solved by any zeroth order optimization algorithm. We demonstrate that our proposed method outperforms the previous random walk approach on attacking convolutional neural networks on MNIST, CIFAR, and ImageNet datasets. More interestingly, we show that the proposed algorithm can also be used to attack other discrete and non-continuous machine learning models, such as Gradient Boosting Trees.
AbstractList The dissertation contains a broad set of algorithmic questions that arise in machine learning and combinatorics. We have exploited the special combinatorial structure of the problem in order to improve the running time. We also use optimization techniques in statistical modeling and machine learning to solve some problems in genomics, and improve the robustness of deep neural network models. There are three main results in the dissertation.1) The matrix-chain multiplication problem is a classic problem that is widely taught to illustrate dynamic programming. The textbook solution runs in Θ(n3) time. Based on triangulating convex polygons, we give a complete correct proofs and implementation details of an O(n2) algorithm. We also extend the solution to a more general class of problems and give an approximation algorithm which runs in linear time.2) Several algorithms have been developed that use high throughput sequencing technology (HTS) characterize structural variations (SV). Most of the existing approaches focus on detecting relatively simple types of SVs such as insertions, deletions, and short inversions. In fact, complex SVs are of crucial importance and several have been associated with genomic disorders. To better understand the contribution of complex SVs to human disease, we need new algorithms to accurately discover and genotype such variants. We gives a novel statistical modeling method to characterize complex structural variation (SV) in genome.3) We study how to attack a machine learning models so that we can improve the robustness of deep neural networks. We propose a novel way to formulate the hard-label black-box attack as a real-valued optimization problem which is usually continuous and can be solved by any zeroth order optimization algorithm. We demonstrate that our proposed method outperforms the previous random walk approach on attacking convolutional neural networks on MNIST, CIFAR, and ImageNet datasets. More interestingly, we show that the proposed algorithm can also be used to attack other discrete and non-continuous machine learning models, such as Gradient Boosting Trees.
Author Le, Thong
Author_xml – sequence: 1
  givenname: Thong
  surname: Le
  fullname: Le, Thong
BookMark eNotjTFPwzAUhC0BErT0P1hijuTYMX4eqwgKUqoO7cZQ2Y7TGiXPJXYXfj1GcMPd8t3dgtxiRH9DFjUDKUHxWt6TVUrBMsa0EKzhD-SjjZMNaHKcgxnp7pLDFL5NDhHp1rhzQE87b2YMeKLr8VSwfJ4SNdjTfS5cysGV4jb2fvxlAtKNxzgFlx7J3WDG5Ff_uSSH15dD-1Z1u817u-4qlExXjQStmAHLuQUQthdGWed0kVBDA4MZQCqvngUfdF33PTSKO8UNFBNOiSV5-pu9zPHr6lM-fsbrjOXxyLmCmkkGWvwAEfBQuA
ContentType Dissertation
Copyright Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Copyright_xml – notice: Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
DBID 053
0BH
0MK
CBPLH
EU9
G20
M8-
PHGZT
PKEHL
PQEST
PQQKQ
PQUKI
DatabaseName Dissertations & Theses Europe Full Text: Science & Technology
ProQuest Dissertations and Theses Professional
Dissertations & Theses @ University of California
ProQuest Dissertations & Theses Global: The Sciences and Engineering Collection
ProQuest Dissertations & Theses A&I
ProQuest Dissertations & Theses Global
ProQuest Dissertations and Theses A&I: The Sciences and Engineering Collection
ProQuest One Academic (New)
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
DatabaseTitle Dissertations & Theses Europe Full Text: Science & Technology
ProQuest One Academic Middle East (New)
ProQuest One Academic UKI Edition
Dissertations & Theses @ University of California
ProQuest One Academic Eastern Edition
ProQuest Dissertations & Theses Global: The Sciences and Engineering Collection
ProQuest Dissertations and Theses Professional
ProQuest One Academic
ProQuest Dissertations & Theses A&I
ProQuest One Academic (New)
ProQuest Dissertations and Theses A&I: The Sciences and Engineering Collection
ProQuest Dissertations & Theses Global
DatabaseTitleList Dissertations & Theses Europe Full Text: Science & Technology
Database_xml – sequence: 1
  dbid: G20
  name: ProQuest Dissertations & Theses Global
  url: https://www.proquest.com/pqdtglobal1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
Genre Dissertation/Thesis
GroupedDBID 053
0BH
0MK
8R4
8R5
CBPLH
EU9
G20
M8-
PHGZT
PKEHL
PQEST
PQQKQ
PQUKI
Q2X
ID FETCH-LOGICAL-n509-458970a8b22b883bd3a7bcc999937f48faf857e7632f911dd8472c72a8c723c73
IEDL.DBID G20
ISBN 1085587215
9781085587211
IngestDate Mon Jun 30 05:05:37 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-n509-458970a8b22b883bd3a7bcc999937f48faf857e7632f911dd8472c72a8c723c73
Notes SourceType-Dissertations & Theses-1
ObjectType-Dissertation/Thesis-1
content type line 12
PQID 2278105089
PQPubID 18750
ParticipantIDs proquest_journals_2278105089
PublicationCentury 2000
PublicationDate 20190101
PublicationDateYYYYMMDD 2019-01-01
PublicationDate_xml – month: 01
  year: 2019
  text: 20190101
  day: 01
PublicationDecade 2010
PublicationYear 2019
Publisher ProQuest Dissertations & Theses
Publisher_xml – name: ProQuest Dissertations & Theses
SSID ssib000933042
Score 1.8182043
Snippet The dissertation contains a broad set of algorithmic questions that arise in machine learning and combinatorics. We have exploited the special combinatorial...
SourceID proquest
SourceType Aggregation Database
SubjectTerms Computer science
Title Combinatorial Optimization Machine Learning Algorithms and Statistical Modeling in Genomics
URI https://www.proquest.com/docview/2278105089
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LT8MwDLZgcEAcxlM8BsqBa0SXtot7QggYHGBwmNAkDlMe7UDaOlgHvx8npDAJiQuXSm1aqXFr57PjzwY4SWhAIBoe2SjhidQpR0wSbrXUHaQFLfXJmI-3stfDwSB7CAG3KqRV1jbRG2o7NS5Gfuoom4QFIszOXt-46xrldldDC41lWHHsWk_2XYQ_3966y7FPkbydNJR5qs_bv2ywX1i6zf--0gasXy7sqG_CUl5uQbPu1cCC6m7DE10iJ9i52PTHsXuyFJNAwWR3PqEyZ6HW6oidj0d02_x5UjFVWuYAqa_nTA-63mmOwc5eSnade05ztQP97lX_4oaHxgq8TF1TuRQzGSnUQmjEWNtYSW0MQUXCKkWChSowlTlZHlGQLbSWVjBhpFBIh9jIeBca5bTM94CR1NtCaW1icrustjpSpONZbJBwEGJnH1q16IZBOarhj9wO_h4-hDXCJ9lXxKMFjfnsPT-CVfNBk54d-2_9Cf-5svQ
linkProvider ProQuest
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LS8NAEB5qFRQP9YmPqnvQYzDdJN3JQUSstaUPPRQpeCjZ3aQKNtW2Kv4o_6OzaaIFwVsPXgLJJoHs7Hw732QeAMcuDXBEZdnadi1XSM9CdF1LSyHLSBualwRj3jVFu43drn-bg88sF8aEVWaYmAC1HirjIz81KZtkC9jonz-_WKZrlPm7mrXQmC6LRvjxTpRtfFavkHxPOK9edS5rVtpVwIo901HNQ1_YAUrOJaIjtRMIqRTZSbRRRy5GQYSeCEnteERAoDXBN1eCB0gHRwmHXrsAi64jbMP1rmetrW_ngAnp95DIlZdWlcrOS78gP9nHqoV_NgNrsFqZiRdYh1wYb0Ah60TBUmDahHu6RBTfOBBIn9gN4eAgTTBlrSRcNGRpJdk-u3jq022Th8GYBbFmxtxOqlXTg6YznMnPZ48xuw6TjO3xFnTm8XnbkI-HcbgDjLhiiQdSKodIpZZa2gEhmO8oJCsPsbwLxUxSvVT1x70fMe39PXwEy7VOq9lr1tuNfVghS8yf-naKkJ-MXsMDWFJvNAGjw2SZMejNWahfNEkNQQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adissertation&rft.genre=dissertation&rft.title=Combinatorial+Optimization+Machine+Learning+Algorithms+and+Statistical+Modeling+in+Genomics&rft.DBID=053%3B0BH%3B0MK%3BCBPLH%3BEU9%3BG20%3BM8-%3BPHGZT%3BPKEHL%3BPQEST%3BPQQKQ%3BPQUKI&rft.PQPubID=18750&rft.au=Le%2C+Thong&rft.date=2019-01-01&rft.pub=ProQuest+Dissertations+%26+Theses&rft.isbn=1085587215&rft.externalDBID=HAS_PDF_LINK
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781085587211/lc.gif&client=summon&freeimage=true
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781085587211/mc.gif&client=summon&freeimage=true
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781085587211/sc.gif&client=summon&freeimage=true