Combinatorial Optimization Machine Learning Algorithms and Statistical Modeling in Genomics

The dissertation contains a broad set of algorithmic questions that arise in machine learning and combinatorics. We have exploited the special combinatorial structure of the problem in order to improve the running time. We also use optimization techniques in statistical modeling and machine learning...

Celý popis

Uloženo v:
Podrobná bibliografie
Hlavní autor: Le, Thong
Médium: Dissertation
Jazyk:angličtina
Vydáno: ProQuest Dissertations & Theses 01.01.2019
Témata:
ISBN:1085587215, 9781085587211
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:The dissertation contains a broad set of algorithmic questions that arise in machine learning and combinatorics. We have exploited the special combinatorial structure of the problem in order to improve the running time. We also use optimization techniques in statistical modeling and machine learning to solve some problems in genomics, and improve the robustness of deep neural network models. There are three main results in the dissertation.1) The matrix-chain multiplication problem is a classic problem that is widely taught to illustrate dynamic programming. The textbook solution runs in Θ(n3) time. Based on triangulating convex polygons, we give a complete correct proofs and implementation details of an O(n2) algorithm. We also extend the solution to a more general class of problems and give an approximation algorithm which runs in linear time.2) Several algorithms have been developed that use high throughput sequencing technology (HTS) characterize structural variations (SV). Most of the existing approaches focus on detecting relatively simple types of SVs such as insertions, deletions, and short inversions. In fact, complex SVs are of crucial importance and several have been associated with genomic disorders. To better understand the contribution of complex SVs to human disease, we need new algorithms to accurately discover and genotype such variants. We gives a novel statistical modeling method to characterize complex structural variation (SV) in genome.3) We study how to attack a machine learning models so that we can improve the robustness of deep neural networks. We propose a novel way to formulate the hard-label black-box attack as a real-valued optimization problem which is usually continuous and can be solved by any zeroth order optimization algorithm. We demonstrate that our proposed method outperforms the previous random walk approach on attacking convolutional neural networks on MNIST, CIFAR, and ImageNet datasets. More interestingly, we show that the proposed algorithm can also be used to attack other discrete and non-continuous machine learning models, such as Gradient Boosting Trees.
Bibliografie:SourceType-Dissertations & Theses-1
ObjectType-Dissertation/Thesis-1
content type line 12
ISBN:1085587215
9781085587211