Maximal-Sum submatrix search using a hybrid contraint programming/linear programming approach

•The Maximum-Sum Submatrix problem aims at finding submatrices of maximum sum.•Two upper bounds are proposed for the problem, both based on linear relaxations.•A reduced-cost filtering algorithm is proposed for constraint programming solvers.•Large instances are tackled using Large Neighborhood Sear...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:European journal of operational research Ročník 297; číslo 3; s. 853 - 865
Hlavní autoři: Derval, Guillaume, Schaus, Pierre
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 16.03.2022
Témata:
ISSN:0377-2217, 1872-6860
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:•The Maximum-Sum Submatrix problem aims at finding submatrices of maximum sum.•Two upper bounds are proposed for the problem, both based on linear relaxations.•A reduced-cost filtering algorithm is proposed for constraint programming solvers.•Large instances are tackled using Large Neighborhood Search.•Improved performance of CP on synthetic and real-word instances against MIP solvers. A Maximal-Sum Submatrix (MSS) maximizes the sum of the entries corresponding to the Cartesian product of a subset of rows and columns from an original matrix (with positive and negative entries). Despite being NP-hard, this recently introduced problem was already proven to be useful for practical data-mining applications. It was used for identifying bi-clusters in gene expression data or to extract a submatrix that is then visualized in a circular plot. The state-of-the-art results for MSS are obtained using an advanced Constraint Programing approach that combines a custom filtering algorithm with a Large Neighborhood Search. We improve the state-of-the-art approach by introducing new upper bounds based on linear and mixed-integer programming formulations, along with dedicated pruning algorithms. We experiment on both synthetic and real-life data, and show that our approach outperforms the previous methods.
ISSN:0377-2217
1872-6860
DOI:10.1016/j.ejor.2021.06.008