αILP: thinking visual scenes as differentiable logic programs
Deep neural learning has shown remarkable performance at learning representations for visual object categorization. However, deep neural networks such as CNNs do not explicitly encode objects and relations among them. This limits their success on tasks that require a deep logical understanding of vi...
Uloženo v:
| Vydáno v: | Machine learning Ročník 112; číslo 5; s. 1465 - 1497 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
Springer US
01.05.2023
Springer Nature B.V |
| Témata: | |
| ISSN: | 0885-6125, 1573-0565 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Deep neural learning has shown remarkable performance at learning representations for visual object categorization. However, deep neural networks such as CNNs do not explicitly encode objects and relations among them. This limits their success on tasks that require a deep logical understanding of visual scenes, such as Kandinsky patterns and Bongard problems. To overcome these limitations, we introduce
α
ILP
, a novel differentiable inductive logic programming framework that learns to represent scenes as logic programs—intuitively, logical atoms correspond to objects, attributes, and relations, and clauses encode high-level scene information.
α
ILP has an end-to-end reasoning architecture from visual inputs. Using it,
α
ILP performs differentiable inductive logic programming on complex visual scenes, i.e., the logical rules are learned by gradient descent. Our extensive experiments on
Kandinsky patterns
and
CLEVR-Hans
benchmarks demonstrate the accuracy and efficiency of
α
ILP
in learning complex visual-logical concepts. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0885-6125 1573-0565 |
| DOI: | 10.1007/s10994-023-06320-1 |