Video Relationship Detection Using Mixture of Experts

Machine comprehension of visual information from images and videos by neural networks faces two primary challenges. Firstly, there exists a computational and inference gap in connecting vision and language, making it difficult to accurately determine which object a given agent acts on and represent...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org
Main Authors: Shaabana, Ala, Gharaee, Zahra, Fieguth, Paul
Format: Paper
Language:English
Published: Ithaca Cornell University Library, arXiv.org 06.03.2024
Subjects:
ISSN:2331-8422
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Be the first to leave a comment!
You must be logged in first