![]() Music |
![]() Video |
![]() Movies |
![]() Chart |
![]() Show |
![]() |
Masking Modalities for Cross-modal Video Retrieval (ComputerVisionFoundation Videos) View |
![]() |
Probabilistic Embeddings for Cross-Modal Retrieval (CVPR'21) (SangHyuk Chun) View |
![]() |
Cross-modal Adversarial Reprogramming (ComputerVisionFoundation Videos) View |
![]() |
A Transformer-based Cross-modal Fusion Model with Adversarial Training for VQA Challenge 2021 (MLP Lab) View |
![]() |
Florence: A New Foundation Model for Computer Vision (Tony Shin) View |
![]() |
How cross-modal are vision and language models really 👀 Seeing past words. [Own work] (AI Coffee Break with Letitia) View |
![]() |
ImageBERT (Connor Shorten) View |
![]() |
[CVPR'23 Highlight] AutoAD: Movie Description in Context (Tengda Han) View |
![]() |
Bridging Video Text Retrieval With Multiple Choice Questions | CVPR 2022 (Artificial Intelligence ) View |
![]() |
How Are Words Connected in our Minds Priming (The Ling Space) View |