Visual Explainability and Robustness through Language -Dep.Computer Science-University of Verona

Follow on

Speaker: Riccardo Volpi - Naver Labs Europe - France

Tuesday, June 4, 2024 at 1:45 PM Aula C (solo presenza)

Abstract: In recent years, the vision-and-language paradigm has revolutionized the way we learn and rely on computer vision models. A major drawback of learning visual representation has always been the lack of data: when coupling our vision model with large, pre-trained language models we can partially mitigate these issues by building on large amounts of previously learned information. In this talk, we will discuss how using language can i) broaden the comfort zone of model vision models for tasks such as object detection and classification and ii) improve their interpretability. We will go through the basis of the vision-and-language paradigm, highlight some of its inherent limitations and discuss some innovative solutions, for example to make CLIP-like models robust to arbitrary vocabularies selected by the user.

Programme Director: Vittorio Murino
External reference
Publication date: April 18, 2024

Share

Strada le Grazie 15
37134 Verona
VAT number01541040232
Italian Fiscal Code93009870234

Follow on

Play store Apple Store

Overview

Organisation

Contact us

Research in brief

Research activities

Facilities

Courses

PhD programmes and postgraduate training

Teaching services

Information for community

Innovation and partnership

Contact us

Visual Explainability and Robustness through Language

Studying

Courses

PhD programmes and postgraduate training

Studying

Courses

PhD programmes and postgraduate training