Learning Visio-linguistic Embeddings for Interactive Fashion Product Retrieval-Dep.Computer Science-University of Verona

Follow on

Speaker: Loris Bazzani - Amazon

Monday, May 30, 2022 at 3:00 PM Sala Verde

Interactive product retrieval is an emerging research topic with the objective of integrating user inputs from multiple modalities as a query for retrieval. In this presentation, we discuss different solutions for the problem of composing images and language-based or attribute-based modifications for product retrieval in the context of fashion. We present Joint Visual Semantic Matching (JVSM), an unified model that learns image-text compositional embeddings by jointly associating visual and textual modalities in a shared discriminative embedding space via compositional losses. We also propose Visio-linguistic Attention Learning (VAL), a composite transformer that can be seamlessly plugged in a CNN to selectively preserve and transform the visual features conditioned on language semantics at different levels of granularity of information. Finally, we introduce Attribute-Driven Disentangled Encoder (ADDE), a model for disentangled representations based on attribute supervision, and tailor it to attribute manipulation, outfit retrieval and conditional image retrieval.

BIO: Loris is a Principal Computer Vision Scientist at Amazon in Berlin, Germany. He prototypes and develops video understanding models for Amazon Video, novel shopping experiences based on vision and language, innovative solutions in the field of Fashion AI and image-to-text models for improving the accessibility of images for the blind and visually impaired. He obtained his Ph.D. in Computer Science from the University of Verona (Italy) in 2012 supervised by Prof. Vittorio Murino and Prof. Marco Cristani. During his Ph.D., he spent 6 months at the University of British Columbia supervised by Prof. Nando de Freitas. Before the current position, he was a postdoctoral fellow at Dartmouth College working with Prof. Lorenzo Torresani and a postdoctoral fellow at the Italian Institute of Technology working with Prof. Vittorio Murino.

Programme Director: Marco Cristani
External reference
Publication date: May 3, 2022

Share

Strada le Grazie 15
37134 Verona
VAT number01541040232
Italian Fiscal Code93009870234

Follow on

Play store Apple Store

Overview

Organisation

Contact us

Research in brief

Research activities

Facilities

Courses

PhD programmes and postgraduate training

Teaching services

Information for community

Innovation and partnership

Contact us

Learning Visio-linguistic Embeddings for Interactive Fashion Product Retrieval

Studying

Courses

PhD programmes and postgraduate training

Studying

Courses

PhD programmes and postgraduate training