Learning Visio-linguistic Embeddings for Interactive Fashion Product Retrieval-Dip.Informatica-Università degli Studi di Verona

Segui su

Relatore: Loris Bazzani - Amazon

lunedì 30 maggio 2022 alle ore 15.00 Sala Verde

Interactive product retrieval is an emerging research topic with the objective of integrating user inputs from multiple modalities as a query for retrieval. In this presentation, we discuss different solutions for the problem of composing images and language-based or attribute-based modifications for product retrieval in the context of fashion. We present Joint Visual Semantic Matching (JVSM), an unified model that learns image-text compositional embeddings by jointly associating visual and textual modalities in a shared discriminative embedding space via compositional losses. We also propose Visio-linguistic Attention Learning (VAL), a composite transformer that can be seamlessly plugged in a CNN to selectively preserve and transform the visual features conditioned on language semantics at different levels of granularity of information. Finally, we introduce Attribute-Driven Disentangled Encoder (ADDE), a model for disentangled representations based on attribute supervision, and tailor it to attribute manipulation, outfit retrieval and conditional image retrieval.

BIO: Loris is a Principal Computer Vision Scientist at Amazon in Berlin, Germany. He prototypes and develops video understanding models for Amazon Video, novel shopping experiences based on vision and language, innovative solutions in the field of Fashion AI and image-to-text models for improving the accessibility of images for the blind and visually impaired. He obtained his Ph.D. in Computer Science from the University of Verona (Italy) in 2012 supervised by Prof. Vittorio Murino and Prof. Marco Cristani. During his Ph.D., he spent 6 months at the University of British Columbia supervised by Prof. Nando de Freitas. Before the current position, he was a postdoctoral fellow at Dartmouth College working with Prof. Lorenzo Torresani and a postdoctoral fellow at the Italian Institute of Technology working with Prof. Vittorio Murino.

Referente: Marco Cristani
Referente esterno
Data pubblicazione: 3 maggio 2022

Strada le Grazie 15
37134 Verona
Partita IVA01541040232
Codice Fiscale93009870234

Segui su

Play store Apple Store

Presentazione

Organizzazione

Riferimenti

La ricerca in breve

Attività di ricerca

Strutture

Corsi di Studio

Dottorati, Master e Formazione superiore

Servizi per la didattica

Informazioni per il territorio

Servizi per il territorio

Riferimenti

Learning Visio-linguistic Embeddings for Interactive Fashion Product Retrieval

Offerta formativa

Corsi di Studio

Dottorati, Master e Formazione superiore

Offerta formativa

Corsi di Studio

Dottorati, Master e Formazione superiore