Heterogeneous Data Integration for Structured Natural Language Processing

Starting date
March 9, 2022
Duration (months)
7
Departments
Computer Science
Managers or local contacts
Sala Pietro

The project focuses on the structural analysis of documents from heterogeneous sources devoid of structural information but provide with their own implicit structure (e.g., the photocopy of an invoice, an identity document, a business card). Starting from these inputs, the project goal consists of extracting, by means of machine learning techniques, structural information, such as presence and consequent position of the various description-value fields in order to be able to extract value of interest to be integrated into Customer Relationship Management (CRM) systems.

Sponsors:

VTENEXT s.r.l.
Funds: assigned and managed by the department
Syllabus: Ricerca commissionata

Project participants

Sandro Bernardinello
Scholarship holder
Pietro Sala
Temporary Assistant Professor
Research areas involved in the project
Sistemi informativi
Data management systems

Activities

Research facilities