Learning Small Random Networks for Molecule Classification

Classification of structured data (i.e., data that are represented as graphs) is a topic of interest in such areas as bioinformatics and cheminformatics. This talk presents a novel, simple approach to the problem of structured pattern recognition, relying on the description of graphs in terms of algebraic binary relations. Maximum-a-posteriori decision rules over relations require the estimation of class-conditional probability density functions (pdf) defined on graphs. A nonparametric technique for the estimation of the pdfs is introduced, on the basis of a factorization of joint probabilities into individual densities that are modeled, in an unsupervised fashion, via Support Vector Machine (SVM). The SVM training is accomplished applying support vector regression on an unbiased variant of the Parzen Window. The behavior of the estimation algorithm is first demonstrated on a synthetic distribution. Finally, experiments on the Mutagenesis (friendly + unfriendly) and Biodegradability datasets are presented. These tasks are representative of the more general problem of machine learning from molecular structures. Results show a dramatic improvement over state-of-the-art approaches to the problem, namely graph neural nets, kernels for graphs, and inductive logic programming.

Edmondo Trentin - Dip. Ingegneria dell'Informazione, Università di Siena

Data e ora
martedì 15 aprile 2008 alle ore 16.15 - Inizio alle 16:30, Caffè e biscotti alle 16:15.

Ca' Vignal 3 - Piramide, Piano 0, Sala Verde

Gloria Menegaz

Data pubblicazione
1 aprile 2008


Offerta formativa