Classification of structured data (i.e., data that are represented as graphs) is a topic of interest in such areas as bioinformatics and cheminformatics. This talk presents a novel, simple approach to the problem of structured pattern recognition, relying on the description of graphs in terms of algebraic binary relations. Maximum-a-posteriori decision rules over relations require the estimation of class-conditional probability density functions (pdf) defined on graphs. A nonparametric technique for the estimation of the pdfs is introduced, on the basis of a factorization of joint probabilities into individual densities that are modeled, in an unsupervised fashion, via Support Vector Machine (SVM). The SVM training is accomplished applying support vector regression on an unbiased variant of the Parzen Window. The behavior of the estimation algorithm is first demonstrated on a synthetic distribution. Finally, experiments on the Mutagenesis (friendly + unfriendly) and Biodegradability datasets are presented. These tasks are representative of the more general problem of machine learning from molecular structures. Results show a dramatic improvement over state-of-the-art approaches to the problem, namely graph neural nets, kernels for graphs, and inductive logic programming.
CSS e script comuni siti DOL - frase 9957