In recent years, several methods have been proposed to combine multiple kernels instead of using a single one. These different kernels may correspond to using different notions of similarity or may be using information coming from multiple sources (different representations or different feature subsets). In trying to organize and highlight the similarities and differences between them, I will give a taxonomy of and review several multiple kernel learning algorithms. Most of the previous research on such methods is focused on the computational efficiency issue. However, it is still not feasible to combine many kernels using existing Bayesian approaches due to their high time complexity. We propose a fully conjugate Bayesian formulation and derive a deterministic variational approximation, which allows us to combine hundreds or thousands of kernels very efficiently. Experiments with large numbers of kernels on benchmark data sets show that our inference method is quite fast, requiring less than a minute. On one bioinformatics and three image recognition data sets, our method outperforms previously reported results with better generalization performance.
CSS e script comuni siti DOL - frase 9957