Towards sub-quadratic learning of probability density models in the form of mixtures of trees
Résumé
We consider randomization schemes of the Chow-Liu algorithm from weak (bagging, of quadratic complexity) to strong ones (full random sampling, of linear complexity), for learning probability density models in the form of mixtures of Markov trees. Our empirical study on high-dimensional synthetic problems shows that, while bagging is the most accurate scheme on average, some of the stronger randomizations remain very competitive in terms of accuracy, specially for small sample sizes. 1 Background and motivations A bayesian network (BN) over a finite set X = {X i } n i=1 of n discrete random variables is a graphical probabilistic model consisting of two parts [1]. The first is a directed acyclic graph over the variables (each variable corresponds to a vertex in this graph). The second is a set of conditional probability distributions (for each variable X i , conditionally to its parents in the graph). A BN encodes a joint distribution over X by the product of these conditional distributions, and it may be exploited to perform probabilistic inferences over that distribution. However, exact inference as well as structure learning with BNs are NP-hard unless the skeleton of the graph is constrained [2]. Markov Trees are an interesting subclass of BNs, whose skeletons are acyclic and for which each node of the graph has (at most) one parent [1]. With Markov trees, the computational complexity of inference is linear in the number of variables , and structure learning by using the so-called Chow-Liu algorithm is essentially quadratic in the number of variables [3]. However, the restrictions on the structure also limit the modeling abilities of Markov trees, and, in practice, they are often not suited to represent a distribution. A mixture model of size m consists of a set of different probability models, each of them defined over X. To each term of the model is associated a positive weight, w i , with the constraint m i=1 w i = 1. A Mixture of Trees (MT) [4] is a particular type of mixture where each term T i of the model is a Markov tree defined on X. The probability P (x) of an event x, encoded by the MT model, *
Domaines
Apprentissage [cs.LG]Origine | Fichiers produits par l'(les) auteur(s) |
---|
Loading...