Probabilistic Modeling in Machine Learning

Bacciu, Davide; Lisboa, Paulo J. G.; Sperduti, Alessandro; Villmann, Thomas

doi:10.1007/978-3-662-43505-2_31

Probabilistic methods are the heart of machine learning. This chapter shows links between core principles of information theory and probabilistic methods, with a short overview of historical and current examples of unsupervised and inferential models. Probabilistic models are introduced as a powerful idiom to describe the world, using random variables as building blocks held together by probabilistic relationships. The chapter discusses how such probabilistic interactions can be mapped to directed and undirected graph structures, which are the Bayesian and Markov networks. We show how these networks are subsumed by the broader class of the probabilistic graphical models, a general framework that provides concepts and methodological tools to encode, manipulate and process probabilistic knowledge in a computationally efficient way. The chapter then introduces, in more detail, two topical methodologies that are central to probabilistic modeling in machine learning. First, it discusses latent variable models, a probabilistic approach to capture complex relationships between a large number of observable and measurable events (data, in general), under the assumption that these are generated by an unknown, nonobservable process. We show how the parameters of a probabilistic model involving such nonobservable information can be efficiently estimated using the concepts underlying the expectation–maximization algorithms. Second, the chapter introduces a notable example of latent variable model, that is of particular relevance for representing the time evolution of sequence data, that is the hidden Markov model . The chapter ends with a discussion on advanced approaches for modeling complex data-generating processes comprising nontrivial probabilistic interactions between latent variables and observed information.