David Danks
- Published in print:
- 2007
- Published Online:
- April 2010
- ISBN:
- 9780195176803
- eISBN:
- 9780199958511
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780195176803.003.0012
- Subject:
- Psychology, Developmental Psychology
Many different, seemingly mutually exclusive, theories of categorization have been proposed in recent years. The most notable theories have been those based on prototypes, exemplars, and causal ...
More
Many different, seemingly mutually exclusive, theories of categorization have been proposed in recent years. The most notable theories have been those based on prototypes, exemplars, and causal models. This chapter provides “representation theorems” for each of these theories in the framework of probabilistic graphical models. More specifically, it shows for each of these psychological theories that the categorization judgments predicted and explained by the theory can be wholly captured using probabilistic graphical models. In other words, probabilistic graphical models provide a lingua franca for these disparate categorization theories, and so we can quite directly compare the different types of theories. These formal results are used to explain a variety of surprising empirical results, and to propose several novel theories of categorization.Less
Many different, seemingly mutually exclusive, theories of categorization have been proposed in recent years. The most notable theories have been those based on prototypes, exemplars, and causal models. This chapter provides “representation theorems” for each of these theories in the framework of probabilistic graphical models. More specifically, it shows for each of these psychological theories that the categorization judgments predicted and explained by the theory can be wholly captured using probabilistic graphical models. In other words, probabilistic graphical models provide a lingua franca for these disparate categorization theories, and so we can quite directly compare the different types of theories. These formal results are used to explain a variety of surprising empirical results, and to propose several novel theories of categorization.
David Firth
- Published in print:
- 2005
- Published Online:
- September 2007
- ISBN:
- 9780198566540
- eISBN:
- 9780191718038
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780198566540.003.0008
- Subject:
- Mathematics, Probability / Statistics
This chapter summarizes recent themes and research topics in social statistics, viewed as statistical methods of particular value in substantive research fields such as criminology, demography, ...
More
This chapter summarizes recent themes and research topics in social statistics, viewed as statistical methods of particular value in substantive research fields such as criminology, demography, economics, education, geography, politics, psychology, public health, social policy, and sociology. Special emphasis is given to multi-level models, small area estimation, models for obtaining measuring instruments, and weighting problems arising in survey data. Particular areas in which further work seems likely to be fruitful are identified by discussing special features connected with incomplete data; policy evaluations; causal inquiries; event history data; aggregate data; macro-level phenomena arising from actions of individuals who influence one another; performance monitoring of public services; and open-source projects for statistical computing.Less
This chapter summarizes recent themes and research topics in social statistics, viewed as statistical methods of particular value in substantive research fields such as criminology, demography, economics, education, geography, politics, psychology, public health, social policy, and sociology. Special emphasis is given to multi-level models, small area estimation, models for obtaining measuring instruments, and weighting problems arising in survey data. Particular areas in which further work seems likely to be fruitful are identified by discussing special features connected with incomplete data; policy evaluations; causal inquiries; event history data; aggregate data; macro-level phenomena arising from actions of individuals who influence one another; performance monitoring of public services; and open-source projects for statistical computing.
Chris Ray and Sharon K. Collinge
- Published in print:
- 2006
- Published Online:
- September 2007
- ISBN:
- 9780198567080
- eISBN:
- 9780191717871
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780198567080.003.0014
- Subject:
- Biology, Disease Ecology / Epidemiology
Plague is emerging as a threat to humans and wildlife throughout western North America. Sylvatic plague, caused by the bacterium Yersinia pestis, is maintained within a network of mammal species and ...
More
Plague is emerging as a threat to humans and wildlife throughout western North America. Sylvatic plague, caused by the bacterium Yersinia pestis, is maintained within a network of mammal species and their fleas. No ‘classic’ reservoir has been identified; no resistant host species is known to develop sufficient bacteremia to support vector transmission. Epizootics are detected through the observation of mass mortality in conspicuous species like prairie dogs. Prairie dogs have key effects on both the ecological and epidemiological dynamics of prairie communities. The diversity of small mammals is lower in prairie dog colonies, despite higher densities of certain species on colonies relative to other grassland sites. This pattern suggests increased competition or apparent competition in colonies, perhaps through shared use of prairie dog burrows. Graphical models demonstrate how the ratio of interspecific to intraspecific interactions may be altered in colonies, affecting the potential for plague transmission in complex ways.Less
Plague is emerging as a threat to humans and wildlife throughout western North America. Sylvatic plague, caused by the bacterium Yersinia pestis, is maintained within a network of mammal species and their fleas. No ‘classic’ reservoir has been identified; no resistant host species is known to develop sufficient bacteremia to support vector transmission. Epizootics are detected through the observation of mass mortality in conspicuous species like prairie dogs. Prairie dogs have key effects on both the ecological and epidemiological dynamics of prairie communities. The diversity of small mammals is lower in prairie dog colonies, despite higher densities of certain species on colonies relative to other grassland sites. This pattern suggests increased competition or apparent competition in colonies, perhaps through shared use of prairie dog burrows. Graphical models demonstrate how the ratio of interspecific to intraspecific interactions may be altered in colonies, affecting the potential for plague transmission in complex ways.
Christopher Meek and Ydo Wexler
- Published in print:
- 2011
- Published Online:
- January 2012
- ISBN:
- 9780199694587
- eISBN:
- 9780191731921
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780199694587.003.0015
- Subject:
- Mathematics, Probability / Statistics
We describe the Multiplicative Approximation Scheme (MAS) for approximate inference in multiplicative models. We apply this scheme to develop the DynaDecomp approximation algorithm. This algorithm ...
More
We describe the Multiplicative Approximation Scheme (MAS) for approximate inference in multiplicative models. We apply this scheme to develop the DynaDecomp approximation algorithm. This algorithm can be used to obtain bounded approximations for various types of max‐sum‐product problems including the computation of the log probability of evidence, the log‐partition function, Most Probable Explanation (MPE) and maximum a posteriori probability (MAP) inference problems. We demonstrate that this algorithm yields bounded approximations superior to existing methods using a variety of large graphical models.Less
We describe the Multiplicative Approximation Scheme (MAS) for approximate inference in multiplicative models. We apply this scheme to develop the DynaDecomp approximation algorithm. This algorithm can be used to obtain bounded approximations for various types of max‐sum‐product problems including the computation of the log probability of evidence, the log‐partition function, Most Probable Explanation (MPE) and maximum a posteriori probability (MAP) inference problems. We demonstrate that this algorithm yields bounded approximations superior to existing methods using a variety of large graphical models.
Haley J. Abel and Alun Thomas
- Published in print:
- 2014
- Published Online:
- December 2014
- ISBN:
- 9780198709022
- eISBN:
- 9780191779619
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780198709022.003.0010
- Subject:
- Mathematics, Probability / Statistics, Biostatistics
This chapter describes the use of decomposable graphical models (DGMs) to represent the dependences within genetic data, or linkage disequilibrium (LD), prior to various downstream applications. ...
More
This chapter describes the use of decomposable graphical models (DGMs) to represent the dependences within genetic data, or linkage disequilibrium (LD), prior to various downstream applications. First, general learning algorithms are reviewed: schemes based on Markov chains Monte Carlo and related simulated annealing strategies are described. However, for tractable processing of high-dimensional data, it is shown that sampling the space of DGMs is efficiently replaced with the sampling of representations of DGMs — the junction trees. Then, a first application is considered: the phase imputation for diploid data, which consists in inferring the latent genetical phased haplotypes underlying the observed genetical unphased genotypes. In particular, it is shown that in the case of diploid data, decoupling the model estimation step from the phasing step allows scalability of the whole learning process. The chapter ends with the illustration of the potentialities of DGMs through four applications.Less
This chapter describes the use of decomposable graphical models (DGMs) to represent the dependences within genetic data, or linkage disequilibrium (LD), prior to various downstream applications. First, general learning algorithms are reviewed: schemes based on Markov chains Monte Carlo and related simulated annealing strategies are described. However, for tractable processing of high-dimensional data, it is shown that sampling the space of DGMs is efficiently replaced with the sampling of representations of DGMs — the junction trees. Then, a first application is considered: the phase imputation for diploid data, which consists in inferring the latent genetical phased haplotypes underlying the observed genetical unphased genotypes. In particular, it is shown that in the case of diploid data, decoupling the model estimation step from the phasing step allows scalability of the whole learning process. The chapter ends with the illustration of the potentialities of DGMs through four applications.
Guido Consonni and Luca La Rocca
- Published in print:
- 2011
- Published Online:
- January 2012
- ISBN:
- 9780199694587
- eISBN:
- 9780191731921
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780199694587.003.0004
- Subject:
- Mathematics, Probability / Statistics
We propose a new method for the objective comparison of two nested models based on non‐local priors. More specifically, starting with a default prior under each of the two models, we construct a ...
More
We propose a new method for the objective comparison of two nested models based on non‐local priors. More specifically, starting with a default prior under each of the two models, we construct a moment prior under the larger model, and then use the fractional Bayes factor for a comparison. Non‐local priors have been recently introduced to obtain a better separation between nested models, thus accelerating the learning behaviour, relative to currently used local priors, when the smaller model holds. Although the argument showing the superior performance of non‐local priors is asymptotic, the improvement they produce is already apparent for small to moderate samples sizes, which makes them a useful and practical tool. As a by‐product, it turns out that routinely used objective methods, such as ordinary fractional Bayes factors, are alarmingly slow in learning that the smaller model holds. On the downside, when the larger model holds, non‐local priors exhibit a weaker discriminatory power against sampling distributions close to the smaller model. However, this drawback becomes rapidly negligible as the sample size grows, because the learning rate of the Bayes factor under the larger model is exponentially fast, whether one uses local or non‐local priors. We apply our methodology to directed acyclic graph models having a Gaussian distribution. Because of the recursive nature of the joint density, and the assumption of global parameter independence embodied in our prior, calculations need only be performed for individual vertices admitting a distinct parent structure under the two graphs; additionally we obtain closed‐form expressions as in the ordinary conjugate case. We provide illustrations of our method for a simple three‐variable case, as well as for a more elaborate seven‐variable situation. Although we concentrate on pairwise comparisons of nested models, our procedure can be implemented to carry‐out a search over the space of all models.Less
We propose a new method for the objective comparison of two nested models based on non‐local priors. More specifically, starting with a default prior under each of the two models, we construct a moment prior under the larger model, and then use the fractional Bayes factor for a comparison. Non‐local priors have been recently introduced to obtain a better separation between nested models, thus accelerating the learning behaviour, relative to currently used local priors, when the smaller model holds. Although the argument showing the superior performance of non‐local priors is asymptotic, the improvement they produce is already apparent for small to moderate samples sizes, which makes them a useful and practical tool. As a by‐product, it turns out that routinely used objective methods, such as ordinary fractional Bayes factors, are alarmingly slow in learning that the smaller model holds. On the downside, when the larger model holds, non‐local priors exhibit a weaker discriminatory power against sampling distributions close to the smaller model. However, this drawback becomes rapidly negligible as the sample size grows, because the learning rate of the Bayes factor under the larger model is exponentially fast, whether one uses local or non‐local priors. We apply our methodology to directed acyclic graph models having a Gaussian distribution. Because of the recursive nature of the joint density, and the assumption of global parameter independence embodied in our prior, calculations need only be performed for individual vertices admitting a distinct parent structure under the two graphs; additionally we obtain closed‐form expressions as in the ordinary conjugate case. We provide illustrations of our method for a simple three‐variable case, as well as for a more elaborate seven‐variable situation. Although we concentrate on pairwise comparisons of nested models, our procedure can be implemented to carry‐out a search over the space of all models.
Raphaël Mourad (ed.)
- Published in print:
- 2014
- Published Online:
- December 2014
- ISBN:
- 9780198709022
- eISBN:
- 9780191779619
- Item type:
- book
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780198709022.001.0001
- Subject:
- Mathematics, Probability / Statistics, Biostatistics
At the crossroads between statistics and machine learning, probabilistic graphical models provide a powerful formal framework to model complex data. Probabilistic graphical models are probabilistic ...
More
At the crossroads between statistics and machine learning, probabilistic graphical models provide a powerful formal framework to model complex data. Probabilistic graphical models are probabilistic models whose graphical components denote conditional independence structures between random variables. The probabilistic framework makes it possible to deal with data uncertainty while the conditional independence assumption helps process high dimensional and complex data. Examples of probabilistic graphical models are Bayesian networks and Markov random fields, which represent two of the most popular classes of such models. With the rapid advancements of high-throughput technologies and the ever decreasing costs of these next generation technologies, a fast-growing volume of biological data of various types—the so-called omics—is in need of accurate and efficient methods for modeling, prior to further downstream analysis. Network reconstruction from gene expression data represents perhaps the most emblematic area of research where probabilistic graphical models have been successfully applied. However these models have also created renew interest in genetics, in particular: association genetics, causality discovery, prediction of outcomes, detection of copy number variations, epigenetics, etc.. For all these reasons, it is foreseeable that such models will have a prominent role to play in advances in genome-wide analyses.Less
At the crossroads between statistics and machine learning, probabilistic graphical models provide a powerful formal framework to model complex data. Probabilistic graphical models are probabilistic models whose graphical components denote conditional independence structures between random variables. The probabilistic framework makes it possible to deal with data uncertainty while the conditional independence assumption helps process high dimensional and complex data. Examples of probabilistic graphical models are Bayesian networks and Markov random fields, which represent two of the most popular classes of such models. With the rapid advancements of high-throughput technologies and the ever decreasing costs of these next generation technologies, a fast-growing volume of biological data of various types—the so-called omics—is in need of accurate and efficient methods for modeling, prior to further downstream analysis. Network reconstruction from gene expression data represents perhaps the most emblematic area of research where probabilistic graphical models have been successfully applied. However these models have also created renew interest in genetics, in particular: association genetics, causality discovery, prediction of outcomes, detection of copy number variations, epigenetics, etc.. For all these reasons, it is foreseeable that such models will have a prominent role to play in advances in genome-wide analyses.
Christine Sinoquet
- Published in print:
- 2014
- Published Online:
- December 2014
- ISBN:
- 9780198709022
- eISBN:
- 9780191779619
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780198709022.003.0002
- Subject:
- Mathematics, Probability / Statistics, Biostatistics
The aim of this chapter is to offer an advanced tutorial to scientists with no background or no deep background on probabilistic graphical models. To readers more familiar with these models, this ...
More
The aim of this chapter is to offer an advanced tutorial to scientists with no background or no deep background on probabilistic graphical models. To readers more familiar with these models, this chapter is to be used as a compendium of definitions and general methods, to browse through at will. Intentionally self-contained, this chapter first begins with reminders of essential definitions such as the distinction between marginal independence and conditional independence. Then the chapter briefly surveys the most popular classes of probabilistic graphical models: Markov chains, Bayesian networks, and Markov random fields. Next probabilistic inference is explained and illustrated in the Bayesian network context. Finally parameter and structure learning are presented.Less
The aim of this chapter is to offer an advanced tutorial to scientists with no background or no deep background on probabilistic graphical models. To readers more familiar with these models, this chapter is to be used as a compendium of definitions and general methods, to browse through at will. Intentionally self-contained, this chapter first begins with reminders of essential definitions such as the distinction between marginal independence and conditional independence. Then the chapter briefly surveys the most popular classes of probabilistic graphical models: Markov chains, Bayesian networks, and Markov random fields. Next probabilistic inference is explained and illustrated in the Bayesian network context. Finally parameter and structure learning are presented.
Marine Jeanmougin, Camille Charbonnier, Mickaël Guedj, and Julien Chiquet
- Published in print:
- 2014
- Published Online:
- December 2014
- ISBN:
- 9780198709022
- eISBN:
- 9780191779619
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780198709022.003.0005
- Subject:
- Mathematics, Probability / Statistics, Biostatistics
Clustering genes with high correlations will group genes with close expression profiles, defining clusters of co-expressed genes. However, such correlations do not provide any clue on the chain of ...
More
Clustering genes with high correlations will group genes with close expression profiles, defining clusters of co-expressed genes. However, such correlations do not provide any clue on the chain of information going from gene to gene. Partial correlation consists in quantifying the correlation between two genes after excluding the effects of the other genes. Partial correlation thus makes it possible to distinguish between the correlation of two genes due to direct causal relationships from the correlation that originates via intermediate genes. In this chapter, Gaussian graphical model (GGM) learning is set up as a covariate selection problem. Two least absolute shrinkage and selection operator (LASSO)-type techniques are described, the graphical LASSO approach and the neighborhood selection. Then two extensions to the classical GGM are presented. GGMs are extended in structured GGMs, to account for modularity, and more generally heterogeneity in the gene connection features. The extension using a biological prior on the network structure is illustrated on real data.Less
Clustering genes with high correlations will group genes with close expression profiles, defining clusters of co-expressed genes. However, such correlations do not provide any clue on the chain of information going from gene to gene. Partial correlation consists in quantifying the correlation between two genes after excluding the effects of the other genes. Partial correlation thus makes it possible to distinguish between the correlation of two genes due to direct causal relationships from the correlation that originates via intermediate genes. In this chapter, Gaussian graphical model (GGM) learning is set up as a covariate selection problem. Two least absolute shrinkage and selection operator (LASSO)-type techniques are described, the graphical LASSO approach and the neighborhood selection. Then two extensions to the classical GGM are presented. GGMs are extended in structured GGMs, to account for modularity, and more generally heterogeneity in the gene connection features. The extension using a biological prior on the network structure is illustrated on real data.
Christine Sinoquet
- Published in print:
- 2014
- Published Online:
- December 2014
- ISBN:
- 9780198709022
- eISBN:
- 9780191779619
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780198709022.003.0001
- Subject:
- Mathematics, Probability / Statistics, Biostatistics
The explosion in omics and other types of biological data has increased the demand for solid, large-scale statistical methods. These data can be discrete or continuous, dependent or independent, from ...
More
The explosion in omics and other types of biological data has increased the demand for solid, large-scale statistical methods. These data can be discrete or continuous, dependent or independent, from many individuals or tissue types. There might be millions of correlated observations from a single individual, observations at different scales and levels, in addition to covariates. The study of living systems encompasses a wide range of concerns, from prospective to predictive and causal questions, reflecting the multiple interests in understanding biological mechanisms, disease etiology, predicting outcome, and deciphering causal relationships in data. Precisely, probabilistic graphical models provide a flexible statistical framework that is suitable to analyze such data. Notably, graphical models are able to handle dependences within data, which is an almost defining feature of cellular and other biological data.Less
The explosion in omics and other types of biological data has increased the demand for solid, large-scale statistical methods. These data can be discrete or continuous, dependent or independent, from many individuals or tissue types. There might be millions of correlated observations from a single individual, observations at different scales and levels, in addition to covariates. The study of living systems encompasses a wide range of concerns, from prospective to predictive and causal questions, reflecting the multiple interests in understanding biological mechanisms, disease etiology, predicting outcome, and deciphering causal relationships in data. Precisely, probabilistic graphical models provide a flexible statistical framework that is suitable to analyze such data. Notably, graphical models are able to handle dependences within data, which is an almost defining feature of cellular and other biological data.
Harri Kiiveri
- Published in print:
- 2014
- Published Online:
- December 2014
- ISBN:
- 9780198709022
- eISBN:
- 9780191779619
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780198709022.003.0003
- Subject:
- Mathematics, Probability / Statistics, Biostatistics
The usual analysis of gene expression data ignores the correlation between gene expression values. Biologically, this assumption is unreasonable. The approach presented in this chapter allows for ...
More
The usual analysis of gene expression data ignores the correlation between gene expression values. Biologically, this assumption is unreasonable. The approach presented in this chapter allows for correlation between genes through a sparse Gaussian graphical model: sparse inverse covariance matrices and their associated graphical representations are used to capture the notion of gene networks. Existing methods find their limitations in the issue posed by the identification of the pattern of zeroes in such inverse covariance matrices. A workable solution for determining the zero pattern is provided in this chapter. Two other important contributions of this chapter are a method for very high-dimensional model fitting and a distribution-free approach to hypothesis testing. Such tests address assessment of differential expression and of differential connection, a novel notion introduced in this chapter. An example dealing with real data is presented.Less
The usual analysis of gene expression data ignores the correlation between gene expression values. Biologically, this assumption is unreasonable. The approach presented in this chapter allows for correlation between genes through a sparse Gaussian graphical model: sparse inverse covariance matrices and their associated graphical representations are used to capture the notion of gene networks. Existing methods find their limitations in the issue posed by the identification of the pattern of zeroes in such inverse covariance matrices. A workable solution for determining the zero pattern is provided in this chapter. Two other important contributions of this chapter are a method for very high-dimensional model fitting and a distribution-free approach to hypothesis testing. Such tests address assessment of differential expression and of differential connection, a novel notion introduced in this chapter. An example dealing with real data is presented.
Devavrat Shah
- Published in print:
- 2015
- Published Online:
- March 2016
- ISBN:
- 9780198743736
- eISBN:
- 9780191803802
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780198743736.003.0001
- Subject:
- Physics, Theoretical, Computational, and Statistical Physics
This chapter introduces graphical models as a powerful tool to derive efficient algorithms for inference problems. When dealing with complex interdependent variables, inference problems may become of ...
More
This chapter introduces graphical models as a powerful tool to derive efficient algorithms for inference problems. When dealing with complex interdependent variables, inference problems may become of huge complexity. In this context, the structure of the variables is of great interest. In this chapter, directed and undirected graphical models are first defined, before some crucial results are stated, such as the Hammersley–Clifford theorem of Markov random fields and the junction tree property aimed at finding groupings under which a graphical model becomes a tree. Taking advantage of the structure of the variables, belief propagation is then described, including two particular instances: the sum–product and max–sum algorithms. In the final section, the learning problem is addressed in three different contexts: parameter learning, graphical model learning, and latent graphical model learning.Less
This chapter introduces graphical models as a powerful tool to derive efficient algorithms for inference problems. When dealing with complex interdependent variables, inference problems may become of huge complexity. In this context, the structure of the variables is of great interest. In this chapter, directed and undirected graphical models are first defined, before some crucial results are stated, such as the Hammersley–Clifford theorem of Markov random fields and the junction tree property aimed at finding groupings under which a graphical model becomes a tree. Taking advantage of the structure of the variables, belief propagation is then described, including two particular instances: the sum–product and max–sum algorithms. In the final section, the learning problem is addressed in three different contexts: parameter learning, graphical model learning, and latent graphical model learning.
James M. Robins and Thomas S. Richardson
- Published in print:
- 2011
- Published Online:
- November 2020
- ISBN:
- 9780199754649
- eISBN:
- 9780197565650
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/oso/9780199754649.003.0011
- Subject:
- Clinical Medicine and Allied Health, Psychiatry
The subject-specific data from either an observational or experimental study consist of a string of numbers. These numbers represent a series of empirical measurements. Calculations are performed ...
More
The subject-specific data from either an observational or experimental study consist of a string of numbers. These numbers represent a series of empirical measurements. Calculations are performed on these strings and causal inferences are drawn. For example, an investigator might conclude that the analysis provides strong evidence for ‘‘both an indirect effect of cigarette smoking on coronary artery disease through its effect on blood pressure and a direct effect not mediated by blood pressure.’’ The nature of the relationship between the sentence expressing these causal conclusions and the statistical computer calculations performed on the strings of numbers has been obscure. Since the computer algorithms are well-defined mathematical objects, it is crucial to provide formal causal models for the English sentences expressing the investigator’s causal inferences. In this chapter we restrict ourselves to causal models that can be represented by a directed acyclic graph. There are two common approaches to the construction of causal models. The first approach posits unobserved fixed ‘potential’ or ‘counterfactual’ outcomes for each unit under different possible joint treatments or exposures. The second approach posits relationships between the population distribution of outcomes under experimental interventions (with full compliance) to the set of (conditional) distributions that would be observed under passive observation (i.e., from observational data). We will refer to the former as ‘counterfactual’ causal models and the latter as ‘agnostic’ causal models (Spirtes, Glymour, & Scheines, 1993) as the second approach is agnostic as to whether unit-specific counterfactual outcomes exist, be they fixed or stochastic. The primary difference between the two approaches is ontological: The counterfactual approach assumes that counterfactual variables exist, while the agnostic approach does not require this. In fact, the counterfactual theory logically subsumes the agnostic theory in the sense that the counterfactual approach is logically an extension of the latter approach. In particular, for a given graph the causal contrasts (i.e. parameters) that are well-defined under the agnostic approach are also well-defined under the counterfactual approach.
Less
The subject-specific data from either an observational or experimental study consist of a string of numbers. These numbers represent a series of empirical measurements. Calculations are performed on these strings and causal inferences are drawn. For example, an investigator might conclude that the analysis provides strong evidence for ‘‘both an indirect effect of cigarette smoking on coronary artery disease through its effect on blood pressure and a direct effect not mediated by blood pressure.’’ The nature of the relationship between the sentence expressing these causal conclusions and the statistical computer calculations performed on the strings of numbers has been obscure. Since the computer algorithms are well-defined mathematical objects, it is crucial to provide formal causal models for the English sentences expressing the investigator’s causal inferences. In this chapter we restrict ourselves to causal models that can be represented by a directed acyclic graph. There are two common approaches to the construction of causal models. The first approach posits unobserved fixed ‘potential’ or ‘counterfactual’ outcomes for each unit under different possible joint treatments or exposures. The second approach posits relationships between the population distribution of outcomes under experimental interventions (with full compliance) to the set of (conditional) distributions that would be observed under passive observation (i.e., from observational data). We will refer to the former as ‘counterfactual’ causal models and the latter as ‘agnostic’ causal models (Spirtes, Glymour, & Scheines, 1993) as the second approach is agnostic as to whether unit-specific counterfactual outcomes exist, be they fixed or stochastic. The primary difference between the two approaches is ontological: The counterfactual approach assumes that counterfactual variables exist, while the agnostic approach does not require this. In fact, the counterfactual theory logically subsumes the agnostic theory in the sense that the counterfactual approach is logically an extension of the latter approach. In particular, for a given graph the causal contrasts (i.e. parameters) that are well-defined under the agnostic approach are also well-defined under the counterfactual approach.
Magy Seif El-Nasr, Truong Huy Nguyen Dinh, Alessandro Canossa, and Anders Drachen
- Published in print:
- 2021
- Published Online:
- November 2021
- ISBN:
- 9780192897879
- eISBN:
- 9780191919466
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/oso/9780192897879.003.0011
- Subject:
- Computer Science, Human-Computer Interaction, Game Studies
This chapter discusses more advanced methods for sequence analysis. These include: probabilistic methods using classical planning, Bayesian Networks (BN), Dynamic Bayesian Networks (DBNs), Hidden ...
More
This chapter discusses more advanced methods for sequence analysis. These include: probabilistic methods using classical planning, Bayesian Networks (BN), Dynamic Bayesian Networks (DBNs), Hidden Markov Models (HMMs), Markov Logic Networks (MLNs), Markov Decision Process (MDP), and Recurrent Neural Networks (RNNs), specifically concentrating on LSTM (Long Short-Term Memory). These techniques are all great but, at this time, are mostly used in academia and less in the industry. Thus, the chapter takes a more academic approach, showing the work and its application to games when possible. The techniques are important as they cultivate future directions of how you can think about modeling, predicting players’ strategies, actions, and churn. We believe these methods can be leveraged in the future as the field advances and will have an impact in the industry. Please note that this chapter was developed in collaboration with several PhD students at Northeastern University, specifically Nathan Partlan, Madkour Abdelrahman Amr, and Sabbir Ahmad, who contributed greatly to this chapter and the case studies discussed.Less
This chapter discusses more advanced methods for sequence analysis. These include: probabilistic methods using classical planning, Bayesian Networks (BN), Dynamic Bayesian Networks (DBNs), Hidden Markov Models (HMMs), Markov Logic Networks (MLNs), Markov Decision Process (MDP), and Recurrent Neural Networks (RNNs), specifically concentrating on LSTM (Long Short-Term Memory). These techniques are all great but, at this time, are mostly used in academia and less in the industry. Thus, the chapter takes a more academic approach, showing the work and its application to games when possible. The techniques are important as they cultivate future directions of how you can think about modeling, predicting players’ strategies, actions, and churn. We believe these methods can be leveraged in the future as the field advances and will have an impact in the industry. Please note that this chapter was developed in collaboration with several PhD students at Northeastern University, specifically Nathan Partlan, Madkour Abdelrahman Amr, and Sabbir Ahmad, who contributed greatly to this chapter and the case studies discussed.
Seeger Matthias
- Published in print:
- 2006
- Published Online:
- August 2013
- ISBN:
- 9780262033589
- eISBN:
- 9780262255899
- Item type:
- chapter
- Publisher:
- The MIT Press
- DOI:
- 10.7551/mitpress/9780262033589.003.0002
- Subject:
- Computer Science, Machine Learning
This chapter proposes a simple taxonomy of probabilistic graphical models for the semi-supervised learning (SSL) problem. It provides some broad classes of algorithms for each of the families and ...
More
This chapter proposes a simple taxonomy of probabilistic graphical models for the semi-supervised learning (SSL) problem. It provides some broad classes of algorithms for each of the families and points to specific realizations in the literature. Finally, more detailed light is shed on the family of methods using input-dependent regularization or conditional prior distributions, and parallels to the co-training paradigm are shown. The SSL problem has recently attracted the machine learning community, mainly due to its significant importance in practical applications. The chapter then defines the problem and introduces the notation to be used. It is argued here that SSL is much more a practical than a theoretical problem. A useful SSL technique should be configurable to the specifics of the task in a similar way as Bayesian learning, through the choice of prior and model.Less
This chapter proposes a simple taxonomy of probabilistic graphical models for the semi-supervised learning (SSL) problem. It provides some broad classes of algorithms for each of the families and points to specific realizations in the literature. Finally, more detailed light is shed on the family of methods using input-dependent regularization or conditional prior distributions, and parallels to the co-training paradigm are shown. The SSL problem has recently attracted the machine learning community, mainly due to its significant importance in practical applications. The chapter then defines the problem and introduces the notation to be used. It is argued here that SSL is much more a practical than a theoretical problem. A useful SSL technique should be configurable to the specifics of the task in a similar way as Bayesian learning, through the choice of prior and model.
Marc Mézard
- Published in print:
- 2015
- Published Online:
- March 2016
- ISBN:
- 9780198743736
- eISBN:
- 9780191803802
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780198743736.003.0004
- Subject:
- Physics, Theoretical, Computational, and Statistical Physics
The cavity method is introduced as a heuristic framework from a physics perspective to solve probabilistic graphical models and is presented at both the replica symmetry (RS) and one-step replica ...
More
The cavity method is introduced as a heuristic framework from a physics perspective to solve probabilistic graphical models and is presented at both the replica symmetry (RS) and one-step replica symmetry breaking (1RSB) levels. This technique has been applied with success to a wide range of models and problems, such as spin glasses, random constraint satisfaction problems (rCSP), and error correcting codes. First, the RS cavity solution for the Sherrington–Kirkpatrick model—a fully connected spin glass model—is derived and its equivalence to the RS solution obtained using replicas is discussed. The general cavity method for diluted graphs is then illustrated at both RS and 1RSB levels. The latter was a significant breakthrough in the last decade and has direct applications to rCSP. Finally, as an example of an actual problem, k-SAT is investigated using belief and survey propagation.Less
The cavity method is introduced as a heuristic framework from a physics perspective to solve probabilistic graphical models and is presented at both the replica symmetry (RS) and one-step replica symmetry breaking (1RSB) levels. This technique has been applied with success to a wide range of models and problems, such as spin glasses, random constraint satisfaction problems (rCSP), and error correcting codes. First, the RS cavity solution for the Sherrington–Kirkpatrick model—a fully connected spin glass model—is derived and its equivalence to the RS solution obtained using replicas is discussed. The general cavity method for diluted graphs is then illustrated at both RS and 1RSB levels. The latter was a significant breakthrough in the last decade and has direct applications to rCSP. Finally, as an example of an actual problem, k-SAT is investigated using belief and survey propagation.
Martin V. Butz and Esther F. Kutter
- Published in print:
- 2017
- Published Online:
- July 2017
- ISBN:
- 9780198739692
- eISBN:
- 9780191834462
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780198739692.003.0009
- Subject:
- Psychology, Cognitive Models and Architectures, Cognitive Psychology
While bottom-up visual processing is important, the brain integrates this information with top-down, generative expectations from very early on in the visual processing hierarchy. Indeed, our brain ...
More
While bottom-up visual processing is important, the brain integrates this information with top-down, generative expectations from very early on in the visual processing hierarchy. Indeed, our brain should not be viewed as a classification system, but rather as a generative system, which perceives something by integrating sensory evidence with the available, learned, predictive knowledge about that thing. The involved generative models continuously produce expectations over time, across space, and from abstracted encodings to more concrete encodings. Bayesian information processing is the key to understand how information integration must work computationally – at least in approximation – also in the brain. Bayesian networks in the form of graphical models allow the modularization of information and the factorization of interactions, which can strongly improve the efficiency of generative models. The resulting generative models essentially produce state estimations in the form of probability densities, which are very well-suited to integrate multiple sources of information, including top-down and bottom-up ones. A hierarchical neural visual processing architecture illustrates this point even further. Finally, some well-known visual illusions are shown and the perceptions are explained by means of generative, information integrating, perceptual processes, which in all cases combine top-down prior knowledge and expectations about objects and environments with the available, bottom-up visual information.Less
While bottom-up visual processing is important, the brain integrates this information with top-down, generative expectations from very early on in the visual processing hierarchy. Indeed, our brain should not be viewed as a classification system, but rather as a generative system, which perceives something by integrating sensory evidence with the available, learned, predictive knowledge about that thing. The involved generative models continuously produce expectations over time, across space, and from abstracted encodings to more concrete encodings. Bayesian information processing is the key to understand how information integration must work computationally – at least in approximation – also in the brain. Bayesian networks in the form of graphical models allow the modularization of information and the factorization of interactions, which can strongly improve the efficiency of generative models. The resulting generative models essentially produce state estimations in the form of probability densities, which are very well-suited to integrate multiple sources of information, including top-down and bottom-up ones. A hierarchical neural visual processing architecture illustrates this point even further. Finally, some well-known visual illusions are shown and the perceptions are explained by means of generative, information integrating, perceptual processes, which in all cases combine top-down prior knowledge and expectations about objects and environments with the available, bottom-up visual information.