Lev Ginzburg and Mark Colyvan
- Published in print:
- 2003
- Published Online:
- September 2007
- ISBN:
- 9780195168167
- eISBN:
- 9780199790159
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:oso/9780195168167.003.0008
- Subject:
- Biology, Ecology
It can be very difficult to decide between two competing theories. It is rarely a straightforward matter of appealing to evidence. Sometimes neither theory conforms perfectly with the evidence, and ...
More
It can be very difficult to decide between two competing theories. It is rarely a straightforward matter of appealing to evidence. Sometimes neither theory conforms perfectly with the evidence, and sometimes more than one theory can be made to agree with the evidence. The issue then is not which theory best accords with the data, but which does so in the simplest or least ad hoc way. This chapter discusses some of the philosophical issues associated with scientific theory choice. These issues in philosophy of science shed light on the choice of population model, and lend support to the inertial model proposed in this book.Less
It can be very difficult to decide between two competing theories. It is rarely a straightforward matter of appealing to evidence. Sometimes neither theory conforms perfectly with the evidence, and sometimes more than one theory can be made to agree with the evidence. The issue then is not which theory best accords with the data, but which does so in the simplest or least ad hoc way. This chapter discusses some of the philosophical issues associated with scientific theory choice. These issues in philosophy of science shed light on the choice of population model, and lend support to the inertial model proposed in this book.
A. Townsend Peterson, Jorge Soberón, Richard G. Pearson, Robert P. Anderson, Enrique Martínez-Meyer, Miguel Nakamura, and Miguel Bastos Araújo
- Published in print:
- 2011
- Published Online:
- October 2017
- ISBN:
- 9780691136868
- eISBN:
- 9781400840670
- Item type:
- chapter
- Publisher:
- Princeton University Press
- DOI:
- 10.23943/princeton/9780691136868.003.0007
- Subject:
- Biology, Ecology
This chapter explains how environmental data can be used to create models that characterize species’ ecological niches in environmental space. It introduces a model, which is a function constructed ...
More
This chapter explains how environmental data can be used to create models that characterize species’ ecological niches in environmental space. It introduces a model, which is a function constructed by means of data analysis for the purpose of approximating the true relationship (that is, the niche) in the form of the function f linking the environment and species occurrences. The chapter first considers the “meaning” of the function f that is being estimated by the algorithms before discussing the modeling algorithms, the approaches used to implement ecological niche modeling, model calibration, model complexity and overfitting, and model extrapolation and transferability. The chapter concludes with an overview of differences among methods and selection of “best” models, along with strategies for characterizing ecological niches in ways that allow visualization, comparisons, definition of quantitative measures, snf more.Less
This chapter explains how environmental data can be used to create models that characterize species’ ecological niches in environmental space. It introduces a model, which is a function constructed by means of data analysis for the purpose of approximating the true relationship (that is, the niche) in the form of the function f linking the environment and species occurrences. The chapter first considers the “meaning” of the function f that is being estimated by the algorithms before discussing the modeling algorithms, the approaches used to implement ecological niche modeling, model calibration, model complexity and overfitting, and model extrapolation and transferability. The chapter concludes with an overview of differences among methods and selection of “best” models, along with strategies for characterizing ecological niches in ways that allow visualization, comparisons, definition of quantitative measures, snf more.
A. Townsend Peterson, Jorge Soberón, Richard G. Pearson, Robert P. Anderson, Enrique Martínez-Meyer, Miguel Nakamura, and Miguel Bastos Araújo
- Published in print:
- 2011
- Published Online:
- October 2017
- ISBN:
- 9780691136868
- eISBN:
- 9781400840670
- Item type:
- chapter
- Publisher:
- Princeton University Press
- DOI:
- 10.23943/princeton/9780691136868.003.0009
- Subject:
- Biology, Ecology
This chapter describes a framework for selecting appropriate strategies for evaluating model performance and significance. It begins with a review of key concepts, focusing on how primary occurrence ...
More
This chapter describes a framework for selecting appropriate strategies for evaluating model performance and significance. It begins with a review of key concepts, focusing on how primary occurrence data can be presence-only, presence/background, presence/pseudoabsence, or presence/absence as well as factors that may contribute to apparent commission error. It then considers the availability of two pools of occurrence data: one for model calibration and another for evaluation of model predictions. It also discusses strategies for detecting overfitting or sensitivity to bias in model calibration, with particular emphasis on quantification of performance and tests of significance. Finally, it suggests directions for future research as regards model evaluation, highlighting areas in need of theoretical and/or methodological advances.Less
This chapter describes a framework for selecting appropriate strategies for evaluating model performance and significance. It begins with a review of key concepts, focusing on how primary occurrence data can be presence-only, presence/background, presence/pseudoabsence, or presence/absence as well as factors that may contribute to apparent commission error. It then considers the availability of two pools of occurrence data: one for model calibration and another for evaluation of model predictions. It also discusses strategies for detecting overfitting or sensitivity to bias in model calibration, with particular emphasis on quantification of performance and tests of significance. Finally, it suggests directions for future research as regards model evaluation, highlighting areas in need of theoretical and/or methodological advances.
Timothy Williamson
- Published in print:
- 2020
- Published Online:
- August 2020
- ISBN:
- 9780198860662
- eISBN:
- 9780191893391
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/oso/9780198860662.003.0016
- Subject:
- Philosophy, Philosophy of Language
This brief chapter draws some methodological morals from the preceding arguments. Semantics must beware of what scientists call overfitting the data, complicating the theory to fit data which may be ...
More
This brief chapter draws some methodological morals from the preceding arguments. Semantics must beware of what scientists call overfitting the data, complicating the theory to fit data which may be unreliable, a process which tends not to yield equilibrium. One source of unreliability is human reliance on heuristics for assessing sentences, since their truth-value in a context is not epistemically transparent to us. This non-transparency explains why the role of heuristics cannot be reduced to standard Gricean mechanisms in pragmatics, which typically work off literal truth-conditions: but sameness of truth-conditions is not transparent to speakers, and heuristics may treat sentences with the same truth-conditions differently. Semantics needs to take into account the role of heuristics as a distinctive source of error in its (apparent) dataLess
This brief chapter draws some methodological morals from the preceding arguments. Semantics must beware of what scientists call overfitting the data, complicating the theory to fit data which may be unreliable, a process which tends not to yield equilibrium. One source of unreliability is human reliance on heuristics for assessing sentences, since their truth-value in a context is not epistemically transparent to us. This non-transparency explains why the role of heuristics cannot be reduced to standard Gricean mechanisms in pragmatics, which typically work off literal truth-conditions: but sameness of truth-conditions is not transparent to speakers, and heuristics may treat sentences with the same truth-conditions differently. Semantics needs to take into account the role of heuristics as a distinctive source of error in its (apparent) data
Schuurmans Dale, Southey Finnegan, Wilkinson Dana, and Guo Yuhong
- Published in print:
- 2006
- Published Online:
- August 2013
- ISBN:
- 9780262033589
- eISBN:
- 9780262255899
- Item type:
- chapter
- Publisher:
- The MIT Press
- DOI:
- 10.7551/mitpress/9780262033589.003.0023
- Subject:
- Computer Science, Machine Learning
This chapter discusses the explicit relationship that must be asserted between labeled and unlabeled data, which is a requirement of semi-supervised learning methods. Semi-supervised model selection ...
More
This chapter discusses the explicit relationship that must be asserted between labeled and unlabeled data, which is a requirement of semi-supervised learning methods. Semi-supervised model selection and regularization methods are presented here that instead require only that the labeled and unlabeled data are drawn from the same distribution. From this assumption, a metric can be constructed over hypotheses based on their predictions for unlabeled data. This metric can then be used to detect untrustworthy training error estimates, leading to model selection strategies that select the richest hypothesis class while providing theoretical guarantees against overfitting. This general approach is then adapted to regularization for supervised regression and supervised classification with probabilistic classifiers. The regularization adapts not only to the hypothesis class but also to the specific data sample provided, allowing for better performance than regularizers that account only for class complexity.Less
This chapter discusses the explicit relationship that must be asserted between labeled and unlabeled data, which is a requirement of semi-supervised learning methods. Semi-supervised model selection and regularization methods are presented here that instead require only that the labeled and unlabeled data are drawn from the same distribution. From this assumption, a metric can be constructed over hypotheses based on their predictions for unlabeled data. This metric can then be used to detect untrustworthy training error estimates, leading to model selection strategies that select the richest hypothesis class while providing theoretical guarantees against overfitting. This general approach is then adapted to regularization for supervised regression and supervised classification with probabilistic classifiers. The regularization adapts not only to the hypothesis class but also to the specific data sample provided, allowing for better performance than regularizers that account only for class complexity.
Bradley E. Alger
- Published in print:
- 2019
- Published Online:
- February 2021
- ISBN:
- 9780190881481
- eISBN:
- 9780190093761
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/oso/9780190881481.003.0006
- Subject:
- Neuroscience, Techniques
This chapter covers the basics of Bayesian statistics, emphasizing the conceptual framework for Bayes’ Theorem. It works through several iterations of the theorem to demonstrate how the same equation ...
More
This chapter covers the basics of Bayesian statistics, emphasizing the conceptual framework for Bayes’ Theorem. It works through several iterations of the theorem to demonstrate how the same equation is applied in different circumstances, from constructing and updating models to parameter evaluation, to try to establish an intuitive feel for it. The chapter also covers the philosophical underpinnings of Bayesianism and compares them with the frequentist perspective described in Chapter 5. It addresses the question of whether Bayesians are inductivists. Finally, the chapter shows how the Bayesian procedures of model selection and comparison can be pressed into service to allow Bayesian methods to be used in hypothesis testing in essentially the same way that various p-tests are used in the frequentist hypothesis testing framework.Less
This chapter covers the basics of Bayesian statistics, emphasizing the conceptual framework for Bayes’ Theorem. It works through several iterations of the theorem to demonstrate how the same equation is applied in different circumstances, from constructing and updating models to parameter evaluation, to try to establish an intuitive feel for it. The chapter also covers the philosophical underpinnings of Bayesianism and compares them with the frequentist perspective described in Chapter 5. It addresses the question of whether Bayesians are inductivists. Finally, the chapter shows how the Bayesian procedures of model selection and comparison can be pressed into service to allow Bayesian methods to be used in hypothesis testing in essentially the same way that various p-tests are used in the frequentist hypothesis testing framework.
Colin F. Camerer
- Published in print:
- 2019
- Published Online:
- January 2020
- ISBN:
- 9780226613338
- eISBN:
- 9780226613475
- Item type:
- chapter
- Publisher:
- University of Chicago Press
- DOI:
- 10.7208/chicago/9780226613475.003.0024
- Subject:
- Economics and Finance, Microeconomics
An important strand of judgment and decision research in the 1970s and 1980s, which influenced behavioral economics mostly indirectly, documented the fact that simple statistical models often predict ...
More
An important strand of judgment and decision research in the 1970s and 1980s, which influenced behavioral economics mostly indirectly, documented the fact that simple statistical models often predict better than experts do (beginning with Meehl 1954). This chapter revisits this phenomenon and connects it to modern machine learning (ML) debates. One view is that subjective judgment exhibits properties of “unregularized” ML which overfits, because human judgment does not naturally penalize overfitting. I also discuss how ML techniques, in large data sets, can help discover different “behavioral types” which correspond to, or extend, heterogeneous types hypothesized previously. ML applications in practice, such as recommender systems, also connect to behavioral concepts of how limited attention and preference assembly create opportunities to either benefit or harm consumers.Less
An important strand of judgment and decision research in the 1970s and 1980s, which influenced behavioral economics mostly indirectly, documented the fact that simple statistical models often predict better than experts do (beginning with Meehl 1954). This chapter revisits this phenomenon and connects it to modern machine learning (ML) debates. One view is that subjective judgment exhibits properties of “unregularized” ML which overfits, because human judgment does not naturally penalize overfitting. I also discuss how ML techniques, in large data sets, can help discover different “behavioral types” which correspond to, or extend, heterogeneous types hypothesized previously. ML applications in practice, such as recommender systems, also connect to behavioral concepts of how limited attention and preference assembly create opportunities to either benefit or harm consumers.
Roger Arditi and Lev R. Ginzburg
- Published in print:
- 2012
- Published Online:
- May 2015
- ISBN:
- 9780199913831
- eISBN:
- 9780190267902
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/acprof:osobl/9780199913831.003.0007
- Subject:
- Biology, Ecology
This chapter explains the role of ratio-dependence as an invariance in ecology. It analyzes the importance of using scaling invariance in mathematics and proposes that all models of interacting ...
More
This chapter explains the role of ratio-dependence as an invariance in ecology. It analyzes the importance of using scaling invariance in mathematics and proposes that all models of interacting species must be fundamentally invariant to a proportional change in the system. It uses Newton's law of inertia as a metaphor for the Malthusian law of population growth and extends this to include interacting species. It looks into Andrey Kolmogorov's insight on the possibility of ratio-dependent interaction, and examines H. Reşit Akçakaya's famous field experiment of the Canadian lynx and hare cycles. It compares Reşit's number of unsupported parameters with other research to measure overfitting. It also discusses idealizations or limit myths of theories and applies this to the ratio-independence model.Less
This chapter explains the role of ratio-dependence as an invariance in ecology. It analyzes the importance of using scaling invariance in mathematics and proposes that all models of interacting species must be fundamentally invariant to a proportional change in the system. It uses Newton's law of inertia as a metaphor for the Malthusian law of population growth and extends this to include interacting species. It looks into Andrey Kolmogorov's insight on the possibility of ratio-dependent interaction, and examines H. Reşit Akçakaya's famous field experiment of the Canadian lynx and hare cycles. It compares Reşit's number of unsupported parameters with other research to measure overfitting. It also discusses idealizations or limit myths of theories and applies this to the ratio-independence model.
Lionel Raff, Ranga Komanduri, Martin Hagan, and Satish Bukkapatnam
- Published in print:
- 2012
- Published Online:
- November 2020
- ISBN:
- 9780199765652
- eISBN:
- 9780197563113
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/oso/9780199765652.003.0007
- Subject:
- Chemistry, Physical Chemistry
In this section, we want to give a brief introduction to neural networks (NNs). It is written for readers who are not familiar with neural networks but are curious about how they can be applied to ...
More
In this section, we want to give a brief introduction to neural networks (NNs). It is written for readers who are not familiar with neural networks but are curious about how they can be applied to practical problems in chemical reaction dynamics. The field of neural networks covers a very broad area. It is not possible to discuss all types of neural networks. Instead, we will concentrate on the most common neural network architecture, namely, the multilayer perceptron (MLP). We will describe the basics of this architecture, discuss its capabilities, and show how it has been used on several different chemical reaction dynamics problems (for introductions to other types of networks, the reader is referred to References 105-107). For the purposes of this document, we will look at neural networks as function approximators. As shown in Figure 3-1, we have some unknown function that we wish to approximate. We want to adjust the parameters of the network so that it will produce the same response as the unknown function, if the same input is applied to both systems. For our applications, the unknown function may correspond to the relationship between the atomic structure variables and the resulting potential energy and forces. The multilayer perceptron neural network is built up of simple components. We will begin with a single-input neuron, which we will then extend to multiple inputs. We will next stack these neurons together to produce layers. Finally, we will cascade the layers together to form the network. A single-input neuron is shown in Figure 3-2. The scalar input p is multiplied by the scalar weight w to form wp, one of the terms that is sent to the summer. The other input, 1, is multiplied by a bias b and then passed to the summer. The summer output n, often referred to as the net input, goes into a transfer function f, which produces the scalar neuron output a.
Less
In this section, we want to give a brief introduction to neural networks (NNs). It is written for readers who are not familiar with neural networks but are curious about how they can be applied to practical problems in chemical reaction dynamics. The field of neural networks covers a very broad area. It is not possible to discuss all types of neural networks. Instead, we will concentrate on the most common neural network architecture, namely, the multilayer perceptron (MLP). We will describe the basics of this architecture, discuss its capabilities, and show how it has been used on several different chemical reaction dynamics problems (for introductions to other types of networks, the reader is referred to References 105-107). For the purposes of this document, we will look at neural networks as function approximators. As shown in Figure 3-1, we have some unknown function that we wish to approximate. We want to adjust the parameters of the network so that it will produce the same response as the unknown function, if the same input is applied to both systems. For our applications, the unknown function may correspond to the relationship between the atomic structure variables and the resulting potential energy and forces. The multilayer perceptron neural network is built up of simple components. We will begin with a single-input neuron, which we will then extend to multiple inputs. We will next stack these neurons together to produce layers. Finally, we will cascade the layers together to form the network. A single-input neuron is shown in Figure 3-2. The scalar input p is multiplied by the scalar weight w to form wp, one of the terms that is sent to the summer. The other input, 1, is multiplied by a bias b and then passed to the summer. The summer output n, often referred to as the net input, goes into a transfer function f, which produces the scalar neuron output a.