Andrew J. Connolly, Jacob T. VanderPlas, Alexander Gray, Andrew J. Connolly, Jacob T. VanderPlas, and Alexander Gray
- Published in print:
- 2014
- Published Online:
- October 2017
- ISBN:
- 9780691151687
- eISBN:
- 9781400848911
- Item type:
- chapter
- Publisher:
- Princeton University Press
- DOI:
- 10.23943/princeton/9780691151687.003.0007
- Subject:
- Physics, Particle Physics / Astrophysics / Cosmology
With the dramatic increase in data available from a new generation of astronomical telescopes and instruments, many analyses must address the question of the complexity as well as size of the data ...
More
With the dramatic increase in data available from a new generation of astronomical telescopes and instruments, many analyses must address the question of the complexity as well as size of the data set. This chapter deals with how we can learn which measurements, properties, or combinations thereof carry the most information within a data set. It describes techniques that are related to concepts discussed when describing Gaussian distributions, density estimation, and the concepts of information content. The chapter begins with an exploration of the problems posed by high-dimensional data. It then describes the data sets used in this chapter, and introduces perhaps the most important and widely used dimensionality reduction technique, principal component analysis (PCA). The remainder of the chapter discusses several alternative techniques which address some of the weaknesses of PCA.Less
With the dramatic increase in data available from a new generation of astronomical telescopes and instruments, many analyses must address the question of the complexity as well as size of the data set. This chapter deals with how we can learn which measurements, properties, or combinations thereof carry the most information within a data set. It describes techniques that are related to concepts discussed when describing Gaussian distributions, density estimation, and the concepts of information content. The chapter begins with an exploration of the problems posed by high-dimensional data. It then describes the data sets used in this chapter, and introduces perhaps the most important and widely used dimensionality reduction technique, principal component analysis (PCA). The remainder of the chapter discusses several alternative techniques which address some of the weaknesses of PCA.
Masashi Sugiyama and Motoaki Kawanabe
- Published in print:
- 2012
- Published Online:
- September 2013
- ISBN:
- 9780262017091
- eISBN:
- 9780262301220
- Item type:
- chapter
- Publisher:
- The MIT Press
- DOI:
- 10.7551/mitpress/9780262017091.003.0005
- Subject:
- Computer Science, Machine Learning
This chapter discusses a dimensionality reduction scheme for density-ratio estimation, called direct density-ratio estimation with dimensionality reduction (D3; pronounced as “D-cube”). The basic ...
More
This chapter discusses a dimensionality reduction scheme for density-ratio estimation, called direct density-ratio estimation with dimensionality reduction (D3; pronounced as “D-cube”). The basic idea of D3 is to find a low-dimensional subspace in which training and test densities are significantly different, and estimate the density ratio only in this subspace. A supervised dimensionality reduction technique called local Fisher discriminant analysis (LFDA) is employed for identifying such a subspace. The usefulness of the D3 approach is illustrated through numerical experiments.Less
This chapter discusses a dimensionality reduction scheme for density-ratio estimation, called direct density-ratio estimation with dimensionality reduction (D3; pronounced as “D-cube”). The basic idea of D3 is to find a low-dimensional subspace in which training and test densities are significantly different, and estimate the density ratio only in this subspace. A supervised dimensionality reduction technique called local Fisher discriminant analysis (LFDA) is employed for identifying such a subspace. The usefulness of the D3 approach is illustrated through numerical experiments.
Saul Lawrence K., Weinberger Kilian Q., Sha Fei, Ham Jihun, and Lee Daniel D.
- Published in print:
- 2006
- Published Online:
- August 2013
- ISBN:
- 9780262033589
- eISBN:
- 9780262255899
- Item type:
- chapter
- Publisher:
- The MIT Press
- DOI:
- 10.7551/mitpress/9780262033589.003.0016
- Subject:
- Computer Science, Machine Learning
This chapter provides an overview of unsupervised learning algorithms that can be viewed as spectral methods for linear and nonlinear dimensionality reduction. Spectral methods have recently emerged ...
More
This chapter provides an overview of unsupervised learning algorithms that can be viewed as spectral methods for linear and nonlinear dimensionality reduction. Spectral methods have recently emerged as a powerful tool for nonlinear dimensionality reduction and manifold learning. These methods are able to reveal low-dimensional structure in high-dimensional data from the top or bottom eigenvectors of specially constructed matrices. To analyze data that lie on a low-dimensional submanifold, the matrices are constructed from sparse weighted graphs whose vertices represent input patterns and whose edges indicate neighborhood relations. The main computations for manifold learning are based on tractable, polynomial-time optimizations, such as shortest-path problems, least-squares fits, semi-definite programming, and matrix diagonalization.Less
This chapter provides an overview of unsupervised learning algorithms that can be viewed as spectral methods for linear and nonlinear dimensionality reduction. Spectral methods have recently emerged as a powerful tool for nonlinear dimensionality reduction and manifold learning. These methods are able to reveal low-dimensional structure in high-dimensional data from the top or bottom eigenvectors of specially constructed matrices. To analyze data that lie on a low-dimensional submanifold, the matrices are constructed from sparse weighted graphs whose vertices represent input patterns and whose edges indicate neighborhood relations. The main computations for manifold learning are based on tractable, polynomial-time optimizations, such as shortest-path problems, least-squares fits, semi-definite programming, and matrix diagonalization.
Max A. Little
- Published in print:
- 2019
- Published Online:
- October 2019
- ISBN:
- 9780198714934
- eISBN:
- 9780191879180
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/oso/9780198714934.003.0006
- Subject:
- Mathematics, Logic / Computer Science / Mathematical Philosophy, Mathematical Physics
This chapter describes in detail how the main techniques of statistical machine learning can be constructed from the components described in earlier chapters. It presents these concepts in a way ...
More
This chapter describes in detail how the main techniques of statistical machine learning can be constructed from the components described in earlier chapters. It presents these concepts in a way which demonstrates how these techniques can be viewed as special cases of a more general probabilistic model which we fit to some data.Less
This chapter describes in detail how the main techniques of statistical machine learning can be constructed from the components described in earlier chapters. It presents these concepts in a way which demonstrates how these techniques can be viewed as special cases of a more general probabilistic model which we fit to some data.
Thomas P. Trappenberg
- Published in print:
- 2019
- Published Online:
- January 2020
- ISBN:
- 9780198828044
- eISBN:
- 9780191883873
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/oso/9780198828044.003.0003
- Subject:
- Neuroscience, Behavioral Neuroscience
This chapter’s goal is to show how to apply machine learning algorithms in a general setting using some classic methods. In particular, it demonstrates how to apply three important machine learning ...
More
This chapter’s goal is to show how to apply machine learning algorithms in a general setting using some classic methods. In particular, it demonstrates how to apply three important machine learning algorithms, a support vector classifier (SVC), a random forest classifier (RFC), and a multilayer perceptron (MLP). While many of the methods studied later go beyond these now classic methods, this does not mean that these methods are obsolete. Also, the algorithms discussed here provide some form of baseline to discuss advanced methods like probabilistic reasoning and deep learning. The aim here is to demonstrate that applying machine learning methods based on machine learning libraries is not very difficult. It offers an opportunity to discuss evaluation techniques that are very important in practice.Less
This chapter’s goal is to show how to apply machine learning algorithms in a general setting using some classic methods. In particular, it demonstrates how to apply three important machine learning algorithms, a support vector classifier (SVC), a random forest classifier (RFC), and a multilayer perceptron (MLP). While many of the methods studied later go beyond these now classic methods, this does not mean that these methods are obsolete. Also, the algorithms discussed here provide some form of baseline to discuss advanced methods like probabilistic reasoning and deep learning. The aim here is to demonstrate that applying machine learning methods based on machine learning libraries is not very difficult. It offers an opportunity to discuss evaluation techniques that are very important in practice.
Bryan C. Daniels
- Published in print:
- 2020
- Published Online:
- December 2020
- ISBN:
- 9780190636685
- eISBN:
- 9780190636722
- Item type:
- chapter
- Publisher:
- Oxford University Press
- DOI:
- 10.1093/oso/9780190636685.003.0003
- Subject:
- Economics and Finance, Microeconomics
From neurons to insects to societies, across biology we see impressive feats of collective information processing. What strategies do these systems use to perform useful computations? Moving toward ...
More
From neurons to insects to societies, across biology we see impressive feats of collective information processing. What strategies do these systems use to perform useful computations? Moving toward an answer to this question, this chapter focuses on common challenges in inferring models of complicated distributed systems and how the perspective of information theory and statistical physics is useful for understanding collective behavior.Less
From neurons to insects to societies, across biology we see impressive feats of collective information processing. What strategies do these systems use to perform useful computations? Moving toward an answer to this question, this chapter focuses on common challenges in inferring models of complicated distributed systems and how the perspective of information theory and statistical physics is useful for understanding collective behavior.