Storkey Amos
- Published in print:
- 2008
- Published Online:
- August 2013
- ISBN:
- 9780262170055
- eISBN:
- 9780262255103
- Item type:
- chapter
- Publisher:
- The MIT Press
- DOI:
- 10.7551/mitpress/9780262170055.003.0001
- Subject:
- Computer Science, Machine Learning
This chapter introduces the general learning transfer problem and formulates it in terms of a change of scenario. Standard regression and classification models can be characterized as conditional ...
More
This chapter introduces the general learning transfer problem and formulates it in terms of a change of scenario. Standard regression and classification models can be characterized as conditional models. Assuming that the conditional model is true, covariate shift is not an issue. However, if this assumption does not hold, conditional modeling will fail. The chapter then characterizes a number of different cases of dataset shift, including simple covariate shift, prior probability shift, sample selection bias, imbalanced data, domain shift, and source component shift. Each of these situations is cast within the framework of graphical models and a number of approaches to addressing each of these problems are reviewed. The chapter also presents a framework for multiple dataset learning that prompts the possibility of using hierarchical dataset linkage.Less
This chapter introduces the general learning transfer problem and formulates it in terms of a change of scenario. Standard regression and classification models can be characterized as conditional models. Assuming that the conditional model is true, covariate shift is not an issue. However, if this assumption does not hold, conditional modeling will fail. The chapter then characterizes a number of different cases of dataset shift, including simple covariate shift, prior probability shift, sample selection bias, imbalanced data, domain shift, and source component shift. Each of these situations is cast within the framework of graphical models and a number of approaches to addressing each of these problems are reviewed. The chapter also presents a framework for multiple dataset learning that prompts the possibility of using hierarchical dataset linkage.
Ben-David Shai
- Published in print:
- 2008
- Published Online:
- August 2013
- ISBN:
- 9780262170055
- eISBN:
- 9780262255103
- Item type:
- chapter
- Publisher:
- The MIT Press
- DOI:
- 10.7551/mitpress/9780262170055.003.0005
- Subject:
- Computer Science, Machine Learning
This chapter discusses some dataset shift learning problems from a formal, statistical point of view. It provides definitions for “multitask learning,” “inductive transfer,” and “domain adaptation,” ...
More
This chapter discusses some dataset shift learning problems from a formal, statistical point of view. It provides definitions for “multitask learning,” “inductive transfer,” and “domain adaptation,” and discusses the parameters along which such learning scenarios may be taxonomized. The chapter then focuses on one concrete setting of domain adaptation and demonstrates how error bounds can be derived for it. These bounds can be reliably estimated from finite samples of training data, and do not rely on any assumptions concerning similarity between the domain from which the labeled training data is sampled and the target (or test) data. However, they are relative to the performance of some optimal classifier, rather than providing any absolute performance guarantee.Less
This chapter discusses some dataset shift learning problems from a formal, statistical point of view. It provides definitions for “multitask learning,” “inductive transfer,” and “domain adaptation,” and discusses the parameters along which such learning scenarios may be taxonomized. The chapter then focuses on one concrete setting of domain adaptation and demonstrates how error bounds can be derived for it. These bounds can be reliably estimated from finite samples of training data, and do not rely on any assumptions concerning similarity between the domain from which the labeled training data is sampled and the target (or test) data. However, they are relative to the performance of some optimal classifier, rather than providing any absolute performance guarantee.