Boosting for transfer learning
- 20 June 2007
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM)
- p. 193-200
- https://doi.org/10.1145/1273496.1273521
Abstract
Traditional machine learning makes a ba- sic assumption: the training and test data should be under the same distribution. However, in many cases, this identical- distribution assumption does not hold. The assumption might be violated when a task from one new domain comes, while there are only labeled data from a similar old domain. Labeling the new data can be costly and it would also be a waste to throw away all the old data. In this pa- per, we present a novel transfer learning framework called TrAdaBoost, which extends boosting-based learning algorithms (Freund & Schapire, 1997). TrAdaBoost allows users to utilize a small amount of newly labeled data to leverage the old data to construct a high-quality classification model for the new data. We show that this method can allow us to learn an accurate model using only a tiny amount of new data and a large amount of old data, even when the new data are not sufficient to train a model alone. We show that TrAdaBoost allows knowledge to be ef- fectively transferred from the old data to the new. The effectiveness of our algorithm is an- alyzed theoretically and empirically to show that our iterative algorithm can converge well to an accurate model.Keywords
This publication has 11 references indexed in Scilit:
- Logistic regression with an auxiliary data sourcePublished by Association for Computing Machinery (ACM) ,2005
- Improving SVM accuracy by training on auxiliary data sourcesPublished by Association for Computing Machinery (ACM) ,2004
- Learning and evaluating classifiers under sample selection biasPublished by Association for Computing Machinery (ACM) ,2004
- Exploiting Task Relatedness for Multiple Task LearningLecture Notes in Computer Science, 2003
- Learning to Classify Text Using Support Vector MachinesPublished by Springer Science and Business Media LLC ,2002
- Improving predictive inference under covariate shift by weighting the log-likelihood functionJournal of Statistical Planning and Inference, 2000
- A Decision-Theoretic Generalization of On-Line Learning and an Application to BoostingJournal of Computer and System Sciences, 1997
- A training algorithm for optimal margin classifiersPublished by Association for Computing Machinery (ACM) ,1992
- Sample Selection Bias as a Specification ErrorEconometrica, 1979
- On Information and SufficiencyThe Annals of Mathematical Statistics, 1951