Boosting for transfer learning

Abstract
Traditional machine learning makes a ba- sic assumption: the training and test data should be under the same distribution. However, in many cases, this identical- distribution assumption does not hold. The assumption might be violated when a task from one new domain comes, while there are only labeled data from a similar old domain. Labeling the new data can be costly and it would also be a waste to throw away all the old data. In this pa- per, we present a novel transfer learning framework called TrAdaBoost, which extends boosting-based learning algorithms (Freund & Schapire, 1997). TrAdaBoost allows users to utilize a small amount of newly labeled data to leverage the old data to construct a high-quality classification model for the new data. We show that this method can allow us to learn an accurate model using only a tiny amount of new data and a large amount of old data, even when the new data are not sufficient to train a model alone. We show that TrAdaBoost allows knowledge to be ef- fectively transferred from the old data to the new. The effectiveness of our algorithm is an- alyzed theoretically and empirically to show that our iterative algorithm can converge well to an accurate model.

This publication has 11 references indexed in Scilit: