• جعفر طهمورث نژاد

  • دانشیار
  • گروه مهندسی فناوری اطلاعات
Email:   
Morvarid Karimpour

Visual Domain Adaptation via Sample Selection



2018, ,

Abstract

Most of the machine learning and data mining algorithms rely heavily on the assumption that the training (source domain) and test (target domain) data follow the same feature space and distribution. However, this condition is violated in many real-world applications. In recent years, domain adaptation has become promising techniques to deal with such problems. Most of the existing approaches so far have been attempted to bring closer the source and target distributions when all the samples are used but not all instances are sufficient equally in terms of adaptability. Sample selection is a key technique to address the domain shift problem. This technique attempts to find a subset of source samples which are distributed most similarly to the target domain. In this research, three novel approaches based on sample selection are proposed which take the advantages of other domain adaptation techniques in order to improve the quality of research and results. All three proposed methods tackle the unsupervised domain shift problem via using some instances that similarly distributed between domains. The first approach is entitled as Domain- Invariant Cluster-based Adaptation for visual classification (DICA) which employs feature matching to map the source and target data onto a new embedded subspace and adapts the mismatches of joint marginal and conditional distributions across domains. Moreover, DICA constructs condensed domain-invariant clusters to project the source and target data into a discriminative shared subspace. In addition, DICA adapts sample-mismatch of the source and target domains by exploiting instance reweighting techniques. The second approach referred to as Optimal Couple Projection and Classification for Domain Adaptive Cluster-based Representation for multiple sources (OCPC-DACR) that is the completed form of DICA which benefits from model matching technique besides sample selection and feature matching. Moreover, OCPC-DACR focuses to improve the performance of DICA in multi-source domain adaptation problems via presenting an adaptive classifier. The third approach is entitled as landmark-based unsupervised Domain adaptation with Genetic search (DAG) which takes the advantages of a genetic algorithm to identify optimal subsets of source samples, i.e. landmarks in which their distributions are more similar to the target data. Then, DAG constructs easier auxiliary tasks using those selected landmarks and then solves tasks via GFK. The proposed approaches are evaluated on four benchmark domain adaptation datasets such as Office+Caltech, PIE, Digit, and COIL. Our comprehensive experiments indicate that all three proposed methods outperform other machine learning and state-of-the-art domain adaptation methods with considerable superiority.

Key Words : Unsupervised Domain Adaptation, Transfer Learning, Invariant Feature Representation, Domain Invariant Clustering, Instance Reweighting, Multi-source Learning, Landmark Selection, Genetic Algorithm




---