Skip to content

Optimizing Training Data for Image Classifiers

Optimizing Training Data for Image Classifiers

In this paper, we propose a robust method for outlier removal to improve the performance for image classification.

Authors: Matthew Hagen, Ala Eddine Ayadi, Jiaqi Wang, Nikolaos Vasiloglou, Estelle Afshar. 2019.

In KDD 2019 Workshop on Data Collection, Curation, and Labeling for Mining and Learning (DCCL, KDD ‘19).

In this paper, we propose a robust method for outlier removal to improve the performance for image classification. Increasing the size of training data does not necessarily raise prediction accuracy, due to instances that may be poor representatives of their respective classes. Four separate experiments are tested to evaluate the effectiveness of outlier removal for several classifiers. Embeddings are generated from a pre-trained neural network, a fine-tuned network, as well as a Siamese network. Subsequently, outlier detection is evaluated based on clustering quality and classifier performance from a fully-connected feed-forward network, K-Nearest Neighbors and gradient boosting model.

Read the PDF: Optimizing Training Data for Image Classifiers (opens in a new tab)

Get Started!

Start your journey with RelationalAI today! Sign up to receive our newsletter, invitations to exclusive events, and customer case studies.

The information you provide will be used in accordance with the terms of our Privacy Policy. By submitting this form, you consent to allow RelationalAI to store and process the personal information submitted above to provide you the content requested.