Training data used in machine learning applications are often assumed to be perfect, i.e., do not contain any errors; however, this is almost never the case and may lead to limitations in the resulting model performance. In this paper, the effects of the presence of label errors in training data are studied quantitatively and in relation to model overfitting. By artificially creating label errors, it is observed that a constrained (small) CNN model exhibits remarkable generalizability --- retaining high accuracy even when most data are mislabelled! Test accuracy catastrophically falls only for unrealistically high label error rates, at a point related to the number of classes present in the data. These preliminary experiments pave the road towards further studies of model robustness, possibly offering a quantitative method through which to compare models.
PDF