Abstract: |
Class imbalance is a pervasive problem in applications of classification models, including deep neural networks. Standard countermeasures, such as re-sampling,
add emphasis on learning from the relatively smaller minority class. However with a discrepancy in the scale required for deep learning, such a strategy can induce over-fitting. In this paper, we propose an adversarial learning framework for re-sampling synthetic minority class samples in class-imbalanced, sequence classification problems.
We train the generator network to produce synthetic feature vectors from stochastically modified sequences, which the classifier network is likely to predict as those from the minority class.
The classifier network, on the other hand, is trained to discriminate the synthetic vectors from those of the minority class, and is prevented from relying on few limited features so as not to make the deception by the generator easier.
We further attempt to gain robustness by producing a diverse set of synthetic feature vectors using both the minority and the majority class sequences as input to the generator. We present proof-of-concept experiments with classification of texts and caption sequences. The results show that the proposed framework can substantially improve the recall for the minority class and the overall retrieval performance as well. |