Data augmentation is one of the ways to deal with labeled data scarcity and overfitting. Both of these problems are crucial for modern deep-learning algorithms, which require massive amounts of data. The problem is better explored in the context of image analysis than for text; this work is a step forward to help close this gap. We propose a method for augmenting textual data when training convolutional neural networks for sentence classification. The augmentation is based on the substitution of words using a thesaurus as well as Princeton University's WordNet. Our method improves upon the baseline in most of the cases. In terms of accuracy, the best of the variants is 1.2% (pp.) better than the baseline.
This paper describes the process for processing reports from rescue and firefighting. To reports processing methods and techniques used in the field of textual data mining (text mining). This paper also presents the classification and analysis methods section of text which is considered a potential use in the proposed process.