Research on document image analysis is actively pursued in the last few decades and services like OCR, vectorization of drawings/graphics and various types of form processing are very common. Handwritten documents, old historical documents and documents captured through camera are now being the subjects of active research. However, another very important type of paper document, namely the map document image processing research suffers due to the inherent complexities of the map document and also for nonavailability of benchmark public data-sets. This paper presents a new data-set, namely, the Land Map Image Database (LMIDb) that consists of a variety of land maps images (446 images at present and growing; scanned at 200/300 dpi in TIF format) and the corresponding ground-truth. Using semiautomatic tools non-text part of the images are deleted and the text-only ground-truth is also kept in the database. This paper also presents a classification strategy for map images using which the maps in the database are automatically classified into Political (Po), Physical (Ph), Resource (R) and Topographic (T) maps. The automatic classification of maps help indexing of the images in LMIDb for archival and easy retrieval of the right maps to get the appropriate geographical information. Classification accuracy is also tested on the proposed data-set and the result is encouraging.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.