There are a large number of historical documents in libraries and other archives throughout the world. Most of them are written by hand. In many cases they exist in only one specimen and are hard to reach. Digitization of such artifacts can make them available to the community. But even digitized, they remain unsearchable, and an important task is to draw the contents in the computer readable form. One of the first steps in this direction is to recognize where the lines of the text are. Computational intelligence algorithms can be used to solve this problem. In the present paper, two groups of algorithms, namely, projection-based and tensor voting-based, are compared. The performance is evaluated on a data set and with the procedure proposed by the organizers of the ICDAR 2009 competition.
Document image segmentation into text lines is one of the stages in unconstrained handwritten document recognition. This paper presents a new algorithm for text line separation in handwriting. The developed algorithm is based on a method using the projection profile. It employs thresholding, but the threshold value is variable. This permits determination of low or overlapping peaks of the graph. The proposed technique is shown to improve the recognition rate relative to traditional methods. The algorithm is robust in text line detection with respect to different text line lengths.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.