• U Pal

      Articles written in Sadhana

    • Automatic recognition of printed Oriya script

      B B Chaudhuri U Pal M Mitra

      More Details Abstract Fulltext PDF

      This paper deals with an Optical Character Recognition (OCR) system for printedOriya script. The development of OCR for this script is difficult because a large number of character shapes in the script have to be recognized. In the proposed system, the document image is first captured using a flat-bed scanner and then passed through different preprocessing modules like skew correction, line segmentation, zone detection, word and character segmentation etc. These modules have been developed by combining some conventional techniques with some newly proposed ones. Next, individual characters are recognized using a combination of stroke and run-number based features, along with features obtained from the concept of water overflow from a reservoir. The feature detection methods are simple and robust, and do not require preprocessing steps like thinning and pruning. A prototype of the system has been tested on a variety of printed Oriya material, and currently achieves 96.3% character level accuracy on average.

    • Handwriting segmentation of unconstrained Oriya text

      N Tripathy U Pal

      More Details Abstract Fulltext PDF

      Segmentation of handwritten text into lines, words and characters is one of the important steps in the handwritten text recognition process. In this paper we propose a water reservoir concept-based scheme for segmentation of unconstrained Oriya handwritten text into individual characters. Here, at first, the text image is segmented into lines, and the lines are then segmented into individual words. For line segmentation, the document is divided into vertical stripes. Analysing the heights of the water reservoirs obtained from different components of the document, the width of a stripe is calculated. Stripe-wise horizontal histograms are then computed and the relationship of the peak-valley points of the histograms is used for line segmentation. Based on vertical projection profiles and structural features of Oriya characters, text lines are segmented into words. For character segmentation, at first, the isolated and connected (touching) characters in a word are detected. Using structural, topological and water reservoir concept-based features, characters of the word that touch are then segmented. From experiments we have observed that the proposed “touching character” segmentation module has 96.7% accuracy for two-character touching strings.

    • A contour distance-based approach for multi-oriented and multi-sized character recognition

      U Pal N Tripathy

      More Details Abstract Fulltext PDF

      In this paper, we propose a novel scheme towards the recognition of multi-oriented and multi-sized isolated characters of printed script. For recognition, at first, distances of the outer contour points from the centroid of the individual characters are calculated and these contour distances are then arranged in a particular order to get size and rotation invariant feature. Next, based on the arranged contour distances, the features are derived from different class of characters. Finally, we use these derived features of the characters to statistically compare the features of the input character for recognition. We have tested our scheme on printed Bangla and Devnagari multi-oriented characters and we obtained encouraging results.

  • Sadhana | News

    • Editorial Note on Continuous Article Publication

      Posted on July 25, 2019

      Click here for Editorial Note on CAP Mode

© 2021-2022 Indian Academy of Sciences, Bengaluru.