Skip to content

manthanchauhan/Image-Processor-For-Text-Extraction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image-Processor-For-Text-Extraction

Aim : The Aim of the project is to develop a image processor whose output could be used by neural networks for text extraction from the input image.
Input : Image with single black text paragraph on white background as shown in sample
Output : Bounding rectange of each charactor in the input. But for representation purpose the output is an binary image with every charactor highlighted by enclosing it in a white rectangle.

Pre-requisites :
Modules - cv2, numpy and sys (you can download these using pip)
Python - Basic syntax, functions :str(), len(), range(), sum() (you can read about these functions here) and handling command line arguments in python.
OpenCV - imread(), dilate(), findContours(), contourArea(), Canny(), adaptiveThreshold(), erode(), bitwise_and(), minAreaRect(), getRotationMatrix2D(), warpAffine(), rectangle(), resize(), imwrite(), waitKey() and destroyAllWindows() (you can read about these functions here)
Numpy - ones(), where(), column_stack() (you can read about these functions here)

Brief algorithm:
1)Read the input image in Grayscale format.
2)Converted the image to binary format.
3)Cropped the image to minimize non-text region.
4)Removed any tilt from the text.
5)Determined the y-indices of the top and the bottom of each line of text (line segmentation).
6)In each line determined the x-indices of the beginning and the end of each charactor (charactor segmentation).
7)The bounding rectangle of each charactor is found by its x-indices and the y-indices of the line containing the charactor.
(for complete working of the project you can refer to docs)

Result - The bounding rectangle of a each charactor could be fed to a neural network to covert the charctor from optical to text format.

About

some documentation to be added

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages