Development of an OCR System - PowerPoint PPT Presentation

About This Presentation
Title:

Development of an OCR System

Description:

... handwriting based. Goals of My Project. Generic recognition for Latin-based fonts ... Averages of Character Models for every character from many different fonts ... – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 13
Provided by: tjh5
Category:

less

Transcript and Presenter's Notes

Title: Development of an OCR System


1
Development of an OCR System
Nathan Harmata TJHSST Computer Systems
Lab 2007-2008
2
What is OCR?
Optical Character Recognition
Font and handwriting based
3
Goals of My Project
Generic recognition for Latin-based fonts
System built from scratch
Proper handling of most formatting
4
Overview of Idocrase System
5
Image Processing
6
Transformations
Attribute Character Model
7
Transformations
Sector Vector - image is parsed into parts that
pass the vertical line test -
then each part is transformed into a collection
of line segments
Gap Vector - gaps, if any, are found on the four
sides of the image
8
Transformations
Pixel Concentration Vector which sides, if any,
have a higher
concentration of pixels
9
Character Recognition
GCDD Generic Character Definition Database
Averages of Character Models for every character
from many different fonts
0 PixelConcentrationVector balanced balanced
SectorVector 4 3 GapVector
10
Character Recognition
For a single character
For words, dictionary and grammar references are
used.
11
Idocrase Application
12
Results
-Mediocre word recognition -Doesnt handle
formatting well -Doesnt handle small letters
well -Fairly accurate single character
recognition (93.7)?
Write a Comment
User Comments (0)
About PowerShow.com