Title: Pitfalls of Optical Character Recognition OCR Scanning for the Mississippi Uniform Crash Report Appl
1Pitfalls of Optical Character Recognition (OCR)
Scanning for the Mississippi Uniform Crash Report
Application2005 Traffic Records ForumDonna
SmithProgrammer AnalystMississippi Department
of Public Safety
2Overview
- In 2003, Mississippi Department of Public Safety
and Mississippi Department of Transportation
recognized a need to modernize its crash data
collection mechanisms - Initially, the State of MS chose to pursue an
Optical Character Recognition (OCR) solution for
electronically scanning data off of the paper
reports and storing information
3Mississippi Uniform Crash Report Optical
Character Recognition Process Description
- The officer on the scene fills in the MS Uniform
Crash Report
4The Officer on the Scene Fills in the Mississippi
Uniform Crash Report (MUCR)
- All officers in MS have paper versions of the
report. - The form was redesigned to capture information
efficiently and completely through the use of
pre-defined fields wherever possible. - Intent is to facilitate the officer making quick
selections through check boxes and option buttons
in a multiple-choice fashion.
5State of Mississippi Uniform Crash Report
- Always Page 1.
- Includes First Harmful Event, Crash location,
Roadway System, etc. - Reporting Officer Badge Number and Name.
- Reviewing Officer Badge Number and Name.
6Mississippi Uniform Crash ReportDiagram /
Narrative
- Top part is dedicated to the hand drawn diagram.
- Bottom part is dedicated to the hand written
narrative.
7Mississippi Uniform Crash ReportPerson / Occupant
- Includes information about the driver.
- Includes information up to 2 occupants.
- Items included are Driver Condition, Contributing
Circumstances, Safety Equipment, Position of the
driver, etc. - Occupant information is the same as the driver
with the exception of Condition.
8Mississippi Uniform Crash ReportVehicle
- Includes information about the vehicle involved
in the crash. - Fields included are Sequence of Events, Vehicle
Action, Vehicle Configuration, Insurance
Information, Owner of Vehicle information,
Direction of travel, etc. - Also, includes Commercial Vehicle information.
This information is required based on the Vehicle
Configuration selection.
9Mississippi Uniform Crash Report Optical
Character Recognition Process Description
- The officer on the scene fills in the MS Uniform
Crash Report
- Officer takes completed form to the police
office
10Officer Takes Completed Form to the Police Office
- Reporting officer carries
- the completed report to
- his / her police station or
- Highway Patrol Troop and
- gives to the supervising
- officer.
11Mississippi Uniform Crash Report Optical
Character Recognition Process Description
- The officer on the scene fills in the MS Uniform
Crash Report - Officer takes completed form to the police office
- Reviewing officer ensures data on form is accurate
12Reviewing Officer Ensures Data on Form Is Accurate
- Process of review puts a second set of eyes on
the report to catch any errors that may have been
overlooked by the submitting officer. - Objective of recording accurate data is to ensure
that the crash can be entirely recreated from the
data captured.
13Mississippi Uniform Crash Report Optical
Character Recognition Process Description
- The officer on the scene fills in the MS Uniform
Crash Report - Officer takes completed form to the police office
- Reviewing officer ensures data on form is accurate
- Forms are sent to be scanned
14Forms Are Sent to Be Scanned
- Reporting agency sends the completed report to
the Mississippi Department of Public Safety,
Safety Responsibility Division. - Safety Responsibility Division is responsible for
scanning the forms into the system,
electronically pulling data from the scanned
forms, and data verification. - Department of Public Safety Troop Secretaries
actually scan the forms in-house and are
responsible for the data verification process
themselves.
15Mississippi Uniform Crash Report Optical
Character Recognition Process Description
- The officer on the scene fills in the MS Uniform
Crash Report - Officer takes completed form to the police office
- Reviewing officer ensures data on form is
accurate - Forms are sent to be scanned
- Forms are scanned, checked for scan quality
16Forms Are Scanned, Checked for Scan Quality
- Operator receiving the crash report is
responsible for the scan quality, page
orientation, and page order. - Operators were trained on the equipment,
software, and the techniques to ensure quality. - Page order is a manual verification of the image
it is imperative that Page 01, with the unique
Agency Number and Agency Case Number, is the
first page scanned of the crash report. - Order of the subsequent pages does not affect the
electronic data pull.
17Mississippi Uniform Crash Report Optical
Character Recognition Process Description
- The officer on the scene fills in the MS Uniform
Crash Report - Officer takes completed form to the police office
- Reviewing officer ensures data on form is
accurate - Forms are sent to be scanned
- Forms are scanned, checked for scan quality
- Teleform Scanning Station
18Teleform Scanning Station
- Teleform Scanning Station electronically creates
an image file of the crash report, in a .tif
format - System creates and assigns batches of crash
reports for electronically pulling of data from
the images and data verification - Automated process and does not require human
intervention
19Mississippi Uniform Crash Report Optical
Character Recognition Process Description
- The officer on the scene fills in the MS Uniform
Crash Report - Officer takes completed form to the police office
- Reviewing officer ensures data on form is
accurate - Forms are sent to be scanned
- Forms are scanned, checked for scan quality
- Teleform Scanning Station
20Verification Process
- Teleform Reader software then uses Optical
Character Recognition (OCR) technology to read
the scanned information. - Automated process that pulls data from the
scanned forms and creates an electronic version.
21Mississippi Uniform Crash Report Optical
Character Recognition Process Description
- The officer on the scene fills in the MS Uniform
Crash Report - Officer takes completed form to the police office
- Reviewing officer ensures data on form is
accurate - Forms are sent to be scanned
- Forms are scanned, checked for scan quality
- Teleform Scanning Station
- Verification process
- Verifier reviews scanned information
22Verifier Reviews Scanned Information
- Verifier reviews the electronic copy of the form
to ensure that accurate data was captured. - Individual must then correct data where there
were Optical Character Recognition (OCR) /
scanning errors.
23Mississippi Uniform Crash Report Optical
Character Recognition Process Description
- The officer on the scene fills in the MS Uniform
Crash Report - Officer takes completed form to the police office
- Reviewing officer ensures data on form is
accurate - Forms are sent to be scanned
- Forms are scanned, checked for scan quality
- Teleform Scanning Station
- Verification process
- Verifier reviews scanned information
- Teleform Verifier exports data to the staging
database
24Teleform Verifier Exports Data to the Staging
Database
- Staging database is the last checkpoint prior to
be entered into the production database. - During the export, there is an initial data
validation check, and errors identified are
published in the error tables.
25Mississippi Uniform Crash Report Optical
Character Recognition Process Description
- The officer on the scene fills in the MS Uniform
Crash Report - Officer takes completed form to the police office
- Reviewing officer ensures data on form is
accurate - Forms are sent to be scanned
- Forms are scanned, checked for scan quality
- Teleform Scanning Station
- Verification process
- Verifier reviews scanned information
- Teleform Verifier exports data to the staging
database
- Electronic data validation
26Electronic Data Validation
- After all data has been entered into the staging
database, more stringent electronic validation
applies greater constraints to the data before it
is moved over to the production database.
27Mississippi Uniform Crash Report Optical
Character Recognition Process Description
- The officer on the scene fills in the MS Uniform
Crash Report - Officer takes completed form to the police office
- Reviewing officer ensures data on form is
accurate - Forms are sent to be scanned
- Forms are scanned, checked for scan quality
- Teleform Scanning Station
- Verification process
- Verifier reviews scanned information
- Teleform Verifier exports data to the staging
database - Electronic data validation
- Return of error data for correction and
reprocessing
28Return of Error Data for Correction and
Reprocessing
- Electronic process pulls data from the error
database and uses a publishing agent to recreate
the Mississippi Uniform Crash Report for problem
reports. - The form can be emailed or physically printed and
sent to the originating agency for review and
correction.
29Mississippi Uniform Crash Report Optical
Character Recognition Process Description
- The officer on the scene fills in the MS Uniform
Crash Report - Officer takes completed form to the police office
- Reviewing officer ensures data on form is
accurate - Forms are sent to be scanned
- Forms are scanned, checked for scan quality
- Teleform Scanning Station
- Verification process
- Verifier reviews scanned information
- Teleform Verifier exports data to the staging
database - Electronic data validation
- Return of error data for correction and
reprocessing
- Data in production database
30Data in Production Database
- Reports that are in the staging database, which
pass all verification checks, the data is ported
over to the production database. - Data is now ready for reporting and analysis.
31Overall Pitfalls
- Dirty scanner bed or dirty documents
- Optical Character Recognition (OCR) software has
a hard time recognizing characters that were
printed too heavily, so that they bleed together,
or too lightly, so that the thinner portions of
the characters are broken.
32Pitfalls of the Officer Filling Out the Report on
the Scene
- Reporting Officers handwriting was a huge
factor in Optical Character Recognition (OCR)
Scanning because every officers handwriting is
unique.
33Pitfalls of Reviewing Officer Ensuring Data on
Form Is Accurate
- Over 100 business rules attached to the fields
on the different forms. Impossible for Reviewing
Officer to catch every mistake on each and every
report that was reviewed.
34Pitfalls for Forms Are Sent to Be Scanned
- Safety Responsibility
- (SR) Division would
- receive large amounts
- of reports in the mail
- from law enforcement
- agencies on a daily or
- weekly basis.
35Pitfalls of Forms Are Scanned, Checked for Scan
Quality
- Safety Responsibility Division (SR) supervisor
would separate reports into batches and assign to
one of three persons that were in charge of
scanning reports.
36Pitfalls of Verifier Reviews Scanned Information
- Optical Character Recognition (OCR)
- recognition was so poor that the operator
- had to basically re-key the entire report.
37Pitfalls on Teleform Verifier Exporting Data to
Staging Database
- Reports were verified and the master table
verification stored procedure ran to check for
errors between related forms, this would generate
literally thousands of errors. After several
months of error accumulation, it was impossible
for the operators to see the light at the end of
the tunnel.
38Solutions to Optical Character Recognition (OCR)
Scanning Problems
- Traffic Records Committee began seeking and end
to this nightmare resulting from Optical
Character Recognition (OCR) Scanning. - Optical Character Recognition (OCR) Scanning
process had started in January 2004 and a Request
for Proposal (RFP) for a new automated crash
report system went out shortly after that date. - Bid was awarded to Visual Statement, Inc. for its
electronic collision reporting software
ReportBeam in late June of 2004. - Department of Public Safety (DPS) began working
with Visual Statement to customize the ReportBeam
software for the State of Mississippis specific
needs.
39Solutions Continued
- Several law enforcement agencies outside the
Department of Public Safety umbrella were chosen
to participate in a pilot of ReportBeam in
October of 2004. - Once the testing phase with the pilot agencies
was completed, the DECISION was made to have
Safety Responsibility Division (SR) to start
using ReportBeam to enter the crash reports that
were submitted to the Department of Public Safety
(DPS) for data entry since they were already
manually entering the entire report with the OCR
system anyway. - At least with ReportBeam, the errors could be
corrected at the time of entry through the
extensive run-time business rules instead of
having to be rescanned and correcting the reports
several times before the data was entered into
the Department of Public Safety database. - To this day, the initial data captured by the OCR
system is being scrubbed for import into the
States ReportBeam data repository.
40Example of Error at Time of Entry From ReportBeam
41Solutions Conclusion
- Once ReportBeam was officially accepted by the
Commissioner of Department of Public Safety,
training began for all law enforcement agencies
in February of 2005 and continues even today. - Once a law enforcement agency is trained, their
officers will enter their own crash reports into
the system and submit them directly to the state
server via the Internet. - State of Mississippi has approximately 300 Law
Enforcement Agencies and we currently have over
200 Law Enforcement Agencies using ReportBeam. - Department of Public Safety is very happy to see
the number of Agencies committed to using an
Automated Crash Reporting System like ReportBeam.
42Questions ???