Prezentace aplikace PowerPoint - PowerPoint PPT Presentation

About This Presentation
Title:

Prezentace aplikace PowerPoint

Description:

To test a theoretical construct on a practical pilot problem with explicit ... We need a formalism allowing to easily work with relative positions and sizes of ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 27
Provided by: danie92
Category:

less

Transcript and Presenter's Notes

Title: Prezentace aplikace PowerPoint


1
Two-dimensional Context-Free Grammars Mathematic
al Formulae Recognition
Daniel Prua, Václav Hlavác
Center for Machine Perception Faculty of
Electrical Engineering Czech Technical
University, Prague
2
Presentation Overview
  • Formulae recognition, problem formulation
  • Known methods
  • General idea of structural recognition
  • Two-dimensional context-free grammars
  • Extension of the grammars
  • Recognition tool, pilot implementation
  • Results, future plans

3
Motivation for this work
  • To test a theoretical construct on a practical
    pilot problem with explicit structure ?
    mathematical formulae
  • The group of Schlesinger, Savchynskyy from Kiev
    works on music score recognition. We cooperate in
    a joint research project.

4
Math. formulae, off-line or on-line
  • Formulae recognition can be divided into two
    groups by the type of input
  • Off-line recognition a formula is depicted in a
    raster image.
  • On-line recognition a formula represented by a
    sequence of pen strokes (growing importance due
    to tablet PCs).

5
Math. formulae recognition, usage
  • Off-line recognition conversion of scanned
    printed mathematical texts into an electronic
    form.
  • On-line recognition connected to pen-based
    computing technologies (electronic tablets).
  • There are many papers on formulae recognition,
    but only a few commercial products (e.g.,
    xMathJournal by xThink)

6
Usual architecture
  • Two independent layers
  • Symbol detection and recognition.
  • Structural analysis.

image, sequence of strokes
symbol recognition
symbols ( coordinates and font size)
error corrections (optional)
structural analysis
derivation tree
7
Symbol recognition methods
  • Image segmentation OCR tool.
  • Image segmentation and character recognition
    performed simultaneously (e.g., by Hidden Markov
    Models).
  • It is very difficult to recover from errors made
    in segmentation phase.
  • Semantic not taken into account.

8
Structural analysis methods
  • Grammar based
  • geometric grammars
  • graph grammars
  • Non-grammar based
  • minimum spanning tree
  • hard-coded rules

9
Our approach to structural recognition
  • Based on general structural constructions by M.I.
    Schlesinger, V. Hlavác in Ten Lectures on
    Statistical and Syntactic Pattern Recognition
    (Kluwer Academic Publishers, 2002)
  • Do not separate segmentation and parsing, perform
    them simultaneously.
  • Suitable for recognition of objects with rich
    structure.
  • Already successfully applied to music scores and
    electric circuits diagrams.

10
Structural Recognition General Idea
Assumptions input image, set of derivation rules
Recognition
  • Algorithm starts with regions labeled by
    terminals
  • - squares corresponding to one symbol,
  • - regions detected by an external tool.
  • Bigger regions labeled by non-terminals are
    derived by applying the rules, each derivation is
    assigned by a penalty.
  • Result region matching the whole picture with
    the smallest penalty.

N
Region N is derived by a rule from regions A, B,
C, D
B
A
D
C
11
Structural Recognition Applied on Formulaeusing
2D Context-free Grammars
  • Uniform shapes of regions considered rectangles
  • 2D grammar for mathematical formulae designed.
  • Terminals detection - detect all possible
    occurrences of elementary symbols using an OCR
    tool, evaluate the occurrences by a penalty
    (computed by the OCR tool).

fraction line, minus sign
symbol 5
12
Structural Recognition Applied on Formulaeusing
2D Context-free Grammars
Parsing let the structural analysis decide what
is the best segmentation and interpretation of
the elementary symbols, i.e. find derivation tree
covering the whole image, evaluated by the
smallest penalty.
-
5
2
13
Two-dimensional Context-free Grammars
set of terminals
set of non-terminals
initial non-terminal
set of productions
Three basic types of productions in P
Generalized form of productions
14
Interpretation of Productions
G generates pictures that can be named by the
initial non-terminal S
15
Theoretical Results on 2D CF Languages
L(2CFG) ... class of languages that can be
generated by a 2D CF grammar
  • L(2CFG) includes 1D context-free languages
  • L(2CFG) and L(2FSA) are not comparable
  • There is no analogy to the Chomsky normal form of
    productions
  • Basic form of productions is weaker than general
    one
  • Emptiness problem is not decidable
  • Languages in L(2CFG) can be recognized in
    polynomial time

Observation natural generalization, but the
properties of L(2CFG) differ to the properties
of the class of 1D context-free languages.
16
Recognition in Polynomial Time
2D CF grammars with productions in the basic form
Generated languages can be recognized in time
(M.I. Schlesinger)
picture size
Algorithm can be generalized on all languages in
L(2CFG)
Maximal number of rows on the right-hand side of
a production.
Maximal number of columns on the right-hand side
of a production.
  • degree of the polynomial depends on size of the
    productions

17
Extension of 2D CF Grammars
2D context-free grammar are not power enough to
express complex structure of mathematical
formulae.
We need a formalism allowing to easily work with
relative positions and sizes of symbols, e.g. to
express relationships like a symbol is
superscript of another symbol, etc.
1
3
5
2

5
3
6
4
18
Extension of 2D CF Grammars
  • Regions are still rectangles.
  • Each derived region is assigned by a feature
    point (logical center). The feature point a
    derived region is determined by the applied
    production.

1
5
3
19
Extension of 2D CF Grammars
  • Usage of productions is not limited on directly
    neighboring (touching) rectangles.
  • Productions can specify a rectangular area where
    some specific point of a rectangle has to be
    contained.
  • Position and sizes can be given relative to one
    of the rectangles.
  • Restrictions on relative sizes of rectangles are
    also possible.

32
5
20
Penalty Computation
Based on summing partial penalties determined by
the following criterions
  • Used production.
  • Relative sizes and positions of regions the
    production is applied on (original regions).
  • Number of black pixels in the new region that are
    not in the original regions.
  • Penalty of the original regions.

21
Implementation of the Recognition Tool
  • Off-line recognition.
  • Implemented in Java.
  • Trained and tuned for hand-written formulae.
  • Black and white images (but can be extended on
    gray-scale images).
  • The following constructs are supported
  • variables, numbers, parenthesis,
  • common unary and binary operators, power to
    operator,
  • fractions, square root, subscripts, superscripts,
  • sum, integral.
  • Can deal with noise, ambiguities, touching or
    split symbols, etc. and also with misplaced
    symbols.

22
Tool Architecture
OCR tool
terminals detection
2D grammar
parsing
23
Terminals Detection
Ideally, all regions should be scanned for an
elementary symbol presence, but this consumes
much time, two smarter strategies implemented
  • Scanning rectangular windows of some predefined
    sizes (not all sizes).
  • Detection based on connectivity components.

Limitations of the method overlaping symbols
bounding boxes, symbols that intersect
Used OCR tool A simple method implemented -
feature vector extracted from image, k-nearest
neighbor classifier used to classify the vector.
Trained for all supported elementary symbols.
24
Remarks on Terminals Detection
  • Symbols that do not have size limited by a
    constant are not treated as terminal symbols
    (e.g., fraction line, square root).
  • In addition, square root cannot be separated from
    an image by a rectangle (it surrounds its
    argument).
  • Solution Treat these cases as symbols composed
    of several terminal symbols, extend grammar by
    related productions.

25
Parsing Algorithm
  • Bottom up approach, as described in the general
    structural recognition.
  • Complexity depends on the number of terminals
    detected during the first phase in general, can
    be exponential, but it is substantially reduced
    by production restristions and usage of suitable
    data structures
  • Data structures for orthogonal range queries
    (searching points that are located in a
    rectangle) used to speed up the algorithm.

26
Future Plans
  • Focus on printed formulae
  • Collect sufficiently large set of annotated
    printed formulae
  • Apply learning methods learn etalons of
    elementary symbols and productions parameters
Write a Comment
User Comments (0)
About PowerShow.com