Constraint Propagation vs Syntactical Analysis for the Logical Structure of Library References

About This Presentation

Title:

Constraint Propagation vs Syntactical Analysis for the Logical Structure of Library References

Description:

for the Logical Structure of Library References. A. Bela d. LORIA ... Weakness of indices extractio algo. Local context handling. Strong points or improvements ... – PowerPoint PPT presentation

Number of Views:40

Avg rating:3.0/5.0

Slides: 17

Provided by: abdel9

Category:

more less

Transcript and Presenter's Notes

Title: Constraint Propagation vs Syntactical Analysis for the Logical Structure of Library References

1
Constraint Propagation vs Syntactical Analysis
for the Logical Structure of Library References
A. Belaïd LORIA-CNRS Nancy France
Y. Chenevoy CRID Univ. Bourgogne Dijon, France

Outline
Structure Modeling
Syntactical Analysis
Constraint Propagation
Results Conclusion

2
Model generic structure
3
Model Attribute Grammar
Object Constructor subordinate objects
qualifier sequence,
required, aggregate, optional, choice
repetitive Separator
space, graphic line / punctuation
Attributes Physical Logical
Typographical
position lexicon
typeface Weights
Attributes Sub-objects
Imp / Reco.
Imp / Hyp. Ambig.
4
Syntactical Analysis the approach

Top-down Model driven
Bottom-up Data driven
Mixed
- Anchor points extraction (o) - Bottom-up
Choice of a rule A ??o o ?o - Top-down
verification for left context ?o right
context ?o - Add A to anchor points

5
Syntactical Analysis Left context verification
6
Initials Finals
Model G (Vn, Vt, P, S)

Finals
O Cho A B C F(O) A, B, CO Seq A B C
F(O) CO Seq A B C? F(O) B, C
O ?Vt , F(O) O
O ?Vn , F(O) F(O) ? (?i?F(O) F(i))

Initials
O Cho A B C I(O) A, B, CO Seq A B C
I(O) AO Seq A? B C I(O) A, B
O ?Vt , I(O) O
O ?Vn , I(O) I(O) ? (?i?I(O) I(i))

7
Indices Extraction without OCR
Specific problems
Corr. with
Corr. with
4.7
16.1
76.7
43.3
37.5
91.0
55.5
31.5
37.5
61
8
Indices Extraction the approaches
( )
Masks
_-
Bounding Box Baseline
Profile Projection
. ,
Bounding Box Baseline
Particular words
Sound Lines
Text style (Bold Italic Underlined)
( spaced text)
(Small text)
- Projection - Spacing - Bounding Box
9
Constraint Propagation
10
Neighbors (Example)
11
Propagation Results
Frag. Possible labels After Cons.
Prop. 1 2 1 2 23 1 3 23 2 4 23 3 5 2 1 6 7 1
7 10 1 8 7 1 9 3 1 ...
12
Model Compilation

Pre-processing of the model
Find initials, finals and neighborslet LNa,p
the set of possible neighbors at the left of a in
the rule
p?? a ? ? ? (Vt ? Vn) ? ? ((Vt ? Vn) -
a) if a ? ? then LNa,p F ? else LNa,p F
? ? LNa ?by extension lna,p ?l?LNa,p Fl
and LNa ? p?Pa lna,p the left neighborhood
of a in the model
A is left compatible with B if B ? LNA or A ?
RNB or(A ? B) ? PA ? PA and ? PB ? PB / PA ?
PB

13
Results
Group Vedette
Area Title
Principal Title
Crossing Title
End of the title
Cros. Formulae
Area Address / Date
Crossing Title
Address
Date
Area Collection
200 references 75
Group Cote
14
Results scientific references
400 references 99.8
15
Results
Yua 95 J. Juan, Y. Y. Tang, and C. Y. Suen.
Four Directional Adjacency Graphs (fdag) and
their Application in Locating \34elds in Forms.
In Third International Conference on Document
Analysis and Recognition (ICDAR95), pages 752\25
755. IEEE Computer Society Press, Aug. 1995.

Author(3) J. Juan, Y. Y. Tang, and C. Y.
Suen Title Four Directional
Adjacency Graphs (fdag) and their Application in
Locating fields in Forms Editor (0) Month
Aug Year 1995 Volume
Number Publisher IEEE
Computer Society Press ADDRESS PA--GES
752-755 Organization Booktitle Third
International Conference on Document Analysis and
Recognition (ICDAR95) Series Note

16
Conclusion