Multilingual Cataloguing of Product Information of Specific Domains: Case Mkbeem System - PowerPoint PPT Presentation

About This Presentation
Title:

Multilingual Cataloguing of Product Information of Specific Domains: Case Mkbeem System

Description:

Multilingual Cataloguing of Product Information of Specific Domains: Case Mkbeem System Aarno Lehtola, Jarno Tenni and Tuula K pyl VTT Information Technology – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 23
Provided by: TTE62
Category:

less

Transcript and Presenter's Notes

Title: Multilingual Cataloguing of Product Information of Specific Domains: Case Mkbeem System


1
Multilingual Cataloguing of Product Information
of Specific Domains Case Mkbeem System
  • Aarno Lehtola, Jarno Tenni and Tuula Käpylä
  • VTT Information Technology
  • Contents Motivation
  • Mkbeem in a nutshell
  • Multilingual Cataloguing Tool
  • Meaning extraction
  • Experiences of test users
  • Future
  • DEMO

2
Online Language Challenges for eCommerce
Native English speakers comprise less than 9
of the world population.
"If I'm selling to you, I speak your language. If
I'm buying, dann müssen Sie Deutsch sprechen".
(Willy Brandt)
Ref Global Reach http//www.glreach.com/
3
An Answer MKBEEM and Multilingual eCommerce
Mediation
MKBEEM Mediation System
Monolingual CP/SP
Multilingual cataloguing write once, publish many
Customer language information retrieval trading
  • Language adaptation via automatic HL translation
    and interpretation
  • Natural dialogues combining HL navigation
  • Harmonised ontologies enabling localised views to
    products and trading contracts

CP/SP User
Customer
CP/SP eCom Service
Transactions with contract adaptation
  • Generic solutions proved by trials in Finnish,
    French and English in the domains of travel and
    mail-order sales
  • More information www.mkbeem.com
  • EC FP 5 IST/HLT project in 2000-2002, budget 4,9
    M
  • Goal Develop intelligent knowledge-based key
    components (HLP KRR) for applications in
    multilingual eCommerce

4
Generic Architecture of Mkbeem
Customer

Content/Ser
vice Provider


CP

Interface


User

Interface

CP E
-
Commerce
platform


CP
CP
User
Human Language Processing Server
CP
Agent

Agent

Agent
Agent

CP
Information

System



Trading Ontology Server


Domain Ontology Server

MKBEEM

Manager


System

Agent

Manager

Manager

Interface

Rational Agent

5
Mkbeem Bridging Languages via Language Neutral
Ontologies
Extracting Product Properties
Meaning extraction Machine translation Dialogue
processing ...
User Information Request Proc.
"Toppatakki. Muhkea malli, olkapäissä vahvikkeet.
Painonapeilla kiinnitetty huppu, jossa joustava
nyöri. Vetoketjun alla suojalista. Kaksi
kannellista taskua...
A brown jacket made of natural material
Ontological Formula in CARIN (c_colour)(X), (r_na
me)(X,brown), (c_product)(Y), (r_name)(Y,jacket),
(c_material)(Z), (r_name)(Z,nat_mat).
14 products found 1. Beige winterjacket of
wool 2. Ochre quilted jacket of cotton ... Any
further requirements?
"Toppatakki. Muhkea malli..." "Quilted jacket.
Puffy model with reinforcements on the
shoulder..." jacket(X,quilted_jacket),
model(X,puffy), part(X,Y,sleeves),
property(Y,Z,reinforcement)...
Multilingual Product Data
One with a hood
Product Model
Material Ontology
Colour Ontology
6
Mkbeem Multilingual Cataloguing Tool
  • Starting point
  • The new product belongs to the supported product
    domains
  • Available a textual product description in one of
    the supported languages and a photograph
  • Basic functionalities
  • Text checking
  • Property extraction
  • Product Categorisation
  • Machine Translation
  • NL Query Processing
  • Technical key challenge
  • Formalising relationship of ontologies and HL
    and
  • Extracting meaning of input HL texts with respect
    to provided ontologies into the form of
    Ontological Formulas

7
Meaning Extraction Example in Clothing Domain
  • Long skirt with cargo pockets
  • Jupe longue avec des poches battle-dress
  • Pitkä hame, jossa reisitaskut
  • (c_MKBEEM81007clothingProduct)(H6641),
  • (r_name)(H6641,H6989),
  • (c_MKBEEM83383property)(H6552),
  • (r_name)(H6552,H6889),
  • (c_MKBEEM81011part)(H6730),
  • (r_name)(H6730,H7295),
  • (l_dependency)(H6989,adjAttr,H6889),
  • (l_dependency)(H6989,prepAttr,H7295),
  • (l_constituent)(H6889,0,long,en,long,adj,n
    om,sg,property),
  • (l_constituent)(H6989,1,skirt,en,skirt,nou
    n,nom,sg,product),
  • (l_constituent)(H7295,4,cargopockets,
  • en,cargopocket,noun,nom,pl,prodpart)

Concept Bindings
Linguistic Dependencies Lexical info
8
VTTs implementation of HLP Services in Mkbeem
checkText
Functions
extractMeaning
translateText
Ontologic Formula
OK or correction
Translated string
HL string
HL string
HL string
Linguistic Services
Meaning Extractor
Unifier Text Correction S/W
Webtran MT System
Webtran Dependency Parser
Verification
Concept Bindings
Linguistic Ontology
Cone Onto S/W Inference
KBs
ALEs for MT (965 btw Finnish, French English)
9
Augmented Lexical Entries
  • Augmented Lexical Entries (ALE) rules (see MT
    Summit 99)
  • Bilingual or multilingual non-directed entries
    representing phrase and sentence structures and
    possibly their translation relations.
  • Both surface form entries and generalised rules
  • Possible to declare multidirectional entries
  • Declarative and intuitive formalism - to be used
    by translators
  • Uniform way of representing phenomena on
    different levels of language
  • Designed to be suitable for automated or machine
    supported language modelling (see SMC 99 paper on
    learning translation grammars)
  • Can be viewed as a forest of partial dependency
    parse trees
  • Near relationship obtainable to the corresponding
    conceptual structures (concept bindings to
    ontologies)
  • Lexicon
  • All the allowed words
  • Monolingual and bilingual entries

10
Meaning Extraction A Product Ontology with ALEs
Embedded
11
Syntax of ALEs
  • augmented_lexical_entry entry_name
    pattern.. opt_message opt_repair
  • entry_name name . number_index
  • name hierarchical_name_w_dots_betw_parts
  • pattern opt_language_id constituent_def..
  • opt_message e message
    string_w_opt_binding
  • opt_repair e repair string_w_opt_binding
  • constituent_def constituent_def
  • constituent_def constituent_def..
  • constituent_def lt constituent_def.. gt
  • constituent_def opt_regent_mark opt_lexeme
    opt_binding opt_feature_constraint
  • opt_language_id e ISO_std_lang_identifier
    ISO_std_lang_identifier
  • ISO_std_lang_identifier ee en fi fr
    se Å
  • opt_regent_mark e
  • opt_lexeme e lexeme tag name
  • opt_binding e binding
  • opt_feature_constraint e feature..
  • binding ( variable_name ) ()
  • feature feature_value property_type
    binding

12
Examples of ALEs - 1/3
footwear.word.27 se allväderskänga fi
jokasäänkenkä en all weather
shoe price.tax.4 se inkl. moms
tag_price(X) fi sis. alv tag_price(X)
en incl. VAT tag_price(X)
cloth.material.composition fi
(A)clothProd tag_percentage(X)
(B)textileMaterial ptv fr
(A)clothProd en tag_percentage(X)
(B)textileMaterial en
(A)clothProd of tag_percentage(X)
(B)textileMaterial se
(A)clothProd av tag_percentage(X)
(B)textileMaterial
  • Basic word correspondence definition
  • Specific idiom correspondence
  • Generalised ALE, e.g. "shirt of 100 cotton

13
Examples of ALEs - 2/3
cloth.property.1 se (A)adj clothProp gender(B)
number(B) (B)noun clothProd fi (A)adj
clothProp case(B) number(B) (B)noun
clothProd en (A)adj clothProp (B)noun
clothProd property.expr.1 se (A)adj prop
gender() number() fi (A)adj prop
number() case() en (A)adj prop
property.expr.2 property.expr.2 tag_comma
property.expr.3 property.expr.3 property.expr
.1 conjAND property.expr.1
  • Semantical and grammatical restrictions,
  • e.g. agreement in miellyttävä pusero
  • or miellyttävää puseroa
  • (comfortable blouse)
  • An iterative phrase, obs! tree flattening
  • cloth.property.2
  • se property.exprclothProp
  • (B)cloth
  • fi property.exprclothProp
  • (B)cloth
  • en property.exprclothProp
  • (B)cloth)

14
Examples of ALEs - 3/3
  • Negative Instances - Correction ALEs
  • correct.ellos.3
  • se kardborrstängning(A)
  • se kardborreförslutning(A)
  • se kardborrknäppning(A)
  • se kardborreknäppning(A)
  • message Use the correct synonym
  • kardborrestängning instead of word(A)
  • repair kardborrestängning(A)

15
Meaning Extraction Process
Input phrase
Set of CARIN formulas
Syntactico-semantic analysis
Inference of CARIN formulas
Set of approved lexical-semantic graphs with
concepts identified
16
Meaning Extraction Process Example
Input phrase
Set of CARIN formulas
Syntactico-semantic analysis
Inference of CARIN formulas
musta hame, jossa halkio ja taskut une jupe noire
avec fente et poches a black skirt with split and
pockets
(c_MKBEEM81098colour)(H1017), (r_name)(H1017,H64
1), (c_MKBEEM84731clothingProduct)(H984), (r_nam
e)(H984,H684), (c_MKBEEM81011part)(H951), (r_nam
e)(H951,H813), (c_MKBEEM81011part)(H918), (r_nam
e)(H918,H899), (l_dependency)(H684,adjAttr,H641),
(l_dependency)(H684,prepAttr,H813), (l_dependency)
(H684,prepAttr,H899), (l_constituent)(H641,0,musta
, fi,colour,musta,adj,nom,sg), (l_cons
tituent)(H684,1,hame,
fi,product,hame,noun,nom,sg), (l_constituent)(H7
27,2,tag_comma, fi), (l_constituent)(H770,3,joss
a, fi,jossa,pron,ine,sg), (l_constitue
nt)(H813,4,halkio, fi,prodpart,halkio,n
oun,nom,sg), (l_constituent)(H856,5,ja,fi,conj,j
a,coord_c), (l_constituent)(H899,6,taskut,
fi,prodpart,taskut,noun,nom,pl)
(c_product)(H1606), (r_name)(H1606,skirt
), (c_colour)(H1573),
(r_name)(H1573,black), (c_part)(H1540),
(r_name)(H1540,split), (c_part)(H1506),
(r_name)(H1506,pocket)
Set of approved lexical-semantic graphs with
concepts identified
17
Cataloguing Tool Testing by End-Users
  • Goals
  • Proof of concept (Swiss army knife of a
    cataloguer)
  • Usability in real working environment
  • Ellos' test group consisted of 8 persons
    (translators, cataloguers and call-centre
    workers)
  • familiar with Internet 5 yes, 1 almost yes, 2
    yes at home
  • languages used 8 Finnish, 6 English, 4 Swedish,
    1 French
  • familiar w. catalogue maintenance 6 yes, 2 no
  • Schedule
  • Short training and preliminary interviews on
    August 30, 2002
  • Interviews of experiences and summary of the
    results ready by October 14, 2002

18
... Trial experiences of the Ellos test group
  • Cataloguing tool considered to be useful
  • cataloguing process as a whole was seen as an
    easy and efficient way of producing and
    classifying product information
  • each of the main features was considered good
  • very importantsemi-automatic translation into
    target languages
  • property extraction and inference with colours
    and materials seen as important in bringing
    value-adding services to customers
  • helps in producing consistent and uniform
    information
  • can make the working process faster and reduce
    the amount of manual, repeated routine procedures
  • KB management tools considered suitable to their
    task
  • Reported difficulty
  • occasionally long response times gt boring of the
    user e.g. repeating queries
  • e.g. "hourglass" or provision of partial results
    could bring quick help
  • will be eventually solved by continued product
    development

19
MT Part (Webtran) in Production Use at Ellos
since 2000
Ellos Sweden
Ellos Finland
MacQuarkXPress
Cataloguer
Catalogue author
LocalisedDB
MacQuarkXPress
Swedish
Finnish
AutomaticSw -gt Fi Translation
LanguageModeller
SourceDB
PC Server
WebtranMachine TranslationSoftware
About 2000 translated catalogue pages and
10000-15000 product descriptions per
year Benchmark by CSC Inc. reports over 30 time
savings after one year of use
Language technology solutions are necessary to
embed into business processes and IT
infrastructure
20
Work Needed for Adding Domain and Languages
  • Marginal cost of adding a new domain or a new
    language is reasonable with respect to the
    added-value gained
  • Based on experiences from modelling vacation
    cottage domain to the system (fi,fr,en) we have
    estimated that introducing a comparable new
    domain would require
  • semantic-lexicon 2 man-months
  • translation and meaning extraction rules 1
    man-month
  • product models 2-4 man-weeks
  • We also estimate that adding a language to a
    pre-existing domain would need
  • semantic-lexicon 1-2 man-month
  • translation and meaning extraction rules 2-4
    man-week
  • product models 1 man-week

21
Future Development Recommendations
  • Further development of could focus on the
    following issues
  • information request processing dialogues
  • question answering capabilities (e.g. qualitative
    questions about the goods selection)
  • proper way of handling null queries (e.g.
    graceful relaxation of the search constraints
    based on the ontology models and the actual goods
    selection)
  • new languages to the system Russian, Norwegian,
    Estonian, German ...
  • user-friendlier ways for the acquisition and
    maintenance of language models and product models
    (knowledge acquisition bottleneck) machine
    learning
  • special requirements of mobile terminals (e.g.
    automatic text abstraction)

22
DEMO
Write a Comment
User Comments (0)
About PowerShow.com