A Motivating Scenario for Designing an Extensible Audio-Visual Description Language - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

A Motivating Scenario for Designing an Extensible Audio-Visual Description Language

Description:

A Motivating Scenario for Designing an Extensible Audio-Visual Description Language Rapha l Troncy, Jean Carrive, Steffen Lalande and Jean-Philippe Poli – PowerPoint PPT presentation

Number of Views:121
Avg rating:3.0/5.0
Slides: 22
Provided by: Rap124
Category:

less

Transcript and Presenter's Notes

Title: A Motivating Scenario for Designing an Extensible Audio-Visual Description Language


1
A Motivating Scenario for Designing an Extensible
Audio-Visual Description Language
Raphaël Troncy, Jean Carrive, Steffen Lalande
and Jean-Philippe Poli
  • Monday 25th of October, 2004

2
Description of the AV content
  • Various uses / Different granularity
  • identification of the content creator and the
    content provider Dublin Core metadata, VRA core
    categories, TV Anytime metadata
  • feature extraction from the video signal storing
    and exchanging automatic tools results (MPEG-7)
  • structural decomposition in video segments
    corresponding to a logical structure of the
    program time-code, spatial coordinates
  • semantic description of these segments
    controlled vocabulary, thesaurus, free text
    annotation

3
Description of the AV content(cultural heritage
point of view)
  • Segmentation
  • locate and date some events
  • Description
  • type each segment with an AV genre
  • type each segment with a general thematic
  • give hints on the production
  • describe the scene (who, when, where, what, )

? needs a powerful description language
4
Motivating scenario
  • Generic application for describing manually TV
    programs w.r.t
  • structural constraints patterns represent the
    logical structure of a document
  • semantic constraints the description of the
    content is machine understandable
  • Let us define the temporal structure of a Sports
    Magazine

5
MPEG-7, the natural candidate description
language?
  • ISO standard since December of 2001
  • Main components
  • Descriptors (Ds) and Description Schemes (DSs)
  • DDL (XML Schema extensions)
  • Concern all types of media

Part 5 - MDS
6
MPEG-7 a non-suitable description language for
this scenario
  • A non-extensible language
  • closed set of descriptors
  • Exchange syntax rather than a real machine
    processable multimedia description language
  • non object-based data model
  • non modular language (universal approach)
  • No formal semantics provided
  • applications cannot have access to the meaning of
    the documents

? the DDL (XML Schema) fault ?
7
MPEG-7 a non-suitable description language for
this scenario
  • How to define new descriptors ?
  • How to define new description schemes ?
  • How to make the description machine
    understandable ?

? how to reconciliate the critical issue
object-oriented semantic expression versus
structural validation
8
Our proposition AVDL
  • AVDL a reduced yet extensible audio-visual
    description language
  • an object meta-model (an instance model specifies
    the vocabulary for and the rules followed by the
    descriptions)
  • an XML syntax
  • a semantics (closed to DL for the descriptors)
  • Description Schemes
  • Descriptors
  • Properties
  • Structures
  • Descriptions
  • valid instances w.r.t description schemes

9
The meta class level
10
The class level
11
Location
12
Document, Content and Media
  • Distinction
  • Document vs Content vs Media
  • Virtual content vs physical content
  • Media a content abstraction for decomposition
  • audio tracks, subtitles

13
Defining Structures
  • A structure defines how the descriptors may and
    have to be combined
  • allows a description control
  • allows an automatic completion of the
    descriptions
  • AVDL provides some predefined structure models
  • containment gives the list of the possible
    sub-segments of an AV segment (in space and in
    time)
  • regular expression by analogy of grammar for
    temporal succession
  • Other models are currently studied temporal
    constraints, etc.

14
AVDL Implementation
  • XML Serialization
  • Independent from a schema language
  • Use XML Schema validation (mainly for datatypes)
  • C
  • Object inheritance
  • Use of the .NET reflexivity

15
XML Serialization
avdl.xsd
Audio-Visual Description Language
ds-17.xsd
partial control
partial control
ds-17.xml
d-162.xml
transformation
Description Schemes
Descriptions
16
XML Syntax (DS)
  • ltDescriptor xsitype"LocatedDescriptorType"
    id"id-d2" name"Tracking"gt
  • ltProperty ref"id-p2"/gt
  • ltStructure ref"id-s2"/gt
  • ltDescriptionRelationship characterization"strin
    g"gt
  • ltLocation type"TemporalInterval"/gt
  • ltMedia type"Media"/gt
  • lt/DescriptionRelationshipgt
  • lt/Descriptorgt

ltProperty id"id-p2" name"nbDetection"gt
ltDomain descriptor"id-d2"/gt ltRangegt
ltPrimitive nameType"int"/gt lt/Rangegt lt/Propertygt
ltStructure id"id-s2" name"TrackingStructure"gt
ltFormalModelgt ltConstraint type"temporal"
validation"full" method"system
parser"XMLSchema"gt ltxsdsequence
minOccurs"0" maxOccurs"unbounded"gt
ltxsdelement name"Detection" type"DetectionType"
/gt lt/xsdsequencegt lt/Constraintgt
lt/FormalModelgt lt/Structuregt
17
XML Syntax (Descriptions)
  • ltTracking type"LocatedDescriptorType"
    nbDetection"1"gt
  • ltDescriptionRelationshipgt
  • ltLocationgt
  • ltavdlBegin timeRef"147329280"/gtltavdlEnd
    timeRef"147329280"/gt
  • lt/Locationgt
  • ltMedia id"CPB86006610.mpg"
    name"CPB86006610.mpg" contentID"CPB86006610.mpg"
    /gt
  • lt/DescriptionRelationshipgt

ltStructure constraintType"temporal"gt
ltDetection type"LocatedDescriptorType"
nbFeature"1"gt ltDescriptionRelationshipgt
ltLocationgt ltavdlInstant
timeRef"147329280"/gt lt/Locationgt
ltMedia id"CPB86006610.mpg" name"CPB86006610.mpg"
contentID"CPB86006610.mpg"
frameHeight"288" frameWidth"352"/gt
lt/DescriptionRelationshipgt ltStructure
constraintType"spatial"gt ltFeature
xsitype"FaceType"gt ltDescriptionRelatio
nshipgt ltLocationgt
ltavlBoundingBoxgt ltavdlNE
numX"92" denX"352" numY"217" denY"288"/gt
ltavdlNW numX"92" denX"352"
numY"267" denY"288"/gt ltavdlSE
numX"136" denX"352" numY"217" denY"288"/gt
ltavdlSW numX"136" denX"352"
numY"267" denY"288"/gt
lt/avdlBoundingBoxgt lt/Locationgt
...
18
.NET implementation
ds-17.dll
read/write
Memory
.NET instanciation
parsing
parsing
ds-17.xml
d-162.xml
Description Schemes
Descriptions
19
Two kinds of applications
  • Static Description Schemes
  • DS are well-known
  • The developer uses generated libraries
  • Dynamic Description Schemes
  • DS are created by the application
  • Use of the dynamic instantiation mechanism
    (reflexivity) of .NET

20
Carrying out the scenario
  • Definition of new descriptors and properties
  • associating behavior with the corresponding
    classes
  • performing reasoning on the descriptions with the
    formal definitions in OWL
  • Definition of logical and temporal structures
  • the description is controlled and validated by a
    grammar

21
Conclusion and Future Work
  • AVDL a reduced yet extensible Audio-Visual
    Description Language
  • descriptors, properties, structures
  • XML syntax and DL semantics
  • .NET implementation and APIs
  • About structure validation
  • which constructors used ? which semantics ?
  • Trade-of expressivity vs calculability
  • OWL Full is undecidable
  • constraints satisfaction problems can be complex
Write a Comment
User Comments (0)
About PowerShow.com