Title: A Proposal for a Video Modeling for Composing Multimedia Document
1A Proposal for a Video Modeling for Composing
Multimedia Document
- Cécile ROISIN - Tien TRAN_THUONG - Lionel VILLARD
- Presented by Tien TRAN THUONG
- Project OPERA - INRIA
- Grenoble - France
2Work Context
- Theme Multimedia Document (Madeus)
- Authoring system for multimedia structured
documents - Basic media sound, video, text, image, etc.
- Document composed by relations
- Need composition of semantic video fragments
with other basic media elements (image, text,
sound, ...)
3Temporal Synchronization ExampleINRIAs
positions document
Pictures Titles synchronized with video parts
Video Presentation
4Logical organization of document
InriaIntroduction
Buildings
Overview
Video Presentation
Video Frames
5Time line view of the document
Rocquencourt Title Picture
Texts grow up
Rennes Title Picture
Sophia-Antipolis Title Picture
Lorraine Title Picture
Rhône-Alpes Title Picture
Raw video
Time
Video fragments
6Spatial Synchronization examples
7Spatial layout of text follow video object
document
Document Region (Left, Top, Width, Height)
Text Region (Width, Height)
Video Region (Left, Top, Width, Height)
- Location of the video object region that is
moving region in the video region
8Objective and plan of that work
- Research and development on the video modeling
for the description of the video content relevant
to multimedia applications - Video modeling video description for multimedia
composition, - Multimedia application our VideoMadeus is an
editing and presentation system.
9Video Description
- Dublin core the semantic indexing schema for
video content description. - MPEG-7 the future standard tools will enable to
define the semantic schemas for description of
the audiovisual information.
- Our video modeling for composing multimedia
document.
10Methodology
- Specification of a modeling for the description
of video content - Multi-level structuration,
- temporal and spatial relations,
- actions interactive on the video elements.
- Specification in XML
- Experimentation in Madeus (VideoMadeus)
11Multi-level Structuration
lt!--XML schema for the description of
VideoContent--gt lt!ELEMENT VideoContent
(MetaInfo, MediaInfo, Summary, Structure,
Semantic, Thesaurus)gt lt!ELEMENT Structure
(Sequence, Relation?)gt lt!ELEMENT Semantic
(VideoObject, EventSemantics)gt lt!ELEMEN
T Thesaurus
(ReferenceDictionary, UserDictionary)gt
12Video Structure Description
- Motivation for composition, the basis is to have
the Structure description level. - Semantic and Thesaurus are more necessary for
retrieval applications or as a support for
structuration level. - First step is Structure description
13High Level Description
lt!--XML schema for the description high level
structure --gt lt!ELEMENT Structure (Sequence,
Relation?)gt lt!ELEMENT Sequence
(Scene,Relation?)gt lt!ELEMENT Scene
(Shot,Event, Relation?)gt lt!ELEMENT Shot
(Transition?,Event,Occurrence,
Background?, Relation?)gt
14Shot Content Description
lt!-- XML Shot Description --gt lt!ELEMENT
Shot (Transition?,Event,Occurrence,
Background?, Relation?) gt lt!ELEMENT
Transition EMPTY gt lt!ELEMENT Event
EMPTYgt lt!ELEMENT Background (Region)gt lt!ELEME
NT Occurrence (Region, Trajectory?,
Occurrence) gt lt!ELEMENT SpatialLayout
(2DBStringDS) gt lt!ELEMENT CameraWork
(CameraMotion?) gt
Semantic
Shot
Trans.
CameraWork
Event
Occurrence
SpatialLayout
Background
Reference
15Occurrence Content Description
lt!-- XML Occurrence description --gt lt!ELEMENT
Occurrence (Trajectory, Region,
Occurrence)gt lt!ELEMENT Trajectory
gt lt!ELEMENT Region (Contour, Color, Texture,
Centroid, Region)gt lt!ELEMENT Contour
gt lt!ELEMENT Color gt lt!ELEMENT Texture
gt lt!ELEMENT Centroid gt lt!ELEMENT Region gt
Occurrence
16Model summary
- The model focuses on the description of video
elements useful for composing a multimedia
document (shot, scene, occurrence, event,
relation, etc.) - It has a XML specification that makes it
independent and easy to apply to multimedia
applications (ex. our VideoMadeus).
17Experimentation of the model in Madeus -
VideoMadeus
18Madeus Architecture
Editor/Presentation Tools
EXECUTION View
TIME LINE View
HIERARCHICAL View
VIDEO STRUCTURED View
. . .
MODEL MANAGEMENT
PARSERS
LOGIC STRUCTURATION
EVENT MANAGEMENT
TEMPORAL STRUCTURATION
SAVE
Madeus document
SPATIAL STRUCTURATION
MADEUS
OUTILS
- To extend Madeus to VideoMadeus, video content
description is handled both in composition and in
presentation parts.
19Madeus Document Model
- Structured document organized according to the
dimensions Logical, temporal, spatial.
Internal Document
Madeus Document
Logical
Content
Actor
Temporal
Spatial
20Relations
- Temporal relations (Allen extension)
- meets, starts, equals, during, overlaps,
parmin,etc. - Spatial relations
- left_align, right_align, center_v, center_h,
top_align, bottom_align, etc.
ltTemporalgt ltRelationsgt
ltstart Interval1 a Interval2 b /gt
ltmeet Interval1 b Interval2 d /gt
ltRelationsgt lt/Temporalgt ltSpatialgt
ltRelationsgt ltleft-align
Region1 b Region2 d /gt
ltRelationsgt lt/Spatialgt
d
21Overview of VideoMadeus
Editing and Presentation Tools
Video edition View
Execution View
SynchronizationManagement
Behavior Management
- Structure View
- Semantic View
- Thesaurus View
Element Management
Synchronization
- Hyperlink
- Follow-up
- Erase
- Display, etc...
Requested descriptions
Requested descriptions
Modified description
Data Management
XML Description of video content
Internal Structure (MODEL)
Modify
Parser
Index on video
22VideoMadeus document
ltMadeusgt ltContentgt . . .
ltVideoContentDSgt . . . ltScene ID
MyScene ... gt . . . lt/Scenegt
lt/VideoContentDSgt
lt/Contentgt ltActorgt . . .
ltVideoElement IDSceneVideo Content
MyScene . . . gt . . .lt/VideoElementgt
lt/Actorgt ltTemporalgt
ltInterval IDScenceInt ActorSceneVideo
Duration... /gt ltRelationsgt . . .
lt/Relationsgt lt/Temporalgt
ltSpatialgt ltRegion
IDScenceReg ActorSceneVideo Height 288
Width352 /gt ltRelationsgt . . .
lt/Relationsgt
lt/Spatialgt lt/Madeusgt
23Editing features
- Editing of the video description
- shot detection (automatic or manual)
- extract manually video objects, events,
spatialLayout, etc. - Creating of semantic groups (manual)
- group shots in a scene, group scenes in a
sequence - detection occurrences of a character (group
occurrences in objects) - creation of the other semantic indexing
- classifying of the video elements (thesaurus)
- scenario editing (composing)
- Set temporal and spatial relations between video
element and other media - Set actions on the video elements
24Conclusion
- Provide support for deeper access into video data
in the multimedia authoring system - temporal/spatial synchronization with the other
media elements (image, text, sound, etc.), - actions on the video elements (hyperlink,
follow-up, erasing, etc.) - Develop experimentally the video editing view to
help the user create and modify descriptions of
video data in accordance with our video model.
25Perspectives
- More experimentation for spatial synchronization,
- Extension and experimentation of the semantic
parts (Semantic and Thesaurus) -gt semantic
queries, - Use the MPEG-7 tools to specify our video model,
- Develop the video content description editing
tool - Integration and adaptation of the video analyzing
algorithms for generating more automatically
possible the video elements, - Timeline editing view for video structure, etc.
- Semantic queries for playing a part of video
through network.
26Thanks for your attention
27Video content description in Madeus document
- lt?xml version"1.0"?gt
- ltMadeus Name"DocMadeus" Version"2.0"
Width"800" Height"600"gt - ltContentgt
- ltVideoContent ID"InriaInfoco" gt
- ltStructure ID"InriaInfocoStruc" gt
- ltSequence ID"Seq" Start_Time "0"
Stop_Time "76.69" gt - ltScene ID"Scene1" Start_Time "0"
Stop_Time "4.91" gt lt/Scenegt - ltScene ID"Scene2" Start_Time
"4.91" Stop_Time "11.09" gt - ltShot ID"Shoti" Start_Time
"4.91" Stop_Time "8.71" /gt - ltShot ID"Shotii" Start_Time
"8.71" Stop_Time "11.09" /gt - lt/Scenegt
- ltScene ID"Scene3" Start_Time
"11.09" Stop_Time "29.07" gt lt/Scenegt -
- lt/Sequencegt
- lt/Structuregt
-
- lt/VideoContent gt
- ltVideoContent ID"InriaGen" gt
lt/VideoContentgt -
28Video element definition
lt?xml version"1.0"?gt ltMadeus Name"DocMadeus"
Version"2.0" Width"800" Height"600"gt
ltContentgt . . . lt/Contentgt ltActorgt
ltVideoElement IDWesternScene
ContentWesternDS.Seq.Scene1 TypeRendere
rLightWeight. . . gt ltVideoObject
IDVO1 Object Shot2.ActorOcc1
ActionsFollow-upHyrperlink...
HRef file///C/Users/ttran/Multimedia/Madeus/op
era.html /gt . . . lt/VideoElementgt . .
. lt/Actorgt ltTemporalgt . . . lt/Temporalgt
ltSpatialgt . . . lt/Spatialgt lt/Madeusgt
- The operations can be defined in the instance of
the described video Hyperlink, Tracking,
Erasing, Jumping, etc.
29Temporal part of Inria introduction document
- lt?xml version"1.0"?gt
- ltMadeus Name"DocMadeus" Version"2.0"
Width"800" Height"600"gt ... - ltTemporalgt
- ltT-Group ID"Temporal" Duration"pref20s
min15s max22s"gt - lt!-- Interval of three hypertexts --gt
- ltInterval ID"ControlOperaInterval"
Actor"ControlOperaInfo" Duration"pref20s
min15s max22s"/gt -
- lt!-- Interval of the video element --gt
- ltInterval ID"MovieInriaURScene4" Actor
"InriaURScene4" Fill"freeze" Duration"pref20s
min15s max22s"/gt - lt!-- Interval of the texts --gt
- ltInterval ID"txtInriaURScene4Shotii"
Actor "TxtInriaURScene4" Duration"pref20s
min15s max22s" /gt - ...
- ltInterval ID"txtInriaURScene4Shotvi"
Actor "TxtInriaURScene4Shotvi"
Duration"pref20s min15s max22s" /gt - lt!-- Interval of the images --gt
- ltInterval ID"ImgInriaURScene4Shotii"
Actor "ImgRoc" Duration"pref20s min15s
max22s" /gt - ...
- ltInterval ID"ImgInriaURScene4Shotvi"
Actor "ImgRA" Duration"pref20s min15s
max22s"/gt - ltRelationsgt
- lt!-- Equals relations of the texts with the
video elements --gt
30Spatial part of Spatio-Temporal Relation Demo
document
- lt?xml version"1.0"?gt
- ltMadeus Name"DocMadeus" Version"2.0"
Width"800" Height"600"gt - ...
- ltSpatialgt
- ltS-Group ID"TOTOSpatial"gt
- lt!-- Video region --gt
- ltRegion ID "WesternVideoRegion"
Actor"WesternVideo" Left"206" Top"140"
Height"288" Width"352" Depth"1"/gt - lt!-- Three hypertext regions--gt
- ltRegion ID " LinkOperaInfoRegion"
Actor "ControlOperaInfo" Left"236.0"
Top"492.0" Width"210.0" Depth"2.0"/gt - ltRegion ID"LinkAutoCitroenRegion"
Actor "ControlAutoCitroen" Left"36.0"
Top"492.0" Width"210.0" Depth"2.0"/gt - ltRegion ID"LinkSTRST" Actor
"ControlSpatioTemp" Left"472.0" Top"492.0"
Width"236.0" Depth"2.0"/gt - ltRegion ID"TxtOperaIntroRegion" Actor
"TxtOperaIntro" Left"168" Top"46" Height"42"
Width"429" Depth"2.0"/gt - lt!-- Regions of the text following the
video object--gt - ltRegion ID"TxtMotionRegion" Actor
"TxtMotion" Height"16.0" Width"69"
Depth"2.0"/gt - ltRelationsgt
- ltTop_align Region1"WesternVideoR
egion.Shot1.Obj" Region2"TxtMotionRegion" /gt - lt/Relationsgt