Document Type Definitions DTDs - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Document Type Definitions DTDs

Description:

DTD explains precisely which elements and entities may appear where in the ... You can Validate whether your ... ENTITIES ... – PowerPoint PPT presentation

Number of Views:137
Avg rating:3.0/5.0
Slides: 27
Provided by: ccNct
Category:

less

Transcript and Presenter's Notes

Title: Document Type Definitions DTDs


1
Document Type Definitions (DTDs)
2
Whats DTD for?
  • DTD explains precisely which elements and
    entities may appear where in the document and
    what the elements' contents and attributes are
  • DTD defines the valid structure of a XML document
    rules
  • Valid element names
  • ltBOOKgtltTITLEgtltAUTHORgtltPRICEgtltISBNgt
  • Valid attribute names and values
  • ltAUTHOR id234 sexFgt
  • Relationship between elements
  • ltBOOKgtltTITLEgtlt/TITLEgtlt/BOOKgt
  • Different XML applications can use different DTDs
    to specify what they do and do not allow

3
Its Complex, but More Powerful
  • With DTD you can
  • Define your own meaningful tags
  • You can Validate whether your document is
    structure correct
  • An XML document is valid if it has an DTD and the
    document conforms with the DTD
  • The document type declaration must appear before
    the first element in an XML

4
Getting Started with DTD
  • 3-1.dtd, 3-1.txt
  • DTD can be stored in a separate file from the
    document it describes
  • DTD can be included inside the document it
    describes
  • DTDs allow forward, backward, and circular
    references to other declarations

5
Document Type Declaration
  • A valid document includes a reference to the DTD
    to which it should be compared
  • lt!DOCTYPE person SYSTEM "http//../person.dtd"gt
  • The root element is person
  • The DTD of this document can be found in
    "http//../person.dtd"
  • The document type declaration is included in the
    prolog of the XML document after the XML
    declaration but before the root element
  • You can use a relative URL (or just the filename)
    instead of the absolute form, if the document
    resides in the same server (or directory) as the
    DTD

6
Internal DTD Subsets
  • DTD can be included inside the document it
    describes
  • 3-4.xml
  • Some document type declarations contain some
    declarations directly but link in others
    (3-5.xml)
  • The part of the DTD between the brackets is
    called the internal DTD subset
  • All the parts that come from outside this
    document are called the external DTD subsets
  • As a general rule, the two subsets must be
    compatible. Neither can override the element
    declarations the other makes
  • When you use an external DTD subset, you should
    give the standalone attribute of the XML
    declaration the value no

7
Internal and External DTD
  • Internal DTD Put DTD and XML in one file
  • lt!DOCTYPE RootElement lt!ELEMENT author
    (PCDATA)gt..gt

8
Element Declaration
  • DTD is a mechanism to describe every object
    (element, attribute,) that can appear in an XML
    document
  • Element Declaration
  • lt!ELEMENT element-name content-specification gt
  • Content specification specifies what children the
    element may or must have in what order
  • lt!ELEMENT address-book (entry)gt
  • Parentheses are used to group elements in content
    specification
  • lt!ELEMENT name (lname, (fname title))gt

9
Special Keywords in Content Model
  • Special Keywords in Content Model
  • PCDATA parsed character data (text)
  • lt!ELEMENT phone_number (PCDATA)gt
  • CDATA sections appear anywhere PCDATA appears
  • EMPTY empty element
  • ANY can contain any other elements declared in
    the DTD (including mixed content, child elements)
  • Mixed Content
  • Element contents that have PCDATA
  • Element Content
  • Element contents that contain only elements

10
The Secret of , , ?
  • , , ? Occurrence indicators
  • No occurrence indicator appear once and only
    once
  • appear one or several times
  • appear zero or more times
  • ? appear once or not at all
  • Example
  • lt!ELEMENT entry (name,address,tel,fax,email)gt
  • lt!ELEMENT address (street,region?,postal-code,loca
    lity,country)gt

11
The Secret of ,
  • , Connectors
  • , Elements must appear in the same order
  • Only one element must appear
  • Examples
  • lt!ELEMENT name (PCDATA fname lname)gt
  • lt!ELEMENT entry (name,address,tel,fax,email)gt
  • Mixed Content Components must always separated
    by a , and the model must repeat
  • lt!ELEMENT name (PCDATA, fname, lname)gt
  • lt!ELEMENT name (PCDATA fname lname)gt
  • PCDATA must be the first child in the list

12
More Examples
  • lt!ELEMENT cover (title, (author subtitle))gt
  • lt!ELEMENT circle (center, (radius diameter))gt
  • lt!ELEMENT name (last_name (first_name,
    ((middle_name, last_name) (last_name?)))gt
  • lt!ELEMENT paragraph (PADATA name profession
    footnote emphasize date) gt
  • lt!ELEMENT image EMPTY)
  • ltimage source"bus.jpg" width"152" /gt

13
Attribute Declaration
  • Attribute Declaration
  • lt!ATTLIST element-name attribute-name
    attribute-type default-valuegt
  • lt!ATTLIST tel preferred (truefalse) falsegt
  • lt!ATTLIST email href CDATA REQUIRED
    preferred (truefalse) falsegt
  • Can appear anywhere in the DTD
  • Best to list attributes immediately after the
    element declaration

14
Attribute Type
  • CDATA String
  • ID Identifier unique in the document
  • IDREF Value of an ID
  • IDREFS List of IDREF separated by space
  • ENTITY Name of an external entity
  • ENTITIES List of ENTITY
  • NMTOKEN Word without spaces
  • NMTOKENS List of NMTOKEN
  • Enumerated-type list Closed list of NMTOKEN
    separated by
  • Notation name of a notation declared in the DTD

15
Attribute Types (Cont.)
  • CDATA
  • Any text string acceptable in a well-formed XML
    attribute value
  • The most general attribute type
  • Can be used for prices, URIs, email addresses,
    citations
  • NMTOKEN (named token)
  • Consist of the same characters as an XML name,
    but any allowed characters can be the first
    character in a name token
  • 12, .cshrc
  • NMTOKENS
  • one ore more XML name tokens separated by
    whitespace
  • ltperformances dates"08-21-2001 08-23-2001
    08-27-2001"gt Kat and the Kings lt/performancesgt

16
Attribute Types (Cont.)
  • Enumeration
  • A list of all possible values for the attribute,
    separated by
  • Each possible value must be an XML name token
  • lt!ATTLIST date year (2000 2001 2002 2003)
    REQUIREDgt
  • ID assign unique identifiers to elements
  • An XML name that is unique within the XML
    document (no other ID type attribute in the
    document can have the same value)
  • Each element must have no more than one ID type
    attribute
  • lt!ATTLIST employee ssn ID REQUIREDgt
  • ltemploy ssn"_078-05_123" /gt

17
Attribute Types (Cont.)
  • IDREF
  • Refer to the ID type attribute of some element in
    the document
  • Used to establish relationships between elements
    when simple containment won't suffice
  • 3-6.xml
  • Do not constrain the person attribute of the
    team_member element to match only employee IDs or
    constrain the project_id attribute of the
    assignment element to match only project IDs
  • IDREFS
  • Contain a whitespace-separated list of XML names,
    each of which must be the ID of an element in the
    document

18
Attribute Types (Cont.)
  • ENTITY
  • Contain the name of an unparsed entity declared
    elsewhere in DTD
  • lt!ATTLIST move source ENTITY REQUIREDgt
  • ltmove source"X-Men-trailer" /gt
  • ENTITIES
  • Contain the name of one ore more unparsed
    entities declared elsewhere in the DTD, separated
    by whitespace
  • lt!ATTLIST slide_show slides ENTITIES REQUIREDgt
  • ltslide_show slides"slide1 slide2 slide3 slide4 /gt

19
Attribute Types (Cont.)
  • NOTATION
  • Contain the name of a notation declared in the
    document's DTD
  • It could be used to associate types with
    particular elements, as well as limiting the
    types associated with the element
  • lt!NOTATION gif SYSTEM "image/gif"gt
  • lt!NOTATION tiff SYSTEM "image/tiff"gt
  • lt!NOTATION jpeg SYSTEM "image/jpeg"gt
  • lt!NOTATION png SYSTEM "image/png"gt
  • lt!ATTLIST image type NOTATION (gif tiff jpeg
    png) REQUIREDgt

20
Default Value
  • REQUIRED
  • The attribute is required. Each instance of the
    element must provide a value for the attribute
  • No default value is provided
  • lt!ATTLIST person name CDATA REQUIREDgt
  • IMPLIED
  • The attribute is optional. Each instance of the
    element may or may not provide a value for the
    attribute
  • No default value is provided
  • lt!ATTLIST person born CDATA IMPLIEDgt

21
Default Value (Cont.)
  • FIXED (follow by a value)
  • The attribute value is constant and immutable.
    The attribute has the specified value regardless
    of whether the attribute is explicitly noted on
    an individual instance of the element
  • If it is included, it must have the specified
    value
  • lt!ATTLIST biography xmlnsxlink CDATA FIXED
    "http//www.w3.org/1999/xlinkgt
  • Literal Value
  • The attribute will take this value if no value is
    given in the document
  • lt!ATTLIST web_page protocol NMTOKEN "http"gt

22
General Entity Declaration
  • Replace texts in XML documents defined by the DTD
  • lt!ENTITY entity_name "replacement_text"gt
  • lt!ENTITY super "superman in the jungle"gt
  • super ? superman in the jungle

23
Limitation of DTD
  • Limitations of DTD
  • Content is limited to textual
  • Difficult to put in repetition constraints
  • DTD does not use XML syntax
  • DTD does not say the following
  • What the root element of the document is
  • How many of instances of each kind of element
    appear in the document
  • What the character data inside the elements looks
    like
  • The semantic meaning of an element
  • A DTD never says anything about the length,
    structure, meaning, allowed values, or other
    aspects of the text content of an element

24
Advanced DTD
  • General Entity Declaration
  • External Parsed General Entities
  • External Unparsed Entities and Notations
  • Parameter Entities
  • Conditional Inclusion

25
Validation
  • A validating parser compares a document to its
    DTD and lists any places where the document
    differs from the constraints specified in the DTD
  • The parser can decide what it wants to do about
    any violations
  • A validity error is not necessarily a fatal error
  • Validation is an optional step in processing XML
  • Everything not permitted in the DTD is forbidden

26
Validating a Document
  • Online validating parsers
  • http//www.stg.brown.edu/service/xmlvalid
  • http//www.cogsci.ed.ac.uk/7Erichard/xml-check.ht
    ml
  • Validating parser software
  • Most XML parser class libraries include a simple
    program to validate documents
  • http//www.cogsci.ed.ac.uk/7Erichard/xml-check.ht
    ml
  • http//www.topologi.com
Write a Comment
User Comments (0)
About PowerShow.com