Introduction the course and XML - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Introduction the course and XML

Description:

An XML document must be well-formed: start and end tags must match ... conversion from one document class to another. querying ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 42
Provided by: arth112
Category:

less

Transcript and Presenter's Notes

Title: Introduction the course and XML


1
Introduction the course and XML
  • Shiyong Lu

2
Course Info
  • CSC8710 Seminar on Database Management 
  • Time 600-720PM TuTh
  • Place State Hall 213
  • Webpage http//www.cs.wayne.edu/csc8710

3
What is this course about?
  • Reading XML papers
  • Doing XML projects
  • Conducting XML research (writing research papers
    on XML)

4
Reading XML papers
  • Each student will present one paper selected from
    the list of papers covering the following
  • Using RDBMS to store and query XML
  • Publishing relational data as XML
  • XML constraints
  • XML integration
  • Query and searching XML data
  • XML algebra, XPath, type checking, etc.

5
Doing XML projects
  • As a warming up of the final research project,
    two small programming projects will be given.
  • They are simple, but good practice of XML
    programming
  • Detailed specifications will be given

6
Conducting XML research
  • Each group selects one project with a
    consultation from the instructor
  • Team work, each group of two students
  • Work closely with your partner and the instructor
  • To promote progress, a series of assignments will
    be given. They will not be graded, but will
    affect the instructors impression of your
    progress.

7
Goal of the course
  • Have an broad knowledge of XML literature
  • improved presentation skill
  • Have a deep knowledge and experience on the
    specific research topic
  • Ready to do XML research

8
Course Info
  • Prerequisites
  • CSC6710 and CSC7710 Or with the permission of the
    instructor. 
  • Instructor
  • Shiyong Lu (shiyong_at_cs.wayne.edu )
  • Office 430 State Hall 
  • Telephone 577-1667
  • Office hours Tu, Th 400-500PM or by
    appointment.

9
Prerequisites
  • CSC6710 and CSC7710 Or with the permission of the
    instructor. 

10
Course load and grading
  • (0 )  a series of assignments will be given,
    although they will not be graded.
  • (30 ) Two programming projects (15 pts each)
  • (15 ) Lecture presentation
  • (15 ) Final project demonstration
  • (40 ) Final professional publication-quality
    research paper.
  • The grade will be given on a group basis (of two
    students) except for the individual lecture
    presentation.

11
Late work penalty
  • You can have one late assignment submission up to
    one week without any penalty. Please indicate on
    the cover page of your submission when you use
    your late excuse. If late excuse is not used, a
    penalty of 2 per day will be assessed up to one
    week. No credits will be given for works handed
    in one week after the due date. The late excuse
    cannot be used for the final project. 

12
Academic Honesty
  • Copying an assignment from another student in
    this class or obtaining a solution from some
    other source will lead to an automatic failure
    for this course and to a disciplinary action.
    Allowing another student to copy one's work will
    be treated as an act of academic dishonesty,
    leading to the same penalty as copying. get a
    failure).

13
What is XML?
  • Example
  • lt?xml version1.0
  • ltemailgt
  • ltfromgtsmith_at_cs.wayne.edult/fromgt
  • lttogt shiyong_at_cs.wayne.edu lt/togt
  • ltsubjectgtWhat is XMLlt/subjectgt
  • ltbodygt
  • Can tell you me what XML is all about?
  • lt/bodygt
  • lt/emailgt

14
What is XML (cont)
  • XML eXtensible Markup Language
  • No fixed collection of markup tags (meta
    language)
  • Semi-structured self-descrbing
  • Separate syntax from semantics
  • The standard for representing and exchanging
    information on the WWW

15
HTML example
  • lth1gtRhubarb Cobblerlt/h1gt lth2gtMaggie.Herrick_at_b
    bs.mhv.netlt/h2gt
  • lth3gtWed, 14 Jun 95lt/h3gt
  • Rhubarb Cobbler made with bananas as the
    main sweetener. It was delicious. Basicly it was
  • lttablegt
  • lttrgtlttdgt 2 1/2 cups lttdgt diced rhubarb
    lttrgtlttdgt 2 tablespoons lttdgt sugar lttrgtlttdgt 2 lttdgt
    fairly ripe bananas lttrgtlttdgt 1/4 teaspoon lttdgt
    cinnamon lttrgtlttdgt dash of lttdgt nutmeg lt/tablegt
    Combine all and use as cobbler, pie, or crisp.
    Related recipes lta href"GardenQuiche"gtGarden
    Quichelt/agt

16
A corresponding XML doc
  • ltrecipe id"117" category"dessert"gt
  • lttitlegtRhubarb Cobblerlt/titlegt ltauthorgtltemailgtMagg
    ie.Herrick_at_bbs.mhv.netlt/emailgtlt/authorgt
    ltdategtWed, 14 Jun 95lt/dategt
  • ltdescriptiongt Rhubarb Cobbler made with
    bananas as the main sweetener. It was delicious.
    lt/descriptiongt ltingredientsgt ltitemgtltamountgt2 1/2
    cupslt/amountgtlttypegtdiced rhubarblt/typegtlt/itemgt
    ltitemgtltamountgt2 tablespoonslt/amountgtlttypegtsugarlt/t
    ypegtlt/itemgt ltitemgtltamountgt2lt/amountgtlttypegtfairly
    ripe bananaslt/typegtlt/itemgt ltitemgtltamountgt1/4
    teaspoonlt/amountgtlttypegtcinnamonlt/typegtlt/itemgt
    ltitemgtltamountgtdash oflt/amountgtlttypegtnutmeglt/typegtlt
    /itemgt lt/ingredientsgt ltpreparationgt Combine all
    and use as cobbler, pie, or crisp. lt/preparationgt
    ltrelated url"GardenQuiche"gtGarden
    Quichelt/relatedgt lt/recipegt

17
HTML vs XML
  • the markup tags are chosen purely for logical
    structure this is just one choice of markup
    detail level
  • we need to define which XML documents we regard
    as "recipe collections" (XML Schema)
  • we need a stylesheet to define browser
    presentation semantics (XSL)
  • we need to express queries in a general way
    (XQuery)

18
A conceptual view of XML
  • Character data (XML content)
  • XML elements
  • XML attributes

19
A concrete view of XML
Markup tags denote elements   ...ltfoo
attr"val" ...gt...lt/foogt...               
                         
          a matching element end
tag                        the
contents of the element          an
attribute with name attr and value val, values
enclosed by ' or "     an element start tag
with name foo There is a short-hand notation for
empty elements ...ltfoo attr"val".../gt...
20
A concrete view of XML (cont)
  • An XML document must be well-formed
  • start and end tags must match
  • element tags must be properly nested
  • some more subtle syntactical requirements

21
A concrete view of XML (cont)
  • XML is case sensitive!
  • Special characters can be escaped using Unicode
    character references
  • 60 and lt both yield lt
  • 38 and amp both yield
  • lt!-- comment --gt
  • lt!DOCTYPE ...gt document type declaration
    (described later...)

22
XML application examples
  • XHTML, W3C's XMLization of HTML 4.0.
  • CML, Chemical Markup Language
  • WML, Wireless Markup Language for WAP services
  • ThML, Theological Markup Language
  • Much more

23
Why XML
  • It is hot () (both in industry and academic)!
  • Syntax itself is not enough, but tools and
    languages to process XML
  • For database people, how to manage XML data
    (storage, update, query, and exchange, and
    transformation, etc)

24
XML techniques
  • common extensions to the core XML specificationa
    namespace mechanism, document inclusion, etc.
  • schemas grammars to define classes of documents
  • linking between documentsa generalization of
    HTML anchors and links
  • addressing parts of read-only documentsflexible
    and robust pointers into documents
  • transformationconversion from one document class
    to another
  • queryingextraction of information, generalizing
    relational databases

25
XML namespaces
  • ltwidget type"gadget"gt  lthead size"medium"/gt 
    ltbiggtltsubwidget ref"gizmo"/gtlt/biggt  ltinfogt   
    ltheadgt      lttitlegtDescription of
    gadgetlt/titlegt    lt/headgt    ltbodygt     
    lth1gtGadgetlt/h1gt      A gadget contains a big
    gizmo    lt/bodygt  lt/infogt
  • Problem the meaning of head and big depends on
    the context!

26
XML namespaces (cont)
  • Simple solution qualify names with URIs
    (Universal Resource Identifiers)
    lthttp//www.w3.org/TR/xhtml1headgt
      \                         
    / \  /   
    ------------------------------------     
     qualifying URI      
    local name
  • Do not be confused by the use of URIs for
    namespaces
  • they are not supposed to point to anything
  • it is simply the cheapest way of getting unique
    names
  • we rely on existing organizations that control
    domain names

27
XML namespaces (cont)
  • lt... xmlnsfoo"http//www.w3.org/TR/xhtml1"gt 
    ...  ltfooheadgt...lt/fooheadgt  ...lt/...gt

28
XML namespaces (cont)
  • xmlnsprefix"URI" declares a namespace with a
    prefix and a URI
  • the scope of declaration is lexical, the element
    containing the declaration and all descendants
    can be overridden by nested declaration
  • both element and attribute names can be qualified
    with namespaces
  • the name of the prefix is irrelevant
    -applications should use only the URI

29
XML name spaces (cont)
  • ltwidget xmlns"http//www.widget.org"       
    xmlnsxhtml"http//www.w3.org/TR/xhtml1"       
    type"gadget"gt  lthead size"medium"/gt 
    ltbiggtltsubwidget ref"gizmo"/gtlt/biggt  ltinfogt   
    ltxhtmlheadgt      ltxhtmltitlegtDescription of
    gadgetlt/xhtmltitlegt    lt/xhtmlheadgt   
    ltxhtmlbodygt      ltxhtmlh1gtGadgetlt/xhtmlh1gt 
        A gadget contains a big gizmo   
    lt/xhtmlbodygt  lt/infogtlt/widgetgt

30
XML schemas
  • A schema is a definition of the syntax of an
    XML-based language (i.e. a class of XML
    documents).
  • A schema language is a formal language for
    expressing schemas. (DTD, XML Schema)

31
DTD Document Type Defintion
  • lt!DOCTYPE root-element doctype-declaration...
  • lt!ELEMENT element-name content-modelgt, content
    model , ,, , , ?
  • lt!ATTLIST element-name attr-name attr-type
    attr-default ...gt

32
Element Type Declaration
  • elementdecl     'lt!ELEMENT' Name
    contentspec 'gt'   
  • contentspec     'EMPTY' 'ANY' Mixed
    Children
  • No element type may be declared more than once

33
Element Type Declaration Example
  • lt!ELEMENT br EMPTYgt
  • lt!ELEMENT p (PCDATAemph) gt
  • lt!ELEMENT name.para content.para gt
  • lt!ELEMENT container ANYgt

34
Empty Elements
  • EmptyElemTag     'lt' Name (Attribute) '/gt
  • Example
  • ltIMG align"left" src"http//www.w3.org/Icons/WWW
    /w3c_home" /gt
  • ltbr/gt
  • Question is ltagtlt/agt an empty element?

35
DTD (cont)
  • lt!ATTLIST element-name attr-name attr-type
    attr-default ...gtdeclares which attributes are
    allowed or required in which elements attribute
    types
  • CDATA any value is allowed (the default)
  • (value...) enumeration of allowed values
  • ID, IDREF, IDREFS ID attribute values must be
    unique (contain "element identity"), IDREF
    attribute values must match some ID (reference to
    an element)
  • ENTITY, ENTITIES, NMTOKEN, NMTOKENS, NOTATION
    just forget these... (consider them deprecated)
  • attribute defaults
  • REQUIRED the attribute must be explicitly
    provided
  • IMPLIED attribute is optional, no default
    provided
  • "value" if not explicitly provided, this value
    inserted by default
  • FIXED "value" as above, but only this value is
    allowed

36
Attribute-list declaration example
  • lt!ATTLIST termdef
  • id ID REQUIRED
  • name CDATA IMPLIEDgt
  • lt!ATTLIST list
  • type (bulletsorderedglos
    sary) "ordered"gt
  • lt!ATTLIST form
  • method CDATA FIXED "POST"gt

37
A DTD example
  • lt!ELEMENT collection (description,recipe)gt
  • lt!ELEMENT description ANYgt
  • lt!ELEMENT recipe (title,ingredient,preparation,co
    mment?,nutrition)gt
  • lt!ELEMENT title (PCDATA)gt
  • lt!ELEMENT ingredient (ingredient,preparation)?gt
  • lt!ATTLIST ingredient name CDATA REQUIRED amount
    CDATA IMPLIED unit CDATA IMPLIEDgt
  • lt!ELEMENT preparation (step)gt
  • lt!ELEMENT step (PCDATA)gt
  • lt!ELEMENT comment (PCDATA)gt
  • lt!ELEMENT nutrition EMPTYgt
  • lt!ATTLIST nutrition protein CDATA REQUIRED
    carbohydrates CDATA REQUIRED fat CDATA REQUIRED
    calories CDATA REQUIRED alcohol CDATA IMPLIEDgt

38
XML Schema language requirement
  • more expressive than XML DTDs
  • expressed in XML
  • self-describing
  • simple enough to be implemented with modest
    design and runtime resources
  • Structures and data types

39
XML doc and schema examples
  • ltcard xmlns"http//businesscard.org"gt
    ltnamegtJohn Doelt/namegt
  • lttitlegtCEO, Widget Inc.lt/titlegt
  • ltemailgtjohn.doe_at_widget.comlt/emailgt
  • ltphonegt(202) 456-1414lt/phonegt
  • ltlogo url"widget.gif"/gt
  • lt/cardgt

40
XML doc and schema examples (business_card.xsd,
cont)
  • ltschema xmlns"http//www.w3.org/2001/XMLSchema"
    xmlnsb"http//businesscard.org"
    targetNamespace"http//businesscard.org"gt
  • ltelement name"card" type"bcard_type"/gt
    ltelement name"name" type"string"/gt
  • ltelement name"title" type"string"/gt ltelement
    name"email" type"string"/gt
  • ltelement name"phone" type"string"/gt ltelement
    name"logo" type"blogo_type"/gt
  • ltcomplexType name"card_type"gt
  • ltsequencegt ltelement ref"bname"/gt
  • ltelement ref"btitle"/gt
  • ltelement ref"bemail"/gt
  • ltelement ref"bphone" minOccurs"0"/gt
  • ltelement ref"blogo" minOccurs"0"/gt
  • lt/sequencegt
  • lt/complexTypegt
  • ltcomplexType name"logo_type"gt
  • ltattribute name"url" type"anyURI"/gt
    lt/complexTypegt lt/schemagt

41
Overview of XML Schema
  • a (global) element declaration associates an
    element name with a type
  • a complex type definition defines requirements
    for attributes, sub-elements, and character data
    in elements of that type
  • attribute declarations describe which attributes
    that may or must appear
  • element references describe which sub-elements
    that may or must appear, how many, and in which
    order
  • a simple type definition defines a set of strings
    to be used as attribute values or character data
Write a Comment
User Comments (0)
About PowerShow.com