Introduction to XML: www.w3schools.com - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to XML: www.w3schools.com

Description:

Introduction to XML: www.w3schools.com Yong Choi CSU Bakersfield What is XML? XML stands for EXtensible Markup Language XML is a markup language much like HTML XML ... – PowerPoint PPT presentation

Number of Views:300
Avg rating:3.0/5.0
Slides: 37
Provided by: Yon74
Learn more at: https://www.csub.edu
Category:

less

Transcript and Presenter's Notes

Title: Introduction to XML: www.w3schools.com


1
Introduction to XMLwww.w3schools.com
  • Yong Choi
  • CSU
  • Bakersfield

2
What is XML?
  • XML stands for EXtensible Markup Language
  • XML is a markup language much like HTML
  • XML was designed to describe data
  • XML will be the most common tool for all data
    manipulation and data transmission.
  • XML tags are not predefined in XML. You must
    define your own tags
  • XML uses a Document Type Definition (DTD) or an
    XML Schema to describe the data
  • XML with a DTD or XML Schema is designed to be
    self-descriptive

3
What is DTD?
  • It defines the document structure with a list of
    legal elements.
  • A DTD can be declared inline in your XML
    document, or as an external reference.
  • Why use a DTD?
  • With DTD, each of your XML files can carry a
    description of its own format with it.
  • With a DTD, independent groups of people can
    agree to use a common DTD for interchanging data.
  • Your application can use a standard DTD to verify
    that the data you receive from the outside world
    is valid.
  • You can also use a DTD to verify your own data.

4
The main difference between XML and HTML
  • XML is not a replacement for HTML.
  • XML and HTML were designed with different goals
  • XML was designed to describe data and to focus on
    what data is.
  • HTML was designed to display data and to focus on
    how data looks.
  • HTML is about displaying information, while XML
    is about describing information.

5
Example A note to Tove from Jani
  • ltnotegt
  • lttogtTovelt/togt
  • ltfromgtJanilt/fromgt
  • ltheadinggtReminderlt/headinggt
  • ltbodygtDon't forget me this weekend!lt/bodygt
  • lt/notegt
  • Just pure information wrapped in XML tags

6
XML is free and extensible
  • XML tags are not predefined. You must "invent"
    your own tags.
  • The tags used to mark up HTML documents and the
    structure of HTML documents are predefined. The
    author of HTML documents can only use tags that
    are defined in the HTML standard (like ltpgt, lth1gt,
    etc.).
  • XML allows the author to define his own tags and
    his own document structure.
  • The tags in the example above (like lttogt and
    ltfromgt) are not defined in any XML standard.
    These tags are "invented" by the author of the
    XML document.

7
XML is a complement to HTML
  • XML is not a replacement for HTML.
  • It is important to understand that XML is not a
    replacement for HTML. In future Web development
    it is most likely that XML will be used to
    describe the data, while HTML will be used to
    format and display the same data.
  • Best description of XML
  • XML is a cross-platform, software and hardware
    independent tool for transmitting information.

8
XML can Separate Data from HTML
  • With XML, your data is stored outside your HTML.
  • When HTML is used to display data, the data is
    stored inside your HTML. With XML, data can be
    stored in separate XML files. This way you can
    concentrate on using HTML for data layout and
    display, and be sure that changes in the
    underlying data will not require any changes to
    your HTML.
  • XML data can also be stored inside HTML pages as
    "Data Islands". You can still concentrate on
    using HTML only for formatting and displaying the
    data.

9
XML is used to Exchange Data
  • With XML, data can be exchanged between
    incompatible systems.
  • In the real world, computer systems and databases
    contain data in incompatible formats. One of the
    most time-consuming challenges for developers has
    been to exchange data between such systems over
    the Internet.
  • Converting the data to XML can greatly reduce
    this complexity and create data that can be read
    by many different types of applications.
  • Especially in B2B (financial institutions)

10
An example XML document
  • lt?xml version"1.0" encoding"ISO-8859-1"?gt
  • ltnotegt
  • lttogtTovelt/togt
  • ltfromgtJanilt/fromgt
  • ltheadinggtReminderlt/headinggt
  • ltbodygtDon't forget me this weekend!lt/bodygt
  • lt/notegt

11
First Line
  • The XML declaration
  • Defines the XML version and the character
    encoding used in the document.
  • Conforms to the 1.0 specification of XML
  • Uses the ISO-8859-1 (Latin-1/West European)
    character set.

12
Second and other lines
  • The second line describes the root element of the
    document (
  • like it was saying "this document is a note"
  • Lines from 3 to 6 describe 4 child elements of
    the root
  • to, from, heading, and body
  • The last line defines the end of the root element

13
XML Editors
This figure shows available XML editors
14
XML Parsers
  • An XML processor (also called XML parser)
    evaluates the document to make sure it conforms
    to all XML specifications for structure and
    syntax.
  • XML parsers are strict and ensure XML code
    accepted by the parser will work the same
    everywhere.
  • Microsofts parser is called MSXML and is built
    directly in IE versions 5.0 and above.
  • Netscape developed its own parser, called
    Mozilla, which is built into version 6.0 and
    above.

15
Well-Formed and Valid XML Documents
  • There are two categories of XML documents
  • Well-formed
  • Valid
  • An XML document is well-formed if it contains no
    syntax errors and fulfills all of the
    specifications for XML code as defined by the
    W3C.
  • An XML document is valid if it is well-formed and
    also satisfies the rules laid out in the DTD or
    schema attached to the document.

16
The Document Creation Process
This figure shows the document creation process
17
XML Applications
This figure shows some XML applications
18
XML syntax vs. HTML syntax
  • With XML, it is illegal to omit the closing tag.
  • Paragraph tag ltPgt in HTML
  • XML tags are case sensitive
  • Unlike HTML, XML tags are case sensitive
  • ltMessagegtThis is incorrectlt/messagegt
  • ltmessagegtThis is correctlt/messagegt
  • All XML elements must be properly nested
  • Improper nesting of tags makes no sense to XML.
  • In HTML some elements can be improperly nested
  • ltbgtltigtThis text is bold and italiclt/igtlt/bgt

19
XML syntax vs. HTML syntax
  • All XML documents must have a root element
  • All XML documents must contain a single tag pair
    to define a root element.
  • All other elements must be within this root
    element.
  • All elements can have sub elements (child
    elements). Sub elements must be correctly nested
    within their parent element
  • ltrootgt
  • ltchildgt
  • ltsubchildgt.....lt/subchildgt
  • lt/childgt
  • lt/rootgt

20
XML syntax vs. HTML syntax
  • Attribute values must always be quoted
  • With XML, it is illegal to omit quotation marks
    around attribute values. 
  • lt?xml version"1.0" encoding"ISO-8859-1"?gt
  • ltnote date"12/11/2002"gt
  • lttogtTovelt/togt
  • ltfromgtJanilt/fromgt
  • lt/notegt

21
XML syntax vs. HTML syntax
  • With XML, white space is preserved
  • With XML, the white space in your document is not
    truncated.
  • Using XML
  • Hello              my name is Tove,
  • Using HTML
  • Hello my name is Tove,
  • HTML strips off the white space.
  • The syntax for writing comments in XML is similar
    to that of HTML.
  • lt!-- This is a comment --gt

22
XML Elements
  • XML Elements are Extensible
  • XML documents can be extended to carry more
    information.
  • XML Elements have Relationships
  • Elements are related as parents and children.
  • Elements have Content
  • Elements can have different content types.

23
Extensible
  • ltnotegt
  • lttogtTovelt/togt
  • ltfromgtJanilt/fromgt
  • ltheadinggtReminderlt/headinggt
  • ltbodygtDon't forget me this weekend!lt/bodygt
  • lt/notegt
  • Extracted output by an application
  • MESSAGE
  • To Tove
  • From Jani
  • Don't forget me this weekend!

24
Extensible
  • ltnotegt
  • ltdategt2002-08-01lt/dategt // new added information
  • lttogtTovelt/togt
  • ltfromgtJanilt/fromgt
  • ltheadinggtReminderlt/headinggt
  • ltbodygtDon't forget me this weekend!lt/bodygt
  • lt/notegt
  • Should the application break or crash because we
    added new information?

25
Relationships
  • A description of a book
  • My First XML
  • Introduction to XML
  • What is HTML
  • What is XML
  • XML Syntax
  • Elements must have a closing tag
  • Elements must be properly nested

26
Relationships
  • A description of the book in XML
  • ltbookgt
  • lttitlegtMy First XMLlt/titlegt
  • ltprod id"33-657" media"paper"gtlt/prodgt
  • ltchaptergtIntroduction to XML
  • ltparagtWhat is HTMLlt/paragt
  • ltparagtWhat is XMLlt/paragt
  • lt/chaptergt
  • ltchaptergtXML Syntax
  • ltparagtElements must have a closing taglt/paragt
  • ltparagtElements must be properly nestedlt/paragt
  • lt/chaptergt
  • lt/bookgt

27
Relationships
  • Book is the root element.
  • Title, prod, and chapter are child elements of
    book.
  • Book is the parent element of title, prod, and
    chapter.
  • Title, prod, and chapter are siblings (or sister
    elements) because they have the same parent.

28
Content
  • In previous example,
  • book has element content, because it contains
    other elements.
  • Chapter has mixed content because it contains
    both text and other elements.
  • Para has simple content (or text content) because
    it contains only text.
  • Prod has empty content, because it carries no
    information.
  • Only the prod element has attributes.
  • The attribute named id has the value "33-657".
  • The attribute named media has the value "paper". 

29
Element Naming
  • Names can contain letters, numbers, and other
    characters
  • Names must not start with a number or punctuation
    character
  • Names must not start with the letters xml (or XML
    or Xml ..)
  • Names cannot contain spaces
  • Any name can be used, no words are reserved, but
    the idea is to make names descriptive. Names with
    an underscore separator are nice.

30
XML Attributes
  • XML elements can have attributes.
  • From HTML ltIMG SRC"computer.gif"gt. The SRC
    attribute provides additional information about
    the IMG element.
  • ltfile type"gif"gtcomputer.giflt/filegt
  • Attribute values must always be enclosed in
    quotes
  • either single or double quotes can be used
  • If the attribute value itself contains double
    quotes, it is necessary to use single quotes or
    vice versa.

31
Use of Elements vs. Attributes
  • Data can be stored in child elements or in
    attributes both below examples provide same
    information.

Sex is an attribute ltperson sex"female"gt ltfirstnamegtAnnalt/firstnamegt ltlastnamegtSmithlt/lastnamegt lt/persongt Sex is an element ltpersongt ltsexgtfemalelt/sexgt ltfirstnamegtAnnalt/firstnamegt ltlastnamegtSmithlt/lastnamegt lt/persongt
32
Problems using Attributes
  • attributes cannot contain multiple values (child
    elements can)
  • attributes are not easily expandable (for future
    changes)
  • attributes cannot describe structures (child
    elements can)
  • attributes are more difficult to manipulate by
    program code
  • attribute values are not easy to test against a
    DTD
  • ltnote day"12" month"11" year"2002" to"Tove"
    from"Jani" heading"Reminder" body"Don't forget
    me this weekend!"gt lt/notegt

33
Linking to a Style Sheet
  • There are two main style sheet languages used
    with XML
  • Cascading Style Sheets (CSS)
  • Extensible Style Sheets (XSL)

34
CSS
  • Because HTML uses predefined tags, the meanings
    of these tags are well understood
  • The ltpgt element defines a paragraph and the lth1gt
    element defines a heading and the browser knows
    how to display them.
  • Adding styles to HTML elements with CSS is
    simple.
  • Telling a browser to display each element in a
    special font or color, is easy to do and easy for
    a browser to understand. 

35
XSL
  • Because XML does not use predefined tags (we can
    use any tags we want), the meanings of these tags
    are not understood
  • lttablegt could mean an HTML table, a piece of
    furniture, or something else. A browser does not
    know how to display an XML document.
  • Therefore there must be something in addition to
    the XML document that describes how the document
    should be displayed and that is XSL!

36
XSL
  • XSL is the preferred style sheet language of XML.
  • XSL (the eXtensible Stylesheet Language) is far
    more sophisticated than CSS.
  • Below is a fraction of the XML file, with an
    added XSL reference.
  • lt?xml-stylesheet type"text/xsl"
    href"simple.xsl"?gt
Write a Comment
User Comments (0)
About PowerShow.com