XML-based Web Publishing and Content Management at Seattle University School of Law - PowerPoint PPT Presentation

About This Presentation
Title:

XML-based Web Publishing and Content Management at Seattle University School of Law

Description:

Followed by zero or more transformers ... movie='cmhall.rm'/ quote id='cumbow' img='cumbow.gif' alt='Cumbow Quote'/ /allButtons ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 30
Provided by: evan85
Category:

less

Transcript and Presenter's Notes

Title: XML-based Web Publishing and Content Management at Seattle University School of Law


1
XML-based Web Publishing and Content Management
at Seattle University School of Law
  • James Cooper
  • Director of Technology Media Services
  • jcooper_at_seattleu.edu
  • Evan Lenz
  • Content Management Architect
  • lenze_at_seattleu.edu

2
Contents
  • Web site requirements and architecture
  • Web site management with Cocoon
  • URI design discussion
  • Redhawk CMS
  • An acronym you should know XSLT
  • QA

3
1. Web site requirements and architecture
4
SU Law Web site requirements (summer 2002)
  • Must include a Flash-enhanced version
  • Must include an HTML-based version that
    approximates the look-and-feel and navigational
    structure of the Flash-enhanced version
  • Must include a version of the site that is
    designed for accessibility
  • Must employ the separation of presentation and
    content through the use of XML technologies.
    Multiple published versions of the same content
    must originate in an automatic way from the same
    source.
  • The publishing framework must employ a single
    point of control over navigational structure,
    e.g. using an XML configuration file.

5
Web site requirements, cont.
  • Must allow an average Web developer to easily
    author new content, edit existing content, etc.
  • Must accommodate the continued use of existing
    tools for authoring content, e.g. Dreamweaver.
  • Particular kinds of content that have
    predictable, repeating structure should be
    converted into custom XML vocabularies to
    increase their flexibility and ease of
    management.
  • The Web site must include search functionality
    integrated into all versions of the site.

6
Web content strategy today
  • Static pages were converted to and are stored as
    style-free XHTML (in VSS, with latest versions
    shadowed on the staging server).
  • Apache Ant is invoked on the staging server to
    incrementally build all versions (Flash,
    Standard, Text-only, and crawler) of each static
    page, using the page source, as well as global
    navigation and sidebar configuration files, as
    input.
  • Cocoon powers the core functionality of the site,
    including setting the users version preferences
    and serving dynamic content. All static pages and
    files are served directly by Apache.
  • Dynamic content pieces are identified by URI in
    the Cocoon sitemap, which is configured to
    assemble corresponding pages on-the-fly. Dynamic
    content examples include
  • Specialized content in our home-grown CMS called
    Redhawk, which provides end-user WYSIWYG
    editing of certain kinds of content
  • Google search results
  • Legacy ASP pages
  • Traditional Web content management, e.g. WYSIWYG
    editing of all pages, is being considered, but
    not sorely missed at this time.

7
Benefits of using XML
  • Separation of presentation from content
  • Ensures consistency of presentation across all
    pages (eliminates layout errors)
  • Enables publication to multiple channels
  • Content re-use
  • Many commercial and open-source tools available
    for processing/creating XML
  • Integration between disparate systems (including
    legacy ASP pages, Google, Redhawk, etc.)
  • Great for configuration files

8
Primary tools used in our Web site
  • Run-time
  • Apache Cocoon (Java-based)
  • Apache Web server on Linux
  • mod_rewrite (for rewriting incoming URLs, e.g.
    path?modeflash, to /flash-html/path.html)
  • Google Appliance (for integrated search inside
    our site template)
  • IIS/ASP (legacy database access scripts, e-mail
    forms, etc.)
  • 4Suite, for exporting content from the Redhawk
    CMS (based on 4Suite)
  • Build-time
  • MS Visual SourceSafe (for versioning of static
    content)
  • Samba (for mounting a VSS shadow folder on the
    Linux staging server)
  • Dreamweaver MX (includes XHTML support and VSS
    integration)
  • Apache Ant (for building the bulk of the site
    statically)
  • 4Suite, for end-user content management of
    specialized document types, aka Redhawk

9
2. Web site management with Cocoon
10
Introduction to Cocoon
  • Cocoon is an open-source, Java-based XML Web
    publishing framework
  • Recently gained status as a top-level Apache
    project, at http//cocoon.apache.org
  • Designed to enable the separation of concerns
    between content, logic, and style

11
The Cocoon sitemap
  • SAX-based pipeline mechanism allows XML content
    to go through a series of transformations,
    configurable by the sitemap, Cocoon's central
    point of configuration
  • Each pipeline consists of
  • Exactly one generator
  • Produces XML content using any number of
    mechanisms reading a file, submitting an HTTP
    request, calling a database, invoking a server
    page script, etc.
  • Followed by zero or more transformers
  • Processes the XML, e.g. XSLT or Xinclude, for
    subsequent handling by either another transformer
    or the serializer
  • Followed by exactly one serializer
  • Serializes into a particular format, e.g.
    well-formed XML, browser-compatible XHTML, SVG,
    PDF (via XSLFO and FOP), rasterized images (via
    SVG and Batik), etc.

12
Simplified Cocoon sitemap excerpt
ltmapmatch pattern"accesstojustice/hague/cases"gt
ltmapgenerate src"http//redhawk/?xsltg
etCases.xsl"/gt ltmaptransform
src"stylesheets/case2html.xsl"/gt
ltmapserialize type"xhtml"/gt lt/mapmatchgt
13
Another sitemap excerpt
  • ltmapresource name"front-door"gt
  • ltmapselect type"request-parameter"gt
  • ltmapparameter name"parameter-name"
  • value"set-version"/gt
  • ltmapwhen test"flash"gt
  • ltmapcall resource"check-flash"/gt
  • lt/mapwhengt
  • ltmapwhen test"flash-confirmed"gt
  • ltmapcall resource"set-preference-to-fla
    sh"/gt
  • lt/mapwhengt
  • ltmapwhen test"standard"gt
  • ltmapcall resource"set-preference-to-sta
    ndard"/gt
  • lt/mapwhengt
  • ltmapwhen test"simple"gt
  • ltmapcall resource"set-preference-to-sim
    ple"/gt
  • lt/mapwhengt
  • ltmapotherwisegt
  • lt!-- more logic --gt
  • lt/mapotherwisegt

14
(No Transcript)
15
URI design considerations
  • The URI design of the SU Law Web site was
    inspired by Tim Berners-Lee's 1998 essay Cool
    URIs don't change http//www.w3.org/Provider/St
    yle/URI.html
  • Aims to follow two of the essay's suggestions
  • Leave out file extensions
  • Leave out topic/classification by subject

16
Leave out file extensions
  • Cocoon makes it easy to map external URIs to
    internal filenames or other content generators
  • In the SU Law Web site, the URLs of all HTML
    pages do not include any file extensions
  • Other types of content use standard file
    extensions, e.g. JPG, GIF, Flash, Word, etc.

17
Leave out topic/classification by subject
  • Difficult problem
  • Design URIs such that they are meaningfully
    mnemonic and will never change, even though the
    corresponding pages may be classified into
    different topics later
  • Berners-Lee "Because the relationships between
    subjects are web-like rather than tree-like,
    even...people who agree on a web may pick a
    different tree representation."

18
Decouple navigational structure from URI structure
  • URI structure is, of necessity, hierarchical
  • Site navigation tends to be hierarchical,
    classifying pages into topics or subjects
  • To help in following the original suggestion, we
    formulated the following mandate
  • Decouple navigational structure from URI
    structure.
  • We met this goal through the use of a custom XML
    configuration file (navigation.xml) that maps
    between the two independent hierarchies
    (navigation and URI structure)

19
Excerpt from navigation.xml
  • ltnavigation xmlns"http//law.seattleu.edu"gt
  • ltmenu display"Welcome" sectionId"welcome"gt
  • ltlink href"/" display"SU Law Home"/gt
  • ltlink display"Contact Information"
    href"/contactus"/gt
  • ltlink display"Directions" href"/directions"/
    gt
  • ltlink href"/welcome" display"From the
    Dean"/gt
  • ltlink href"/history" display"History"/gt
  • ltlink href"/calendar" display"Master
    Calendar"/gt
  • ltlink href"/mission" display"Mission"/gt
  • ltlink href"/search" display"Search"/gt
  • ltlink href"/sitemap" display"Site Map"/gt
  • ltlink href"http//www.seattleu.edu"
  • display"Seattle University Home"/gt
  • lthidden href"/news" display"News"/gt
  • lthidden pattern"/news"/gt
  • lthidden href"/privacy" display"Privacy
    Statement"/gt
  • lt/menugt
  • ltmenu display"Students" sectionId"students"gt
  • ltmenu display"Academics"gt

20
The benefits of URI-navigation independence
  • Pages can be moved from one section of the site
    to another by simply editing one file
    (navigation.xml)
  • Navigation structure can change without needing
    to update any links or change any URIs (thereby
    rendering them uncool)
  • Files do not need to be moved around just because
    corresponding pages move around the site

21
XML-based configuration of the Web site sidebar
  • ltsidebar xmlns"http//law.seattleu.edu"gt
  • ltallButtonsgt
  • ltpromotion id"laptop" img"laptoppurchase.gif
  • alt"Student Laptop Purchase
    Program (Dell)
  • href"/technology/purchase"/gt
  • ltprofile id"cmhall" alt"Christian
    Halliburton Video
  • movie"cmhall.rm"/gt
  • ltquote id"cumbow" img"cumbow.gif"
    alt"Cumbow Quote"/gt
  • ...
  • lt/allButtonsgt
  • ...
  • ltsection id"faculty"gt
  • ltprofile idref"cmhall"/gt
  • ltquote idref"cumbow"/gt
  • ltpromotion idref"giving"/gt
  • ltpromotion idref"newfaculty"/gt
  • ltpromotion idref"laptop"/gt
  • lt/sectiongt
  • ...

22
3. Redhawk CMS
23
Redhawk, home-grown CMS
  • Redhawk is a specialized XML content management
    system, based on 4Suite, an open-source platform
    for XML and RDF processing
  • Named after SU mascot
  • Basic unit of storage is an XML document
  • Supports development of custom Redhawk "document
    classes", which correspond to XML document types
    (or schemas)
  • Provides basic CRUD (Create, Read, Update,
    Delete) and role-based workflow functionality
  • Two types of users for each document class
    Author and Editor
  • Any Create, Update, or Delete requests by an
    Author must be approved by an Editor before
    taking effect
  • Pluggable WYSIWYG editing environments so far we
    have developed support for Altova's free
    browser-based XML editor, Authentic 5
  • Future plans to support Microsoft InfoPath and
    Word 2003

24
Create New Announcement form
25
Current Redhawk applications
  • Announcements and events for the Docket
    (migration from custom production application in
    process)
  • Access to Justice Institutes Hague Project for
    managing Hague Convention-related case
    information (in production)

26
4. An acronym you should know XSLT
27
The common denominator XSLT (Extensible
Stylesheet Language Transformations)
  • Used in Cocoon to assemble all pages (XSLT is the
    default type of "Transformer")
  • Used in our site build process, via Ant's ltxsltgt
    task for collectively applying transformations
    over multiple files
  • Built-in to 4Suite and used throughout Redhawk to
    assemble pages, create documents, and implement
    the core CMS logic (with the help of extensions)
  • Used in the Google Appliance to style the output
    of search results
  • Used in Redhawk in the browser to apply
    supplemental "clean-up" transformations to the
    XML resulting from Authentic editing
  • Growing abundance of conformant XSLT processors,
    including IE6 and Mozilla support, as well as a
    growing number of powerful tools
  • And XSLT is reaching mainstream technology
    status Microsoft Office 2003 will pervasively
    employ XSLT for the development of custom XML
    solutions, particularly in Word, Excel, Access,
    and InfoPath.

28
References
  • http//cocoon.apache.org
  • http//4suite.org
  • http//ant.apache.org
  • Cool URIs don't change http//www.w3.org/Provi
    der/Style/URI.html
  • Cocoon and 4Suite for Content Management The
    Best of Both Worlds at Seattle University School
    of Law - http//www.xmlportfolio.com/xmleurope200
    3/

29
Questions?
Write a Comment
User Comments (0)
About PowerShow.com