XML Web Services: XML Metadata in Acrobat PDF Files - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

XML Web Services: XML Metadata in Acrobat PDF Files

Description:

2 The Document Metadata dialog box displays all the metadata embedded in the document. ... Creating tagged Adobe PDF documents (need to do for accessibility anyway) ... – PowerPoint PPT presentation

Number of Views:131
Avg rating:3.0/5.0
Slides: 18
Provided by: brandn
Category:
Tags: pdf | xml | acrobat | files | metadata | services | web

less

Transcript and Presenter's Notes

Title: XML Web Services: XML Metadata in Acrobat PDF Files


1
XML Web Services XML Metadata in Acrobat PDF
Files
  • Brand Niemann
  • XML Web Services Evangelist (My Internet
    Handle)
  • US EPA Office of Environmental Information
  • May 2, 2002

2
Overview
  • 1. E-Gov E-Records Management Project
  • 2. Adobe 5.0 Document Metadata
  • 3. Repurposing Adobe PDF Documents
  • 4. Large Legacy Document Collections
  • 5. Contact Information

3
1. E-Gov E-Records Management Project
  • NARA
  • Has the lead on the E-Gov E-Records Management
    Project.
  • Identified two study areas
  • Accepting new electronic record formats such as
    scanned images, PDF, etc.
  • Develop archival and records management metadata
    to facilitate transfer of these records.
  • EPA
  • Interested in leading/co-leading the second study
    area as part of ERDMS work.

4
1. E-Gov E-Records Management Project
  • Two milestones
  • September 30, 2002
  • Define standardized organizational strategic
    approaches to phased implementation of electronic
    recordkeeping in a federal agency.
  • December 31, 2002
  • Develop metadata record needed to use XML as the
    common language for transferring permanent
    e-records to NARA.
  • Goal
  • Propose and pilot test common metadata
    stylesheets for NARA and federal agencies use to
    transfer electronic records.

5
1. E-Gov E-Records Management Project
  • Problem with Acrobat PDF files
  • NARA will not accept because not XML.
  • NARA is in discussions with Adobe, Inc. about
    this problem.
  • Adobe, Inc. position on XML
  • XML and PDF compliment one another.
  • My comments
  • Actually one can produce a PDF from XML by using
    an XML Formatting Objects Stylesheet (XSL-FO).
  • An Acrobat Plug-in provides PDF-to-XML
    conversion
  • My comments
  • This only works if the PDF is of the third type
    (tagged) and it still needs additional work to
    make a useful XML document.

6
2. Adobe 5.0 Document Metadata
  • Viewing Document Metadata In Acrobat 5.0, Adobe
    PDF files contain Document Metadata in XML
    format. This Document Metadata contains (but is
    not limited to) information that is also in the
    Document Properties. Any changes made in the
    Acrobat Document Properties dialog box are
    reflected in the Document Metadata. Because
    Document Metadata is in XML format, it can be
    extended and modified using third-party products.
    You can copy and paste the Document Metadata XML
    source code.
  • To view the Document Metadata
  • 1 Choose File, Document Properties, Document
    Metadata.
  • 2 The Document Metadata dialog box displays all
    the metadata embedded in the document. (Metadata
    is displayed by schemathat is, in predefined
    groups of related information.) The information
    associated with each schema is visible by
    default it can be hidden by clicking the
    triangle next to the schema name. If a schema
    doesnt have a recognized name, it is listed as
    Unknown.The XML name space is contained in
    parentheses after the schema name.
  • 3 To view the XML code, click View Source.You can
    cut, copy, and paste XML code from the Metadata
    Source View dialog box. Click OK to return to the
    Document Metadata dialog box.
  • 4 Click OK to close the Document Metadata dialog
    box, and click Cancel to close the dialog box
    without making any changes.
  • See next slides.

7
2. Adobe 5.0 Document Metadata
8
2. Adobe 5.0 Document Metadata
9
2. Adobe 5.0 Document Metadata
10
2. Adobe 5.0 Document Metadata
  • ltrdfRDF xmlnsrdf'http//www.w3.org/1999/02/22-r
    df-syntax-ns'
  • xmlnsiX'http//ns.adobe.com/iX/1.0/'gt
  • ltrdfDescription about''
  • xmlns'http//ns.adobe.com/pdf/1.3/'
  • xmlnspdf'http//ns.adobe.com/pdf/1.3/'gt
  • ltpdfModDategt2001-07-30T173238-0600lt/pdfModDat
    egt
  • ltpdfCreationDategt2001-07-30T173204-0600lt/pdfC
    reationDategt
  • ltpdfProducergtAcrobat Distiller 4.05 for
    Windowslt/pdfProducergt
  • lt/rdfDescriptiongt
  • ltrdfDescription about''
  • xmlns'http//ns.adobe.com/xap/1.0/'
  • xmlnsxap'http//ns.adobe.com/xap/1.0/'gt
  • ltxapModifyDategt2001-07-30T173238-0600lt/xapMod
    ifyDategt
  • ltxapCreateDategt2001-07-30T173204-0600lt/xapCre
    ateDategt
  • lt/rdfDescriptiongt
  • lt/rdfRDFgt

11
3. Repurposing Adobe PDF Documents
  • See Acrobat 5.0 Help
  • Repurposing Adobe PDF Documents (pages 82-90) and
    Working with PDF (pages 103-107)
  • Creating tagged Adobe PDF documents (need to do
    for accessibility anyway).
  • Saving Adobe PDF documents to other formats (RTF
    and XML). See next slides.
  • But still need XML authoring tools and expertise
  • I have done this for lots of EPA documents in my
    XML Web Services training.

12
3. Repurposing Adobe PDF Documents
13
3. Repurposing Adobe PDF Documents
14
4. Large Legacy Document Collections
  • Need industrial-strength XML tools and software
    platforms for efficient cost-effective
    electronic document management solutions (c.f.)
  • eXtensible Markup Language (XML) Web Services for
    Legacy Document Collections, Brand Niemann and
    David Eng, April 5, 2002, to appear in
    InfoAccess.
  • XML Web Services Training (c.f.)
  • Unit 14 Toxics Release Data.
  • Unit 18 Superfund Data (see next slides).

15
4. Large Legacy Document Collections
16
4. Large Legacy Document Collections
17
5. Contact Information
  • Brand Niemann, Ph.D.
  • USEPA Headquarters, EPA West, Room 6143D
  • Office of Environmental Information, MC 2822T
  • 1200 Pennsylvania Avenue, NW, Washington, DC
    20460
  • 202-566-1657
  • niemann.brand_at_epa.gov
  • EPA http//161.80.70.167
  • Outside EPA http//130.11.44.140
Write a Comment
User Comments (0)
About PowerShow.com