Taxonomies and Metadata for Content Management - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

Taxonomies and Metadata for Content Management

Description:

Quick implementation that provides measurable results as quickly as possible. ... Storage is cheap. Re-creating content is expensive. Repeatable rule ... – PowerPoint PPT presentation

Number of Views:143
Avg rating:3.0/5.0
Slides: 56
Provided by: hkla
Category:

less

Transcript and Presenter's Notes

Title: Taxonomies and Metadata for Content Management


1
Taxonomies and Metadatafor Content Management
Michael HuffInformation Resource OfficerU.S.
Department of State
2
E-Government Act of 2002
  • The use of computers and the Internet is rapidly
    transforming societal interactions and the
    relationships among citizens, private businesses,
    and the Government.
  • The Federal Government has had uneven success in
    applying advances in information technology to
    enhance governmental functions and services,
    achieve more efficient performance, increase
    access to Government information, and increase
    citizen participation in Government.
  • Most Internet-based services of the Federal
    Government are developed and presented
    separately, according to the jurisdictional
    boundaries of an individual department or agency,
    rather than being integrated cooperatively
    according to function or topic.

3
Which U.S. Government organizations are
experienced in using metadata taxonomy tools?
  • Defense Intelligence Agency
  • USDA Economic Research Service (ERS)
  • Federal Aviation Administration
  • FirstGov
  • NASA
  • Small Business Administration
  • Social Security Administration
  • Department of State

4
(No Transcript)
5
Taxonomy
Metadata
6
Why use metadata?
  • Adding metadata to unstructured content allows it
    to be managed like structured content.
  • Enriching content with structured metadata is
    critical for supporting search and personalized
    content delivery.
  • Content that has been adequately tagged with
    metadata can be leveraged in usage tracking,
    personalization and improved searching.

7
Where does metadata fit in the information system
architecture?
User experience. How content is presented and how
users experience and interact with it dictates
its perceived and actual value. Content
architecture Scalable metadata framework to
enable content reuse, and handle changes in
organization goals, user needs, and retrieval
concerns. Tools and technology. The information
supply-chain platform that enables workflows, and
supports organizational and operational concerns.
8
(No Transcript)
9
What is Dublin Core?
  • Dublin Core is the metadata standard for
    describing Internet resources so they are easy to
    find.

Original workshop held in Dublin, Ohio.
Dublin Core approved as ISO 15836.
Shanghai meeting.
04
95
03
For more information http//www.dublincore.org
10
Why is metadata important?
Better navigation discovery
More efficient editorial process
http//dublincore.org/documents/dcmi-terms/
11
What is a taxonomy?
The specification of the names of people, places,
things
The specification of the names of people, places,
things and everything else that is needed to
allow search engines and other content
applications to work better.
Animalia
Chordata
Mammalia
Carnivora
Canidae
Canis
C. familiari
Kingdom
Phylum
Class
Order
Family
Genus
Species
Linnaeus
44-Office Equipment and Accessories and Supplies
.12-Office Supplies
.17-Writing Instruments
.05-Mechanical pencils .06-Wooden
pencils .07-Colored pencils
Segment
Family
Class
Commodity
UNSPSC
12
Sample Recipe Taxonomy
Main Ingredients
Cooking Methods
Courses
Meal Type
Cuisines
Chocolate Dairy Fruits Grains Meat
Seafood Nuts Olives Pasta Spices
Seasonings Vegetables
Advanced Bake Broil Fry Grill Marinade Microwave
No Cooking Poach Quick Roast Sauté Slow
Cooking Steam Stir-fry
Breakfast Brunch Lunch Supper Dinner Snack
Appetizers Beverages Breads Cheese Cocktails Dess
erts Fish Shellfish Fruit Hors
d'Oeuvres Meat Pasta Salad Sandwiches Soup Vegetab
les
  • African
  • American
  • Asian
  • Caribbean
  • Continental
  • Eclectic/ Fusion/ International
  • Jewish
  • Latin American
  • Mediterranean
  • Middle Eastern
  • Vegetarian

Controlled Vocabularies
13
The power of taxonomy facets
  • 4 independent categories of 10 nodes each have
    the same discriminatory power as one hierarchy of
    10,000 nodes (104)
  • Easier to maintain
  • Can be easier to navigate

14
7 Common taxonomy facets
Personalized content delivery requires defining
taxonomy facets
and re-use of existing vocabulary sources
15
Applying the facets to the Dublin Core metadata
elements
Applied taxonomy metadata facilitates a
multi-faceted view of content
16
Facets at work on FirstGov site
http//www.firstgov.gov
17
Powered by
  • Guided Navigation
  • 2-3 clicks to product
  • No dead ends

http//www.tesco.com/winestore
18
http//www.towerrecords.com
19
Powered by
http//www.fortunoff.com
20
Seven practical rules for taxonomies
  • Incremental, extensible process that identifies
    and enables owners, and engages stakeholders.
  • Quick implementation that provides measurable
    results as quickly as possible.
  • Not monolithichas separately maintainable
    facets.
  • Re-uses existing IP as much as possible.
  • A means to an end, and not the end in itself.
  • Not perfect, but it does the job it is supposed
    to dosuch as improving search and navigation.
  • Improved over time, and maintained.

21
(No Transcript)
22
  • Creating a taxonomy is only part of the job
  • How will it be put to use?
  • In a new application, or by modifying an existing
    application?
  • Whats the effort around that?
  • Additional Issues
  • Tagging Who will add the metadata and how?

23
(No Transcript)
24
Task 1 Identify objectives
What do you do? What kinds of digital assets are
being produced? For what audiences? What is the
business process for submitting, selecting,
editing, maintaining digital assets? How many
digital assets are there? How fast is this
growing? Are there particular industry or other
standards that are important? What types of
assets are hard to search for (that should be
easier to find)? What tools would be helpful in
locating assets? Acronyms? Abbreviations? Nick
names? Glossary? Thesaurus? Taxonomy? Who else
should we be talking to?
25
Task 2 Inventory content
26
Task 3 Specify metadata
Legend ? 1 or more - 0 or more
27
Task 4 Model content
Header area
Factor asset types from inventory into canonical
types. Select examples from inventory (possibly
with spider). Identify useful chunks for each
asset type. Factor chunks into element superset.
Identify relationships between chunks. Iterate
until agree on asset types, elements, and
relationships.
Main content area
Footer area
Left navigation area
28
Task 5 Specify vocabularies
Develop broad taxonomy outline (1-3 levels
deep) Review, revise, and approve taxonomy
outline with stakeholders and subject matter
experts. Fill in taxonomy outline Tag random
samples from content inventory Review, revise,
and approve draft taxonomy with stakeholders and
subject matter experts.
29
Task 6 Specify procedures
Develop taxonomy style rules, ensure that the
taxonomy follows them. Develop tagging rules and
procedures, along with software to assist in the
task. Specify taxonomy maintenance process and
the update procedures to follow.
30
Task 6 Governance Maintenance
The taxonomy must be changed over
time. Suggestions for changes can come from
users, through query log analysis, and staff,
from feedback form. Governance structure needed
to make sure changes are justified.
Content
Taxonomy
Staff notes missing concepts
Query log analysis
End User
Recommendations by Editor 1 Small taxonomy
changes (labels, synonyms) 2 Large taxonomy
changes (retagging, application changes) 3 New
best bets content
Committee considerations 1 Business Goals 2
Change in user experience 3 Retagging cost
Steering Committee
31
Task 6 Steering Committee Roles
Business Lead Keeps committee on track with
larger business objectives Balances cost/benefit
issues to decide appropriate levels of
effort Specialists help in estimating
costs Obtains needed resources if those in
committee cant accomplish a particular
task Technical Specialist Estimates costs of
proposed changes in terms of amount of data to be
retagged, additional storage and processing
burden, software changes, etc. Helps obtain data
from various systems Content Specialist Committee
s liaison to content creators Estimates costs of
proposed changes in terms of editorial process
changes, additional or reduced workload,
etc. Taxonomy Specialist Suggests potential
taxonomy changes based on analysis of query logs,
indexer feedback Makes edits to taxonomy,
installs into system with aid of IT
specialist Content Owner Reality check on process
change suggestions
32
Task 7 Train staff
Staff will require training on The UI they use to
tag the content The rules to follow when deciding
what codes to apply The end-effect of the codes
they apply The structure of the taxonomy Tagging
examples come from the content inventory Hardcopie
s of the taxonomy, and yellow highlighters, are
helpful during training
Indexing UI
33
What about Automatic Categorization?
  • Automatic vs. Manual Categorization is a
    cost/benefit tradeoff
  • Semi-automated recommended over pure manual in
    production situations.
  • Automatic performance not bad, but not equal to
    trained manual tagging.
  • Software is not sane, so errors look crazy.
  • Large backlogs of content cant justify
    investment of high-quality manual tagging
  • Old articles rarely accessed.
  • Recommend automated bulk tagging with error
    reporting and correction process.

34
What about automatically-created
taxonomies? Typically a single hierarchy with no
overall plan Results hard for people to
navigate What about automatic categorization? Ac
curacy close to human levels, but errors are very
different Cost/benefit tradeoff Semi-automation
is best practice
35
Enterprise taxonomy maintenance workflow
Problem?
Yes
No
Add to enterprise Taxonomy
Suggest new name/category
Review new name
Copy edit new name
Problem?
Taxon-omy
No
Yes
Analyst
Taxonomy Tool
Editor
Copywriter
Sys Admin
36
Categorize with a purpose
What is the problem you are trying to
solve? Improve search Browse for content on an
enterprise-wide portal Enable users to syndicate
content Otherwise provide the basis for content
re-use How will you control the cost of creating
and maintaining the metadata) needed to solve
these problems? CMS with a metadata tagging
products Semi-automated classification Taxonomy
editing tools Guided navigation tools
37
How do you sell it?
  • Dont sell the taxonomy, sell the vision of what
    you want to be able to do
  • Clearly understanding what the problem is and
    what the opportunities are
  • Costs and benefits
  • Design the taxonomy in relation to the value at
    hand

38
Internet Resources
39
U.S. Government Resources
40
http//www.nasa.gov/home/index.html
41
http//pub-lib.jpl.nasa.gov/pub-lib/dscgi/ds.py/Vi
ew/Collection-10
42
http//www.loc.gov/flicc/wg/taxonomy.html
43
http//www.loc.gov/lexico/servlet/lexico/
44
http//www.archives.gov/federal_register/code_of_f
ederal_regulations/thesaurus.html
45
http//feapmo.gov/
46
http//www.km.gov/
47
Other Resources
48
http//www.educause.edu/asp/taxonomy/show_taxonomy
_links.asp?TREE1EXPAND1
49
http//databases.unesco.org/thesaurus/
50
http//www.naa.gov.au/recordkeeping/control/functi
ons_thesaur/contents.html
51
http//www.taxonomystrategies.com/html/bibliograph
y.htm
52
Summary
  • Why taxonomies?Why metadata?

53
Shiyali Ramamrita Ranganathan
54
Ranganathans Five Laws of Library Science
  • Books are for use (They don't belong on the
    shelf)
  • Books are for all every reader his book (Every
    reader is unique)
  • Every book its reader (Every book is unique)
  • Save the time of the reader (Make libraries easy
    to use)
  • A library is a growing organism (Libraries are
    constantly changing to meet changing patron
    needs)

55
Thank you
  • Michael HuffInformation Resource OfficerU.S.
    Department of Statehuffmp_at_state.gov
Write a Comment
User Comments (0)
About PowerShow.com