New Approaches to Structuring Data and Metadata in Statistical Systems Implications for Usability and Functionality - PowerPoint PPT Presentation

About This Presentation
Title:

New Approaches to Structuring Data and Metadata in Statistical Systems Implications for Usability and Functionality

Description:

New Approaches to Structuring Data and Metadata in Statistical Systems Implications for Usability and Functionality Simon Musgrave, University of Essex – PowerPoint PPT presentation

Number of Views:162
Avg rating:3.0/5.0
Slides: 23
Provided by: Simon313
Category:

less

Transcript and Presenter's Notes

Title: New Approaches to Structuring Data and Metadata in Statistical Systems Implications for Usability and Functionality


1
New Approaches to Structuring Data and Metadata
in Statistical SystemsImplications for Usability
and Functionality
  • Simon Musgrave, University of EssexRSS/ASC
    January 2004

2
User Scenarios
  • We begin by painting three potential user
    scenarios
  • Information Analyst Workspace
  • Policy Maker Workpage
  • Market Research Client Page

3
Information Analyst Workspace
  • We would like an active workspace that
    dynamically brings together all pertinent
    information for alerts and review
  • Workpage sorts, merges and describes multiple
    heterogeneous information sources
  • e.g. Monitoring the local public health issues
    links to
  • latest Hospital Episode Statistics data
  • Health Survey for England data
  • NHS Direct statistics
  • local surveys
  • key events
  • previous reports
  • contextual information

4
(No Transcript)
5
Policy Maker Workspace
  • Latest performance measures for hospital trusts
    released.
  • Policy maker wants to understand the variability,
    comparisons with previous years and other
    regions, breakdown of component parts etc.
  • Ideally system will treat the number as a
    signpost to these lower levels of data so that
  • Underlying tables can be shown?
  • Displayed with measures of uncertainty
  • Ranked next to comparative areas
  • Expanded (if permitted) to detailed
    administrative data
  • Link to content management system via metadata
    etc.

6
(No Transcript)
7
Market Research Client Page
  • Dedicated page for client
  • Typically links to reports, surveys, analyses
  • Ideally are pages that contain all active links
    to company performance and available competitor
    information
  • Easy new analyses
  • Background information
  • Real-time market information

8
example
9
User Levels
  • Regardless of usage, we also have to accommodate
    different user competencies and expectations
  • Expert professional analysts
  • Clerical
  • Executive
  • Press
  • Customers customers
  • Ignorant
  • Workspace should be tailored to usability
    criteria of end user

10
Usability
  • Learnability How easy is it for users to
    accomplish basic tasks the first time they
    encounter the design?
  • Efficiency Once users have learned the design,
    how quickly can they perform tasks?
  • Memorability When users return to the design
    after a period of not using it, how easily can
    they reestablish proficiency?
  • Errors How many errors do users make, how severe
    are these errors, and how easily can they recover
    from the errors?
  • Satisfaction How pleasant is it to use the
    design?
  • Nielsen (2003)

11
Are the statistical systems?
Usefulness


?
?
Usability
?
?
?
12
Entry Points
  • Finding
  • Browsing (tree, registry, file system)
  • Searching (google, keywords, metadata, thesaurus)
  • Linking
  • Shallow
  • Deep

13
Functionality
  • Given the growing demand for all types of data,
  • from advanced statistical systems
  • to easy access to performance measurements
  • from all types of users
  • How can we build systems that
  • Handle a variety of data types
  • Indicators
  • Tables
  • Counts
  • Surveys
  • avoid disclosure risks (real or theoretical)

14
And link seamlessly with both e-GIF and a
potential data spine
  • All of these broad use cases demand joined up
    data
  • We would all love to do data linkage
  • How do we model and build systems that provide
    for interoperability and at what level?
  • All of this demands statistical metadata, which
    is .

15
Definitions
  • Statistical Metadata is anything that you need to
    know to make proper and correct use of the real
    data in terms of
  • capturing,
  • reading,
  • processing,
  • interpreting,
  • analysing and
  • presenting the information
  • Thus, metadata includes (but is not limited to)
  • population definitions, sample designs,
  • file descriptions and database schemas,
  • codebooks and classification structures,
  • fieldwork reports and notes,
  • processing details, checks, transformation,
    weighting
  • conceptual motivations,
  • table designs and layouts
  • (Westlake 2003)

16
Or
  • statistical metadata are relevant in the areas
  • definition of statistical concepts
  • modelling of data and processes
  • storage structures and transfer protocols
  • standards to ensure a uniform and co-ordinated
    approach
  • information about availability, location,
    meaning, quality and use of data. (Kent and
    Schuerhoff 1997)

17
Alternative Views
  • Typically our understanding of data and metadata
    systems reflect our own priorities and goals,
    which may have a creation, storage or usage bias
  • Within the recent EC Metanet project Grossman has
    defined the United Metadata Architecture for
    Statistics (UMAS) which seeks to Define a
    framework to understand communalities and
    differences of Data / Metadata Models from a
    statistical point of view, irrespective of the
    terminology and goals of the specific models.
  • He suggest 4 views
  • Conceptual Category View (Conceptual model)
  • Statistical View (Role of the category within the
    statistical ontology)
  • Data Management View (Access and Manipulation of
    Category Instance Data)
  • Administration View (Management and bookkeeping
    of the structures)

18
Model Elements
  • Concepts what is is we are describing, and so a
    link to non-statistical systems, vital for our
    integrated workspace
  • Semantics understanding the meaning of both
    concepts and elements within the data model
  • Methods what we can do with the data
  • Structure how the underlying data is organised

19
Simplified microdata model
production method
obtained through
structural relationships
carries
refers to
statistical population
dataset
Descriptive and technical info
Based on
Defined by
contains
statistical unit
numeric information
variables
Grossman 2003
20
Levels of interoperability
  • Descriptive information (e-GMS)
  • File exchange (data dictionary)
  • Dataset exchange (archive standards)
  • Information exchange between systems (data
    warehouse)
  • Application accessibility (Web services)

21
Some standards
  • The Common Warehouse Metamodel (CWM) from OMG a
    model and syntax for the exchange of metadata
    for data warehousing and business intelligence
  • ISO 11179 a universal standard for describing
    data elements in a metadata repository
  • SPSS MR Data Model an interface layer
  • GESMES and SDMX a metadata model for the
    exchange of multidimensional data and
    time-series.
  • IQML, AskXML and Triple-S - metadata for the
    exchange of questionnaire and survey data
  • The Data Documentation Initiative (DDI) a
    general metadata standard for statistical data
    (micro as well as aggregated)

22
Challenge
  • Understand the scope of our ambitions
  • Are we building a simple interoperable
    environment within one organisation?
  • Are we seeking to link our information into a
    wider data web?
  • The technology (e.g. web services) offers massive
    potential which moves away from our ability to
    organise to exploit it
  • Can we make systems that work, that are useful
    and highly usable?
Write a Comment
User Comments (0)
About PowerShow.com