Working with Data Managers - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

Working with Data Managers

Description:

Vignette. Sam is taking a class in genetics at Alpha U and needs to do some research for a paper. ... Vignette Illustrates. Privacy trust. Sam controls personal ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 49
Provided by: awe43
Learn more at: http://www.internet2.edu
Category:

less

Transcript and Presenter's Notes

Title: Working with Data Managers


1
Working with Data Managers
  • Renee Woodten Frost
  • Internet2 Middleware Initiative
  • University of Michigan

2
Topics
  • Vignette
  • Data Role in Directory Implementation
  • Data Policy Issues
  • Key Data Needs
  • Identifiers
  • Directory data
  • eduPerson schema
  • Strategies and Recommendations

3
Vignette
  • Sam is taking a class in genetics at Alpha U
    and needs to do some research for a paper. At
    lunch, he goes online to access a restricted
    EBSCO database AU shares with Beta U. A window
    pops up in the browser asking if its okay for AU
    to give EBSCO information about his status ---
    only students from subscribing institutions can
    access the database. He clicks ok, knowing that
    only his status is passed, not his name or
    contact information. The browser then loads the
    restricted website.

4
Vignette Illustrates
  • Privacy trust
  • Sam controls personal information flow
  • Administrative and security services integration
  • Inter-campus access
  • University vouches for and acts on behalf of Sam

5
Demands on IT Revealed
  • One stop for university services integrated with
    course management systems
  • Expensive library databases shared with other
    schools by joint agreement
  • Browser or desktop preferences follow you
  • Submission and/or maintenance of information
    online
  • Privacy protection

6
Important questions, Important data
  • Are the people using these services who they
    claim to be?
  • Are they a member of our campus community?
  • Have they been given permission?
  • Is their privacy being protected?

7
Pause for Some Terminology
  • Identity set of attributes about you.
  • Attributes specific information stored about
    you.
  • Authentication process used to prove your
    identity. Often a login process.
  • Authorization process of determining if policy
    permits an intended action to proceed.
  • Directories where an identitys basic
    characteristics are stored

8
Enterprise Directory
  • Anti-stovepipe architecture that can provide
    authentication, attribute, group services to
    applications.
  • Adds value by improving cost/benefit of online
    services and by improving security.
  • A new and visible flow of administrative data..

9
Definitions Enterprise Directory Services
  • Enterprise Directory services - where your
    electronic identifiers are reconciled and basic
    characteristics are kept
  • Very quick lookup function
  • Machine address, voice mail box, email box
    location, address, campus identifiers

10
Enterprise Directory
  • Determine application-driven requirements for
    authentication, attribute, and group services and
    then design these four stages to meet the
    requirements
  • Data Sources
  • Metadirectory Processes
  • Directory Services
  • Applications

11
UoM Core Middleware Stages
Data sources
Metadirectory processes
Directories
Applications
12
Nature of Directory Work
  • Technology
  • Establish campus-wide services name space,
    authentication
  • Build an enterprise directory service
  • Populate the directory from source systems
  • Enable applications to use the directory
  • Policies and Politics
  • Clarify relationships between individuals and
    institution
  • Determine who manages, who can update and who can
    see common data
  • Structure information access and use rules
    between departments and central administrative
    units
  • Reconcile business rules and practices

13
Data Policy Issues
  • Cross organizational data sharing
  • Enabling a centralized repository
  • Identifying authoritative sources
  • Building trust
  • Privacy constraints FERPA, HIPAA
  • New procedures
  • Security
  • Audit ability
  • Accountability

14
Stage 1 Analyze Data Sources
  • Common Identifiers on campus
  • Identify systems of record and data owners
  • Determine data and data access needed
  • Determine frequency of the feed
  • Provide Standard Data Collection Model
  • Define database load procedure and produce audit
    log

15
Definitions Identifiers
  • Identifiers your electronic identification
  • Multiple names and corresponding information in
    multiple places
  • Single unique identifier for each authorized user
  • Names and information in other systems can be
    cross-linked to it
  • Admin systems, library systems, building systems

16
Definitions Authentication
  • Authentication maps the physical you to an
    electronic identifier
  • Password authentication most common
  • Security need should drive authentication method
  • Distance learning and inter-campus applications

17
Major campus identifiers
  • UUID
  • Student and/or emplid
  • Person registry ID
  • Account login ID
  • Enterprise-LAN ID
  • Student ID card
  • Net ID
  • Email address
  • Library/departmental ID
  • Publicly visible ID (and pseudo-SSN)
  • Pseudonymous ID

18
General Identifier Characteristics
  • Uniqueness (within a given context)
  • Dumb vs intelligent (i.e. whether subfields have
    meaning)
  • Readability (machine vs human vs device)
  • Affordance (centrally versus locally provided)
  • Resolver approach (how an identifier is mapped to
    associated object)
  • Metadata (both associated with the assignment and
    resolution of an identifier)
  • Persistence (permanence of relationship between
    identifier and specific object)

19
General Identifier Characteristics
  • Granularity (the degree to which identifier
    denotes a collection or component)
  • Format (checkdigits)
  • Versions (can defining characteristics of
    identifier change over time)
  • Capacity (size limitations imposed on the domain
    or object range)
  • Extensibility (the capability to intelligently
    extend one identifier to be the basis for
    another identifier).

20
Important Characteristics
  • Semantics and syntax- what it names and how does
    it name it
  • Domain - who issues and over what space is
    identifier unique
  • Revocation - can the subject ever be given a
    different value for the identifier
  • Reassignment - can the identifier ever be given
    to another subject
  • Opacity - is the real world subject easily
    deduced from the identifier - privacy and use
    issues

21
Identifier Mapping Process
  • Map campus identifiers against a canonical set of
    functional needs
  • For each identifier, establish its key
    characteristics, including revocation,
    reassignment, privileges, and opacity
  • Shine a light on some of the shadowy
    underpinnings of middleware
  • A key first step towards the loftier middleware
    goals

22
Identifier Mapping Template
  • Model Identifier Mapping and examples
  • http//middleware.internet2.edu/earlyadopters/iden
    tifier-mappings/

23
Stage 1 Analyze Data Sources
  • Common Identifiers on campus
  • Identify systems of record and data
    owners/managers
  • Determine data and data access needed
  • Determine frequency of the feed/updates
  • Provide Standard Data Collection Model
  • Define database load procedure and produce audit
    log

24
Cross Organizational Data Sharing
  • Information gathering across silos
  • What are the systems of record? The
    authoritative source of the data?
  • Who are the owners/stewards/managers?
  • Centralized vs Distributed
  • Environment
  • Cooperative vs Competitive
  • Uncovering skeletons
  • Normalizing the data

25
Systems of Record
  • Data (ex,names,addresses) exist in multiple
    systems which is authoritative?
  • Individual can have several roles which is
    primary?
  • Student and alum
  • Student and staff/teaching assistant
  • How is maintenance, especially purge process,
    handled?

26
Data Stewards/Managers
  • Registrar
  • Human Resources
  • Alumni Records
  • Library Records
  • Schools and Colleges
  • Telecommunications
  • Potentially, many others

27
Requires Education and Communication with Data
Stewards/Managers
  • Need to understand data as a resource
  • Need to understand the concept of authoritative
    data and be willing to collaborate
  • Need to understand the value of data sharing and
    appropriate access
  • Need to be reassured that proper security/privacy
    being adhered to

28
Institutional Environment Impact
  • Public vs. Private Institutions
  • Institutional Vision vs. Local Control
  • Change Readiness
  • Strategic vs. Tactical Planning
  • Role of IT
  • Policy and Legal Constraints
  • Resource Determination/Allocation

29
Institutional Environment Organizational
Culture/Structure
  • Competitive or collaborative
  • Challenges ownership
  • Can feel disenfranchised
  • Anticipate clear needs and keep everyone on the
    same page educate and communicate
  • Willingness to change
  • Technical infrastructure
  • Formally or informally, organizational structure
    may need to change too

30
Institutional EnvironmentPolicy and Legal
Constraints
  • Ownership of Data
  • Is data stewardship well-defined?
  • Is it centralized or distributed?
  • Access to Data
  • Formally or loosely governed?
  • Access authority centralized or distributed?
  • Data Administration
  • Centrally managed or distributed?
  • FERPA and HIPAA compliant?

31
Data Administration
  • Definition the development and application of
    formal rules and methods to the management of an
    institutions data resource
  • Management of any resource establish policy and
    procedures and monitor compliance

32
University of MichiganData Resource Management
Policy
  • Institutional data resource is a University asset
  • Data resource will be safeguarded/protected
  • Data will be shared based on institutional
    policies
  • Data will be managed as an institutional resource
  • Institutional data will be identified and defined
  • Databases will be developed based on functional
    needs
  • Information quality will be actively managed

33
University of Michigan Data Resource Guidelines
  • Defines data management roles
  • Introduces concept of Institutional Database
  • Provides guidelines for collection
    maintenance, validation correction,
    manipulation, modification, and reporting,
    security, access, data availability and
    integration, and documentation (includes data
    definitions and level of security)

34
University of MichiganData Administration
  • Philosophy the value of data as an
    institutional resource is increased through the
    widespread and appropriate use the value is
    diminished through misuse, misinterpretation, or
    unnecessary restriction.
  • University owns the data, stewardship is
    identified and maintained

35
Without Data Administration . . And/or high
level exec sponsorship
  • the burden of data manager and data source
    identification and negotiation often falls to IT
    leadership
  • requires leadtime, energy, communication and
    negotiation skills, and continual education and
    communication

36
Approach
  • Dependent on institutional environment
  • Dependent on drivers
  • Dependent on project methods (often related to
    environment)
  • Campus strategic project
  • Application requirement
  • Stealth

37
Primary Tasks to be Completed
  • Select attributes/data for inclusion
  • Negotiate for access to data
  • Determine data access policy
  • Develop familiarity with semantics of desired
    data elements
  • Develop familiarity with business processes that
    maintain them
  • Define database load procedure, with standard
    feeds, and produce audit log

38
What Data is Needed?
  • The object classes/schema and source data to
    populate directories are determined by the
    applications to be directory enabled.
  • Common initial or early applications include
    white pages and email routing which require
  • identifiers
  • directory information (name, addresses, phone
    numbers, email addresses,etc) - found in standard
    directory schemas such as inetOrgPerson
  • eduPerson attributes

39
Good Practices for Attributes
  • Use standards schema inetOrgPerson, eduPerson,
    localPerson
  • Never repurpose an fields defined as standards
    (RFC-defined). Add new attributes - adding
    attributes is easier than thought

40
eduPerson
  • A directory object class intended to support
    inter-institutional applications
  • Fills gaps in traditional directory schema
  • For existing attributes, states good practices
    where known
  • Specifies several new attributes and controlled
    vocabulary to use as values
  • Provides suggestions on how to assign values, but
    leaves it to the institution to choose
  • Latest version released with NMI components in
    October, 2002

41
Upper Class Attributes Issues
  • eduPerson inherits attributes from Person,
    inetOrgPerson
  • Some of those attributes need conventions about
    controlled vocabulary (e.g. telephones)
  • Some of those attributes need ambiguity resolved
    via a consistent interpretation (e.g. email
    address)
  • Some of the attributes need standards around
    indexing and search (e.g. compound surnames)
  • Many of those attributes need access control and
    privacy decisions (e.g. JPEG photo, email
    address, etc.)

42
eduPerson Attributes
  • eduPersonAffiliation
  • eduPersonEntitlement
  • eduPersonNickname
  • eduPersonOrgDN
  • eduPersonOrgUnitDN
  • eduPersonPrimaryAffiliation
  • eduPersonPrimaryOrgUnitDN
  • eduPersonPrincipalName

43
eduPersonAffiliation
  • Multi-valued list of relationships an individual
    has with institution
  • Controlled vocabulary includes faculty, staff,
    student, alum, member, affiliate, employee
  • Applications that use Shibboleth digital
    libraries, Directory of Directories for Higher
    Education

44
eduPersonPrimaryAffiliation
  • Single-valued attribute that would be the status
    put on a name badge at a conference
  • Controlled vocabulary includes faculty, staff,
    student, alum, member, affiliate
  • Determined by institutional business rules
  • Applications that use white pages, restricted
    access sites

45
Strategies
  • Executive Dictate (overt or stealth)
  • Data Administration
  • Fully functioning unit or philosophy itself
  • Data managers committee
  • Education/communication/negotiation
  • Data administration concepts
  • Vignettes/scenarios (relevant to data manager)
  • Institutional drivers (external,internal, apps)
  • Case studies from other universities
  • NMI/Internet2 materials

46
Key Planning Recommendations
  • Understand the institutional environment,
    including data policies and business rules, and
    the value of the enterprise directory to your
    institution
  • Build in time to collect and map/resolve
    identifiers
  • Allow considerable time upfront to work
    with/educate data stewards, possibly developing
    policy
  • Think standards
  • Be prepared for political wounds from the
    possible reduction of duchies in data and
    policies
  • Give priority to both education and communication
    plans (continual and consistent)

47
Strategies You Used?
  • Discussion
  • Questions

48
More Information
  • Middleware
  • http//middleware.internet2.edu
  • http//www.nmi-edit.org
  • My contact information
  • Rwfrost_at_internet2.edu
Write a Comment
User Comments (0)
About PowerShow.com