Collection of general data mining briefings - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Collection of general data mining briefings

Description:

XML is needed due to the limitations of HTML and complexities of SGML. It is an extensible markup language specified by the W3C (World Wide Web Consortium) ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 27
Provided by: chrisc8
Category:

less

Transcript and Presenter's Notes

Title: Collection of general data mining briefings


1
Building Trustworthy Semantic Webs Lecture
5 XML and XML Security
Dr. Bhavani Thuraisingham
September 2006

2
Objective of the Unit
  • This unit will provide an overview of XML and
    then discuss some security issues

3
Outline of the Unit
  • XML Elements
  • XML Attributes
  • XML DTD
  • XML Schema
  • XML Namespaces
  • Federations
  • Policy/Credential
  • Access Control
  • Third Party Publication
  • XML Databases
  • Inference Control

4
What is XML all about?
  • XML is needed due to the limitations of HTML and
    complexities of SGML
  • It is an extensible markup language specified by
    the W3C (World Wide Web Consortium)
  • Designed to make the interchange of structured
    documents over the Internet easier
  • Key to XML used to be Document Type Definitions
    (DTDs)
  • Defines the role of each element of text in a
    formal model
  • XML schemas have now become critical to specify
    the structure
  • XML schemas are also XML documents

5
XML Elements
XML Statement John Smith is a Professor in
Texas This can be expressed as
follows ltProfessorgt ltnamegt John Smith
lt/namegt ltstategt Texas lt/stategt lt/Professorgt
6
XML Elements
Now suppose this data can be read by anyone then
we can augment the XML statement by an additional
element called access as follows. ltProfessorgt
ltnamegt John Smith lt/namegt ltstategt Texas
lt/stategt ltaccessgt All, Read lt/accessgt lt/Professor
gt
7
XML Elements
If only HR can update this XML statement, then we
have the following ltProfessorgt ltnamegt John
Smith lt/namegt ltstategt Texas lt/stategt ltaccessgt
HR department, Write lt/accessgt lt/Professorgt
8
XML Elements
We may not wish for everyone to know that John
Smith is a professor, but we can give out the
information that this professor is in Texas.
This can be expressed as ltProfessorgt ltnamegt
John Smith, Govt-official, Read lt/namegt ltstategt
Texas, All, Read lt/stategt ltaccessgt HR
department, Write lt/accessgt lt/Professorgt
9
XML Attributes
Suppose we want to specify to access based on
attribute values. One way to specify such access
is given below. ltProfessor Name John Smith,
Access All, Read Salary 60K, Access
Administrator, Read, Write Department
Security Access All, Read lt/Professor Here
we assume that everyone can read the name John
Smith and Department Security. But only the
administrator can read and write the salary
attribute.
10
XML DTD
DTDs essentially specify the structure of XML
documents. Consider the following DTD for
Professor with elements Name and State. This
will be specified as lt!ELEMENT Professor
Officer (Name, State)gt lt!ELEMENT name
(PCDATA)gt lt!ELEMENR state (PCDATA)gt lt!ELEMENT
access (PCDATA).gt
11
XML Schema
While DTDs were the early attempts to specify
structure for XML documents, XML schemas are far
more elegant to specify structures. Unlike
DTDs XML schemas essentially use the XML syntax
for specification. Consider the following
example ltComplexType name
ProfessorTypegt ltSequencegt ltelement name
name type string/gt ltelement name state
type string/gt ltelement name access type
strong/gt ltSequencegt lt/ComplexTypegt
12
XML Namespaces
Namespaces are used for DISAMBIGUATION ltCountryX
Academic-Institution Xmlns CountryX
http//www.CountryX.edu/Instution DTD Xmlns
USA http//www.USA.edu/Instution DTD Xmlns
UK http//www.UK.edu/Instution DTD ltUSA
Title College USA Name University of Texas
at Dallas USA State Texas ltUK Title
University UK Name Cambridge
University UK State Cambs lt/CountryX
Acedmic-Instiutiongt
13
XML Namespaces
ltCountry Academic-Institution ltAccess
Government-official, Read lt/Accessgt Xmlns
CountryX http//www.CountryX.edu/Instution
DTD Xmlns USA http//www.USA.edu/Instution
DTD Xmlns UK http//www.UK.edu/Instution
DTD ltUSA Title College USA Name
University of Texas at Dallas USA State
Texas ltUK Title University UK Name
Cambridge University UK State
Cambs lt/CountryX Academic-Institutiongt
14
Federations/Distribution
Site 1 document ltProfessor-namegt ltIDgt 111
lt/IDgt ltNamegt John Smith lt/namegt ltStategt Texas
lt/stategt lt/Professor-namegt Site 2
document ltProfessor-salarygt ltIDgt 111
lt/IDgt ltsalarygt 60K lt/salarygt ltProfessor-salarygt
15
XML Query
  • XML-QL, XQuery, etc. are query languages for XML
  • XPath is used for query specification

16
Presentations of XML Documents
  • XSLT

17
Credentials in XML
ltProfessor credID9 subID 16 CIssuer
2gt ltnamegt Alice Brown lt/namegt ltuniversitygt
University of X ltuniversity/gt ltdepartmentgt CS
lt/departmentgt ltresearch-groupgt Security
lt/research-groupgt lt/Professorgt ltSecretary
credID12 subID 4 CIssuer 2gt ltnamegt
John James lt/namegt ltuniversitygt University of X
ltuniversity/gt ltdepartmentgt CS lt/departmentgt ltlev
elgt Senior lt/levelgt lt/Secretarygt
18
Policies in XML
lt? Xml VERSION 1.0 ENCODING utf-8?gt
ltPolicybasegt ltpolicy-spec cred-expr
//Professordepartment CS target
annual_ report.xml path //Patent_at_Dept
CS//Node() priv VIEW/gt
ltpolicy-spec cred-expr //Professordepartment
CS target annual_ report.xml
path //Patent_at_Dept EE /Short-descr/Node()
and //Patent _at_Dept EE/authors priv
VIEW/gt ltpolicy-spec cred-expr - - -
- ltpolicy-spec cred-expr - -
-- lt/Policy-basegt Explantaion CS professors
are entitled to access all the patents of their
department. They are entitled to see only the
short descriptions and authors of patents of the
EE department
19
Access Control Strategy
  • Subjects request access to XML documents under
    two modes Browsing and authoring
  • With browsing access subject can read/navigate
    documents
  • Authoring access is needed to modify, delete,
    append documents
  • Access control module checks the policy based and
    applies policy specs
  • Views of the document are created based on
    credentials and policy specs
  • In case of conflict, least access privilege rule
    is enforced
  • Works for Push/Pull modes

20
System Architecture for Access Control
User
Pull/Query
Push/result
X-Access
X-Admin
Admin Tools
Credential base
Policy base
XML Documents
21
Third-Party Architecture
  • The Owner is the producer of information It
    specifies access control policies
  • The Publisher is responsible for managing (a
    portion of) the Owner information and answering
    subject queries
  • Goal Untrusted Publisher with respect to
    Authenticity and Completeness checking

XML Source
policy base
Credential base
SE-XML
Owner
Publisher
Reply document
credentials
Query
User/Subject
22
XML Databases
  • Data is presented as XML documents
  • Query language XML-QL
  • Query optimization
  • Managing transactions on XML documents
  • Metadata management XML schemas/DTDs
  • Access methods and index strategies
  • XML security and integrity management

23
Inference/Privacy Control
Interface to the Semantic Web
Technology By UTD
Inference Engine/ Rules Processor
Policies Ontologies Rules
XML Documents Web Pages, Databases
XML Database
24
Example Policies
  • Temporal Access Control
  • After 1/1/05, only doctors have access to medical
    records
  • Role-based Access Control
  • Manager has access to salary information
  • Project leader has access to project budgets, but
    he does not have access to salary information
  • What happens is the manager is also the project
    leader?
  • Positive and Negative Authorizations
  • John has write access to EMP
  • John does not have read access to DEPT
  • John does not have write access to Salary
    attribute in EMP
  • How are conflicts resolved?

25
Privacy Policies
  • Privacy constraints processing
  • Simple Constraint an attribute of a document is
    private
  • Content-based constraint If document contains
    information about X, then it is private
  • Association-based Constraint Two or more
    documents taken together is private individually
    each document is public
  • Release constraint After X is released Y becomes
    private
  • Augment a database system with a privacy
    controller for constraint processing

26
Summary and Directions
  • XML is widely used
  • Securing XML documents is a challenges
  • How can we specify the policies discussed in this
    unit in XML?
  • How can query modification be carried out for XML
    documents?
  • Design access control for XML databases
Write a Comment
User Comments (0)
About PowerShow.com