Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) November 2013 - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) November 2013

Description:

Assured Cloud Computing for Assured Information Sharing Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) November 2013 – PowerPoint PPT presentation

Number of Views:122
Avg rating:3.0/5.0
Slides: 31
Provided by: russoue
Learn more at: http://www.utdallas.edu
Category:

less

Transcript and Presenter's Notes

Title: Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) November 2013


1
Dr. Bhavani ThuraisinghamThe
University of Texas at Dallas (UTD)November 2013
  • Assured Cloud Computing for Assured Information
    Sharing

2
Team Members
  • Sponsor Air Force Office of Scientific Research
  • The University of Texas at Dallas
  • Dr. Murat Kantarcioglu Dr. Latifur Khan Dr.
    Kevin Hamlen Dr. Zhiqiang Lin, Dr. Kamil Sarac
  • Sub-contractors
  • Prof. Elisa Bertino (Purdue)
  • Ms. Anita Miller, Dr. Bob Johnson (North Texas
    Fusion Center)
  • Collaborators
  • Late Dr. Steve Barker, Kings College, U of London
    (EOARD)
  • Dr. Barbara Carminati Dr. Elena Ferrari, U of
    Insubria (EOARD)

3
Outline
  • Objectives
  • Assured Information Sharing
  • Layered Framework
  • Our Research
  • Education
  • Acknowledgement
  • Research Funded by Air Force Office of Scientific
    Research
  • Education funded by the National Science
    Foundation

4
Objectives
  • Cloud computing is an example of computing in
    which dynamically scalable and often virtualized
    resources are provided as a service over the
    Internet. Users need not have knowledge of,
    expertise in, or control over the technology
    infrastructure in the "cloud" that supports them.
  • Our research on Cloud Computing is based on
    Hadoop, MapReduce, Xen
  • Apache Hadoop is a Java software framework that
    supports data intensive distributed applications
    under a free license. It enables applications to
    work with thousands of nodes and petabytes of
    data. Hadoop was inspired by Google's MapReduce
    and Google File System (GFS) papers.
  • XEN is a Virtual Machine Monitor developed at the
    University of Cambridge, England
  • Our goal is to build a secure cloud
    infrastructure for assured information sharing
    applications

5
Information Operations Across Infospheres
Assured Information Sharing
  • Objectives
  • Develop a Framework for Secure and Timely Data
    Sharing across Infospheres
  • Investigate Access Control and Usage Control
    policies for Secure Data Sharing
  • Develop innovative techniques for extracting
    information from trustworthy, semi-trustworthy
    and untrustworthy partners
  • Budget FY06-8 AFOSR 300K, State Match. 150K

Data/Policy for Coalition
Publish Data/Policy
Publish Data/Policy
Publish Data/Policy
Component
Component
Data/Policy for
Data/Policy for
Agency A
Agency C
Component
Data/Policy for
Agency B
  • Scientific/Technical Approach
  • Conduct experiments as to how much information is
    lost as a result of enforcing security policies
    in the case of trustworthy partners
  • Develop more sophisticated policies based on
    role-based and usage control based access
    control models
  • Develop techniques based on game theoretical
    strategies to handle partners who are
    semi-trustworthy
  • Develop data mining techniques to carry out
    defensive and offensive information operations
  • Accomplishments
  • l Developed an experimental system for
    determining information loss due to security
    policy enforcement
  • Developed a strategy for applying game theory for
    semi-trustworthy partners simulation results
  • Developed data mining techniques for conducting
    defensive operations for untrustworthy partners
  • Challenges
  • Handling dynamically changing trust levels
    Scalability

6
Architecture 2005-2008
Data/Policy for Coalition
Export
Export
Data/Policy
Data/Policy
Export
Data/Policy
Component
Component
Data/Policy for
Data/Policy for
Agency A
Agency C
Component
Data/Policy for
Trustworthy Partners Semi-Trustworthy
Partners Untrustworthy Partners
Agency B
7
Our Approach
  • Integrate the Medicaid claims data and mine the
    data next enforce policies and determine how
    much information has been lost (Trustworthy
    partners) Prototype system Application of
    Semantic web technologies
  • Apply game theory and probing to extract
    information from semi-trustworthy partners
  • Conduct Active Defence and determine the actions
    of an untrustworthy partner
  • Defend ourselves from our partners using data
    mining techniques
  • Conduct active defence find our what our
    partners are doing by monitoring them so that we
    can defend our selves from dynamic situations
  • Trust for Peer to Peer Networks (Infrastructure
    security)

8
Policy Enforcement PrototypeDr. Mamoun Awad
(postdoc) and students
Coalition
9
Game Theory for Assured Information Sharing
  • Studies such interactions through mathematical
    representations of gain
  • Each party is considered a player
  • The information they gain from each other is
    considered a payoff
  • Scenario considered a finite repeated game
  • Information exchanged in discrete chunks each
    round
  • Situation terminates at a finite yet
    unforeseeable point in the future
  • Actions within the game are to either lie or tell
    the truth
  • Our Goal All players draw conclusion that
    telling the truth is the best option

10
Incentive Issues in Assured Information
SharingDoD MURI Project 2008 - 2013, AFOSR
  • Motivation
  • Misaligned incentives could be a significant
    problem in Information Security
  • Software bugs vs. software companies incentives
  • Incentive issues in information sharing have been
    explored to some extent
  • Incentive issues in file sharing p2p networks
  • Assured information sharing creates new
    challenges
  • Security considerations vs. utility
  • Technical Approach
  • Verify that the other participants do not lie
    about their data
  • If the data is revealed as it is
  • Trust but verify (Our initial results DKE 08
    paper)
  • If the data is not revealed (e.g., SMC techniques
    are used)
  • Non-cooperative computing
  • Mechanism design
  • SMC with rational adversaries

11
Layered Framework for Assured Cloud Computing
12
Secure Query Processing with Hadoop/MapReduce
  • We have studied clouds based on Hadoop
  • Query rewriting and optimization techniques
    designed and implemented for two types of data
  • (i) Relational data Secure query processing with
    HIVE
  • (ii) RDF data Secure query processing with
    SPARQL
  • Demonstrated with XACML policies
  • Joint demonstration with Kings College and
    University of Insubria
  • First demo (2011) Each party submits their data
    and policies
  • Our cloud will manage the data and policies
  • Second demo (2012) Multiple clouds

13
Fine-grained Access Control with Hive
System Architecture
  • Table/View definition and loading,
  • Users can create tables as well as load data into
    tables. Further, they
  • can also upload XACML policies for the table
    they are creating.
  • Users can also create XACML policies for
    tables/views.
  • Users can define views only if they have
    permissions for all tables specified in the query
    used to create the view. They can also either
    specify or create XACML policies for the views
    they are defining.
  • CollaborateCom 2010

14
SPARQL Query Optimizer for Secure RDF Data
Processing
To build an efficient storage mechanism using
Hadoop for large amounts of data (e.g. a billion
triples) build an efficient query mechanism for
data stored in Hadoop Integrate with
Jena Developed a query optimizer and query
rewriting techniques for RDF Data with XACML
policies and implemented on top of JENA IEEE
Transactions on Knowledge and Data Engineering,
2011
Web Interface
New Data
Answer
Query
Server Backend
15
Demonstration Concept of Operation
Agency 1
Agency 2
Agency n

User Interface Layer
Relational Data
RDF Data
Fine-grained Access Control with Hive
SPARQL Query Optimizer for Secure RDF Data
Processing
16
RDF-Based Policy Engine
Interface to the Semantic Web
Technology By UTDallas
Inference Engine/ Rules Processor e.g., Pellet
Policies Ontologies Rules In RDF
JENA RDF Engine
RDF Documents
17
RDF-based Policy Engine on the Cloud
  • Determine how access is granted to a resource as
    well as how a document is shared
  • User specify policy e.g., Access Control,
    Redaction, Released Policy
  • Parse a high-level policy to a low-level
    representation
  • Support Graph operations and visualization.
    Policy executed as graph operations
  • Execute policies as SPARQL queries over large RDF
    graphs on Hadoop
  • Support for policies over Traditional data and
    its provenance
  • IFIP Data and Applications Security, 2010, ACM
    SACMAT 2011

A testbed for evaluating different policy sets
over different data representation. Also
supporting provenance as directed graph and
viewing policy outcomes graphically
18
Integration with Assured Information Sharing
Agency 1
Agency 2
Agency n

User Interface Layer
SPARQL Query
RDF Data and Policies
Policy Translation and Transformation Layer
RDF Data Preprocessor
MapReduce Framework for Query Processing
Hadoop HDFS
Result
19
Architecture
20
Key Feature 1 Policy Reciprocity
  • Agency 1 wishes to share its resources if Agency
    2 also shares its resources with it
  • Use our Combined policies
  • Allow agents to define policies based on
    reciprocity and mutual interest amongst
    cooperating agencies
  • SPARQL query
  • SELECT B
  • FROM NAMED uri1 FROM NAMED uri2
  • WHERE P

21
Key Feature 2 Develop and Scale Policies
  • Agency 1 wishes to extend its existing policies
    with support for constructing policies at a finer
    granularity.
  • The Policy engine
  • Policy interface that should be implemented by
    all policies
  • Add newer types of policies as needed

22
Key Feature 3 Justification of Resources
  • Agency 1 asks Agency 2 for a justification of
    resource R2
  • Policy engine
  • Allows agents to define policies over provenance
  • Agency 2 can provide the provenance to Agency 1
  • But protect it by using access control or
    redaction policies

23
Key Feature 4 Development Testbed
  • Policy framework provides three configurations
  • A standalone version for development and
    testing
  • A version backed by a relational database
  • A cloud-based version
  • achieves high availability and scalability while
    maintaining low setup and operation costs

24
Secure Storage and Query Processing in a Hybrid
Cloud
  • The use of hybrid clouds is an emerging trend in
    cloud computing
  • Ability to exploit public resources for high
    throughput
  • Yet, better able to control costs and data
    privacy
  • Several key challenges
  • Data Design how to store data in a hybrid cloud?
  • Solution must account for data representation
    used (unencrypted/encrypted), public cloud
    monetary costs and query workload characteristics
  • Query Processing how to execute a query over a
    hybrid cloud?
  • Solution must provide query rewrite rules that
    ensure the correctness of a generated query plan
    over the hybrid cloud

25
Hypervisor integrity and forensics in the Cloud
Applications
OS
Linux
Solaris
XP
MacOS
integrity
forensics
Virtualization Layer (Xen, vSphere)
Hypervisor
Cloud integrity forensics
Hardware Layer
  • Secure control flow of hypervisor code
  • Integrity via in-lined reference monitor
  • Forensics data extraction in the cloud
  • Multiple VMs
  • De-mapping (isolate) each VM memory from physical
    memory

26
Cloud-based Malware DetectionDr. Mehedy
27
Cloud-based Malware Detection
  • ACM Transactions on Management Information
    Systems
  • Binary feature extraction involves
  • Enumerating binary n-grams from the binaries and
    selecting the best n-grams based on information
    gain
  • For a training data with 3,500 executables,
    number of distinct 6-grams can exceed 200
    millions
  • In a single machine, this may take hours,
    depending on available computing resources not
    acceptable for training from a stream of binaries
  • We use Cloud to overcome this bottleneck
  • A Cloud Map-reduce framework is used
  • to extract and select features from each chunk
  • A 10-node cloud cluster is 10 times faster than a
    single node
  • Very effective in a dynamic framework, where
    malware characteristics change rapidly

28
Identity Management Considerations in a Cloud
  • Trust model that handles
  • (i) Various trust relationships, (ii) access
    control policies based on roles and attributes,
    iii) real-time provisioning, (iv) authorization,
    and (v) auditing and accountability.
  • Several technologies have to be examined to
    develop the trust model
  • Service-oriented technologies standards such as
    SAML and XACML and identity management
    technologies such as OpenID.
  • Does one size fit all?
  • Can we develop a trust model that will be
    applicable to all types of clouds such as private
    clouds, public clouds and hybrid clouds Identity
    architecture has to be integrated into the cloud
    architecture.

29
Education
  • NSF Capacity Building Grant on Assured Cloud
    Computing
  • Introduce cloud computing into several cyber
    security courses
  • Completed courses
  • Data and Applications Security
  • Data Storage
  • Digital Forensics
  • Secure Web Services
  • Computer and Information Security
  • Capstone Course
  • One course that covers all aspects of assured
    cloud computing
  • Week long course to be given at Texas Southern
    University

30
Directions
  • Secure VMM (Virtual Machine Monitor) and VNM
    (Virtual Network Monitor)
  • Exploring XEN VMM and examining security issues
  • Developing automated techniques for VMM
    introspection
  • Examine VMM issues
  • Integrate Secure Storage Algorithms into Hadoop
  • Identity Management
Write a Comment
User Comments (0)
About PowerShow.com