SecureFiles - PowerPoint PPT Presentation


PPT – SecureFiles PowerPoint presentation | free to view - id: bb9d8-MzM1N


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation



... Finance, Insurance, Banking. Compliance. Web 2.0 ... Mostly photos and videos. up to 3x speedup for large image loading. Major research institution ' ... – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 56
Provided by: seAuck


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: SecureFiles

  • VLDB August 23 - 28, 2008
  • Database Storage Development
  • Oracle Corporation

A revolutionary technology for unstructured
(file) data storage, specifically engineered to
provide filesystem like performance and advanced
filesystem and database features all within the
database server (released in 2007)
  • Preface
  • Introduction
  • Performance Proof Points
  • SecureFiles Show Case NIF LLN Laboratories, USA
  • Architecture
  • Advanced Features

Enterprise Data Growth
  • Yearly Data Growth (IDC, Gartner)
  • Structured 15 20
  • Unstructured 50- 200
  • Drivers for unstructured data growth
  • Increased digitization of content
  • Healthcare, Finance, Insurance, Banking
  • Compliance
  • Web 2.0
  • Scientific/Research Community
  • Storage, network and processor bandwidth
  • By 2010, enterprise data volumes are expected to
    reach multi petabytes ingested on hundreds of

Challenges for Near Future
  • As data volumes and ingestion rates step up,
    requirement arises for
  • Maximum storage throughput and scalability
  • Highest degree of robustness through atomicity,
    consistency, durability, security and
  • Scalable query ability of metadata and
  • Efficient storage utilization space and power
  • Effective storage lifecycle management

Why not Databases for Unstructured Data?
  • Current solution
  • Filesystems
  • preferred choice for unstructured data storage
  • Low performance and scalability of RDBMS is a
    major reason
  • RDBMS preferred choice for relational data
    accompanying files
  • Fragmented solution, however, is not a long-term

Consolidation Without Compromises
  • Lack of consolidation compromises security,
    robustness, and management
  • Disjoint security and auditing models
  • Differences in transaction semantics
  • Integrity and Consistency not guaranteed
  • Backup and recovery are fragmented
  • Storage management is complicated
  • Separate interfaces and protocols
  • Two data storage managers for one application is
    one too many as data volumes explode
  • Data Storage Integration Precursor to
    Information Integration
  • Need for consolidated industry-strength
    semi-structured data management solution that
    does not compromise on challenges

  • Jim Gray For less than 1MBDB faster than
    FilesystemMost things are less than 1MBDB
    should work to make this 10 MBFilesystem should
    borrow ideas from DB FAST 2005
  • David DeWitt Objects and Databases in 2006We
    envision large enterprises reaping the benefits
    of families of products that offerA Fully
    Integrated Solutionscalably and robustly VLDB
  • Michael Stonebraker There have been some
    extensions over the yearstime has come for a
    complete rewrite VLDB 2007

  • Preface
  • Introduction
  • Performance Proof Points
  • SecureFiles Show Case NIF LLN Laboratories, USA
  • Architecture
  • Advanced Features

SecureFiles Consolidated Secure Management of
  • SecureFiles is a new Oracle 11g database server
    feature designed to break the performance barrier
    that has been keeping file data out of databases
  • Delivers comparable performance with respect to
    traditional filesystems for all file sizes
    without compromising on throughput and
  • Maximizes throughput to match underlying device,
    single instance multi-core systems as well
  • Scales from terabytes to petabytes on all storage
  • Enables consolidation of file data with
    associated relational data
  • Single platform of storage
  • Single security model
  • Single view and management of data
  • Extends security, reliability, and scalability of
    database to file data.
  • Is a cluster filesystem
  • Provides high scalability and availability using
    commodity hardware
  • Leverages and extends Real Application Clusters
    (RAC) cache fusion technology

  • Implemented in a parallel integrated filesystem
    Stack. Designed from Ground Up, for the next
    10-15 years
  • Layered Filesystem extensible transform
  • Dynamic Write Buffering Deferred write requests
    within transaction boundaries. Full utilization
    of I/O bandwidths
  • Space Management Scalable in-memory management
    of free space metadata within SMPs and across
    clusters maintaining ACID properties. Self
    adaptive best fit on-disk data layout optimizing
    I/O requests
  • Inode and I/O Management Fast scalable access
    of file metadata for sequential as well as random
    access. Scales with concurrency as well as across
    clusters. Parallel, pipelined, asynchronous I/Os,
    intelligent read-ahead based on access patterns,
    overlap of network and storage bandwidths

The Best of Filesystems and Databases
  • SecureFiles have all the leading-edge file system
  • Options for Deduplication, Encryption,
    Compression, Snapshots
  • SecureFiles have advanced database features not
    in file systems
  • Transactions, Read Consistency, Various
    Durability Options
  • Readable Standby, Consistent Incremental Backup,
    Point in Time Recovery
  • Unlimited Temporal Data Access using Oracle
    Flashback Archive
  • Sliding Inserts Using Delta Updates Inherent
    Support for XML operations
  • Text, Functional and XML Indexes
  • Search across meta-data and file content
  • Partitioning and ILM
  • Leading the architectural confluence of databases
    and filesystems
  • Having the best of both worlds removes the need
    to compromise
  • Visions Fulfilled in the Domain of

  • Preface
  • Introduction
  • Performance Proof Points
  • SecureFiles Show Case NIF LLN Laboratories, USA
  • Architecture
  • Advanced Features

SecureFiles vs NFS
Experiment Setup
  • Clients 2 hyperthreaded Intel Xeon 2.8 GHz, 6GB
  • Server 2 hyperthreaded Intel Xeon 3.2 GHz, 6GB
    RAM, RHEL, 2Gb fiber channel SAN host adapter
  • OCI client for the database, NFSv3 client for
    filesystem, TCP/IP
  • Server machine running Oracle 11g database server
    and NFSv3 server
  • Two 2 TB Raid 5 storage arrays
  • Managed by Ext3
  • Managed by Oracle ASM

  • DICOM application consisting of digital
    diagnostic images and patient information.
  • Images are stored on Ext3 FS fileserver accessed
    through NFSv3 Vs Images stored as SecureFiles
    within the database
  • Patient information is stored in OracleDB in both
  • Test images range from 10KB to 100MB, with total
    data size from 1GB upto 100GB
  • Filesystem_like_logging used for securefiles
    similar to filesystems with metadata journaling.

Single Threaded Read Performance
Multi Threaded Read Performance
Single Threaded Write Performance
Multi Threaded Write Performance
Single DB Instance Scalability
Scalability Document Archiving Workload
Scalability Image and Video Storage
Secure Files on RAC
  • 4 node RAC, Xeon 3.4 GHz, 2 CPUs, 6GB RAM, 3 EMC
    CX700 connected through 2 switches
  • DICOM application dataset reused
  • SecureFiles Filesystem_like_logging, NoCache

Breaking The Performance Barriers
Breaking The Performance Barriers
SecureFiles Performance Summary
  • High Performance meets 100 data storage and
    access requirements
  • 462MB/s Ingest, 776MB/s Reads
  • Meets or Beats NFS/Ext3 performance on same h/w
  • Solution for the future
  • YouTube - 65,000 uploads a day, 100MB maximum
    video size, 6.5TB of uploads a day
  • SecureFiles on 4 nodeRAC (in-house test setup)
    30TB of possible insertss a day, 4x of the peak
    YouTube requirement

Early Beta (external) User Feedback
  • Major telecommunication company
  • fingerprinting application for govt agencies
  • up to 7x speedup with SecureFiles
  • Major digital video company
  • Digital asset management
  • Mostly photos and videos
  • up to 3x speedup for large image loading
  • Major research institution
  • The tests showed a clear performance advantage
    of storing LOB data in SecureFiles by a factor of
    upto 5.45 times better

  • Preface
  • Introduction
  • Performance Proof Points
  • SecureFiles Show Case NIF LLN Laboratories, USA
  • Architecture
  • Advanced Features

National Ignition Facility
  • Worlds largest and most energetic laser
  • Experiments to harness the potential of fusion as
    future source of safe and usable energy
  • When fully operational, 192 laser beams will
    generate 500 trillion watts, pulse energy 1.8 MJ,
    pulse length 20 billionth of second

Content Management Requirements
  • Optics make NIF work laser glass slabs,
    crystals, lenses, precision optical components
  • High resolution cameras
  • Generate multiple images every 6 seconds for 6
  • Needs to be processed within 6 seconds
  • For detection of defects on optical surfaces (1/5
    human hair size)
  • Throughput requirement more than 40 MB/sec
  • More than 300 TB of storage by 2010
  • 30 yrs of retention (on tape)

On Production With SecureFiles
  • Performance
  • Able to push the server and storage to the
  • Provides more than 100MB/s on their in-house
  • Storage Overhead
  • Mitigated with compression
  • Manageability
  • Consolidated store for images and metadata
  • Better governance
  • Archival
  • Unlimited history
  • Partitioning
  • Backup and Recovery of all data

  • Unification of storage of unstructured and
    structured data without compromises A

Special Note
  • We continue to innovate
  • Lots of challenges in the field of data storage
  • We collaborate extensively with our users in the
    scientific community for new ideas, requirements
    and feedback
  • We welcome you to join us
  • We welcome you to collaborate with us

  • Preface
  • Introduction
  • Performance Proof Points
  • SecureFiles Show Case NIF LLN Laboratories, USA
  • Architecture
  • Advanced Features

Layered Architecture
  • Write Gather Cache enables large disk I/O and
    contiguous space allocation.
  • Layering allows multiple data transformations to
    be applied.
  • Features like delta updates are pluggable based
    on application requirements.
  • Delta updates support non-length preserving
    updates to be performed efficiently.
  • SF Compression provide significant space savings.
  • SF Encryption extends DB security to file data.
  • Deduplication eliminates multiple copies of
    identical data.

Delta Updates
  • Enables non-length-preserving updates on Oracle
    SecureFile objects
  • Provides special APIs to the user to specify the
    object, the list of delta, their lengths and
  • I/O size proportional to the length of the update

Write Gather Cache
  • Subset of database buffer cache private to Oracle
  • User specified parameter governs buffering of
    data during write operations before flushing to
    an underlying storage layer
  • Maintained on a per transaction basis
  • Optimizes on-disk layout of SecureFiles data

Securefile Deduplication
  • Eliminates duplicates of identical file data.
    Results in efficient space usage.
  • File copy is an efficient non-space consuming
  • More duplicates, the higher the space savings.
  • A secure hash is evaluates over the file contents
    and stored in a per-segment index
  • Prefix-matched to identify potential duplicates
    followed by byte-byte comparison to eliminate
    false positives or potentially hash collisions.
  • Updates result in break-away from the source and
  • Content management, email and data archive
    applications can greatly leverage deduplication.

Compression and Encryption
  • Results in reduced I/O and significant space
  • Transparent to end users
  • Efficient random access with partial
  • Random updates involve updating only specific
    portion of the data.
  • Layering enables features like encryption to
    encrypt less data when compressed.
  • Encryption using 3DES 168 and AES 128, 192 and
    256 bit key size.
  • Encryption leverage existing database security

Inode and I/O Management
  • Responsible for maintaining persistent,
    transactional disk structures that maps file data
    to physical storage space.
  • Metadata is either a simple array of chunk
    entries for small files or grows into B tree for
    large files.
  • Enables efficient random access to an arbitrary
    offset within the data.
  • Highly scalable disk layout to map TB sized
    objects efficiently
  • Inode metadata is transactionally managed and
    recoverable across process, instance and media
  • In-place updates for small changes. Large changes
    are versioned at appropriate levels of
  • Supports intelligent pre-fetching based on access
    patterns and asynchronous writes within
  • Reduces read/write latency by overlapping network
    and storage throughput

Efficient Space Management
  • Supports variable sized chunks upto 64M.
  • Allocation based on best fit approach.
  • In-memory dispenser primary high-performance
    allocation provider.
  • CFS unit Pool of committed free space blocks.
  • UFS unit Pool of de-allocated uncommitted free
    space blocks.
  • Space freed not reused until retention time
    allows for CR.
  • Space reclaimation is a background process.

Database Semantics in SecureFiles
  • Atomicity
  • SF is a transaction data store.
  • Ability to rollback and recover from transaction
  • Copy on write semantics for large size
  • Undo generation for metadata and small data
  • Read Consistency
  • Multi version read consistency for relational
  • Retention of old versions of SecureFile objects
    up to a user specified amount of time. Read
    requests on SecureFile objects succeed as long as
    versions as of the query time are retained
  • Data Durability
  • Relational data and SecureFile metadata are
    always logged to achieve durability across
    instance database and media failures
  • Filesystem-like-logging semantics as in
    filesystems continue to achieve data durability
    across transaction and instance failures
  • User data can be logged conditionally based on
    user settings, thus allowing recovery from media
  • .

  • Preface
  • Introduction
  • Performance Proof Points
  • SecureFiles Show Case NIF LLN Laboratories, USA
  • Architecture
  • Advanced Features

Advanced Features Inherited from the RDBMS
  • Temporal Filesystem Features Using Flashback
  • Flashback framework allows to query, retrieve as
    well as recreate relational data consistent as of
    any point in time in the past.
  • Flashback Archive enable users to retrieve and
    recreate data as of several years before.
  • SecureFiles data retrieval at any point in time
    is guaranteed as long as the accompanying
    relational data can be retrieved.
  • SecureFiles with Flashback Archive provide a
    tamper-proof filesystem behavior to applications
    that have many practical uses in the area of data
  • Data Retrieval in Standby Systems
  • Oracle 11g provides the capability to query and
    retrieve database objects from physical standby
    database systems using Active Data Guard
  • Being first-class database objects, SecureFiles
    support query-ability of both unstructured and
    relational content on standby database systems if
    data manipulation operations on SecureFile
    objects are logged in the database Redo logs.

Advanced Features Inherited from the RDBMS
  • Secure Incremental Backup and Point-in-Time
  • Being first class database objects, secure
    encrypted backup of the database system ensures
    encrypted backup of SecureFile objects as well as
    accompanying relational data.
  • Oracle provides the capability to perform point
    in time recovery of the database.
  • Point in time recovery can be performed on
    SecureFile objects if users choose to log
    manipulation operations on SecureFile object data
    with the full LOGGING option.
  • Clustered Filesystem Features Using RAC
  • Allows share access of the entire underlying disk
    subsystem staging the database and provides
    opportunities for maximizing scalability of
    execution of database operations.
  • SecureFiles inherit the capabilities provided by
    Real Application Clusters. The design of the
    space management component in SecureFiles is
    tuned to provide scalability in throughput
    proportionally with the number of active database

Advanced Features Inherited from the RDBMS
  • Information Lifecycle Management Using
  • Partitioning achieves effective lifecycle
    management of data.
  • SecureFiles makes use of similar partitioning
    techniques to achieve lifecycle management of
    SecureFile objects.
  • Partitioning of base tables containing the
    relational and SecureFile metadata result in
    partitioning underlying SecureFiles segments.
  • Storage Support on Flash-based Devices
  • SecureFiles architecture provides a variant of a
    log-structured filesystem.
  • The space management framework in SecureFiles
    assists the architecture to adapt to storage on
    flash devices.
  • With optimal wear-leveling, semi-structured data
    management becomes highly feasible on flash-based
    storage devices

More Experiments
WAN Performance (NFS vs SecureFiles)
SecureFiles Compression Data Reduction
  • Calgary dataset standard compression dataset
  • 3-9x reduction in size
  • Additional 20 reduction in size from Compress

SecureFiles Compression Read CPU Impact
  • Compression makes Encrypted SF consume less CPU
    for Reads
  • Reading Compressed consumes 2-3x more CPU

SecureFiles Compression Write CPU Impact
  • Compress High can consume 2x more CPU than
    Compress medium

SecureFiles LZO Compression
  • LZO Write is 3x faster than ZLIB
  • LZO Read is 2x faster than ZLIB
  • ZLIB gives additional 15 compression