The magic is in the glue XQuery Cloud Daniela Florescu Oracle - PowerPoint PPT Presentation

View by Category
About This Presentation
Title:

The magic is in the glue XQuery Cloud Daniela Florescu Oracle

Description:

PhD in object-oriented query processing/optimization ... Consumer (e.g Craiglist) Time to market. Cost. Flexibility. Customizability. Sustainability ... – PowerPoint PPT presentation

Number of Views:289
Avg rating:3.0/5.0
Slides: 58
Provided by: car69
Learn more at: http://isg.ics.uci.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: The magic is in the glue XQuery Cloud Daniela Florescu Oracle


1
The magic is in the glue XQueryCloud Daniela
Florescu Oracle
2
My personal history
  • PhD in object-oriented query processing/optimizati
    on
  • Loved the database theory and practice
    (relational, object-oriented, semi-structured)
  • Got really interested in it, and thought it was
    important
  • .then I joined Oracle.

3
after 4 years in Oracle
  • Applications are the really important issue
  • How to develop, deploy, maintain, evolve,
    customize
  • Databases are a side effect
  • Customers are educated to think they need them
  • DB are only useful as part of a general
    application architecture
  • Customer is the king
  • If they dont make , you dont either
  • Customers are in pain building apps right now

4
Agenda
  • Current pain in building apps
  • What can XQuery do for customers ?
  • What can the Cloud do for customers ?
  • How do we put them together ?
  • How do XQueryCloud solve the problem ?
  • Some open research problems

5
Imagine I am a customer, I need to build a new
app.
  • How much does it cost
  • Cost of developing the app (salaries)
  • Cost of deploying the app
  • Hardware, software licenses, maintenance
  • Loss of income because of mis-provisioning
  • Do I have to pay up front?
  • Is the cost proportional with the income ?

6
Other questions ?
  • How fast can I deliver the app
  • Quicker on the market then my competitors ?
  • How good the application is
  • More customers for the app. gt more income
  • Acceptable operational characteristics ?
  • Can I adapt if something changes ?
  • Operational characteristics
  • Functionality
  • Can I customize the same app in a different
    vertical / different set of customers ?
  • Is there a risk in the technology ?

7
Customers concerns
  • Cost
  • Time to market
  • Flexibility
  • Customizability
  • Sustainability
  • Risk
  • Often a tradeoff

8
Different classes of customers
  • Enterprise (e.g. Bank of America)
  • Cost
  • Sustainability
  • Risk
  • Customizability
  • Flexibility
  • Time to market
  • Government agency (eg. DoD)
  • Sustainability
  • Cost
  • Time to market (?)
  • Flexibility (?)
  • Customizability
  • Risk
  • Consumer (e.g Craiglist)
  • Time to market
  • Cost
  • Flexibility
  • Customizability

9
Typical enterprise app stack
Communication (XML, REST, WS)
Oracle IBM SAP Microsoft
Application logic (Java, C)
Database SQL)
10
Cost ? !
  • Cost of developing the app
  • Cost of deploying the app
  • (hardware, software licenses, maintenance)
  • Loss of income because of mis-provisioning
  • Do I have to pay up front?
  • Is the cost proportional with the income ?

Communication (XML, REST, WS)
Application logic (Java, C)
Database SQL)
11
Time to market ? Years!
  1. How fast can I deliver the app

Communication (XML, REST, WS)
Application logic (Java, C)
Database SQL)
12
Flexibility ? Customizability? Hardly any !
Communication (XML, REST, WS)
  • Can I adapt if something changes ?
  • Operational characteristics
  • Functionality
  • Can I customise it to a different vertical?

Application logic (Java, C)
Oracle experience for every 1M for Oracle app
licenses, customers pay 2M to customize it. (SAP
experience even worse -)
Database SQL)
13
Two major evil points
  • Multi layer infrastructure
  • Schemas a pre-requisite
  • New apps
  • Even the Oracle apps !
  • New platforms
  • Salesforce, GoogleApps, Facebook

Communication
Application Logic (schema-less)
put get
Persistent (key, value) store (schema-less)
XQuery a possible solution.
14
Another evil point
  • Lack of cost elasticity
  • Cost proportional with income
  • Lack of elasticity in performance
  • Response time independent of clients

The Cloud is the beginning of a solution.
15
Agenda
  • Current pain in building apps
  • What can XQuery do for customers ?
  • What can the Cloud do for customers ?
  • How do we put them together ?
  • How do XQueryCloud solve the problem ?
  • Some open research problems

16
Why XML ?
  • Covers all spectrum from structured data to
    textual information
  • Schema independent
  • Platform independent
  • Continuity with the basic Internet infrastructure
    (URI, HTML, HTTP)

17
What is XQuery ?
  • A programming language for XML processing
  • Functional in style
  • Turing complete
  • Contains
  • Navigation
  • Declarative query and aggregation (FLWOR)
  • Search (full text)
  • Declarative updates
  • Transforms
  • Scripting
  • Streaming and windowing
  • Error handling and second order expressions
  • Packaging (modules)
  • Has limitations (further)

18
History and status
  • Standard of the W3C
  • Good and bad
  • 10 years old
  • 40 existing implementations
  • Implemented in major databases
  • Best implementations in open source
  • If you have XML data, it is hard to avoid.

19
Navigation
  • fndoc("catalog.xml") /items/item
    fndoc("catalog.xml")/items//item
  • fndoc("catalog.xml")/items//
  • fndoc("catalog.xml")/items/_at_item
  • fndoc("parts.xml")/parts/partpartno
    i/partno
  • x/items/item

20
FLWOR
  • for i in fndoc("catalog.xml")/items/item,
  • p in fndoc("parts.xml")/parts/partpartno
    i/partno,
  • s in fndoc("suppliers.xml")/suppliers
    /suppliersuppno i/suppno
  • order by p/description, s/suppname
  • return s
  • Groupby, having, outerjoins, etc

21
Creation of new information
  • ltdescriptive-cataloggt
  • for i in fndoc("catalog.xml")/items/item,
  • p in fndoc("parts.xml")/parts/partpartno
    i/partno,
  • s in fndoc("suppliers.xml")/suppliers
    /suppliersuppno i/suppno
  • order by p/description, s/suppname
  • return
  • ltitemgt p/description, s/suppname, i/price
  • lt/itemgt
  • lt/descriptive-cataloggt

22
Textual search
  • doc ftcontains ( ( "mustang" ftand (("great",
    "excellent") any word occurs at least 2 times) )
    window 11 words ftand ftnot "rust" ) same
    paragraph

23
Declarative updates
  • for p in /inventory/part
  • let deltap changes/partpartno eq
    p/partno
  • return
  • replace value of node p/quantity with
    p/quantity deltap/quantity

24
Transforms
  • let oldx /a/b/x
  • return
  • copy newx oldx
  • modify
  • (rename node newx as "newx",
  • replace value of node newx by newx 2)
  • return (oldx, newx)

25
Streams and windowing
  • for sliding window w in (2, 4, 6, 8, 10, 12, 14)
  • start at s when fntrue()
  • only end at e when e - s eq 2
  • return ltwindowgt w lt/windowgt
  • Result of the above query
  • ltwindowgt2 4 6lt/windowgt
  • ltwindowgt4 6 8lt/windowgt
  • ltwindowgt6 8 10lt/windowgt
  • ltwindowgt8 10 12lt/windowgt
  • ltwindowgt10 12 14lt/windowgt

26
Scripting expressions
  • block
  • declare a as xsinteger 0
  • declare b as xsinteger 1
  • declare c as xsinteger a b
  • declare fibseq as xsinteger (a, b)
  • while (c lt 100)
  • set fibseq (fibseq, c)
  • set a b
  • set b c
  • set c a b
  • fibseq

27
Where can it be used in todays architectures?
  • Databases
  • Middle tiers
  • Information dispatch
  • Transformation
  • Data integration
  • Browsers (see XQIB demo, WWW09 paper)
  • Mobile devices (XQuery on iPhone anyone ?)

28
XQuerys real potential
XML
XML
  • Standalone programming language for information
    intensive applications
  • Can build extremely rich applications

Application Logic (XQuery)
XML
29
Why XQuery ?
  • Cost
  • Time to market
  • Flexibility
  • Customizability
  • Sustainability
  • Risk
  • Because of XML
  • Schema independent
  • Continuity with basic Internet infrastructure
  • Continuity structured data lt--gt textual
    information
  • XQuerys own advantages
  • Declarative
  • Single layer code
  • Open source friendly
  • Extra Goodies
  • Opportunity to rethink ACID transactions
  • Unique opportunities for introspection
  • Code and data migration

30
Declarativity
  • Small number of lines of code
  • Development cost
  • Time to market
  • bugs
  • Easy to optimize automatically
  • Easy to parallelize automatically
  • Especially important in the cloud
  • Easier to achieve elasticity in performance
  • Easier to generate automatically
  • Important for smart/non-developers UIs

31
Declarativity, negative side
  • Less number of developers capable of writing such
    code
  • Easy to write, harder to read
  • Tools harder to make (e.g. debuggers)
  • Performance can be unstable
  • Despite that, in the history of CS we evolve in
    the direction of declarativity
  • Assembly, C, C, Java, Haskell
  • Cobol, SQL

32
Rethink transactions and data consistency
  • XQuery silent as ACID transactions go
  • On purpose !
  • Are ACID transactions really needed ?
  • Are they really enforced in Web apps ?
  • No.
  • Open research field
  • Interaction of programming languages with new
    transactional models and new data consistency
    models

33
Sigmod08
  • Data consistency is something to optimize, not an
    absolute requirement
  • Data consistency models Tanembaum
  • Shared-Disk (Naïve approach)
  • No concurrency control at all
  • Eventual Consistency (Basic Protocol)
  • Updates become visible any time and will persist
  • No lost update on page level
  • Atomicity
  • All or no updates of a transaction become visible
  • Monotonic reads, Read your writes, Monotonic
    writes, ...
  • Strong Consistency
  • database-style consistency (ACID) via OCC
  • Data consistency a la carte

34
Introspection opportunities
  • Closed world
  • Everything is (or will be) XML
  • Data, schemas, code, PULs, metadata, configs,
    runtime information
  • Unique opportunity to
  • introspect at runtime all of them
  • reason about them
  • change them dynamically (not only data, but
    schemas, code and configuration)
  • Open research field
  • Consequences on programming

35
Why NOT XQuery
  • XML is complicated
  • XML Schema is hard/impossible to understand
  • XQuery is complicated
  • XQuery is incomplete (maybe research opport.?)
  • Missing a standard persistent data model
  • Missing DDL functionality (indexes, integrity
    constraints)
  • Missing basic functionalities (e.g. eval,
    function overloading)
  • Missing basic data modeling functionality (nm
    relationships)
  • XQuery lacks a standard environment (e.g. J2EE)
    (maybe research opport.?)
  • No tools (debuggers, profilers) (maybe research
    opport.?)
  • Performance is not clear yet (certainly research
    opport !)
  • There are few XQuery developers (teaching opport
    ? )

36
Agenda
  • Current pain in building apps
  • What can XQuery do for customers ?
  • What can the Cloud do for customers ?
  • How do we put them together ?
  • How do XQueryCloud solve the problem ?
  • Some open research problems

37
What is Cloud Computing ?
  • The rental cars paradigm for computing
  • Commoditization of (certain aspects of )
    Computing
  • CPU, storage, and network
  • Goal 1 Reduction of Cost
  • principle fine-grained renting of resources
  • pay as you go (elasticity of cost)
  • Goal 2 Simplification of Management
  • potentially infinite/unbreakable computing
    resources
  • potentially no administration
  • Goal 3 Elasticity of performance
  • Same resp time independently of workload
  • Note does not work yet for DB or apps

38
Case Study Amazon AWS
  • EC2 scalable virtual private servers using Xen.
  • S3 WS based storage for applications
  • SQS hosted message queue for web applications
  • SimpleDB the core functionality of a database
  • Hadoop based functionality
  • Similar providers IBM Blue Cloud, Microsoft
    Azure, (GoogleApp engine)

39
The limits of the (Amazon) Cloud
  • Cloud Computing a great starting point
  • Unfortunately, only a fraction of the stack

Customization, Training, ...
Application
Application Server
DBMS
Hardware
40
Making use of the Cloud
  • Solution 1 (conservative)
  • Take an existing application (JavaSQL, etc) and
    try to make it run on the cloud (e.g. make Oracle
    run on AWS)
  • Solution 2 (reactionary)
  • Create an fresh new infrastructure, specially
    designed for Web apps requirements, to be
    deployed in the cloud

Benefit
Risk
41
Solution 1 (conservative)
  • take a traditional DBMS (e.g., Oracle, MySQL,
    ...)
  • install it on an EC2 instance
  • use S3 or EBS as a persistent store
  • Advantages
  • traditional databases are available
  • proven to work well many tools
  • people trained and confident with them
  • Disadvantages
  • traditional DBMS solve the wrong problem anyway
    (e.g. focus on consistency)
  • traditional DBMS make the wrong assumptions (DB
    optimizers fail on virtualized hardware)

42
Solution 2 (reactionary)
  • Rethink the whole system architecture
  • do NOT use a traditional DBMS and app server
  • create new breed of application server (with DB)
  • run application server on n EC2 instances
  • use S3 distributed consistency protocols
  • Advantages and Disadvantages
  • requires new breed of (immature) systems tools
  • solves the right problem and gets it right
  • Examples
  • GoogleApps (Python in the cloud)
  • Sausalito (www.28msec.com) (XQuery in the cloud)

43
Agenda
  • Current pain in building apps
  • What can XQuery do for customers ?
  • What can the Cloud do for customers ?
  • How do we put them together ?
  • How do XQueryCloud solve the problem ?
  • Some open research problems

44
XQuery AWS Cloud
  • Cookbook
  • Take an existing XQuery processor
  • Partition the XML data on S3
  • Map REST calls to XQuery programs
  • Run the XQuery programs on EC2
  • Use SQS for (asyncronous) updates
  • Voila.
  • The magic is in the glue (XQuery proc. AWS )
  • Application Server Web Server Database
  • integrated XQuery based application stack for
    Web-based apps
  • fully SOA enabled
  • all pre-configured and lean (ZERO admin)

45
XQuery in the Cloud (connected)
46
Customers concerns
  • Cost
  • Time to market
  • Flexibility
  • Customizability
  • Sustainability

47
XQuery in the Cloud (no Server)
48
XQuery in the Cloud (offline)
49
  • Demo at www.28msec.com !
  • Look at www.programmableweb.com for use cases (
    consumer and enterprise mashups)

50
Competitors Internet
  • Web 2.0 Development Frameworks
  • E.g., Ruby on Rails, PHP / LAMP, ...
  • Deployment in the cloud still problematic
  • Google AppEngine, Facebook Apps
  • Proprietary programming model (Python-based)
  • Limited functionality
  • Vendor lock-in, privacy issues
  • Oracle on AWS, do-it-yourself on AWS
  • limited functionality and/or scalability

51
Competitors Enterprise
  • Salesforce AppExchange
  • proprietary programming model
  • Limited applications domain (CRM)
  • Microsoft Azure
  • .Net programming model
  • manual configuration needed
  • (recent offering, market adoption unclear)
  • Virtualization Companies (e.g., VMWare)
  • No offerings / expertise for data management
  • Oracle (Grid, RAC)
  • limited scalability, cost prohibitive

52
Web 2.0 Support vs. Cloud Support
Deployment
AWS
Google App Engine, Facebook
XQueryAWS
Cloud
Salesforce, Workday
Azure
VMWare Cloud, Citrix
Oracle
Trad.
Ruby on Rails
Development
Standard
Proprietary
53
Agenda
  • Current pain in building apps
  • What can XQuery do for customers ?
  • What can the Cloud do for customers ?
  • How do we put them together ?
  • How do XQueryCloud solve the problem ?
  • Some open research problems

54
Versions and variations
  • Human mind does not like agreements
  • We like our differences (for a good reason)
  • Different ways to see
  • Data
  • Schemas
  • Code
  • Current stack is imposing agreement
  • unlike our own nature
  • We have to come up with solutions that allow,
    welcome and exploit variations
  • Darwinian, evolutionary approach to data, schema
    and code mutations

55
Versions and variations
  • Research problems
  • What is a (data, schema, code) variation ?
  • What does it mean to run an app in the presence
    of variations ?
  • How do you store (index, etc) variations ?
  • How do you re-integrate them back into mainstream
    app (e.g. community voting ?)
  • What is the correct lifecycle for data, schema,
    code that allows and maximally exploits
    variations ?
  • Note I have a easier time to think of a solution
    if the app is in XML/XQuery rather if the app is
    in JavaSQL (even Python)

56
Conclusion
  • XQuery in the cloud a serious alternative for
    some (large and large ) customers
  • Nothing equivalent in the competition
  • How solid (standard, tested) this is
  • Richness of applications
  • Potential for optimization and parallelization
  • Ease of porting to the cloud

57
My advice
  • Keep the eye on the apps, not db
  • Keep the customer in mind
  • Rethink the entire stack
  • Dont be afraid to shake down existing ideas
    about how applications are supposed to work
  • Thank you!
About PowerShow.com