FAST Corporate Presentation - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

FAST Corporate Presentation

Description:

The Web Search Experience 'You are viewing 5 random jobs. out of 2461 jobs in total... Contextual Search. Contextual Relevance. Contextual Navigation 'Best ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 38
Provided by: carin6
Category:

less

Transcript and Presenter's Notes

Title: FAST Corporate Presentation


1
Why Search Engines are used increasingly to
Offload Queries from Databases
Bjørn Olstad CTO FAST Search Transfer Adjunct
Prof. The Norwegian University of Science
Technology Email bjorn.olstad_at_fast.no Cell 47
48011157
2
The Typo Problem...
3
Talent Offloading ....
4
The Web Search Experience
5
The RDBMS Experience
High input barrier
You are viewing 5 random jobs out of 2461 jobs
in total....
6
CareerBuilderUse scenario, part 1
30956 jobs
7
CareerBuilderUse scenario, part 2
1084 jobs
8
CareerBuilderUse scenario, part 3
30 jobs
9
CareerBuilderUse scenario, part 4
5 jobs
30956 ? 5 targeted jobs in 3 steps
10
Challenger Shuttle Launch
Fax to NASA from contractor with O-ring concern
11
Presentation Matters
12
IYP A Disruptive Change
Taylor or Gibson guitar? Good local
offers? Compare offerings Phone /
Directions BTW Im using my iPAQ
What is the phone numberto Wills Barber shop?
Product ServicesBlogs
Companyweb site
13
ISVs A Disruptive Change
Siebel 2000
Siebel 2005
my CRM Application
my CRM Application
Information Access Layer
3rd party content
Search is a strategic enabler
Search is a tactical afterthought
14
Revisit the Assumptions
2003 24B
2002 12B
Cave paintings,Bone tools 40,000 BCE
Writing 3500 BCE
2001 6B
0 C.E.
Paper 105
2000 3B
Printing 1450
Electricity, Telephone 1870
80 Unstructured
Transistor 1947
Computing 1950
Internet (DARPA) Late 1960s
The Web 1993
1999
15
Extreme Capabilities?
  • Feeding/streaming, transaction, retrieval or
    analytics centric?
  • Content size M, L, VL, VVVL or Vn?8 L?
  • Schema centric, Semi-structured XML, Text,
    Agnostic?
  • Fuzzy Value vs. Binary Completeness?
  • Discovery primitives?
  • User interaction part of design target?

16
Query LatencyRDBMS vs ESP
Test Data
  • Structured data
  • 5 million records
  • 13 fields per record
  • Structured queries
  • 22 SQL queries( Representative in ERP )

17
Query Per SecondRDBMS vs ESP
QPS
Identical HW single node, 2 CPU, 4GB ram 3 SCSI
disks Identical data auction data from eBay,
3.6 million docs Identical queries 200 queries
defined by Oracle
18
Disruptive Change
Queries that fit The Model Queries that dont fit
The Model
Alternative I
Alternative II
  • Star, snowflake schemas
  • Cubes / datamarts ? Incremental fixes to
    painful shortcomings? Adds complexity
  • Schema agnostic
  • Scalable ad-hoc querying
  • BLOBS ? Contextual Insight
  • Real-time fusion of disparate data models
  • Massive fault tolerant scalability

19
Extreme CapabilitiesESP Design Targets
Powering Search Derivative Applications (SDAs)
Game Changer driven by Extreme Retrival and
on-the-fly Analytics
20
Database Query OffloadingExample AutoTrader.com
RDBMS
  • HW-cost 320K (32CPU on 4 Sun servers)
  • 90 sub-second query responseAverage 12 s for
    the rest .
  • Relevance Sorting
  • 5 FTE to maintain

ESP
  • HW-cost 90K
  • 100 sub-second query response
  • Flexible relevance and discovery
  • 0.5 FTE to maintain

Car Dealers - Product Supply
21
Content ScalabilityRDBMS vs ESP
Examples of ESP deployments
  • Compliance case
  • 50B documents _at_ 80k average
  • ? 4 PB (around 100 web indexes)
  • Storage
  • Intelligent content addressable storage
  • XML metadata and full content
  • EMC Centera N 256TB (N1..400)
  • Webmining Webfountain
  • 60.000 1 in query capacity (ESP DB)

22
Intelligent StorageStorage and Search Unite
Discover
Simple
Scalable
Secure
23
Contextual Search


Any new supiciousfinancial transactionpatterns?
Where is the emailfrom Peter aboutROI analysis?
FIND
EXPLORE
Contextual Relevance
Contextual Navigation
  • Best of WebRecommender / Authority
  • Best of EnterpriseLinguistic / Statistic
  • Contextual fact discovery
  • On-the-fly meta-dataanalysis

24
Turning around the PyramidHBZ.de Leading
German Library Service Center
From
Librarians
To
Researchers
Single Field Search
Quering
FAST ESP
WWW (HTML, XML, WML, JavaScript)
SQL LIB

DB
DB
DB
DB
DB
STRUCTURED
25
ESP _at_ SCOPUS
  • gt200M articles / 180M citations
  • 180TB capacity / 14000 journals

David Goodman standing up and declaring in
public, that Scopus is the best-designed database
he's ever seen
26
Relevance Drives Revenue
Search Reduces Clicks to Purchase and Browsing
and Drives Revenue
  • Reduced of clicks to buy content from gt 4 to lt
    2
  • 50 reduction in ringtone browsing
  • 100 increase in search
  • 20 increase in ringtone revenue

Launched search
Launched search
4.50
140
140
4.00
120
120
3.50
100
100
3.00
Search
page views per sale
2.50
80
80
Clicks to Purchase
2.00
60
60
1.50
40
40
1.00
Revenue
20
20
0.50
0.00
0
0
-20
-20
Week 1
Week 10
Week 1
Week 10
-40
-40
-60
-60
Browsing
27
Business AnalyticsProcessing of real-time
streams
Example Norwegian Customs Foreign Exchange
Transaction Monitoring
SECURITY ACCESS MODULE
ACL Monitor
User Monitor
Real-time Registration
Queries
Message Queue
Results
Database connector
Alerts
Transaction Log
Data
Validation
Firewall
Firewall
28
Technology Maturity...RDBMS vs ESP
29
Business IntelligenceESP vs. RDBMS Technology
OBSERVATIONThe Enterprise Search Platform
(ESP), a relatively new concept, integrating
advanced technologies typically associated with
search engines, database tools, and analytical
systems, is fast becoming able to solve modern
business intelligence problems (using both
structured and unstructured data) in a way that
is fundamentally different from, and ultimately
superior to, that of other currently available
analytical or database software. PREDICTIONEnter
prise Search Platform and search centric
application technology represents a true paradigm
shift in the way data will be stored, analyzed
and reported on in the future. Resulting
realignments in the marketplace may be both rapid
and tumultuous.
- Chief strategist leading BI vendor
30
If your only tool is a hammer ....
... every problem looks like a nail
31
UIMA Architecture
32
Text ? Structure
ltCategorygtFINANCIALlt/ Category gt
ltAuthorgtGeorge Steinlt/ Author gt
BC-dynegy-enron-offer-update5 Dynegy May Offer at
Least 8 Bln to Acquire Enron (Update5) By George
Stein SOURCEc.2001 Bloomberg News BODY
ltCompanygtDynegy Inclt/Companygt
ltPersongtRoger Hamiltonlt/Persongt
ltCompanygtJohn Hancock Advisers Inc. lt/Companygt
ltPersonPositionCompanygt  ltOFFLEN OFFSET"3576"
LENGTH"63" /gt   ltPersongtRoger
Hamiltonlt/Persongt ltPositiongtmoney
managerlt/Positiongt ltCompanygtJohn Hancock
Advisers Inc.lt/Companygt lt/PersonPositionCompanygt
. Dynegy has to act fast,'' said Roger
Hamilton, a money manager with John Hancock
Advisers Inc., which sold its Enron shares in
recent weeks. If Enron can't get financing and
its bonds go to junk, they lose counterparties
and their marvelous business vanishes.''
Moody's Investors Service lowered its rating on
Enron's bonds to Baa2'' and Standard Poor's
cut the debt to BBB.'' in the past two weeks.
Fact
ltCompanygtEnron Corplt/Companygt
ltCompanygtMoody's Investors Servicelt/Companygt
ltCreditRatinggt  ltOFFLEN OFFSET"3814"
LENGTH"61" /gt   ltCompany_SourcegtMoody's
Investors Servicelt/Company_Sourcegt  
ltCompany_RatedgtEnron Corplt/Company_Ratedgt
ltTrendgtdowngradedlt/Trendgt   ltRank_NewgtBaa2lt/Rank_
Newgt   lt__Typegtbondslt/__Typegt   lt/CreditRatinggt
Event
33
The BI hammer Approach
Document Vector
Antiobiotics,Peptidyl,Eubacteria,RNA,Mg,
SVD Analysis
( ?1, ?2, ..., ?n )
?1, ?2, ..., ?n, Structured attributes
34
Contextual RefinementETL and Semantic
understanding unite
Direct access to RDBMs for info from some Telcos
ESP lookup
Logic for cleansing
Ordered hits (by quality)
XML feed from other Telcos
Cleansed data to ESP
XML
Flat files (CSV or fixed)from the laggards
Ambigous data (close hits or unidentified)
clean data
Error database for manual inspection,
correction, storage/learning
Master database for persistant storage
35
Contextual InsightQuery-time fact analysis _at_
sub-document level
entry probe carried toSaturns moon Titanas
part of the
Intent
Concepts
36
Contextual NavigationThisIsTravel
37
Revisit the Assumptions
2003 24B
Scalable Search
2002 12B
Cave paintings,Bone tools 40,000 BCE
Writing 3500 BCE
2001 6B
0 C.E.
Paper 105
2000 3B
Printing 1450
Electricity, Telephone 1870
80 Unstructured
Transistor 1947
Computing 1950
Internet (DARPA) Late 1960s
The Web 1993
1999
Write a Comment
User Comments (0)
About PowerShow.com