An introduction to Apache Gora - PowerPoint PPT Presentation

About This Presentation
Title:

An introduction to Apache Gora

Description:

A short introduction to Apache Gora, what is it and how does it work ? How can it provide data store abstraction and persistency for big data ? – PowerPoint PPT presentation

Number of Views:128
Slides: 12
Provided by: semtechs

less

Transcript and Presenter's Notes

Title: An introduction to Apache Gora


1
Apache Gora
  • What is it ?
  • Gora Nutch
  • Supports
  • Data Access
  • API's

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
2
Apache Gora What is it ?
  • Provides for Big Data
  • In memory data model
  • Persistence
  • Data store abstraction
  • Supports persisting to
  • Column stores
  • Key/value stores
  • Document stores
  • RDBMS's
  • Supports use of Hadoop

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
3
Apache Gora What is it ?
  • Released via Apache 2 license
  • Written in Java
  • Offers a persistence framework
  • Designed for big data applications
  • Used by Nutch 2.x for web crawl data storage
  • Used for
  • Persistence
  • Indexing
  • Analytics

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
4
Apache Gora Nutch
  • Nutch 2.x now uses Gora
  • Abstracted storage
  • Data store independence
  • Handles object to persistent mappings
  • Use various NoSql solutions

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
5
Apache Gora Supports
  • Gora supports the following
  • Apache Accumulo
  • Apache Cassandra
  • Apache Hbase
  • Amazon DynamoDB
  • Pig
  • Hive
  • Cascading
  • MapReduce

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
6
Apache Gora Data Access
  • Java API for data access
  • Independent of location
  • Core Gora API's
  • Store
  • Persistency
  • Query
  • MapReduce

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
7
Apache Gora Store API
  • Java API org.apache.gora.store.
  • DataStore handles object persistence
  • DataStore methods process objects
  • Persist
  • Fetch
  • Query
  • Delete

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
8
Apache Gora Persistency API
  • Java API org.apache.gora.persistency.
  • Core classes
  • BeanFactory
  • Construct keys
  • Persistent
  • Persist objects
  • State
  • State managed through StateManager
  • NEW, CLEAN (UNMODIFIED)?
  • DIRTY (MODIFIED), DELETED

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
9
Apache Gora Query API
  • Java API org.apache.gora.query.
  • Core classes
  • Query
  • Constructed via DataStore
  • PartitionQuery
  • Divide results of Query into partitions.
  • Run queries on data nodes.
  • Generate Hadoop InputSplits
  • Result

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
10
Apache Gora MapReduce API
  • Java API org.apache.gora.mapreduce.
  • GoraMapper
  • GoraReducer
  • ALL Record Counter
  • Reader
  • Writer
  • Hadoop / Avro
  • Serialise
  • De-serialise
  • Persistent

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
11
Contact Us
  • Feel free to contact us at
  • www.semtech-solutions.co.nz
  • info_at_semtech-solutions.co.nz
  • We offer IT project consultancy
  • We are happy to hear about your problems
  • You can just pay for those hours that you need
  • To solve your problems
Write a Comment
User Comments (0)
About PowerShow.com