Apache Gora
●

What is it ?

●

Gora – Nutch

●

Supports

●

Data Access

●

API's

www.semtech-solutions.co.nz

info@sem...
Apache Gora – What is it ?
●

Provides for Big Data
–
–

Persistence

–
●

In memory data model
Data store abstraction

Su...
Apache Gora – What is it ?
●

Released via Apache 2 license

●

Written in Java

●

Offers a persistence framework

●

Des...
Apache Gora – Nutch
●

Nutch 2.x now uses Gora
–

Abstracted storage

–

Data store independence

–

Handles object to per...
Apache Gora – Supports
●

Gora supports the following
–

Apache Accumulo

–

Apache Cassandra

–

Apache Hbase

–

Amazon ...
Apache Gora – Data Access
●

Java API for data access
–

●

Independent of location

Core Gora API's
–

Store

–

Persiste...
Apache Gora – Store API
●

Java API – org.apache.gora.store.*
–

DataStore handles object persistence

–

DataStore method...
Apache Gora – Persistency API
●

Java API – org.apache.gora.persistency.*
–

Core classes
●

●

●

BeanFactory
– Construct...
Apache Gora – Query API
●

Java API – org.apache.gora.query.*
–

Core classes
●

●

●

Query
– Constructed via DataStore
P...
Apache Gora – MapReduce API
●

Java API – org.apache.gora.mapreduce.*
–

GoraMapper

–

GoraReducer

–

ALL Record Counter...
Contact Us
●

Feel free to contact us at
–

www.semtech-solutions.co.nz

–

info@semtech-solutions.co.nz

●

We offer IT p...
Upcoming SlideShare
Loading in...5
×

An introduction to Apache Gora

920

Published on

A short introduction to Apache Gora, what is it and how does it work ?
How can it provide data store abstraction and persistency for big data ?

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
920
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
21
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

An introduction to Apache Gora

  1. 1. Apache Gora ● What is it ? ● Gora – Nutch ● Supports ● Data Access ● API's www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  2. 2. Apache Gora – What is it ? ● Provides for Big Data – – Persistence – ● In memory data model Data store abstraction Supports persisting to – – Key/value stores – Document stores – ● Column stores RDBMS's Supports use of Hadoop www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  3. 3. Apache Gora – What is it ? ● Released via Apache 2 license ● Written in Java ● Offers a persistence framework ● Designed for big data applications ● Used by Nutch 2.x for web crawl data storage ● Used for – Persistence – Indexing – Analytics www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  4. 4. Apache Gora – Nutch ● Nutch 2.x now uses Gora – Abstracted storage – Data store independence – Handles object to persistent mappings – Use various NoSql solutions www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  5. 5. Apache Gora – Supports ● Gora supports the following – Apache Accumulo – Apache Cassandra – Apache Hbase – Amazon DynamoDB – Pig – Hive – Cascading – MapReduce www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  6. 6. Apache Gora – Data Access ● Java API for data access – ● Independent of location Core Gora API's – Store – Persistency – Query – MapReduce www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  7. 7. Apache Gora – Store API ● Java API – org.apache.gora.store.* – DataStore handles object persistence – DataStore methods process objects ● ● ● ● Persist Fetch Query Delete www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  8. 8. Apache Gora – Persistency API ● Java API – org.apache.gora.persistency.* – Core classes ● ● ● BeanFactory – Construct keys Persistent – Persist objects State – State managed through StateManager – – NEW, CLEAN (UNMODIFIED) DIRTY (MODIFIED), DELETED www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  9. 9. Apache Gora – Query API ● Java API – org.apache.gora.query.* – Core classes ● ● ● Query – Constructed via DataStore PartitionQuery – Divide results of Query into partitions. – Run queries on data nodes. – Generate Hadoop InputSplits Result www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  10. 10. Apache Gora – MapReduce API ● Java API – org.apache.gora.mapreduce.* – GoraMapper – GoraReducer – ALL Record Counter – Reader – Writer – Hadoop / Avro ● ● ● Serialise De-serialise Persistent www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  11. 11. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×