SlideShare a Scribd company logo
1 of 8
Google Percolator
● What is it ?
● What is it used for ?
● Percolator Vs MapReduce
● Architecture
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Percolator – What is it ?
● Incremental updates to Big Data
● Developed by Google
● Based on Google File System ( GFS )
● Provides transactions and locking
● Faster than comparable Map Reduce
● Developed by Google due to MapReduce limitations
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Percolator – What is it used for ?
● Iterative updates
● No need to batch process
● Update as data received
● Data in multi petabyte range
● Strong consistency needed
● Improved latency ( 100 x )
● Reduced document age ( 50 % )
● Random access to big data repository
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Percolator Vs MapReduce
Map Reduce
● Batch Processing
● No transactions
● Latency A
● Run time scales with data
● Code in C++
● Open source
● Uses HDFS
Percolator
– Iterative
– Transactions
– Latency 100 x A
– Incremental updates
– Code in Java ( mainly )
– Google owned
– Uses GFS
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Percolator – Architecture
● Applications are a sequence of observers
● An observer is called via a notification
● A notification is triggered when table data changes
● Application calls TabletServer via RPC
● TabletServer calls GFS ChunkServer
Percolator – Architecture
● Applications
– Series of observers
● Observer
– Completes task
– Updates table
● Next Observer called
– Via notification
● Percolator worker
– Scans for changes
– Sends notifications
Percolator – Architecture
Actual worker diagram including time stamping and locking
via Chubby lock server
Contact Us
● Feel free to contact us at
– www.semtech-solutions.co.nz
– info@semtech-solutions.co.nz
● We offer IT project consultancy
● We are happy to hear about your problems
● You can just pay for those hours that you need
● To solve your problems

More Related Content

Viewers also liked

An Introduction to Soft Computing
An Introduction to Soft ComputingAn Introduction to Soft Computing
An Introduction to Soft Computing
Tameem Ahmad
 
virtualization and hypervisors
virtualization and hypervisorsvirtualization and hypervisors
virtualization and hypervisors
Gaurav Suri
 
Virtualization Techniques & Cloud Compting
Virtualization Techniques & Cloud ComptingVirtualization Techniques & Cloud Compting
Virtualization Techniques & Cloud Compting
Ahmed Mekkawy
 
Soft computing (ANN and Fuzzy Logic) : Dr. Purnima Pandit
Soft computing (ANN and Fuzzy Logic)  : Dr. Purnima PanditSoft computing (ANN and Fuzzy Logic)  : Dr. Purnima Pandit
Soft computing (ANN and Fuzzy Logic) : Dr. Purnima Pandit
Purnima Pandit
 
Virtualization presentation
Virtualization presentationVirtualization presentation
Virtualization presentation
Mangesh Gunjal
 

Viewers also liked (17)

Memory virtualization
Memory virtualizationMemory virtualization
Memory virtualization
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
Storage Virtualization
Storage VirtualizationStorage Virtualization
Storage Virtualization
 
5. IO virtualization
5. IO virtualization5. IO virtualization
5. IO virtualization
 
4. Memory virtualization and management
4. Memory virtualization and management4. Memory virtualization and management
4. Memory virtualization and management
 
An Introduction to Soft Computing
An Introduction to Soft ComputingAn Introduction to Soft Computing
An Introduction to Soft Computing
 
VMware Esx Short Presentation
VMware Esx Short PresentationVMware Esx Short Presentation
VMware Esx Short Presentation
 
virtualization and hypervisors
virtualization and hypervisorsvirtualization and hypervisors
virtualization and hypervisors
 
Column base plates_prof_thomas_murray
Column base plates_prof_thomas_murrayColumn base plates_prof_thomas_murray
Column base plates_prof_thomas_murray
 
Virtualization Techniques & Cloud Compting
Virtualization Techniques & Cloud ComptingVirtualization Techniques & Cloud Compting
Virtualization Techniques & Cloud Compting
 
Basics of Soft Computing
Basics of Soft  Computing Basics of Soft  Computing
Basics of Soft Computing
 
Soft computing (ANN and Fuzzy Logic) : Dr. Purnima Pandit
Soft computing (ANN and Fuzzy Logic)  : Dr. Purnima PanditSoft computing (ANN and Fuzzy Logic)  : Dr. Purnima Pandit
Soft computing (ANN and Fuzzy Logic) : Dr. Purnima Pandit
 
Virtualization basics
Virtualization basics Virtualization basics
Virtualization basics
 
Virtualization and cloud Computing
Virtualization and cloud ComputingVirtualization and cloud Computing
Virtualization and cloud Computing
 
Soft computing
Soft computingSoft computing
Soft computing
 
Extraction processes
Extraction processes Extraction processes
Extraction processes
 
Virtualization presentation
Virtualization presentationVirtualization presentation
Virtualization presentation
 

More from Mike Frampton

An introduction to Apache Mesos
An introduction to Apache MesosAn introduction to Apache Mesos
An introduction to Apache Mesos
Mike Frampton
 
An introduction to Pentaho
An introduction to PentahoAn introduction to Pentaho
An introduction to Pentaho
Mike Frampton
 

More from Mike Frampton (20)

Apache Airavata
Apache AiravataApache Airavata
Apache Airavata
 
Apache MADlib AI/ML
Apache MADlib AI/MLApache MADlib AI/ML
Apache MADlib AI/ML
 
Apache MXNet AI
Apache MXNet AIApache MXNet AI
Apache MXNet AI
 
Apache Gobblin
Apache GobblinApache Gobblin
Apache Gobblin
 
Apache Singa AI
Apache Singa AIApache Singa AI
Apache Singa AI
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
OrientDB
OrientDBOrientDB
OrientDB
 
Prometheus
PrometheusPrometheus
Prometheus
 
Apache Tephra
Apache TephraApache Tephra
Apache Tephra
 
Apache Kudu
Apache KuduApache Kudu
Apache Kudu
 
Apache Bahir
Apache BahirApache Bahir
Apache Bahir
 
Apache Arrow
Apache ArrowApache Arrow
Apache Arrow
 
JanusGraph DB
JanusGraph DBJanusGraph DB
JanusGraph DB
 
Apache Ignite
Apache IgniteApache Ignite
Apache Ignite
 
Apache Samza
Apache SamzaApache Samza
Apache Samza
 
Apache Flink
Apache FlinkApache Flink
Apache Flink
 
Apache Edgent
Apache EdgentApache Edgent
Apache Edgent
 
Apache CouchDB
Apache CouchDBApache CouchDB
Apache CouchDB
 
An introduction to Apache Mesos
An introduction to Apache MesosAn introduction to Apache Mesos
An introduction to Apache Mesos
 
An introduction to Pentaho
An introduction to PentahoAn introduction to Pentaho
An introduction to Pentaho
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

An Introduction to Google Percolator

  • 1. Google Percolator ● What is it ? ● What is it used for ? ● Percolator Vs MapReduce ● Architecture www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 2. Percolator – What is it ? ● Incremental updates to Big Data ● Developed by Google ● Based on Google File System ( GFS ) ● Provides transactions and locking ● Faster than comparable Map Reduce ● Developed by Google due to MapReduce limitations www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 3. Percolator – What is it used for ? ● Iterative updates ● No need to batch process ● Update as data received ● Data in multi petabyte range ● Strong consistency needed ● Improved latency ( 100 x ) ● Reduced document age ( 50 % ) ● Random access to big data repository www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 4. Percolator Vs MapReduce Map Reduce ● Batch Processing ● No transactions ● Latency A ● Run time scales with data ● Code in C++ ● Open source ● Uses HDFS Percolator – Iterative – Transactions – Latency 100 x A – Incremental updates – Code in Java ( mainly ) – Google owned – Uses GFS www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  • 5. Percolator – Architecture ● Applications are a sequence of observers ● An observer is called via a notification ● A notification is triggered when table data changes ● Application calls TabletServer via RPC ● TabletServer calls GFS ChunkServer
  • 6. Percolator – Architecture ● Applications – Series of observers ● Observer – Completes task – Updates table ● Next Observer called – Via notification ● Percolator worker – Scans for changes – Sends notifications
  • 7. Percolator – Architecture Actual worker diagram including time stamping and locking via Chubby lock server
  • 8. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems