SlideShare a Scribd company logo
1 of 22
Coprocessors – Uses, Abuses
Solutions
• 26 // SEPTEMBER // 2016
COPYRIGHT 2016 BLOOMBERG
FINANCE L.P. ALL RIGHTS RESERVED.
Esther Kundin
(With guest appearance by Clay Baenziger)
Coprocessors
What is a coprocessor?
– Custom jar loaded into HBase daemon process
– Endpoint – like a stored procedure
– Observer – like a trigger
Observers
– Region Observer
• preGet
• postGet
• prePut
• postPut
– WAL Observer
– Master Observer
• runs in HBase master
• Create, Delete, Modify table
Why use a coprocesor?
– Simple filter or aggregation run on your data
– Reduces amount of data being sent to the client
– NOT for complex data analysis
– Ex: Apache Phoenix (“We put the SQL back in
NoSQL”)
PORT – A sample use case
Post-Get example
RegionServer
postGet
Key Col1 Col2 Col3 Col4 Col5
Key1Abc 1 4 5
Key1Def 2 2 2
Key1Xyz 10 11 12
Key1 Abc-
col1
Def-
col2
Abc-
col3
Abc-
col4
Xyz-col5
Key1 1 2 4 5 12
Table Representation:
Coprocessor Result:
Problems and
Solutions
Coprocessors crash regionservers
– Exceptions (other than IOExceptions) in the
coprocessor bring down the RegionServer
– In other cases, the coprocessor silently unloads
Solution – catch all exceptions
public final void prePut(...)
throws IOException {
try {
prePutImpl(…);
}
catch(IOException ex) {
// Allow IOExceptions to propagate
// They won't cause an unload
throw ex;
}
catch(Throwable ex) {
// Wrap other exceptions as IOException
LOG.error("prePut: caught ", ex);
throw new IOException(ex);
}
}
Coprocessors can hog memory
– Memory is shared with RegionServer memory and
coprocessor memory
– Memory hogging slows RegionServer Performance
Solutions - defensive Java code
– Profile all coprocessor code for memory usage
• Use a generic profiler with a driver for your
coprocessor
– Use common Java tricks for limiting memory usage
• Use primitive types and underlying arrays where
possible
• Use immutable objects
• StringBuilder vs String concatenation
Problems with deployment
– Manual Deployment
• disable table
• assign new coprocessor
• enable table
– Rollout of non-backward-compatible coprocessor
difficult
Solutions
– HBASE-7639 – online schema update is enabled,
perhaps it will work
– Hard-code jar path in hbase-site.xml
• Used by Apache Phoenix
• Not the best approach for user-defined coprocessors
Logging and metrics tips
– Update log4j.properties file with a separate log
parameter for coprocessors
– Use MDC context to pass parameters to all parts of
the coprocessor
(http://www.slf4j.org/api/org/slf4j/MDC.html)
– Create an extra column in a Result to pass back an
object populated with metrics
Unsolved issues
– Bad request can bring down the whole cluster
– Missing jar will bring down the RegionServer
ERROR
org.apache.hadoop.hbase.coprocessor.CoprocessorHos
t: The coprocessor
fooCoprocessor threw
java.io.FileNotFoundException: File does not
exist:
/path/to/corprocessor.jar
java.io.FileNotFoundException: File does not
exist: /path/to/corprocessor.jar
(Preventing)
Abuses
Load Failures
– Affects all region servers – one at a time
– Affect some operations and not others (e.g. scan works, not get)
– HTable descriptors contain coprocessor class:
Clean-up can be messy HBASE-14190 - Assign system tables ahead of user
region assignment
– Set table property:
hbase.coprocessor.abortonerror to false
2016-09-24 02:32:07,366 ERROR
org.apache.hadoop.hbase.regionserver.RegionCoproce
ssorHost: Failed to load coprocessor
net.clayb.hbase.coprocessor.RegionObserver
java.io.FileNotFoundException: File does not
exist: hdfs://Test/user/foo/clayCoprocessor.jar
(Region server stays alive only table stays disabled)
Handler Failure
– RPC starvation is simple and non-obvious failure:
public class RegionObserverInfinity extends
BaseRegionObserver {
public void preGetOp(…) throws IOException {
for(;;){ LOG.trace(“Off I go…”); }
}
– Use jstack to see what is up in a region server:
clay@hbase-regionserver:~$ sudo jstack 3990
[…]
net.clayb.RegionObserverInfinity.preGetOp(…) @bci=12,
line=28 (Compiled frame; information may be imprecise)
Coprocessor Whitelisting
– Coprocessors are key to HBase operation:
• security.access.AccessController
• security.token.TokenProvider
• security.access.SecureBulkLoadEndpoint
• security.access.AccessController
• MultiRowMutationEndpoint
– hbase.coprocessor.user.enabled – disables all user
coprocessors (e.g. Apache Phoenix)
– HBASE-16700 – “Allow for coprocessor whitelisting” or
abuse HBASE-15686
Recap
– Coprocessors are dangerous:
Coprocessors are an advanced feature of HBase and
are intended to be used by system developers only. –
HBase Book
– Write defensive code!
– Needed from the community
• Story for coprocessor deployment
• Process isolation
• JMX metrics
Thank you!

More Related Content

What's hot

Overview of Cascading 3.0 on Apache Flink
Overview of Cascading 3.0 on Apache Flink Overview of Cascading 3.0 on Apache Flink
Overview of Cascading 3.0 on Apache Flink Cascading
 
Optimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsOptimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsdatamantra
 
Capture the Streams of Database Changes
Capture the Streams of Database ChangesCapture the Streams of Database Changes
Capture the Streams of Database Changesconfluent
 
ProxySQL - High Performance and HA Proxy for MySQL
ProxySQL - High Performance and HA Proxy for MySQLProxySQL - High Performance and HA Proxy for MySQL
ProxySQL - High Performance and HA Proxy for MySQLRené Cannaò
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesHBaseCon
 
SAP ASE Migration Lessons Learned
SAP ASE Migration Lessons LearnedSAP ASE Migration Lessons Learned
SAP ASE Migration Lessons LearnedAliter Consulting
 
Deployment topologies for high availability (ha)
Deployment topologies for high availability (ha)Deployment topologies for high availability (ha)
Deployment topologies for high availability (ha)Deepak Mane
 
Flink history, roadmap and vision
Flink history, roadmap and visionFlink history, roadmap and vision
Flink history, roadmap and visionStephan Ewen
 
Solution for events logging with akka streams and kafka
Solution for events logging with akka streams and kafkaSolution for events logging with akka streams and kafka
Solution for events logging with akka streams and kafkaAnatoly Sementsov
 
How Pulsar Stores Your Data - Pulsar Summit NA 2021
How Pulsar Stores Your Data - Pulsar Summit NA 2021How Pulsar Stores Your Data - Pulsar Summit NA 2021
How Pulsar Stores Your Data - Pulsar Summit NA 2021StreamNative
 
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...Till Rohrmann
 
New features in Rails 6 - Nihad Abbasov (RUS) | Ruby Meditation 26
New features in Rails 6 -  Nihad Abbasov (RUS) | Ruby Meditation 26New features in Rails 6 -  Nihad Abbasov (RUS) | Ruby Meditation 26
New features in Rails 6 - Nihad Abbasov (RUS) | Ruby Meditation 26Ruby Meditation
 
Hbasecon2019 hbck2 (1)
Hbasecon2019 hbck2 (1)Hbasecon2019 hbck2 (1)
Hbasecon2019 hbck2 (1)wchevreuil
 
Oslo Vancouver Project Update
Oslo Vancouver Project UpdateOslo Vancouver Project Update
Oslo Vancouver Project UpdateBen Nemec
 
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud" Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud" Flink Forward
 
Apache Airflow in Production
Apache Airflow in ProductionApache Airflow in Production
Apache Airflow in ProductionRobert Sanders
 
Archivematica Technical Training Diagnostics Guide (September 2018)
Archivematica Technical Training Diagnostics Guide (September 2018)Archivematica Technical Training Diagnostics Guide (September 2018)
Archivematica Technical Training Diagnostics Guide (September 2018)Artefactual Systems - Archivematica
 
Juggling with Bits and Bytes - How Apache Flink operates on binary data
Juggling with Bits and Bytes - How Apache Flink operates on binary dataJuggling with Bits and Bytes - How Apache Flink operates on binary data
Juggling with Bits and Bytes - How Apache Flink operates on binary dataFabian Hueske
 
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...Michael Stack
 

What's hot (20)

Overview of Cascading 3.0 on Apache Flink
Overview of Cascading 3.0 on Apache Flink Overview of Cascading 3.0 on Apache Flink
Overview of Cascading 3.0 on Apache Flink
 
Optimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloadsOptimizing S3 Write-heavy Spark workloads
Optimizing S3 Write-heavy Spark workloads
 
Capture the Streams of Database Changes
Capture the Streams of Database ChangesCapture the Streams of Database Changes
Capture the Streams of Database Changes
 
ProxySQL - High Performance and HA Proxy for MySQL
ProxySQL - High Performance and HA Proxy for MySQLProxySQL - High Performance and HA Proxy for MySQL
ProxySQL - High Performance and HA Proxy for MySQL
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New Features
 
SAP ASE Migration Lessons Learned
SAP ASE Migration Lessons LearnedSAP ASE Migration Lessons Learned
SAP ASE Migration Lessons Learned
 
Deployment topologies for high availability (ha)
Deployment topologies for high availability (ha)Deployment topologies for high availability (ha)
Deployment topologies for high availability (ha)
 
Flink history, roadmap and vision
Flink history, roadmap and visionFlink history, roadmap and vision
Flink history, roadmap and vision
 
Solution for events logging with akka streams and kafka
Solution for events logging with akka streams and kafkaSolution for events logging with akka streams and kafka
Solution for events logging with akka streams and kafka
 
How Pulsar Stores Your Data - Pulsar Summit NA 2021
How Pulsar Stores Your Data - Pulsar Summit NA 2021How Pulsar Stores Your Data - Pulsar Summit NA 2021
How Pulsar Stores Your Data - Pulsar Summit NA 2021
 
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
Dynamic Scaling: How Apache Flink Adapts to Changing Workloads (at FlinkForwa...
 
New features in Rails 6 - Nihad Abbasov (RUS) | Ruby Meditation 26
New features in Rails 6 -  Nihad Abbasov (RUS) | Ruby Meditation 26New features in Rails 6 -  Nihad Abbasov (RUS) | Ruby Meditation 26
New features in Rails 6 - Nihad Abbasov (RUS) | Ruby Meditation 26
 
Hbasecon2019 hbck2 (1)
Hbasecon2019 hbck2 (1)Hbasecon2019 hbck2 (1)
Hbasecon2019 hbck2 (1)
 
Oslo Vancouver Project Update
Oslo Vancouver Project UpdateOslo Vancouver Project Update
Oslo Vancouver Project Update
 
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud" Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud"
 
Apache Airflow in Production
Apache Airflow in ProductionApache Airflow in Production
Apache Airflow in Production
 
Archivematica Technical Training Diagnostics Guide (September 2018)
Archivematica Technical Training Diagnostics Guide (September 2018)Archivematica Technical Training Diagnostics Guide (September 2018)
Archivematica Technical Training Diagnostics Guide (September 2018)
 
Juggling with Bits and Bytes - How Apache Flink operates on binary data
Juggling with Bits and Bytes - How Apache Flink operates on binary dataJuggling with Bits and Bytes - How Apache Flink operates on binary data
Juggling with Bits and Bytes - How Apache Flink operates on binary data
 
Gobblin on-aws
Gobblin on-awsGobblin on-aws
Gobblin on-aws
 
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
 

Similar to Coprocessors – Uses, Abuses and Solutions

Apache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling UpApache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling UpSander Temme
 
(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP PerformanceBIOVIA
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkRahul Jain
 
WE18_Performance_Up.ppt
WE18_Performance_Up.pptWE18_Performance_Up.ppt
WE18_Performance_Up.pptwebhostingguy
 
Spy hard, challenges of 100G deep packet inspection on x86 platform
Spy hard, challenges of 100G deep packet inspection on x86 platformSpy hard, challenges of 100G deep packet inspection on x86 platform
Spy hard, challenges of 100G deep packet inspection on x86 platformRedge Technologies
 
(ATS4-PLAT01) Core Architecture Changes in AEP 9.0 and their Impact on Admini...
(ATS4-PLAT01) Core Architecture Changes in AEP 9.0 and their Impact on Admini...(ATS4-PLAT01) Core Architecture Changes in AEP 9.0 and their Impact on Admini...
(ATS4-PLAT01) Core Architecture Changes in AEP 9.0 and their Impact on Admini...BIOVIA
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...Cloudera, Inc.
 
Logs aggregation and analysis
Logs aggregation and analysisLogs aggregation and analysis
Logs aggregation and analysisDivante
 
What no one tells you about writing a streaming app
What no one tells you about writing a streaming appWhat no one tells you about writing a streaming app
What no one tells you about writing a streaming apphadooparchbook
 
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...Spark Summit
 
Optimizing LAMPhp Applications
Optimizing LAMPhp ApplicationsOptimizing LAMPhp Applications
Optimizing LAMPhp ApplicationsPiyush Goel
 
Streaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in ProductionStreaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in Productionconfluent
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutSander Temme
 
From content to search: speed-dating Apache Solr (ApacheCON 2018)
From content to search: speed-dating Apache Solr (ApacheCON 2018)From content to search: speed-dating Apache Solr (ApacheCON 2018)
From content to search: speed-dating Apache Solr (ApacheCON 2018)Alexandre Rafalovitch
 
Unified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache SamzaUnified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache SamzaDataWorks Summit
 
Tips and Tricks for Operating Apache Kafka
Tips and Tricks for Operating Apache KafkaTips and Tricks for Operating Apache Kafka
Tips and Tricks for Operating Apache KafkaAll Things Open
 
PLNOG 18 - Paweł Małachowski - Spy hard czyli regexpem po pakietach
PLNOG 18 - Paweł Małachowski - Spy hard czyli regexpem po pakietachPLNOG 18 - Paweł Małachowski - Spy hard czyli regexpem po pakietach
PLNOG 18 - Paweł Małachowski - Spy hard czyli regexpem po pakietachPROIDEA
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewNagios
 

Similar to Coprocessors – Uses, Abuses and Solutions (20)

Apache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling UpApache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling Up
 
(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
 
WE18_Performance_Up.ppt
WE18_Performance_Up.pptWE18_Performance_Up.ppt
WE18_Performance_Up.ppt
 
Spy hard, challenges of 100G deep packet inspection on x86 platform
Spy hard, challenges of 100G deep packet inspection on x86 platformSpy hard, challenges of 100G deep packet inspection on x86 platform
Spy hard, challenges of 100G deep packet inspection on x86 platform
 
Performance_Up.ppt
Performance_Up.pptPerformance_Up.ppt
Performance_Up.ppt
 
(ATS4-PLAT01) Core Architecture Changes in AEP 9.0 and their Impact on Admini...
(ATS4-PLAT01) Core Architecture Changes in AEP 9.0 and their Impact on Admini...(ATS4-PLAT01) Core Architecture Changes in AEP 9.0 and their Impact on Admini...
(ATS4-PLAT01) Core Architecture Changes in AEP 9.0 and their Impact on Admini...
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
 
Logs aggregation and analysis
Logs aggregation and analysisLogs aggregation and analysis
Logs aggregation and analysis
 
What no one tells you about writing a streaming app
What no one tells you about writing a streaming appWhat no one tells you about writing a streaming app
What no one tells you about writing a streaming app
 
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
 
Optimizing LAMPhp Applications
Optimizing LAMPhp ApplicationsOptimizing LAMPhp Applications
Optimizing LAMPhp Applications
 
Streaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in ProductionStreaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in Production
 
Apache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling OutApache Performance Tuning: Scaling Out
Apache Performance Tuning: Scaling Out
 
From content to search: speed-dating Apache Solr (ApacheCON 2018)
From content to search: speed-dating Apache Solr (ApacheCON 2018)From content to search: speed-dating Apache Solr (ApacheCON 2018)
From content to search: speed-dating Apache Solr (ApacheCON 2018)
 
Unified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache SamzaUnified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache Samza
 
Tips and Tricks for Operating Apache Kafka
Tips and Tricks for Operating Apache KafkaTips and Tricks for Operating Apache Kafka
Tips and Tricks for Operating Apache Kafka
 
Logstash
LogstashLogstash
Logstash
 
PLNOG 18 - Paweł Małachowski - Spy hard czyli regexpem po pakietach
PLNOG 18 - Paweł Małachowski - Spy hard czyli regexpem po pakietachPLNOG 18 - Paweł Małachowski - Spy hard czyli regexpem po pakietach
PLNOG 18 - Paweł Małachowski - Spy hard czyli regexpem po pakietach
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture Overview
 

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 

Recently uploaded (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

Coprocessors – Uses, Abuses and Solutions

  • 1. Coprocessors – Uses, Abuses Solutions • 26 // SEPTEMBER // 2016 COPYRIGHT 2016 BLOOMBERG FINANCE L.P. ALL RIGHTS RESERVED. Esther Kundin (With guest appearance by Clay Baenziger)
  • 3. What is a coprocessor? – Custom jar loaded into HBase daemon process – Endpoint – like a stored procedure – Observer – like a trigger
  • 4. Observers – Region Observer • preGet • postGet • prePut • postPut – WAL Observer – Master Observer • runs in HBase master • Create, Delete, Modify table
  • 5. Why use a coprocesor? – Simple filter or aggregation run on your data – Reduces amount of data being sent to the client – NOT for complex data analysis – Ex: Apache Phoenix (“We put the SQL back in NoSQL”)
  • 6. PORT – A sample use case
  • 7. Post-Get example RegionServer postGet Key Col1 Col2 Col3 Col4 Col5 Key1Abc 1 4 5 Key1Def 2 2 2 Key1Xyz 10 11 12 Key1 Abc- col1 Def- col2 Abc- col3 Abc- col4 Xyz-col5 Key1 1 2 4 5 12 Table Representation: Coprocessor Result:
  • 9. Coprocessors crash regionservers – Exceptions (other than IOExceptions) in the coprocessor bring down the RegionServer – In other cases, the coprocessor silently unloads
  • 10. Solution – catch all exceptions public final void prePut(...) throws IOException { try { prePutImpl(…); } catch(IOException ex) { // Allow IOExceptions to propagate // They won't cause an unload throw ex; } catch(Throwable ex) { // Wrap other exceptions as IOException LOG.error("prePut: caught ", ex); throw new IOException(ex); } }
  • 11. Coprocessors can hog memory – Memory is shared with RegionServer memory and coprocessor memory – Memory hogging slows RegionServer Performance
  • 12. Solutions - defensive Java code – Profile all coprocessor code for memory usage • Use a generic profiler with a driver for your coprocessor – Use common Java tricks for limiting memory usage • Use primitive types and underlying arrays where possible • Use immutable objects • StringBuilder vs String concatenation
  • 13. Problems with deployment – Manual Deployment • disable table • assign new coprocessor • enable table – Rollout of non-backward-compatible coprocessor difficult
  • 14. Solutions – HBASE-7639 – online schema update is enabled, perhaps it will work – Hard-code jar path in hbase-site.xml • Used by Apache Phoenix • Not the best approach for user-defined coprocessors
  • 15. Logging and metrics tips – Update log4j.properties file with a separate log parameter for coprocessors – Use MDC context to pass parameters to all parts of the coprocessor (http://www.slf4j.org/api/org/slf4j/MDC.html) – Create an extra column in a Result to pass back an object populated with metrics
  • 16. Unsolved issues – Bad request can bring down the whole cluster – Missing jar will bring down the RegionServer ERROR org.apache.hadoop.hbase.coprocessor.CoprocessorHos t: The coprocessor fooCoprocessor threw java.io.FileNotFoundException: File does not exist: /path/to/corprocessor.jar java.io.FileNotFoundException: File does not exist: /path/to/corprocessor.jar
  • 18. Load Failures – Affects all region servers – one at a time – Affect some operations and not others (e.g. scan works, not get) – HTable descriptors contain coprocessor class: Clean-up can be messy HBASE-14190 - Assign system tables ahead of user region assignment – Set table property: hbase.coprocessor.abortonerror to false 2016-09-24 02:32:07,366 ERROR org.apache.hadoop.hbase.regionserver.RegionCoproce ssorHost: Failed to load coprocessor net.clayb.hbase.coprocessor.RegionObserver java.io.FileNotFoundException: File does not exist: hdfs://Test/user/foo/clayCoprocessor.jar (Region server stays alive only table stays disabled)
  • 19. Handler Failure – RPC starvation is simple and non-obvious failure: public class RegionObserverInfinity extends BaseRegionObserver { public void preGetOp(…) throws IOException { for(;;){ LOG.trace(“Off I go…”); } } – Use jstack to see what is up in a region server: clay@hbase-regionserver:~$ sudo jstack 3990 […] net.clayb.RegionObserverInfinity.preGetOp(…) @bci=12, line=28 (Compiled frame; information may be imprecise)
  • 20. Coprocessor Whitelisting – Coprocessors are key to HBase operation: • security.access.AccessController • security.token.TokenProvider • security.access.SecureBulkLoadEndpoint • security.access.AccessController • MultiRowMutationEndpoint – hbase.coprocessor.user.enabled – disables all user coprocessors (e.g. Apache Phoenix) – HBASE-16700 – “Allow for coprocessor whitelisting” or abuse HBASE-15686
  • 21. Recap – Coprocessors are dangerous: Coprocessors are an advanced feature of HBase and are intended to be used by system developers only. – HBase Book – Write defensive code! – Needed from the community • Story for coprocessor deployment • Process isolation • JMX metrics

Editor's Notes

  1. - HBase is a good choice for doing a random access over large data
  2. HBASE-15686 provides hbase.coprocessor.classloader.included.classes to exclude loading coprocessors based on Java class name (for class isolation – not operator sanity)