SlideShare a Scribd company logo
1 of 14
Download to read offline
What Is Apache Tephra ?
● Provides transactions for HBase and Phoenix
● Apache incubating project
● Uses HBase's native data versioning to
● Provide multi-versioned concurrency control (MVCC)
● For transactional reads and writes
● Provides snapshot isolation of concurrent transactions
● Open source / Apache 2.0 license
Tephra Architecture
● Tephra has three main components
● Transaction Server
– Maintains global view of transaction state
– Assigns new transaction IDs
– Performs conflict detection
● Transaction Client
– Coordinates start, commit
– And rollback of transactions
Tephra Architecture
● Tephra has three main components
● TransactionProcessor Coprocessor
– Applies filtering to the data read
● (based on a given transaction's state)
– Cleans up any data from old
● (no longer visible) transactions
● Multiple transaction server instances can run concurrently
– Allows for automatic failover
– One server instance is actively serving requests
– Configured by ZooKeeper
Tephra Phoenix
● Tephra is an incubating Apache project
● Phoenix uses Tephra for transaction support
● So this functionality is in a beta stage
● It gives cross row and cross table transaction support
● And full ACID semantics
● Remember that Phoenix uses Hbase as it's backing store
● Next slides show configuration
Phoenix Architecture ( Reminder )
Tephra Phoenix Config
● Add the following config
● To your client side hbase-site.xml file
● To enable transactions
<property>
<name>phoenix.transactions.enabled</name>
<value>true</value>
</property>
Tephra Phoenix Config
● Add the following config
● To your server side hbase-site.xml file
● To configure the transaction manager
<property>
<name>data.tx.snapshot.dir</name>
<value>/tmp/tephra/snapshots</value>
</property>
Tephra Phoenix Config
● Add the following config
● To your server side hbase-site.xml file
● To set the transaction timeout
<property>
<name>data.tx.timeout</name>
<value>60</value>
</property>
● Then you can start Tephra on Phoenix
./bin/tephra
Tephra Requirements
Component
Java
HDFS
Hbase
ZooKeeper
Source
Apache Hadoop
CDH or HDP
MapR
Apache
CDH or HDP
MapR
Apache
CDH or HDP
MapR
Version
1.7.xx / 1.8.xx
2.0.2-alpha - 2.7.x
(CDH) 5.0.0 - 5.12.0 /(HDP) 2.0 – 2.6
4.1 - 5.1 (with MapR-FS)
0.96.x, 0.98.x, 1.0.x, 1.1.x, 1.2.x, 1.3.x
(except 1.1.5 and 1.2.2) and 2.0.x
(CDH) 5.0.0 - 5.12.0 /(HDP) 2.0 – 2.6
4.1 - 5.1 (with Apache Hbase)
Version 3.4.3 - 3.4.5
(CDH) 5.0.0 - 5.12.0 /(HDP) 2.0 – 2.6
4.1 - 5.1
Tephra Transaction Server Config
● Add changes to hbase-site.xml
data.tx.bind.port
data.tx.bind.address
data.tx.server.io.threads
data.tx.server.threads
data.tx.timeout
data.tx.long.timeout
data.tx.cleanup.interval
data.tx.snapshot.dir
data.tx.snapshot.interval
data.tx.snapshot.retain
data.tx.metrics.period
15165
0.0.0.0
2
20
30
86400
10
300
10
60
Port to bind to
Server address to listen on
Number of threads for socket IO
Number of handler threads
Timeout for a transaction to complete
Timeout for a long run trans to complete
Frequency to check for timed out trans
HDFS directory used to store snapshots
requency to write new snapshots
No. old transaction snapshots to retain
Frequency for metrics reporting
Tephra Transaction Client Config
● Add changes to hbase-site.xml
data.tx.client.timeout
data.tx.client.provider
data.tx.client.count
data.tx.client.obtain.timeout
data.tx.client.retry.strategy
data.tx.client.retry.attempts
data.tx.client.retry.backoff.initial
data.tx.client.retry.backoff.factor
data.tx.client.retry.backoff.limit
30000
Pool
50
3000
Backoff
2
100
4
30000
Client socket timeout (milliseconds)
Client provider strategy:
"pool" uses a pool of clients
"thread-local" a client per thread
Max number of clients for "pool" provider
Pool provider clients get timeout (ms)
Client retry strategy(Backoff/n-times)
Number of times to retry (n-times)
Initial sleep time (backoff)
Multiplication factor for sleep time
Exit when sleep time reaches this limit
Tephra HBase Coprocessor Configuration
● Tephra requires an HBase coprocessor to be installed
● On all tables where transactional reads and writes
● Will be performed, Add this change
● To hbase-site.xml
<property>
<name>hbase.coprocessor.region.classes</name>
<value>org.apache.tephra.hbase.coprocessor.TransactionProcessor</value>
</property>
● Use Tephra binary to start once configured
./bin/tephra start
Available Books
● See “Big Data Made Easy”
– Apress Jan 2015
●
See “Mastering Apache Spark”
– Packt Oct 2015
●
See “Complete Guide to Open Source Big Data Stack
– “Apress Jan 2018”
● Find the author on Amazon
– www.amazon.com/Michael-Frampton/e/B00NIQDOOM/
●
Connect on LinkedIn
– www.linkedin.com/in/mike-frampton-38563020
Connect
● Feel free to connect on LinkedIn
– www.linkedin.com/in/mike-frampton-38563020
● See my open source blog at
– open-source-systems.blogspot.com/
● I am always interested in
– New technology
– Opportunities
– Technology based issues
– Big data integration

More Related Content

What's hot

Максим Барышиков-«WoT: Geographically distributed cluster of clusters»
Максим Барышиков-«WoT: Geographically distributed cluster of clusters»Максим Барышиков-«WoT: Geographically distributed cluster of clusters»
Максим Барышиков-«WoT: Geographically distributed cluster of clusters»Tanya Denisyuk
 
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...HBaseCon
 
Redis vs Infinispan | DevNation Tech Talk
Redis vs Infinispan | DevNation Tech TalkRedis vs Infinispan | DevNation Tech Talk
Redis vs Infinispan | DevNation Tech TalkRed Hat Developers
 
Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...
Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...
Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...Continuent
 
Monitoring MongoDB’s Engines in the Wild
Monitoring MongoDB’s Engines in the WildMonitoring MongoDB’s Engines in the Wild
Monitoring MongoDB’s Engines in the WildTim Vaillancourt
 
JActor Cluster Platform
JActor Cluster PlatformJActor Cluster Platform
JActor Cluster PlatformBill La Forge
 
Linux HTTPS/TCP/IP Stack for the Fast and Secure Web
Linux HTTPS/TCP/IP Stack for the Fast and Secure WebLinux HTTPS/TCP/IP Stack for the Fast and Secure Web
Linux HTTPS/TCP/IP Stack for the Fast and Secure WebAll Things Open
 
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStackSaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStackSaltStack
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems confluent
 
OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017HBaseCon
 
GeoDistributed datacenter: the DNS way
GeoDistributed datacenter: the DNS wayGeoDistributed datacenter: the DNS way
GeoDistributed datacenter: the DNS wayMoyd.co LTD
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon
 
He Pi Xii2003
He Pi Xii2003He Pi Xii2003
He Pi Xii2003FNian
 
Introduction to Haproxy
Introduction to HaproxyIntroduction to Haproxy
Introduction to HaproxyShaopeng He
 
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuPostgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuRedis Labs
 
Distributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentdDistributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentdSATOSHI TAGOMORI
 
Demystifying postgres logical replication percona live sc
Demystifying postgres logical replication percona live scDemystifying postgres logical replication percona live sc
Demystifying postgres logical replication percona live scEmanuel Calvo
 

What's hot (19)

Максим Барышиков-«WoT: Geographically distributed cluster of clusters»
Максим Барышиков-«WoT: Geographically distributed cluster of clusters»Максим Барышиков-«WoT: Geographically distributed cluster of clusters»
Максим Барышиков-«WoT: Geographically distributed cluster of clusters»
 
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
 
Redis vs Infinispan | DevNation Tech Talk
Redis vs Infinispan | DevNation Tech TalkRedis vs Infinispan | DevNation Tech Talk
Redis vs Infinispan | DevNation Tech Talk
 
Query logging with proxysql
Query logging with proxysqlQuery logging with proxysql
Query logging with proxysql
 
Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...
Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...
Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...
 
Monitoring MongoDB’s Engines in the Wild
Monitoring MongoDB’s Engines in the WildMonitoring MongoDB’s Engines in the Wild
Monitoring MongoDB’s Engines in the Wild
 
JActor Cluster Platform
JActor Cluster PlatformJActor Cluster Platform
JActor Cluster Platform
 
Linux HTTPS/TCP/IP Stack for the Fast and Secure Web
Linux HTTPS/TCP/IP Stack for the Fast and Secure WebLinux HTTPS/TCP/IP Stack for the Fast and Secure Web
Linux HTTPS/TCP/IP Stack for the Fast and Secure Web
 
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStackSaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems
 
OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017
 
GeoDistributed datacenter: the DNS way
GeoDistributed datacenter: the DNS wayGeoDistributed datacenter: the DNS way
GeoDistributed datacenter: the DNS way
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
 
He Pi Xii2003
He Pi Xii2003He Pi Xii2003
He Pi Xii2003
 
Introduction to Haproxy
Introduction to HaproxyIntroduction to Haproxy
Introduction to Haproxy
 
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuPostgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
 
Distributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentdDistributed Stream Processing on Fluentd / #fluentd
Distributed Stream Processing on Fluentd / #fluentd
 
Demystifying postgres logical replication percona live sc
Demystifying postgres logical replication percona live scDemystifying postgres logical replication percona live sc
Demystifying postgres logical replication percona live sc
 

Similar to Apache Tephra

ACID Transactions in Apache Phoenix with Apache Tephra™ (incubating), by Poor...
ACID Transactions in Apache Phoenix with Apache Tephra™ (incubating), by Poor...ACID Transactions in Apache Phoenix with Apache Tephra™ (incubating), by Poor...
ACID Transactions in Apache Phoenix with Apache Tephra™ (incubating), by Poor...Cask Data
 
Splice Machine Overview
Splice Machine OverviewSplice Machine Overview
Splice Machine OverviewKunal Gupta
 
HBaseCon2016-final
HBaseCon2016-finalHBaseCon2016-final
HBaseCon2016-finalMaryann Xue
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesHBaseCon
 
Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...
Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...
Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...varien
 
Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...
Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...
Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...MagentoImagine
 
Trafodion Distributed Transaction Management
Trafodion Distributed Transaction ManagementTrafodion Distributed Transaction Management
Trafodion Distributed Transaction ManagementRohit Jain
 
Nov. 4, 2011 o reilly webcast-hbase- lars george
Nov. 4, 2011 o reilly webcast-hbase- lars georgeNov. 4, 2011 o reilly webcast-hbase- lars george
Nov. 4, 2011 o reilly webcast-hbase- lars georgeO'Reilly Media
 
eHarmony @ Hbase Conference 2016 by vijay vangapandu.
eHarmony @ Hbase Conference 2016 by vijay vangapandu.eHarmony @ Hbase Conference 2016 by vijay vangapandu.
eHarmony @ Hbase Conference 2016 by vijay vangapandu.Vijaykumar Vangapandu
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaCloudera, Inc.
 
Fluentd at HKOScon
Fluentd at HKOSconFluentd at HKOScon
Fluentd at HKOSconN Masahiro
 
Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014Michael Renner
 
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon
 
.NET Conf 2022 - Networking in .NET 7
.NET Conf 2022 - Networking in .NET 7.NET Conf 2022 - Networking in .NET 7
.NET Conf 2022 - Networking in .NET 7Karel Zikmund
 
HBase Coprocessors @ HUG NYC
HBase Coprocessors @ HUG NYCHBase Coprocessors @ HUG NYC
HBase Coprocessors @ HUG NYCmlai
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogJoe Stein
 
FBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp serversFBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp serversAngelo Failla
 

Similar to Apache Tephra (20)

ACID Transactions in Apache Phoenix with Apache Tephra™ (incubating), by Poor...
ACID Transactions in Apache Phoenix with Apache Tephra™ (incubating), by Poor...ACID Transactions in Apache Phoenix with Apache Tephra™ (incubating), by Poor...
ACID Transactions in Apache Phoenix with Apache Tephra™ (incubating), by Poor...
 
Splice Machine Overview
Splice Machine OverviewSplice Machine Overview
Splice Machine Overview
 
HBaseCon2016-final
HBaseCon2016-finalHBaseCon2016-final
HBaseCon2016-final
 
HiveACIDPublic
HiveACIDPublicHiveACIDPublic
HiveACIDPublic
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New Features
 
Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...
Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...
Magento Imagine eCommerce Conference February 2011: Optimizing Magento For Pe...
 
Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...
Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...
Magento's Imagine eCommerce Conference 2011 - Hosting Magento: Performance an...
 
Trafodion Distributed Transaction Management
Trafodion Distributed Transaction ManagementTrafodion Distributed Transaction Management
Trafodion Distributed Transaction Management
 
Nov. 4, 2011 o reilly webcast-hbase- lars george
Nov. 4, 2011 o reilly webcast-hbase- lars georgeNov. 4, 2011 o reilly webcast-hbase- lars george
Nov. 4, 2011 o reilly webcast-hbase- lars george
 
eHarmony @ Hbase Conference 2016 by vijay vangapandu.
eHarmony @ Hbase Conference 2016 by vijay vangapandu.eHarmony @ Hbase Conference 2016 by vijay vangapandu.
eHarmony @ Hbase Conference 2016 by vijay vangapandu.
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Fluentd at HKOScon
Fluentd at HKOSconFluentd at HKOScon
Fluentd at HKOScon
 
Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014Postgres Vienna DB Meetup 2014
Postgres Vienna DB Meetup 2014
 
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond PanelHBaseCon 2015: HBase 2.0 and Beyond Panel
HBaseCon 2015: HBase 2.0 and Beyond Panel
 
Running php on nginx
Running php on nginxRunning php on nginx
Running php on nginx
 
.NET Conf 2022 - Networking in .NET 7
.NET Conf 2022 - Networking in .NET 7.NET Conf 2022 - Networking in .NET 7
.NET Conf 2022 - Networking in .NET 7
 
HBase Coprocessors @ HUG NYC
HBase Coprocessors @ HUG NYCHBase Coprocessors @ HUG NYC
HBase Coprocessors @ HUG NYC
 
Presto
PrestoPresto
Presto
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit Log
 
FBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp serversFBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp servers
 

More from Mike Frampton (20)

Apache Airavata
Apache AiravataApache Airavata
Apache Airavata
 
Apache MADlib AI/ML
Apache MADlib AI/MLApache MADlib AI/ML
Apache MADlib AI/ML
 
Apache MXNet AI
Apache MXNet AIApache MXNet AI
Apache MXNet AI
 
Apache Gobblin
Apache GobblinApache Gobblin
Apache Gobblin
 
Apache Singa AI
Apache Singa AIApache Singa AI
Apache Singa AI
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
OrientDB
OrientDBOrientDB
OrientDB
 
Prometheus
PrometheusPrometheus
Prometheus
 
Apache Kudu
Apache KuduApache Kudu
Apache Kudu
 
Apache Bahir
Apache BahirApache Bahir
Apache Bahir
 
Apache Arrow
Apache ArrowApache Arrow
Apache Arrow
 
JanusGraph DB
JanusGraph DBJanusGraph DB
JanusGraph DB
 
Apache Ignite
Apache IgniteApache Ignite
Apache Ignite
 
Apache Samza
Apache SamzaApache Samza
Apache Samza
 
Apache Flink
Apache FlinkApache Flink
Apache Flink
 
Apache Edgent
Apache EdgentApache Edgent
Apache Edgent
 
Apache CouchDB
Apache CouchDBApache CouchDB
Apache CouchDB
 
An introduction to Apache Mesos
An introduction to Apache MesosAn introduction to Apache Mesos
An introduction to Apache Mesos
 
An introduction to Pentaho
An introduction to PentahoAn introduction to Pentaho
An introduction to Pentaho
 
An introduction to Apache Thrift
An introduction to Apache ThriftAn introduction to Apache Thrift
An introduction to Apache Thrift
 

Recently uploaded

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 

Recently uploaded (20)

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 

Apache Tephra

  • 1. What Is Apache Tephra ? ● Provides transactions for HBase and Phoenix ● Apache incubating project ● Uses HBase's native data versioning to ● Provide multi-versioned concurrency control (MVCC) ● For transactional reads and writes ● Provides snapshot isolation of concurrent transactions ● Open source / Apache 2.0 license
  • 2. Tephra Architecture ● Tephra has three main components ● Transaction Server – Maintains global view of transaction state – Assigns new transaction IDs – Performs conflict detection ● Transaction Client – Coordinates start, commit – And rollback of transactions
  • 3. Tephra Architecture ● Tephra has three main components ● TransactionProcessor Coprocessor – Applies filtering to the data read ● (based on a given transaction's state) – Cleans up any data from old ● (no longer visible) transactions ● Multiple transaction server instances can run concurrently – Allows for automatic failover – One server instance is actively serving requests – Configured by ZooKeeper
  • 4. Tephra Phoenix ● Tephra is an incubating Apache project ● Phoenix uses Tephra for transaction support ● So this functionality is in a beta stage ● It gives cross row and cross table transaction support ● And full ACID semantics ● Remember that Phoenix uses Hbase as it's backing store ● Next slides show configuration
  • 6. Tephra Phoenix Config ● Add the following config ● To your client side hbase-site.xml file ● To enable transactions <property> <name>phoenix.transactions.enabled</name> <value>true</value> </property>
  • 7. Tephra Phoenix Config ● Add the following config ● To your server side hbase-site.xml file ● To configure the transaction manager <property> <name>data.tx.snapshot.dir</name> <value>/tmp/tephra/snapshots</value> </property>
  • 8. Tephra Phoenix Config ● Add the following config ● To your server side hbase-site.xml file ● To set the transaction timeout <property> <name>data.tx.timeout</name> <value>60</value> </property> ● Then you can start Tephra on Phoenix ./bin/tephra
  • 9. Tephra Requirements Component Java HDFS Hbase ZooKeeper Source Apache Hadoop CDH or HDP MapR Apache CDH or HDP MapR Apache CDH or HDP MapR Version 1.7.xx / 1.8.xx 2.0.2-alpha - 2.7.x (CDH) 5.0.0 - 5.12.0 /(HDP) 2.0 – 2.6 4.1 - 5.1 (with MapR-FS) 0.96.x, 0.98.x, 1.0.x, 1.1.x, 1.2.x, 1.3.x (except 1.1.5 and 1.2.2) and 2.0.x (CDH) 5.0.0 - 5.12.0 /(HDP) 2.0 – 2.6 4.1 - 5.1 (with Apache Hbase) Version 3.4.3 - 3.4.5 (CDH) 5.0.0 - 5.12.0 /(HDP) 2.0 – 2.6 4.1 - 5.1
  • 10. Tephra Transaction Server Config ● Add changes to hbase-site.xml data.tx.bind.port data.tx.bind.address data.tx.server.io.threads data.tx.server.threads data.tx.timeout data.tx.long.timeout data.tx.cleanup.interval data.tx.snapshot.dir data.tx.snapshot.interval data.tx.snapshot.retain data.tx.metrics.period 15165 0.0.0.0 2 20 30 86400 10 300 10 60 Port to bind to Server address to listen on Number of threads for socket IO Number of handler threads Timeout for a transaction to complete Timeout for a long run trans to complete Frequency to check for timed out trans HDFS directory used to store snapshots requency to write new snapshots No. old transaction snapshots to retain Frequency for metrics reporting
  • 11. Tephra Transaction Client Config ● Add changes to hbase-site.xml data.tx.client.timeout data.tx.client.provider data.tx.client.count data.tx.client.obtain.timeout data.tx.client.retry.strategy data.tx.client.retry.attempts data.tx.client.retry.backoff.initial data.tx.client.retry.backoff.factor data.tx.client.retry.backoff.limit 30000 Pool 50 3000 Backoff 2 100 4 30000 Client socket timeout (milliseconds) Client provider strategy: "pool" uses a pool of clients "thread-local" a client per thread Max number of clients for "pool" provider Pool provider clients get timeout (ms) Client retry strategy(Backoff/n-times) Number of times to retry (n-times) Initial sleep time (backoff) Multiplication factor for sleep time Exit when sleep time reaches this limit
  • 12. Tephra HBase Coprocessor Configuration ● Tephra requires an HBase coprocessor to be installed ● On all tables where transactional reads and writes ● Will be performed, Add this change ● To hbase-site.xml <property> <name>hbase.coprocessor.region.classes</name> <value>org.apache.tephra.hbase.coprocessor.TransactionProcessor</value> </property> ● Use Tephra binary to start once configured ./bin/tephra start
  • 13. Available Books ● See “Big Data Made Easy” – Apress Jan 2015 ● See “Mastering Apache Spark” – Packt Oct 2015 ● See “Complete Guide to Open Source Big Data Stack – “Apress Jan 2018” ● Find the author on Amazon – www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ ● Connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020
  • 14. Connect ● Feel free to connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020 ● See my open source blog at – open-source-systems.blogspot.com/ ● I am always interested in – New technology – Opportunities – Technology based issues – Big data integration