EXPAND FOR MORE INFO.
Speaker: Rao Kakarlamudi, Esgyn
Big Data Applications Meetup, 03/16/2014
Palo Alto, CA
More info here: http://www.meetup.com/BigDataApps/
Link to video: https://www.youtube.com/watch?v=hGSXlRohdCU
About the talk:
Introducing EsgynDB based on Apache Trafodion, the Big Data database that revolutionizes the way you manage Big Data and Hadoop. With EsgynDB, you can now run your transactional and enterprise operational reporting workloads on Hadoop, and avoid being locked into those expensive, proprietary database vendors.
By consolidating your workloads onto the same platform, you can derive business insight faster and cheaper than ever before. You can adopt EsgynDB to enable a Big Data strategy that simplifies and modernizes your operational data management, as illustrated by the following use cases from early adopters:
* Gain real-time views and analytics on security data collected from IT infrastructure, firewalls, and web traffic worldwide
* Monitor transit fleet to optimize and maintain efficiency in real time and perform historical reporting for future planning
* Offload historic data from expensive transactional systems to lower costs and differentiate customer experience by enriching transactional data with other data sources
* Transform traditional back office services to deliver service capabilities over the Internet
6. Trafodion Brings:
• Open Source Apache Trafodion (Incubating) project and license
• Hadoop HBase scalability up to petabytes
• Full ANSI SQL support
• ACID Transactions (Atomic, Consistent, Isolated, Durable) across rows, tables, and servers
• Cost effective scale out
• Enterprise ready active-active replication across multiple data centers
• ODBC / JDBC / ADO.Net / Hibernate support
• Proven and hardened database engine with 20+ years of Tandem / Compaq / HP innovation
• Data federation (e.g. Kafka) and schema flexibility
• Optimized for real-time transaction processing, operational reporting, and operational data
store (ODS) workloads that demand sub-second response times with high concurrency
(C) Copyright 2015 Esgyn Corporation Esgyn Confidential
8. Why Apache Trafodion?
Ingredients for a world-class relational database
1. Time, Money, and Talent
◦ 20+ years of investment
◦ $300+ million invested
◦ Database developers grew up on
◦ Shared nothing Massively Parallel Architecture
◦ With a single system image across clusters
◦ 300+ years of database experience
◦ On building OLTP and BI engines
ANSI and non-ANSI functionality supported,
performance, scalability, concurrency, throughput,
stability, high availability, transactional, UDF, SPJ,
OLAP, etc.
10. Node 1 Node 2 Node n
Client Application
HDFS
HBase HBase HBaseFilters
HDFS HDFS HDFS HDFS
Ethernet
Coprocessors
3. World Class Parallel Data Flow Execution
Engine
◦ Data Flow pipeline parallel architecture
◦ Intermediate results materialized only for blocking
operations like sorts
◦ Data overflow to disk only for large hash joins
◦ Adaptive Segmentation to use only needed resources
◦ Co-located joins & repartitioning when necessary
◦ Uses Inner and outer child broadcasts
◦ Parallel secondary index maintenance
Why Apache Trafodion?
Ingredients for a world-class relational database
Master
ESP ESP ESP ESP ESP
ESP ESP ESP ESP ESP
Master
Multi-
fragment
Supports salting of data across region servers
12. Performance
YCSB and Order Entry scale linearly!
Transactional
Order Entry
Throughput
YCSB
Selects Updates
50/50
Throughput
Throughput
Throughput
13. Try and Contribute Apache Trafodion
Download:
◦ trafodion.apache.org
Try Trafodion on AWS:
◦ https://aws.amazon.com/marketplace/pp/B018RBMFG0
Documentation:
◦ trafodion.apache.org
Become a contributor – add a new feature, fix a bug, translate documentation, more
◦ Discuss your changes on the dev mailing list
◦ Create a JIRA issue
◦ Setup your development environment
◦ Prepare a patch containing your changes
◦ Submit the patch
A database transaction, must be atomic, consistent, isolated and durable. Below we have discussed these four points.
Atomic : A transaction is a logical unit of work which must be either completed with all of its data modifications, or none of them is performed.
Consistent : At the end of the transaction, all data must be left in a consistent state.
Isolated : Modifications of data performed by a transaction must be independent of another transaction. Unless this happens, the outcome of a transaction may be erroneous.
Durable : When the transaction is completed, effects of the modifications performed by the transaction must be permanent in the system.
Often these four properties of a transaction is known as ACID