SlideShare a Scribd company logo
1 of 55
Download to read offline
page
SIMPLIFY YOUR STREAMING DATA
ARCHITECTURE WITH KAFKA & VOLTDB
Maheedhar Gunturu
page© 2014 VoltDB PROPRIETARY
WHO AM I?
• Maheedhar Gunturu – Software & Solutions Architect @VoltDB
mgunturu@voltdb.com
@Vanguard_space
http://chat.voltdb.com/
• Previously:
 Solutions Architect @ MapR
 Working on Big Data systems since 2010
• Current Interests include
 NVRAM
 GPU Co-processors for databases
 Operationalizing Deep-learning based applications using Fast Data.
2
page© 2014 VoltDB PROPRIETARY
COMPANY OVERVIEW
FAST : World Record Cloud Benchmark:
YCSB (Yahoo Cloud Serving Benchmark) - 2.4m million tps (transactions per second)
3
Mike Stonebraker*
Founded in 2009 by database luminary
VoltDB in the
Magic Quadrant
“Operational Databases”
Other Stonebraker Companies
• Professor in MIT for the Data and
Artificial Intelligence lab
• Co-Founder of VoltDB
• Creator of the C-store & H-store
Project
• 2014 Turing Award Winner
• His students include – Mike Olson,
Diane Greene, Daniel Abadi…
page© 2014 VoltDB PROPRIETARY
WHAT IS VOLTDB?
• An operational database purpose-built to run 100% in-
memory at web scale
 In-Memory
 Relational, SQL, ACID Compliant
 Scale-out on commodity hardware
 Reliability, HA, fault tolerant
 Integration with downstream systems i.e. OLAP,
Hadoop, DW
4
Best use cases: operational and transactional workloads
page© 2014 VoltDB PROPRIETARY 5
Fast (in motion)
Streaming Analytics:
real time summary and
aggregation
Transaction Processing:
per-event decisions using
context + history.
Big (at rest)
Exploration:
data science, investigation of
large data sets
Reporting:
recommendation matrixes,
search indexes, trend and BI
page© 2014 VoltDB PROPRIETARY
WHO NEEDS VOLTDB ?
2/
17
/2
01
6
2/
17
/2
01
6
Time CriticalNot Time Critical
Unimportant Data
Important Data
Most streaming
systems (exactly
once semantics is
very expensive)
Most anything,
Spark, etc.
Data warehouses,
RDBMS,
Transaction
processing engines
page
Perishable insights can have exponentially more
value than after-the-fact traditional historical
analytics.
page© 2014 VoltDB PROPRIETARY
WHAT ARE SOME COMMON USE-CASES?
• Data (messages) stream in from humans or devices
• Internet of Things (IoT)
• Ad-bidding platforms
• Telecommunications – OSS/BSS/NFV
• Fintech – trading and risk assessment platforms
• Consumer facing Online Billing and booking systems
• Massively multi-player games
• At high volume
• 100K ~ 1M messages/sec requires specialized software
• HighAvailability
• Nobody wants to go down these days
8
page© 2014 VoltDB PROPRIETARY 9
Streaming Analytics Transactions Transformations
• Materialized Views
• Capped Tables
• Ranking Indexes
• Stored Procedures
• Java + SQL
• ACID guarantees
• No client-side
transaction control
• Stored Procedures
• Loaders/Importers
• Export Connectors
• Sessionization
• Enrichment
• JDBC connector
VoltDB architecture
Commodity HW HA + ACID Scale-out VM-friendly
How do we do it?
• XDCR – configurable out of the box.
• Geospatial capabilities for location based BI and decision making
page© 2014 VoltDB PROPRIETARY
The general-purpose RDBMS can’t scale
page© 2014 VoltDB PROPRIETARY
MODERN OLTP
11
1. Processing streams requires integrated access to state.
2. Using real time analytics requires a query interface.
3. Reacting to incoming events requires transactions.
State + Query + Transactions = OLTP
Fast
Streaming Analytics
Transaction Processing
page© 2014 VoltDB PROPRIETARY 12
Absolute Non-negotiables in VoltDB
• Transactional Consistency
• Extremely high throughput
• Linear Scalability
• Resiliency
• Minimum dev-ops administration
State , Speed , Scale , Stable, Simple
page© 2014 VoltDB PROPRIETARY 13
Our Customers are streaming important data
• Typical Deployments
• 100K to 1M transactions/sec
• Commodity Hardware
32GB - 256 GB RAM
16 Cores – 64 Cores
10 GigE ~ 40 GigE
• VoltDB runs on AWS, Azure, GCP, IBM Bluemix
Transalytics = Transactions + Analytics
page
“A single, unified database that supports
transactions and analytics in real-time without
sacrificing transactional integrity, performance,
and scale.” - Mike Gualtieri, Forrester
Transactional
Operational
Analytical
Translytical Database
Traditional Stack For streaming data
Single data layer
page 15
How can queuing systems like
Kafka Simplify the architecture?
page 16
How do we perceive Kafka?
page 17
APACHE KAFKA
• High Throughput
• Low Latency
• Scalable
• Centralized
• Real-time
page© 2014 VoltDB PROPRIETARY
STATE OF THE ART KAFKA @ LINKED-IN
18
page 19
• Handles 1.4 Trillion/day messages for various applications in Linked-in
• Over 1400 brokers
• Can handle well over a few million messages/sec
• At-least once delivery of messages
• Strong durability contract with replication
• Rich ecosystem
• Expresso - offload My-SQL replication
• Venice – Compute derived data
• Nuage- a portal to manage topics and associated metadata
• Goblin – Ingestion framework
• Mirror-maker - replication
CAN KAFKA SCALE?
page© 2015 Forrester Research, Inc. Reproduction Prohibited 20
JUST ENOUGH KAFKA
› Producer
› Broker
› Topics
› Partitions – Random & Semantic
› ISR – Leader & Controller, high watermark
› Zookeeper – offsets
› Consumer
› Consumer Groups
› Commit Log – TTLs, compactions
page 21
• Make sure that the producer is set to Acks = all
• Make sure “replica.lag.time.max.ms” set to a minimum (match it with the voltdb timeout)
• Make sure "replica.lag.max.messages” set to a minimum (this parameter is getting
deprecated from 0.9)
• Disable unclean.leader.election.enable = false
• Use default.replication.factor = 3
• Make sure that the consumer is set to read only committed messages min.insync.replicas = 2
(this is applied to Topic level – need to be done manually before 0.9)
• “autocommit.enable” = false
• Disable automatic topic creation in kafka
• “Block.on.buffer.full” = true
• “Max.inflight.requests.per.connect = 1”
• Use rebalance listener to limit duplicates
• Connection to Zookeeper
• Monitor Consumer lag via offsets
• Report consumer counts and errors to a separate topics
SO WHAT ARE THE CAVEATS WITH KAFKA?
page 22
Sample Config file
<import>
<configuration type="kafka" format="csv" enabled="true">
<property name="brokers">kafkasvr:9092</property>
<property name="topics">employees</property>
<property name="procedure">EMPLOYEE.insert</property>
</configuration>
<configuration type="kafka" enabled="true">
<property name="brokers">kafkasvr:9092</property>
<property name="topics">employees</property>
<property name="procedure">EMPLOYEE.insert</property>
</configuration>
<configuration type="kafka" enabled="true">
<property name="brokers">kafkasvr:9092</property>
<property name="topics">managers</property>
<property name="procedure">MANAGER.insert</property>
</configuration>
</import>
• Supports multiple data formats like CSV(default), TSV,
JSON etc. (refer documentation)
• Supports various types sources of data like Kafka,
Kinesis.
• Supply a list of brokers to pick-up offsets
• Supply a topic name which contains the messages
• Supply the stored procedure name to invoke per
event/message and then insert the result into the db
• “fetch.message.​max.bytes” maximum size of message
that is fetched from Kafka (default 64 KB)
• “groupid” the group the consumer belongs to.
• “socket.timeout.ms” (milliseconds) the maximum
time the socket connection waits before timing out.
HOW TO CONFIGURE KAFKA-> VOLTDB? (IMPORTER)
page© 2014 VoltDB PROPRIETARY
HOW TO CONFIGURE VOLTDB -> KAFKA?
(EXPORTER)
23
page 24
• According to our customer success team as of today – approximately 15-20% of our
customers are using Kafka & VoltDB together.
Examples
• King Games (of Candy Crush fame) – 5 nodes, 384GB RAM, 32 cores – 300+ topics
with more that a 400,000 Txns/sec @ 50% CPU utilization.
• MaxCDN (now Stackpath – Global CDN) – 11 nodes, 128 GB RAM,16 cores, couple
of hundred topics with more that 500,000 Txns/sec @ 30% CPU utilization.
• Nimble Storage (Infosight dashboard & support) – 9 nodes,128GB RAM,64 cores –
50+ topics with more that 200,000 Txns/sec @ 20~30% CPU utilization.
• We highly recommend this architecture if it meets the SLA requirements
IS KAFKA & VOLTDB INTEGRATION IN PRODUCTION?
page© 2014 VoltDB PROPRIETARY
SO WHAT DOES KAFKA BRING TO AN IN-MEMORY
DATABASE LIKE VOLTDB?
• Centralized infrastructure
• Recreate state
• Resiliency with at-least once delivery
• Impedance mismatch between applications
• Integrations with various applications
• Export and Import capabilities
• Cost optimization for HW
25
page© 2014 VoltDB PROPRIETARY 26
page© 2014 VoltDB PROPRIETARY
IDEMPOTENCE!
• Is the property of certain operations in mathematics and
computer science, that can be applied multiple times
without changing the result beyond the initial application.
• At-Least-Once Delivery + Idempotent Operations =
Exactly Once Semantics
27
page© 2015 Forrester Research, Inc. Reproduction Prohibited 28
28
Idempotent Not Idempotent
set x = 5;
same as
set x = 5; set x = 5;
x++;
not same as
x++ ; x++;
If (x % 2 == 0) x++;
same as
If (x % 2 == 0) x++;
If (x % 2 == 0) x++;
If (x % 2 == 0) x* = 2;
same as
If (x % 2 == 0) x* = 2;
If (x % 2 == 0) x* = 2;
spill coffee on brown pants eat whole plate of spaghetti
page 29
What interesting problems do we solve?
• Correlation – streaming Join (state management)
• Out of order delivery
• At least once delivery – How to dedup
• Precise Accounting
• Precise Statistics – Event time vs processing time
page 30
page© 2014 VoltDB PROPRIETARY
FAST DATA: APACHE-STYLE
31
Applications, Message Queues, Data Sources
Ingest
Analyze Decide
Counters
Aggregations
Time series
Statistics
Store results
Query and
recombine
Fast serving
Per-event policy evaluations
Responses (synchronous)
Side-effects (asynchronous)
Export & Pipeline
Kafka / RabbitMQ
Storm, Flume, Sqoop
Storm +
Serving Layer
Spark +
Serving Layer
Cassandra,
HBase
Hadoop, Message queues
page© 2014 VoltDB PROPRIETARY
FAST DATA: VOLTDB-STYLE
32
Applications, Message Queues, Data Sources
Ingest
Analyze Decide
Counters
Aggregations
Time series
Statistics
Store results
Query and
recombine
Fast serving
Per-event policy evaluations
Responses (synchronous)
Side-effects (asynchronous)
Export & Pipeline
Kafka / RabbitMQ
VoltDB
SQL, Java for
Analytics
Transactions /
ACID
Hadoop, Message queues
page© 2014 VoltDB PROPRIETARY
THREE MAJOR DRAWBACKS OF STREAMING
SOLUTIONS
• Streaming solutions lack context
• filter, aggregate and join operations require state.
• need backend databases to support decisions
• good for fast ingestion only.
• Streaming solutions are not architected for real-time decisions
• not ACID (atomicity, consistency, isolation, durability)
• no support for JDBC/ODBC
• Good for algorithmic processing of windowed data
• Streaming solutions lack operational transparency
• good for statically configured topological results
• need back-end databases for storing aggregates/counters
33
page© 2014 VoltDB PROPRIETARY
CEP / STREAM PROCESSING VS. VOLTDB
Common characteristics
• Process high speed, streaming
data
• Ingest thousands to millions of
events per second
• Can function as part of a data
pipeline
• Basic event alerting or
enrichment
34
Stream processing is right choice when....
• Unstructured audio, video, image, signal
processing data streams
• Micro-second latencies are needed
• Examine stream for temporal pattern detection
VoltDB is the right choice when...
• Realtime streaming analytics - calculation and
serving
• Transactional decisions per event - data
informed
• Ad hoc queries of state
• Common interface across data architecture
stack (SQL)
page© 2014 VoltDB PROPRIETARY
INTEGRATING DATA SOURCES WITH VOLTDB
• CSV loader
• Kafka loader
• JDBC loader
• Vertica UDx
• Extensible loader API
• JDBC
• ODBC
• HTTP JSON
• Native client drivers / SDKs
BULK LOADERS APPLICATION INTERFACES
35
page© 2014 VoltDB PROPRIETARY
INTEGRATING VOLTDB WITH EXPORT TARGETS
36
• Local file system export
• JDBC export
• Kafka export
• Elasticsearch export
• HDFS export
• HTTP export
• Extensible API
page© 2014 VoltDB PROPRIETARY
VOLTDB EXPORT UI
CREATE TABLE events (
EventID INTEGER,
time TIMESTAMP,
msg VARCHAR(128));
EXPORT TABLE events;
37
<export enabled="true" target="file">
ddl.sql
deployment.xmlINSERT into TABLE values…
Application SQL
page© 2014 VoltDB PROPRIETARY
ACID PROCESSING
• Sync intra-cluster replication
• Replicated durability
• High availability (configurable)
• Serializable isolation
• Ad-hoc SQL or stored procedures.
• Partitioned & distributed transactions
• Load balanced reads across replicas
38
page© 2014 VoltDB PROPRIETARY
MATERIALIZED VIEWS
• Declarative SQL
• Fully transactional
• Supports ad-hoc query
39
CREATE VIEW registrations_by_zipcode (
zipcode, registered_voters
) AS
SELECT zipcode, count(*) from voters
where registration=1 GROUP BY zipcode;
page© 2014 VoltDB PROPRIETARY
MV FOR STREAMING AGGREGATION
• Partitioned on cluster
• Immediately up-to-date
• Active/active HA
40
Global Read: SELECT
sum(count) WHERE sec > 130
and sec < 140;
page© 2014 VoltDB PROPRIETARY
REAL-TIME ANALYTICS IN VOLTDB
• Counters
• Counting is exceedingly hard at scale
• VoltDB was designed to excel at counters
• Aggregates
• Materialized views maintain slices of fast moving data
and enable fast access
• Group by keys + time functions (day, hour, minute,
second)
• Query views for time-series rollups, e.g. “last 30
minutes”
• Leaderboards
• Leaderboards rank items by size, value or amplitude
• Index optimizations enable fast ranking of records
within large sets
41
page© 2014 VoltDB PROPRIETARY
REVIEW
Application
Event
Sources
VoltDB
Client
Interface
Partition
Replica 1
Partition
Replica 2
Export
Destination
(OLAP,
HTTP)
• SQL + Java transactions
• JSON column values
• HA in-memory processing
• ACID (durable to disk)
• Ranking indexes
• Indexes on functions
• Capped tables
• Mat. views: RT aggregation
• Append only export
• 1-5 ms @ 99% responses
42
page© 2014 VoltDB PROPRIETARY
QUESTIONS?
• Developer
• The source code for Kafka – VoltDB connector?
https://github.com/VoltDB/voltdb-kafka-connector
• The Developer Guide to Streaming Data Applications
http://learn.voltdb.com/WPThreeContenders.html
• Architect:
• Download our new ‘recipes’ eBook
https://voltdb.com/ebook/fast-data-recipes
• Email us your questions: askanengineer@voltdb.com
• http://chat.voltdb.com/
careers@voltdb.com
43
page© 2014 VoltDB PROPRIETARY page
USE CASES
44
page© 2014 VoltDB PROPRIETARY
OPENET
Application/Use Case
• Openet enables the world's largest network operators to innovate
service offerings in an increasingly mobile, data-driven society
• Applications include Policy Manager, Evolved Charging,
Convergent Mediation
Why VoltDB?
• Performance and scalability that provides real-time control of
network resource consumption, and real-time interaction between
network systems and their users
• Virtualized platform for elasticity and ease of operations
• Simplified deployments with ACID and built-in availability for risk
averse Telco customers
• Saves $0.5 million/customer installation; unlimited scale in the
cloud
45
“VoltDB is the logical choice for a cloud-deployable, transactional
database that can flexibly handle high-volume data streams for
service providers to monitor and leverage in real time.”
page© 2014 VoltDB PROPRIETARY
ASIAINFO
Application/Use Case
• Advanced IT software solutions and services for the
telecommunications industry
• Veris Convergent Context-awareness Center (C3)
• User session management and stream processing
that enables real-time matching and data processing
Why VoltDB?
• High transaction performance with immediate user device
make and model identification
• URL/website matching and real-time campaign triggers
based on VoltDB to enable very rapid processing
• SQL, ACID, data integrity, and disaster recovery
• Reduced TCO: more than 500,000 transactions per second
on commodity servers
46
page© 2014 VoltDB PROPRIETARY
EAGLE INVESTMENT SYSTEMS, BNY MELON
Application/Use Case
• Eagle Investment Systems is a leading provider of financial
services technology and a subsidiary of BNY Mellon.
• VoltDB powers Eagle’s cloud-based software for tracking the
performance of investment portfolios and analyzing
performance risks
Why VoltDB?
• Performance and transparent scalability to meet application
workloads and client SLAs
• High speed data cache for risk calculations with large and
rapidly changing data sets
• Lower TCO
47
“We deliver the best in class technology to our clients, and when we evaluated VoltDB, we
discovered that it suited our requirements. With its in-memory, high-velocity database,
VoltDB provides us a great foundation to enhance our current and future offerings.”
Marc Firenze, CTO
page© 2014 VoltDB PROPRIETARY
ERICSSON Application/Use Case
• Ericsson MediaFirst is an end-to-end cloud-based
platform for the creation, management and delivery of
next generation Pay TV
Why VoltDB?
• Enabled Ericsson to move from batch and manual
processing to real-time user session monitoring of 10’s to
100’s of million users
• Ensure user experience across devices
• Attract, retain and monetize new subscribers
• Cloud ready (Azure)
• Agility - quickly develop and deploy TV services across all
end-points at web speeds to respond to changing market
trends and conditions
48
“VoltDB gives us up-to-the-second operational visibility into the performance of the systems across our
customers’ carrier grade TV networks as well as enable real-time user targeting. The database gives our platform
competitive advantage by letting us analyze device and user data as it comes in from Tier One providers”
Mark Hydar, Head of Engineering, Ericsson MediaFirst
page© 2014 VoltDB PROPRIETARY
SMART METER MARKET LEADERS PICK VOLTDB*
49
* > 60 million meters under management
Leader in the Gartner
Magic Quadrant
Announced Utility Customers
• UK Smart Meter
• ShikoKu Electric Power
• Hokkaido Electric Power
page© 2014 VoltDB PROPRIETARY
FLYTXT Application/Use Case
• Customer Experience, Revenue Management and Data
Monetization Solutions for mobile operators
• Drive campaigns to increase revenue, reduce churn,
enhance loyalty and create new revenue streams
Business Impact
• 1.2% incremental revenue
• 18.9% higher ARPU
• Conversion rates 40%-300% higher
Why VoltDB?
• Performance and scale to drive its real-time analytics
platform to extract actionable intelligence from 4 billion
events/day streaming from more than 200 million mobile
subscribers
• Monetize data faster, more efficiently and at lower cost
50
“The partnership with VoltDB enhances our platform’s capability
to act upon insights derived from subscriber actions in real time.”
Prateek Kapadia, CTO
page© 2014 VoltDB PROPRIETARY
AIRPUSH
Application/Use Case
• Managing online mobile advertising
• Manages over 120,000 live applications
Why VoltDB?
• Replaced costly MySQL infrastructure with scalable VoltDB
cluster
• Enabled accurate ad-campaign balance tracking, dramatically
improving “last-dollar” decisions, saving millions in budget
overages
• Eliminated the opportunity cost of placing wrong ads
• Reduced infrastructure cost by 93% (7 servers vs. 100)
51
“Achieved a previously impossible level of budget
management accuracy”
page© 2014 VoltDB PROPRIETARY
SIMPLIFYING THE LAMBDA ARCHITECTURE
Use Case
• Content delivery network service provider
• Counting content views
Why VoltDB?
• Real-time analytics+ transactions w/scale
• Replaced Storm, Cassandra with VoltDB for
real-time streaming aggregations with
“exactly once” semantic
Bottom line
• Accurate – guaranteed correct results with
VoltDB’s ‘exactly-once’ semantics
• Faster time to market
• 32 TB of data processed with 7 servers
• 1/10th the resources of the alternatives 52
page© 2014 VoltDB PROPRIETARY
MOBILE
Use Case
• The Emagine real-time event decision
making platform for Communications
Service Providers (CSPs)
Why VoltDB?
• Real-time analytics+ transactions
• Scale - billions of network events per day,
analyzing hundreds of thousands of
transactions simultaneously, and then
intelligently interacting with customers
Bottom line
• 3 ms system response time
• 253% increase in offer purchases
Real-Time Event Decisioning
53
page© 2014 VoltDB PROPRIETARY
VOLTDB: A BEAUTIFUL ARCHITECTURE
Work Queue
Execution Engine
Table and Index
Data
VoltDB Cluster
Server
1 Partition 1 Partition 2 Partition 3
Server
2 Partition 4 Partition 5 Partition 6
Server
3 Partition 7 Partition 8 Partition 9
Inside a Partition
54
page© 2014 VoltDB PROPRIETARY
WHY VOLTDB?
Faster
Smarter Better
• Superior architecture for fast data/translytics
• In-Memory, Scale-out, ACID,
SQL+JSON
• Rapid data ingestion with transactions
• Data durability and HA
• VoltDB customers realize exceptional
business value
55

More Related Content

What's hot

Sql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su AzureSql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su AzureMarco Obinu
 
FOSDEM 2015 - NoSQL and SQL the best of both worlds
FOSDEM 2015 - NoSQL and SQL the best of both worldsFOSDEM 2015 - NoSQL and SQL the best of both worlds
FOSDEM 2015 - NoSQL and SQL the best of both worldsAndrew Morgan
 
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...DataStax Academy
 
Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...
Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...
Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...NoSQLmatters
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 
It's a wrap - closing keynote for nlOUG Tech Experience 2017 (16th June, The ...
It's a wrap - closing keynote for nlOUG Tech Experience 2017 (16th June, The ...It's a wrap - closing keynote for nlOUG Tech Experience 2017 (16th June, The ...
It's a wrap - closing keynote for nlOUG Tech Experience 2017 (16th June, The ...Lucas Jellema
 
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamCloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamRomeo Kienzler
 
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca SartoriCCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartoriwalk2talk srl
 
Review Oracle OpenWorld 2015 - Overview, Main themes, Announcements and Future
Review Oracle OpenWorld 2015 - Overview, Main themes, Announcements and FutureReview Oracle OpenWorld 2015 - Overview, Main themes, Announcements and Future
Review Oracle OpenWorld 2015 - Overview, Main themes, Announcements and FutureLucas Jellema
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Lucas Jellema
 
High Performance Drupal with MariaDB
High Performance Drupal with MariaDBHigh Performance Drupal with MariaDB
High Performance Drupal with MariaDBMariaDB Corporation
 
Introducing Venice - Strata NYC 2017
Introducing Venice - Strata NYC 2017Introducing Venice - Strata NYC 2017
Introducing Venice - Strata NYC 2017Felix GV
 
Open Source für den geschäftskritischen Einsatz
Open Source für den geschäftskritischen EinsatzOpen Source für den geschäftskritischen Einsatz
Open Source für den geschäftskritischen EinsatzMariaDB plc
 
Couchbase Sydney meetup #1 Couchbase Architecture and Scalability
Couchbase Sydney meetup #1    Couchbase Architecture and ScalabilityCouchbase Sydney meetup #1    Couchbase Architecture and Scalability
Couchbase Sydney meetup #1 Couchbase Architecture and ScalabilityKarthik Babu Sekar
 
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347Manik Surtani
 
Couchbase Chennai Meetup: Developing with Couchbase- made easy
Couchbase Chennai Meetup:  Developing with Couchbase- made easyCouchbase Chennai Meetup:  Developing with Couchbase- made easy
Couchbase Chennai Meetup: Developing with Couchbase- made easyKarthik Babu Sekar
 

What's hot (20)

Sql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su AzureSql Start! 2020 - SQL Server Lift & Shift su Azure
Sql Start! 2020 - SQL Server Lift & Shift su Azure
 
FOSDEM 2015 - NoSQL and SQL the best of both worlds
FOSDEM 2015 - NoSQL and SQL the best of both worldsFOSDEM 2015 - NoSQL and SQL the best of both worlds
FOSDEM 2015 - NoSQL and SQL the best of both worlds
 
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
C* Summit 2013: Searching for a Needle in a Big Data Haystack by Jason Ruther...
 
Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...
Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...
Glynn Bird – Cloudant – Building applications for success.- NoSQL matters Bar...
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
It's a wrap - closing keynote for nlOUG Tech Experience 2017 (16th June, The ...
It's a wrap - closing keynote for nlOUG Tech Experience 2017 (16th June, The ...It's a wrap - closing keynote for nlOUG Tech Experience 2017 (16th June, The ...
It's a wrap - closing keynote for nlOUG Tech Experience 2017 (16th June, The ...
 
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa NeddamCloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa Neddam
 
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca SartoriCCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori
 
Review Oracle OpenWorld 2015 - Overview, Main themes, Announcements and Future
Review Oracle OpenWorld 2015 - Overview, Main themes, Announcements and FutureReview Oracle OpenWorld 2015 - Overview, Main themes, Announcements and Future
Review Oracle OpenWorld 2015 - Overview, Main themes, Announcements and Future
 
Pci multitenancy exalogic at AMIS25
Pci multitenancy exalogic at AMIS25Pci multitenancy exalogic at AMIS25
Pci multitenancy exalogic at AMIS25
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
 
What database
What databaseWhat database
What database
 
High Performance Drupal with MariaDB
High Performance Drupal with MariaDBHigh Performance Drupal with MariaDB
High Performance Drupal with MariaDB
 
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
 
Introducing Venice - Strata NYC 2017
Introducing Venice - Strata NYC 2017Introducing Venice - Strata NYC 2017
Introducing Venice - Strata NYC 2017
 
Open Source für den geschäftskritischen Einsatz
Open Source für den geschäftskritischen EinsatzOpen Source für den geschäftskritischen Einsatz
Open Source für den geschäftskritischen Einsatz
 
Couchbase Sydney meetup #1 Couchbase Architecture and Scalability
Couchbase Sydney meetup #1    Couchbase Architecture and ScalabilityCouchbase Sydney meetup #1    Couchbase Architecture and Scalability
Couchbase Sydney meetup #1 Couchbase Architecture and Scalability
 
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
 
Spring Into the Cloud
Spring Into the CloudSpring Into the Cloud
Spring Into the Cloud
 
Couchbase Chennai Meetup: Developing with Couchbase- made easy
Couchbase Chennai Meetup:  Developing with Couchbase- made easyCouchbase Chennai Meetup:  Developing with Couchbase- made easy
Couchbase Chennai Meetup: Developing with Couchbase- made easy
 

Viewers also liked

Containerizing Distributed Pipes
Containerizing Distributed PipesContainerizing Distributed Pipes
Containerizing Distributed Pipesinside-BigData.com
 
Apex & Geode: In-memory streaming, storage & analytics
Apex & Geode: In-memory streaming, storage & analyticsApex & Geode: In-memory streaming, storage & analytics
Apex & Geode: In-memory streaming, storage & analyticsAshish Tadose
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexYahoo Developer Network
 
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, SparkReactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, SparkTodd Fritz
 
How to Avoid Problems with Lump-sum Relocation Allowances
How to Avoid Problems with Lump-sum Relocation AllowancesHow to Avoid Problems with Lump-sum Relocation Allowances
How to Avoid Problems with Lump-sum Relocation AllowancesParsifal Corporation
 
Going bananas with recursion schemes for fixed point data types
Going bananas with recursion schemes for fixed point data typesGoing bananas with recursion schemes for fixed point data types
Going bananas with recursion schemes for fixed point data typesPawel Szulc
 
Apache Flink Crash Course by Slim Baltagi and Srini Palthepu
Apache Flink Crash Course by Slim Baltagi and Srini PalthepuApache Flink Crash Course by Slim Baltagi and Srini Palthepu
Apache Flink Crash Course by Slim Baltagi and Srini PalthepuSlim Baltagi
 
[OracleCode - SF] Distributed caching for your next node.js project
[OracleCode - SF] Distributed caching for your next node.js project[OracleCode - SF] Distributed caching for your next node.js project
[OracleCode - SF] Distributed caching for your next node.js projectViktor Gamov
 
Configuration Management with AWS OpsWorks for Chef Automate
Configuration Management with AWS OpsWorks for Chef AutomateConfiguration Management with AWS OpsWorks for Chef Automate
Configuration Management with AWS OpsWorks for Chef AutomateAmazon Web Services
 
Automated Governance of Your AWS Resources
Automated Governance of Your AWS ResourcesAutomated Governance of Your AWS Resources
Automated Governance of Your AWS ResourcesAmazon Web Services
 
Complex Event Processing with Esper
Complex Event Processing with EsperComplex Event Processing with Esper
Complex Event Processing with EsperTed Won
 
Deadly Code! (seriously) Blocking &amp; Hyper Context Switching Pattern
Deadly Code! (seriously) Blocking &amp; Hyper Context Switching PatternDeadly Code! (seriously) Blocking &amp; Hyper Context Switching Pattern
Deadly Code! (seriously) Blocking &amp; Hyper Context Switching Patternchibochibo
 
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics FrameworkOverview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics FrameworkSlim Baltagi
 
Make your programs Free
Make your programs FreeMake your programs Free
Make your programs FreePawel Szulc
 
Application Lifecycle Management in a Serverless World
Application Lifecycle Management in a Serverless WorldApplication Lifecycle Management in a Serverless World
Application Lifecycle Management in a Serverless WorldAmazon Web Services
 
Reactive integrations with Akka Streams
Reactive integrations with Akka StreamsReactive integrations with Akka Streams
Reactive integrations with Akka StreamsKonrad Malawski
 
Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Slim Baltagi
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsSlim Baltagi
 

Viewers also liked (20)

Containerizing Distributed Pipes
Containerizing Distributed PipesContainerizing Distributed Pipes
Containerizing Distributed Pipes
 
Apex & Geode: In-memory streaming, storage & analytics
Apex & Geode: In-memory streaming, storage & analyticsApex & Geode: In-memory streaming, storage & analytics
Apex & Geode: In-memory streaming, storage & analytics
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
 
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, SparkReactive Fast Data & the Data Lake with Akka, Kafka, Spark
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
 
How to Avoid Problems with Lump-sum Relocation Allowances
How to Avoid Problems with Lump-sum Relocation AllowancesHow to Avoid Problems with Lump-sum Relocation Allowances
How to Avoid Problems with Lump-sum Relocation Allowances
 
Going bananas with recursion schemes for fixed point data types
Going bananas with recursion schemes for fixed point data typesGoing bananas with recursion schemes for fixed point data types
Going bananas with recursion schemes for fixed point data types
 
Apache Flink Crash Course by Slim Baltagi and Srini Palthepu
Apache Flink Crash Course by Slim Baltagi and Srini PalthepuApache Flink Crash Course by Slim Baltagi and Srini Palthepu
Apache Flink Crash Course by Slim Baltagi and Srini Palthepu
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
[OracleCode - SF] Distributed caching for your next node.js project
[OracleCode - SF] Distributed caching for your next node.js project[OracleCode - SF] Distributed caching for your next node.js project
[OracleCode - SF] Distributed caching for your next node.js project
 
Scala Matsuri 2017
Scala Matsuri 2017Scala Matsuri 2017
Scala Matsuri 2017
 
Configuration Management with AWS OpsWorks for Chef Automate
Configuration Management with AWS OpsWorks for Chef AutomateConfiguration Management with AWS OpsWorks for Chef Automate
Configuration Management with AWS OpsWorks for Chef Automate
 
Automated Governance of Your AWS Resources
Automated Governance of Your AWS ResourcesAutomated Governance of Your AWS Resources
Automated Governance of Your AWS Resources
 
Complex Event Processing with Esper
Complex Event Processing with EsperComplex Event Processing with Esper
Complex Event Processing with Esper
 
Deadly Code! (seriously) Blocking &amp; Hyper Context Switching Pattern
Deadly Code! (seriously) Blocking &amp; Hyper Context Switching PatternDeadly Code! (seriously) Blocking &amp; Hyper Context Switching Pattern
Deadly Code! (seriously) Blocking &amp; Hyper Context Switching Pattern
 
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics FrameworkOverview of Apache Flink: Next-Gen Big Data Analytics Framework
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
 
Make your programs Free
Make your programs FreeMake your programs Free
Make your programs Free
 
Application Lifecycle Management in a Serverless World
Application Lifecycle Management in a Serverless WorldApplication Lifecycle Management in a Serverless World
Application Lifecycle Management in a Serverless World
 
Reactive integrations with Akka Streams
Reactive integrations with Akka StreamsReactive integrations with Akka Streams
Reactive integrations with Akka Streams
 
Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming Analytics
 

Similar to SimplifyStreamingArchitecture

Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksDatabricks
 
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_sahaTez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_sahaData Con LA
 
How to Build Real-Time Streaming Analytics with an In-memory, Scale-out SQL D...
How to Build Real-Time Streaming Analytics with an In-memory, Scale-out SQL D...How to Build Real-Time Streaming Analytics with an In-memory, Scale-out SQL D...
How to Build Real-Time Streaming Analytics with an In-memory, Scale-out SQL D...VoltDB
 
Apache Tez -- A modern processing engine
Apache Tez -- A modern processing engineApache Tez -- A modern processing engine
Apache Tez -- A modern processing enginebigdatagurus_meetup
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingDataWorks Summit
 
DWX 2023 - .NET-Microservices mit Dapr: Zu viel Abstraktion oder der richtige...
DWX 2023 - .NET-Microservices mit Dapr: Zu viel Abstraktion oder der richtige...DWX 2023 - .NET-Microservices mit Dapr: Zu viel Abstraktion oder der richtige...
DWX 2023 - .NET-Microservices mit Dapr: Zu viel Abstraktion oder der richtige...Marc Müller
 
Apache Tez - Accelerating Hadoop Data Processing
Apache Tez - Accelerating Hadoop Data ProcessingApache Tez - Accelerating Hadoop Data Processing
Apache Tez - Accelerating Hadoop Data Processinghitesh1892
 
BDTC2015 hulu-梁宇明-voidbox - docker on yarn
BDTC2015 hulu-梁宇明-voidbox - docker on yarnBDTC2015 hulu-梁宇明-voidbox - docker on yarn
BDTC2015 hulu-梁宇明-voidbox - docker on yarnJerry Wen
 
.NET Developer Conference 2023 - .NET Microservices mit Dapr – zu viel Abstra...
.NET Developer Conference 2023 - .NET Microservices mit Dapr – zu viel Abstra....NET Developer Conference 2023 - .NET Microservices mit Dapr – zu viel Abstra...
.NET Developer Conference 2023 - .NET Microservices mit Dapr – zu viel Abstra...Marc Müller
 
Real-time Big Data Analytics in the IBM SoftLayer Cloud with VoltDB
Real-time Big Data Analytics in the IBM SoftLayer Cloud with VoltDBReal-time Big Data Analytics in the IBM SoftLayer Cloud with VoltDB
Real-time Big Data Analytics in the IBM SoftLayer Cloud with VoltDBVoltDB
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...DataStax
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez Hortonworks
 
Whats new in Oracle Database 12c release 12.1.0.2
Whats new in Oracle Database 12c release 12.1.0.2Whats new in Oracle Database 12c release 12.1.0.2
Whats new in Oracle Database 12c release 12.1.0.2Connor McDonald
 
Apache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query ProcessingApache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query ProcessingBikas Saha
 
Modern MySQL Monitoring and Dashboards.
Modern MySQL Monitoring and Dashboards.Modern MySQL Monitoring and Dashboards.
Modern MySQL Monitoring and Dashboards.Mydbops
 
Magnolia CMS on Jelastic
Magnolia CMS on JelasticMagnolia CMS on Jelastic
Magnolia CMS on JelasticEdgar Vonk
 
Running Magnolia on Jelastic Cloud Hosting
Running Magnolia on Jelastic Cloud HostingRunning Magnolia on Jelastic Cloud Hosting
Running Magnolia on Jelastic Cloud HostingMagnolia
 
Magnolia CMS - on Jelastic
Magnolia CMS - on JelasticMagnolia CMS - on Jelastic
Magnolia CMS - on JelasticInfo.nl
 
Seize Profits in the Cloud with SolidFire
Seize Profits in the Cloud with SolidFire Seize Profits in the Cloud with SolidFire
Seize Profits in the Cloud with SolidFire NetApp
 

Similar to SimplifyStreamingArchitecture (20)

Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
 
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_sahaTez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_saha
 
How to Build Real-Time Streaming Analytics with an In-memory, Scale-out SQL D...
How to Build Real-Time Streaming Analytics with an In-memory, Scale-out SQL D...How to Build Real-Time Streaming Analytics with an In-memory, Scale-out SQL D...
How to Build Real-Time Streaming Analytics with an In-memory, Scale-out SQL D...
 
Apache Tez -- A modern processing engine
Apache Tez -- A modern processing engineApache Tez -- A modern processing engine
Apache Tez -- A modern processing engine
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
 
IBM - Introduction to Cloudant
IBM - Introduction to CloudantIBM - Introduction to Cloudant
IBM - Introduction to Cloudant
 
DWX 2023 - .NET-Microservices mit Dapr: Zu viel Abstraktion oder der richtige...
DWX 2023 - .NET-Microservices mit Dapr: Zu viel Abstraktion oder der richtige...DWX 2023 - .NET-Microservices mit Dapr: Zu viel Abstraktion oder der richtige...
DWX 2023 - .NET-Microservices mit Dapr: Zu viel Abstraktion oder der richtige...
 
Apache Tez - Accelerating Hadoop Data Processing
Apache Tez - Accelerating Hadoop Data ProcessingApache Tez - Accelerating Hadoop Data Processing
Apache Tez - Accelerating Hadoop Data Processing
 
BDTC2015 hulu-梁宇明-voidbox - docker on yarn
BDTC2015 hulu-梁宇明-voidbox - docker on yarnBDTC2015 hulu-梁宇明-voidbox - docker on yarn
BDTC2015 hulu-梁宇明-voidbox - docker on yarn
 
.NET Developer Conference 2023 - .NET Microservices mit Dapr – zu viel Abstra...
.NET Developer Conference 2023 - .NET Microservices mit Dapr – zu viel Abstra....NET Developer Conference 2023 - .NET Microservices mit Dapr – zu viel Abstra...
.NET Developer Conference 2023 - .NET Microservices mit Dapr – zu viel Abstra...
 
Real-time Big Data Analytics in the IBM SoftLayer Cloud with VoltDB
Real-time Big Data Analytics in the IBM SoftLayer Cloud with VoltDBReal-time Big Data Analytics in the IBM SoftLayer Cloud with VoltDB
Real-time Big Data Analytics in the IBM SoftLayer Cloud with VoltDB
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
Whats new in Oracle Database 12c release 12.1.0.2
Whats new in Oracle Database 12c release 12.1.0.2Whats new in Oracle Database 12c release 12.1.0.2
Whats new in Oracle Database 12c release 12.1.0.2
 
Apache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query ProcessingApache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query Processing
 
Modern MySQL Monitoring and Dashboards.
Modern MySQL Monitoring and Dashboards.Modern MySQL Monitoring and Dashboards.
Modern MySQL Monitoring and Dashboards.
 
Magnolia CMS on Jelastic
Magnolia CMS on JelasticMagnolia CMS on Jelastic
Magnolia CMS on Jelastic
 
Running Magnolia on Jelastic Cloud Hosting
Running Magnolia on Jelastic Cloud HostingRunning Magnolia on Jelastic Cloud Hosting
Running Magnolia on Jelastic Cloud Hosting
 
Magnolia CMS - on Jelastic
Magnolia CMS - on JelasticMagnolia CMS - on Jelastic
Magnolia CMS - on Jelastic
 
Seize Profits in the Cloud with SolidFire
Seize Profits in the Cloud with SolidFire Seize Profits in the Cloud with SolidFire
Seize Profits in the Cloud with SolidFire
 

SimplifyStreamingArchitecture

  • 1. page SIMPLIFY YOUR STREAMING DATA ARCHITECTURE WITH KAFKA & VOLTDB Maheedhar Gunturu
  • 2. page© 2014 VoltDB PROPRIETARY WHO AM I? • Maheedhar Gunturu – Software & Solutions Architect @VoltDB mgunturu@voltdb.com @Vanguard_space http://chat.voltdb.com/ • Previously:  Solutions Architect @ MapR  Working on Big Data systems since 2010 • Current Interests include  NVRAM  GPU Co-processors for databases  Operationalizing Deep-learning based applications using Fast Data. 2
  • 3. page© 2014 VoltDB PROPRIETARY COMPANY OVERVIEW FAST : World Record Cloud Benchmark: YCSB (Yahoo Cloud Serving Benchmark) - 2.4m million tps (transactions per second) 3 Mike Stonebraker* Founded in 2009 by database luminary VoltDB in the Magic Quadrant “Operational Databases” Other Stonebraker Companies • Professor in MIT for the Data and Artificial Intelligence lab • Co-Founder of VoltDB • Creator of the C-store & H-store Project • 2014 Turing Award Winner • His students include – Mike Olson, Diane Greene, Daniel Abadi…
  • 4. page© 2014 VoltDB PROPRIETARY WHAT IS VOLTDB? • An operational database purpose-built to run 100% in- memory at web scale  In-Memory  Relational, SQL, ACID Compliant  Scale-out on commodity hardware  Reliability, HA, fault tolerant  Integration with downstream systems i.e. OLAP, Hadoop, DW 4 Best use cases: operational and transactional workloads
  • 5. page© 2014 VoltDB PROPRIETARY 5 Fast (in motion) Streaming Analytics: real time summary and aggregation Transaction Processing: per-event decisions using context + history. Big (at rest) Exploration: data science, investigation of large data sets Reporting: recommendation matrixes, search indexes, trend and BI
  • 6. page© 2014 VoltDB PROPRIETARY WHO NEEDS VOLTDB ? 2/ 17 /2 01 6 2/ 17 /2 01 6 Time CriticalNot Time Critical Unimportant Data Important Data Most streaming systems (exactly once semantics is very expensive) Most anything, Spark, etc. Data warehouses, RDBMS, Transaction processing engines
  • 7. page Perishable insights can have exponentially more value than after-the-fact traditional historical analytics.
  • 8. page© 2014 VoltDB PROPRIETARY WHAT ARE SOME COMMON USE-CASES? • Data (messages) stream in from humans or devices • Internet of Things (IoT) • Ad-bidding platforms • Telecommunications – OSS/BSS/NFV • Fintech – trading and risk assessment platforms • Consumer facing Online Billing and booking systems • Massively multi-player games • At high volume • 100K ~ 1M messages/sec requires specialized software • HighAvailability • Nobody wants to go down these days 8
  • 9. page© 2014 VoltDB PROPRIETARY 9 Streaming Analytics Transactions Transformations • Materialized Views • Capped Tables • Ranking Indexes • Stored Procedures • Java + SQL • ACID guarantees • No client-side transaction control • Stored Procedures • Loaders/Importers • Export Connectors • Sessionization • Enrichment • JDBC connector VoltDB architecture Commodity HW HA + ACID Scale-out VM-friendly How do we do it? • XDCR – configurable out of the box. • Geospatial capabilities for location based BI and decision making
  • 10. page© 2014 VoltDB PROPRIETARY The general-purpose RDBMS can’t scale
  • 11. page© 2014 VoltDB PROPRIETARY MODERN OLTP 11 1. Processing streams requires integrated access to state. 2. Using real time analytics requires a query interface. 3. Reacting to incoming events requires transactions. State + Query + Transactions = OLTP Fast Streaming Analytics Transaction Processing
  • 12. page© 2014 VoltDB PROPRIETARY 12 Absolute Non-negotiables in VoltDB • Transactional Consistency • Extremely high throughput • Linear Scalability • Resiliency • Minimum dev-ops administration State , Speed , Scale , Stable, Simple
  • 13. page© 2014 VoltDB PROPRIETARY 13 Our Customers are streaming important data • Typical Deployments • 100K to 1M transactions/sec • Commodity Hardware 32GB - 256 GB RAM 16 Cores – 64 Cores 10 GigE ~ 40 GigE • VoltDB runs on AWS, Azure, GCP, IBM Bluemix Transalytics = Transactions + Analytics
  • 14. page “A single, unified database that supports transactions and analytics in real-time without sacrificing transactional integrity, performance, and scale.” - Mike Gualtieri, Forrester Transactional Operational Analytical Translytical Database Traditional Stack For streaming data Single data layer
  • 15. page 15 How can queuing systems like Kafka Simplify the architecture?
  • 16. page 16 How do we perceive Kafka?
  • 17. page 17 APACHE KAFKA • High Throughput • Low Latency • Scalable • Centralized • Real-time
  • 18. page© 2014 VoltDB PROPRIETARY STATE OF THE ART KAFKA @ LINKED-IN 18
  • 19. page 19 • Handles 1.4 Trillion/day messages for various applications in Linked-in • Over 1400 brokers • Can handle well over a few million messages/sec • At-least once delivery of messages • Strong durability contract with replication • Rich ecosystem • Expresso - offload My-SQL replication • Venice – Compute derived data • Nuage- a portal to manage topics and associated metadata • Goblin – Ingestion framework • Mirror-maker - replication CAN KAFKA SCALE?
  • 20. page© 2015 Forrester Research, Inc. Reproduction Prohibited 20 JUST ENOUGH KAFKA › Producer › Broker › Topics › Partitions – Random & Semantic › ISR – Leader & Controller, high watermark › Zookeeper – offsets › Consumer › Consumer Groups › Commit Log – TTLs, compactions
  • 21. page 21 • Make sure that the producer is set to Acks = all • Make sure “replica.lag.time.max.ms” set to a minimum (match it with the voltdb timeout) • Make sure "replica.lag.max.messages” set to a minimum (this parameter is getting deprecated from 0.9) • Disable unclean.leader.election.enable = false • Use default.replication.factor = 3 • Make sure that the consumer is set to read only committed messages min.insync.replicas = 2 (this is applied to Topic level – need to be done manually before 0.9) • “autocommit.enable” = false • Disable automatic topic creation in kafka • “Block.on.buffer.full” = true • “Max.inflight.requests.per.connect = 1” • Use rebalance listener to limit duplicates • Connection to Zookeeper • Monitor Consumer lag via offsets • Report consumer counts and errors to a separate topics SO WHAT ARE THE CAVEATS WITH KAFKA?
  • 22. page 22 Sample Config file <import> <configuration type="kafka" format="csv" enabled="true"> <property name="brokers">kafkasvr:9092</property> <property name="topics">employees</property> <property name="procedure">EMPLOYEE.insert</property> </configuration> <configuration type="kafka" enabled="true"> <property name="brokers">kafkasvr:9092</property> <property name="topics">employees</property> <property name="procedure">EMPLOYEE.insert</property> </configuration> <configuration type="kafka" enabled="true"> <property name="brokers">kafkasvr:9092</property> <property name="topics">managers</property> <property name="procedure">MANAGER.insert</property> </configuration> </import> • Supports multiple data formats like CSV(default), TSV, JSON etc. (refer documentation) • Supports various types sources of data like Kafka, Kinesis. • Supply a list of brokers to pick-up offsets • Supply a topic name which contains the messages • Supply the stored procedure name to invoke per event/message and then insert the result into the db • “fetch.message.​max.bytes” maximum size of message that is fetched from Kafka (default 64 KB) • “groupid” the group the consumer belongs to. • “socket.timeout.ms” (milliseconds) the maximum time the socket connection waits before timing out. HOW TO CONFIGURE KAFKA-> VOLTDB? (IMPORTER)
  • 23. page© 2014 VoltDB PROPRIETARY HOW TO CONFIGURE VOLTDB -> KAFKA? (EXPORTER) 23
  • 24. page 24 • According to our customer success team as of today – approximately 15-20% of our customers are using Kafka & VoltDB together. Examples • King Games (of Candy Crush fame) – 5 nodes, 384GB RAM, 32 cores – 300+ topics with more that a 400,000 Txns/sec @ 50% CPU utilization. • MaxCDN (now Stackpath – Global CDN) – 11 nodes, 128 GB RAM,16 cores, couple of hundred topics with more that 500,000 Txns/sec @ 30% CPU utilization. • Nimble Storage (Infosight dashboard & support) – 9 nodes,128GB RAM,64 cores – 50+ topics with more that 200,000 Txns/sec @ 20~30% CPU utilization. • We highly recommend this architecture if it meets the SLA requirements IS KAFKA & VOLTDB INTEGRATION IN PRODUCTION?
  • 25. page© 2014 VoltDB PROPRIETARY SO WHAT DOES KAFKA BRING TO AN IN-MEMORY DATABASE LIKE VOLTDB? • Centralized infrastructure • Recreate state • Resiliency with at-least once delivery • Impedance mismatch between applications • Integrations with various applications • Export and Import capabilities • Cost optimization for HW 25
  • 26. page© 2014 VoltDB PROPRIETARY 26
  • 27. page© 2014 VoltDB PROPRIETARY IDEMPOTENCE! • Is the property of certain operations in mathematics and computer science, that can be applied multiple times without changing the result beyond the initial application. • At-Least-Once Delivery + Idempotent Operations = Exactly Once Semantics 27
  • 28. page© 2015 Forrester Research, Inc. Reproduction Prohibited 28 28 Idempotent Not Idempotent set x = 5; same as set x = 5; set x = 5; x++; not same as x++ ; x++; If (x % 2 == 0) x++; same as If (x % 2 == 0) x++; If (x % 2 == 0) x++; If (x % 2 == 0) x* = 2; same as If (x % 2 == 0) x* = 2; If (x % 2 == 0) x* = 2; spill coffee on brown pants eat whole plate of spaghetti
  • 29. page 29 What interesting problems do we solve? • Correlation – streaming Join (state management) • Out of order delivery • At least once delivery – How to dedup • Precise Accounting • Precise Statistics – Event time vs processing time
  • 31. page© 2014 VoltDB PROPRIETARY FAST DATA: APACHE-STYLE 31 Applications, Message Queues, Data Sources Ingest Analyze Decide Counters Aggregations Time series Statistics Store results Query and recombine Fast serving Per-event policy evaluations Responses (synchronous) Side-effects (asynchronous) Export & Pipeline Kafka / RabbitMQ Storm, Flume, Sqoop Storm + Serving Layer Spark + Serving Layer Cassandra, HBase Hadoop, Message queues
  • 32. page© 2014 VoltDB PROPRIETARY FAST DATA: VOLTDB-STYLE 32 Applications, Message Queues, Data Sources Ingest Analyze Decide Counters Aggregations Time series Statistics Store results Query and recombine Fast serving Per-event policy evaluations Responses (synchronous) Side-effects (asynchronous) Export & Pipeline Kafka / RabbitMQ VoltDB SQL, Java for Analytics Transactions / ACID Hadoop, Message queues
  • 33. page© 2014 VoltDB PROPRIETARY THREE MAJOR DRAWBACKS OF STREAMING SOLUTIONS • Streaming solutions lack context • filter, aggregate and join operations require state. • need backend databases to support decisions • good for fast ingestion only. • Streaming solutions are not architected for real-time decisions • not ACID (atomicity, consistency, isolation, durability) • no support for JDBC/ODBC • Good for algorithmic processing of windowed data • Streaming solutions lack operational transparency • good for statically configured topological results • need back-end databases for storing aggregates/counters 33
  • 34. page© 2014 VoltDB PROPRIETARY CEP / STREAM PROCESSING VS. VOLTDB Common characteristics • Process high speed, streaming data • Ingest thousands to millions of events per second • Can function as part of a data pipeline • Basic event alerting or enrichment 34 Stream processing is right choice when.... • Unstructured audio, video, image, signal processing data streams • Micro-second latencies are needed • Examine stream for temporal pattern detection VoltDB is the right choice when... • Realtime streaming analytics - calculation and serving • Transactional decisions per event - data informed • Ad hoc queries of state • Common interface across data architecture stack (SQL)
  • 35. page© 2014 VoltDB PROPRIETARY INTEGRATING DATA SOURCES WITH VOLTDB • CSV loader • Kafka loader • JDBC loader • Vertica UDx • Extensible loader API • JDBC • ODBC • HTTP JSON • Native client drivers / SDKs BULK LOADERS APPLICATION INTERFACES 35
  • 36. page© 2014 VoltDB PROPRIETARY INTEGRATING VOLTDB WITH EXPORT TARGETS 36 • Local file system export • JDBC export • Kafka export • Elasticsearch export • HDFS export • HTTP export • Extensible API
  • 37. page© 2014 VoltDB PROPRIETARY VOLTDB EXPORT UI CREATE TABLE events ( EventID INTEGER, time TIMESTAMP, msg VARCHAR(128)); EXPORT TABLE events; 37 <export enabled="true" target="file"> ddl.sql deployment.xmlINSERT into TABLE values… Application SQL
  • 38. page© 2014 VoltDB PROPRIETARY ACID PROCESSING • Sync intra-cluster replication • Replicated durability • High availability (configurable) • Serializable isolation • Ad-hoc SQL or stored procedures. • Partitioned & distributed transactions • Load balanced reads across replicas 38
  • 39. page© 2014 VoltDB PROPRIETARY MATERIALIZED VIEWS • Declarative SQL • Fully transactional • Supports ad-hoc query 39 CREATE VIEW registrations_by_zipcode ( zipcode, registered_voters ) AS SELECT zipcode, count(*) from voters where registration=1 GROUP BY zipcode;
  • 40. page© 2014 VoltDB PROPRIETARY MV FOR STREAMING AGGREGATION • Partitioned on cluster • Immediately up-to-date • Active/active HA 40 Global Read: SELECT sum(count) WHERE sec > 130 and sec < 140;
  • 41. page© 2014 VoltDB PROPRIETARY REAL-TIME ANALYTICS IN VOLTDB • Counters • Counting is exceedingly hard at scale • VoltDB was designed to excel at counters • Aggregates • Materialized views maintain slices of fast moving data and enable fast access • Group by keys + time functions (day, hour, minute, second) • Query views for time-series rollups, e.g. “last 30 minutes” • Leaderboards • Leaderboards rank items by size, value or amplitude • Index optimizations enable fast ranking of records within large sets 41
  • 42. page© 2014 VoltDB PROPRIETARY REVIEW Application Event Sources VoltDB Client Interface Partition Replica 1 Partition Replica 2 Export Destination (OLAP, HTTP) • SQL + Java transactions • JSON column values • HA in-memory processing • ACID (durable to disk) • Ranking indexes • Indexes on functions • Capped tables • Mat. views: RT aggregation • Append only export • 1-5 ms @ 99% responses 42
  • 43. page© 2014 VoltDB PROPRIETARY QUESTIONS? • Developer • The source code for Kafka – VoltDB connector? https://github.com/VoltDB/voltdb-kafka-connector • The Developer Guide to Streaming Data Applications http://learn.voltdb.com/WPThreeContenders.html • Architect: • Download our new ‘recipes’ eBook https://voltdb.com/ebook/fast-data-recipes • Email us your questions: askanengineer@voltdb.com • http://chat.voltdb.com/ careers@voltdb.com 43
  • 44. page© 2014 VoltDB PROPRIETARY page USE CASES 44
  • 45. page© 2014 VoltDB PROPRIETARY OPENET Application/Use Case • Openet enables the world's largest network operators to innovate service offerings in an increasingly mobile, data-driven society • Applications include Policy Manager, Evolved Charging, Convergent Mediation Why VoltDB? • Performance and scalability that provides real-time control of network resource consumption, and real-time interaction between network systems and their users • Virtualized platform for elasticity and ease of operations • Simplified deployments with ACID and built-in availability for risk averse Telco customers • Saves $0.5 million/customer installation; unlimited scale in the cloud 45 “VoltDB is the logical choice for a cloud-deployable, transactional database that can flexibly handle high-volume data streams for service providers to monitor and leverage in real time.”
  • 46. page© 2014 VoltDB PROPRIETARY ASIAINFO Application/Use Case • Advanced IT software solutions and services for the telecommunications industry • Veris Convergent Context-awareness Center (C3) • User session management and stream processing that enables real-time matching and data processing Why VoltDB? • High transaction performance with immediate user device make and model identification • URL/website matching and real-time campaign triggers based on VoltDB to enable very rapid processing • SQL, ACID, data integrity, and disaster recovery • Reduced TCO: more than 500,000 transactions per second on commodity servers 46
  • 47. page© 2014 VoltDB PROPRIETARY EAGLE INVESTMENT SYSTEMS, BNY MELON Application/Use Case • Eagle Investment Systems is a leading provider of financial services technology and a subsidiary of BNY Mellon. • VoltDB powers Eagle’s cloud-based software for tracking the performance of investment portfolios and analyzing performance risks Why VoltDB? • Performance and transparent scalability to meet application workloads and client SLAs • High speed data cache for risk calculations with large and rapidly changing data sets • Lower TCO 47 “We deliver the best in class technology to our clients, and when we evaluated VoltDB, we discovered that it suited our requirements. With its in-memory, high-velocity database, VoltDB provides us a great foundation to enhance our current and future offerings.” Marc Firenze, CTO
  • 48. page© 2014 VoltDB PROPRIETARY ERICSSON Application/Use Case • Ericsson MediaFirst is an end-to-end cloud-based platform for the creation, management and delivery of next generation Pay TV Why VoltDB? • Enabled Ericsson to move from batch and manual processing to real-time user session monitoring of 10’s to 100’s of million users • Ensure user experience across devices • Attract, retain and monetize new subscribers • Cloud ready (Azure) • Agility - quickly develop and deploy TV services across all end-points at web speeds to respond to changing market trends and conditions 48 “VoltDB gives us up-to-the-second operational visibility into the performance of the systems across our customers’ carrier grade TV networks as well as enable real-time user targeting. The database gives our platform competitive advantage by letting us analyze device and user data as it comes in from Tier One providers” Mark Hydar, Head of Engineering, Ericsson MediaFirst
  • 49. page© 2014 VoltDB PROPRIETARY SMART METER MARKET LEADERS PICK VOLTDB* 49 * > 60 million meters under management Leader in the Gartner Magic Quadrant Announced Utility Customers • UK Smart Meter • ShikoKu Electric Power • Hokkaido Electric Power
  • 50. page© 2014 VoltDB PROPRIETARY FLYTXT Application/Use Case • Customer Experience, Revenue Management and Data Monetization Solutions for mobile operators • Drive campaigns to increase revenue, reduce churn, enhance loyalty and create new revenue streams Business Impact • 1.2% incremental revenue • 18.9% higher ARPU • Conversion rates 40%-300% higher Why VoltDB? • Performance and scale to drive its real-time analytics platform to extract actionable intelligence from 4 billion events/day streaming from more than 200 million mobile subscribers • Monetize data faster, more efficiently and at lower cost 50 “The partnership with VoltDB enhances our platform’s capability to act upon insights derived from subscriber actions in real time.” Prateek Kapadia, CTO
  • 51. page© 2014 VoltDB PROPRIETARY AIRPUSH Application/Use Case • Managing online mobile advertising • Manages over 120,000 live applications Why VoltDB? • Replaced costly MySQL infrastructure with scalable VoltDB cluster • Enabled accurate ad-campaign balance tracking, dramatically improving “last-dollar” decisions, saving millions in budget overages • Eliminated the opportunity cost of placing wrong ads • Reduced infrastructure cost by 93% (7 servers vs. 100) 51 “Achieved a previously impossible level of budget management accuracy”
  • 52. page© 2014 VoltDB PROPRIETARY SIMPLIFYING THE LAMBDA ARCHITECTURE Use Case • Content delivery network service provider • Counting content views Why VoltDB? • Real-time analytics+ transactions w/scale • Replaced Storm, Cassandra with VoltDB for real-time streaming aggregations with “exactly once” semantic Bottom line • Accurate – guaranteed correct results with VoltDB’s ‘exactly-once’ semantics • Faster time to market • 32 TB of data processed with 7 servers • 1/10th the resources of the alternatives 52
  • 53. page© 2014 VoltDB PROPRIETARY MOBILE Use Case • The Emagine real-time event decision making platform for Communications Service Providers (CSPs) Why VoltDB? • Real-time analytics+ transactions • Scale - billions of network events per day, analyzing hundreds of thousands of transactions simultaneously, and then intelligently interacting with customers Bottom line • 3 ms system response time • 253% increase in offer purchases Real-Time Event Decisioning 53
  • 54. page© 2014 VoltDB PROPRIETARY VOLTDB: A BEAUTIFUL ARCHITECTURE Work Queue Execution Engine Table and Index Data VoltDB Cluster Server 1 Partition 1 Partition 2 Partition 3 Server 2 Partition 4 Partition 5 Partition 6 Server 3 Partition 7 Partition 8 Partition 9 Inside a Partition 54
  • 55. page© 2014 VoltDB PROPRIETARY WHY VOLTDB? Faster Smarter Better • Superior architecture for fast data/translytics • In-Memory, Scale-out, ACID, SQL+JSON • Rapid data ingestion with transactions • Data durability and HA • VoltDB customers realize exceptional business value 55