SlideShare a Scribd company logo
Lambda Architecture Platform 
Using SQL 
Sep 13 2014 
HadoopCon 2014 Taiwan 
TAGOMORI Satoshi (@tagomoris)
Taipei
Topics 
About Me & LINE 
Data analytics workloads 
Batch processing 
Stream processing 
Lambda architecture 
Lambda architecture using SQL 
Norikra: Stream processing with SQL 
13:30-14:20 4F
@tagomoris 
Satoshi Tagomori (田籠 聡) 
LINE Corporation Analytics Platform Team
Tokyo
LINE Offices 
Tokyo HQ 
Spain 
Thailand 
Taipei 
USA 
Korea
LINE is born! JUNE 23, 2011
Data Analytics 
Workload 
Part 01
Various Data Analytics Workload 
Reports 
Monthly/Daily reports 
Hourly (or shorter) news 
Real-time metrics 
Automatically updated reports/graphs 
Alerts for abuse of services, overload, ...
Batch Processing 
Hadoop 
MapReduce (or Spark, Tez) & DSLs (Hive, Pig, ...) 
For reports 
MPP Engines 
Cloudera Impala, Apache Drill, Facebook Presto, ... 
For interactive analysis 
For reports of shorter window
Stream Processing 
Apache Storm 
Incubator project 
“Distributed and fault-tolerant realtime computation” 
Norikra 
by tagomoris 
Non-distributed “Stream processing with SQL”
Why Stream Processing? 
Less latency 
Realtime metrics 
Short-term prompt reports 
Less computing power 
10Mbps for batch processing: 100GB/day 
10Mbps for stream processing: 1 Server 
No query schedule management 
Once query registered, it runs forever
Disadvantage of Stream Processing 
Queries must be written before data 
There should be another way to query past data 
Queries cannot be run twice 
All results will be lost when any error occurs 
All data have gone when bugs found 
Disorders of events break results 
Recorded time based queries? Or arrival time based queries?
Part 02 
Lambda Architecture
Lambda Architecture 
“The Lambda-architecture aims to satisfy the needs for a 
robust system that is fault-tolerant, both against hardware 
failures and human mistakes, being able to serve a wide 
range of workloads and use cases, and in which low-latency 
reads and updates are required. The resulting system should 
be linearly scalable, and it should scale out rather than up.” 
http://lambda-architecture.net/
Lambda Architecture: Overview 
new data 
batch layer 
master dataset 
serving layer 
view 
speed layer 
real-time view 
query
Twitter Summingbird 
Lambda architecture library 
Batch mode: Scalding on Hadoop MapReduce 
Realtime mode: Storm 
Word counting by Summingbird (scala): 
def wordCount[P <: Platform[P]] 
(source: Producer[P, String], store: P#Store[String, Long]) = 
source.flatMap { sentence => 
toWords(sentence).map(_ -> 1L) 
}.sumByKey(store) 
https://github.com/twitter/summingbird 
https://blog.twitter.com/2013/streaming-mapreduce-with-summingbird
What Lambda Architecture Provides 
Replayable queries 
Redo queries anytime if results of speed layer are broken 
Accurate results on demand 
Prompt reports in speed layer with arrival time 
Fixed reports in batch layer with recorded time 
... And many more benefits of stream processing
Why All of Us Don’t Use It? 
Storm doesn’t fit well with many uses 
Storm requires computer resources too big to deploy 
Summingbird requires many steps to deploy 
Many directors/analysts don’t write Scala/Java 
Summingbird DSL is not enough easy for non-professional 
people
Lambda Architecture 
Using SQL 
Part 03
Existing Hadoop Platform 
new data 
HDFS hive 
query 
Fluentd 
presto 
query
Norikra 
Schema-less stream processing with SQL 
“Norikra is a open source server software provides "Stream 
Processing" with SQL, written in JRuby, runs on JVM, licensed 
under GPLv2.” 
SELECT 
path, 
COUNT(1, status=200) AS success_count, 
COUNT(1, status=500) AS server_error_count, 
COUNT(*) AS count 
FROM AccessLog.win:time_batch(10 min, 0L) 
WHERE service='myservice' AND path LIKE '/api/%' 
GROUP BY path 
http://norikra.github.io/
Added-on Lambda Architecture Platform 
new data 
presto 
query 
HDFS hive 
query 
norikra 
query
“Pseudo Lambda” Architecture Using SQL 
Lambda architecture platform 
with almost same queries 
SELECT path, 
COUNT(IF(status=200,1,NULL)) AS success_count, 
COUNT(IF(status=500,1,NULL)) AS server_error_count, 
COUNT(*) AS count 
FROM AccessLog 
WHERE service='myservice' AND path LIKE '/api/%' 
AND timestamp >= ‘2014-09-13 10:40:00’ 
AND timestamp < ‘2014-09-13 10:50:00’ 
GROUP BY path 
SELECT path, 
COUNT(1, status=200) AS success_count, 
COUNT(1, status=500) AS server_error_count, 
COUNT(*) AS count 
FROM AccessLog.win:time_batch(10 min, 0L) 
WHERE service='myservice' AND path LIKE '/api/%' 
GROUP BY path
“Pseudo Lambda” Architecture Using SQL 
SQL dialects are easy to learn! 
Standard SQL, Hive, Presto, Impala, Drill, ... 
+ Norikra 
For non-professional people too! 
SQL queries are very easy to write twice!
Use Cases in LINE 
Prompt reports for Ads service 
Short-term prompt reports by Norikra 
Daily fixed reports by Hive 
Summary of application server error log 
Aggregate error log for alerting by Norikra 
Check details with Hive, Presto (or grep!) 
See you later for details!
TMTOWTDI 
“There’s more than one way to do it.” 
- Perl programming language
SHARE 
What I want & What I’m doing! 
- tagomoris
Q & A

More Related Content

What's hot

Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Anton Kirillov
 
Realtime Reporting using Spark Streaming
Realtime Reporting using Spark StreamingRealtime Reporting using Spark Streaming
Realtime Reporting using Spark Streaming
Santosh Sahoo
 
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Helena Edelson
 
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisNoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
Helena Edelson
 
Hoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on SparkHoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on Spark
Vinoth Chandar
 
Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)
Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)
Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)
Spark Summit
 
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
Databricks
 
Kafka Lambda architecture with mirroring
Kafka Lambda architecture with mirroringKafka Lambda architecture with mirroring
Kafka Lambda architecture with mirroringAnant Rustagi
 
Bellevue Big Data meetup: Dive Deep into Spark Streaming
Bellevue Big Data meetup: Dive Deep into Spark StreamingBellevue Big Data meetup: Dive Deep into Spark Streaming
Bellevue Big Data meetup: Dive Deep into Spark Streaming
Santosh Sahoo
 
Kafka spark cassandra webinar feb 16 2016
Kafka spark cassandra   webinar feb 16 2016 Kafka spark cassandra   webinar feb 16 2016
Kafka spark cassandra webinar feb 16 2016
Hiromitsu Komatsu
 
Reactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkReactive dashboard’s using apache spark
Reactive dashboard’s using apache spark
Rahul Kumar
 
Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks
Databricks
 
SSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine LearningSSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine Learning
felixcss
 
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)
Robert "Chip" Senkbeil
 
Building a unified data pipeline in Apache Spark
Building a unified data pipeline in Apache SparkBuilding a unified data pipeline in Apache Spark
Building a unified data pipeline in Apache SparkDataWorks Summit
 
Lambda architecture
Lambda architectureLambda architecture
Lambda architecture
Szilveszter Molnár
 
Tale of ISUCON and Its Bench Tools
Tale of ISUCON and Its Bench ToolsTale of ISUCON and Its Bench Tools
Tale of ISUCON and Its Bench Tools
SATOSHI TAGOMORI
 
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
Taro L. Saito
 
To Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT ToTo Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT To
SATOSHI TAGOMORI
 
Stream Processing using Apache Spark and Apache Kafka
Stream Processing using Apache Spark and Apache KafkaStream Processing using Apache Spark and Apache Kafka
Stream Processing using Apache Spark and Apache Kafka
Abhinav Singh
 

What's hot (20)

Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
 
Realtime Reporting using Spark Streaming
Realtime Reporting using Spark StreamingRealtime Reporting using Spark Streaming
Realtime Reporting using Spark Streaming
 
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
 
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisNoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
 
Hoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on SparkHoodie: How (And Why) We built an analytical datastore on Spark
Hoodie: How (And Why) We built an analytical datastore on Spark
 
Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)
Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)
Spark and Spark Streaming at Netfix-(Kedar Sedekar and Monal Daxini, Netflix)
 
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
Lessons Learned from Managing Thousands of Production Apache Spark Clusters w...
 
Kafka Lambda architecture with mirroring
Kafka Lambda architecture with mirroringKafka Lambda architecture with mirroring
Kafka Lambda architecture with mirroring
 
Bellevue Big Data meetup: Dive Deep into Spark Streaming
Bellevue Big Data meetup: Dive Deep into Spark StreamingBellevue Big Data meetup: Dive Deep into Spark Streaming
Bellevue Big Data meetup: Dive Deep into Spark Streaming
 
Kafka spark cassandra webinar feb 16 2016
Kafka spark cassandra   webinar feb 16 2016 Kafka spark cassandra   webinar feb 16 2016
Kafka spark cassandra webinar feb 16 2016
 
Reactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkReactive dashboard’s using apache spark
Reactive dashboard’s using apache spark
 
Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks
 
SSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine LearningSSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine Learning
 
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)
 
Building a unified data pipeline in Apache Spark
Building a unified data pipeline in Apache SparkBuilding a unified data pipeline in Apache Spark
Building a unified data pipeline in Apache Spark
 
Lambda architecture
Lambda architectureLambda architecture
Lambda architecture
 
Tale of ISUCON and Its Bench Tools
Tale of ISUCON and Its Bench ToolsTale of ISUCON and Its Bench Tools
Tale of ISUCON and Its Bench Tools
 
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
 
To Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT ToTo Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT To
 
Stream Processing using Apache Spark and Apache Kafka
Stream Processing using Apache Spark and Apache KafkaStream Processing using Apache Spark and Apache Kafka
Stream Processing using Apache Spark and Apache Kafka
 

Viewers also liked

Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big data
Trieu Nguyen
 
Invitation for v1.0.0
Invitation for v1.0.0Invitation for v1.0.0
Invitation for v1.0.0
SATOSHI TAGOMORI
 
Norikra: Stream Processing with SQL
Norikra: Stream Processing with SQLNorikra: Stream Processing with SQL
Norikra: Stream Processing with SQL
SATOSHI TAGOMORI
 
Norikra: SQL Stream Processing In Ruby
Norikra: SQL Stream Processing In RubyNorikra: SQL Stream Processing In Ruby
Norikra: SQL Stream Processing In Ruby
SATOSHI TAGOMORI
 
BigQuery, Fluentd and tagomoris #gcpja
BigQuery, Fluentd and tagomoris #gcpjaBigQuery, Fluentd and tagomoris #gcpja
BigQuery, Fluentd and tagomoris #gcpjaSATOSHI TAGOMORI
 
運用とデータ分析の遠くて近い関係、ISUCONを添えて
運用とデータ分析の遠くて近い関係、ISUCONを添えて運用とデータ分析の遠くて近い関係、ISUCONを添えて
運用とデータ分析の遠くて近い関係、ISUCONを添えて
SATOSHI TAGOMORI
 
Big Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionBig Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in Action
Guido Schmutz
 
Effectively using Open Source with conda
Effectively using Open Source with condaEffectively using Open Source with conda
Effectively using Open Source with conda
Travis Oliphant
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data
Trieu Nguyen
 
Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013Nathan Bijnens
 
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...Nathan Bijnens
 
A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013Nathan Bijnens
 
Design Principles for a Modern Data Warehouse
Design Principles for a Modern Data WarehouseDesign Principles for a Modern Data Warehouse
Design Principles for a Modern Data Warehouse
Rob Winters
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache Spark
DataWorks Summit
 
fluent-plugin-norikra #fluentdcasual
fluent-plugin-norikra #fluentdcasualfluent-plugin-norikra #fluentdcasual
fluent-plugin-norikra #fluentdcasual
SATOSHI TAGOMORI
 
Hadoop and Kerberos
Hadoop and KerberosHadoop and Kerberos
Hadoop and Kerberos
Yuta Imai
 
Kostenlose Social Media Monitoring Tools
Kostenlose Social Media Monitoring ToolsKostenlose Social Media Monitoring Tools
Kostenlose Social Media Monitoring Tools
Kommunikation-zweinull
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
James Serra
 

Viewers also liked (19)

Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big data
 
Invitation for v1.0.0
Invitation for v1.0.0Invitation for v1.0.0
Invitation for v1.0.0
 
Norikra: Stream Processing with SQL
Norikra: Stream Processing with SQLNorikra: Stream Processing with SQL
Norikra: Stream Processing with SQL
 
Norikra: SQL Stream Processing In Ruby
Norikra: SQL Stream Processing In RubyNorikra: SQL Stream Processing In Ruby
Norikra: SQL Stream Processing In Ruby
 
BigQuery, Fluentd and tagomoris #gcpja
BigQuery, Fluentd and tagomoris #gcpjaBigQuery, Fluentd and tagomoris #gcpja
BigQuery, Fluentd and tagomoris #gcpja
 
運用とデータ分析の遠くて近い関係、ISUCONを添えて
運用とデータ分析の遠くて近い関係、ISUCONを添えて運用とデータ分析の遠くて近い関係、ISUCONを添えて
運用とデータ分析の遠くて近い関係、ISUCONを添えて
 
Big Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionBig Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in Action
 
Effectively using Open Source with conda
Effectively using Open Source with condaEffectively using Open Source with conda
Effectively using Open Source with conda
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data
 
Fluentd and WebHDFS
Fluentd and WebHDFSFluentd and WebHDFS
Fluentd and WebHDFS
 
Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013
 
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne...
 
A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013
 
Design Principles for a Modern Data Warehouse
Design Principles for a Modern Data WarehouseDesign Principles for a Modern Data Warehouse
Design Principles for a Modern Data Warehouse
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache Spark
 
fluent-plugin-norikra #fluentdcasual
fluent-plugin-norikra #fluentdcasualfluent-plugin-norikra #fluentdcasual
fluent-plugin-norikra #fluentdcasual
 
Hadoop and Kerberos
Hadoop and KerberosHadoop and Kerberos
Hadoop and Kerberos
 
Kostenlose Social Media Monitoring Tools
Kostenlose Social Media Monitoring ToolsKostenlose Social Media Monitoring Tools
Kostenlose Social Media Monitoring Tools
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 

Similar to Lambda Architecture Using SQL

Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matter
Paolo Castagna
 
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar SeriesIntroducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Amazon Web Services
 
Mashups
MashupsMashups
Mashups
Johan Eltes
 
Data Microservices In The Cloud + 日本語コメント
Data Microservices In The Cloud + 日本語コメントData Microservices In The Cloud + 日本語コメント
Data Microservices In The Cloud + 日本語コメント
Takuya Saeki
 
Spark streaming state of the union
Spark streaming state of the unionSpark streaming state of the union
Spark streaming state of the union
Databricks
 
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
Getting value from IoT, Integration and Data Analytics
 
Cloud Lambda Architecture Patterns
Cloud Lambda Architecture PatternsCloud Lambda Architecture Patterns
Cloud Lambda Architecture Patterns
Asis Mohanty
 
Spark ML Pipeline serving
Spark ML Pipeline servingSpark ML Pipeline serving
Spark ML Pipeline serving
Stepan Pushkarev
 
StrongLoop Overview
StrongLoop OverviewStrongLoop Overview
StrongLoop Overview
Shubhra Kar
 
(CMP403) AWS Lambda: Simplifying Big Data Workloads
(CMP403) AWS Lambda: Simplifying Big Data Workloads(CMP403) AWS Lambda: Simplifying Big Data Workloads
(CMP403) AWS Lambda: Simplifying Big Data Workloads
Amazon Web Services
 
Hopsworks Feature Store 2.0 a new paradigm
Hopsworks Feature Store  2.0   a new paradigmHopsworks Feature Store  2.0   a new paradigm
Hopsworks Feature Store 2.0 a new paradigm
Jim Dowling
 
Webinar: Serverless Architectures with AWS Lambda and MongoDB Atlas
Webinar: Serverless Architectures with AWS Lambda and MongoDB AtlasWebinar: Serverless Architectures with AWS Lambda and MongoDB Atlas
Webinar: Serverless Architectures with AWS Lambda and MongoDB Atlas
MongoDB
 
Apache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextApache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's Next
Prateek Maheshwari
 
Serverless Architectural Patterns & Best Practices
Serverless Architectural Patterns & Best PracticesServerless Architectural Patterns & Best Practices
Serverless Architectural Patterns & Best Practices
Daniel Zivkovic
 
Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)
Databricks
 
Unified Big Data Processing with Apache Spark
Unified Big Data Processing with Apache SparkUnified Big Data Processing with Apache Spark
Unified Big Data Processing with Apache Spark
C4Media
 
Serverless Architecture - A Gentle Overview
Serverless Architecture - A Gentle OverviewServerless Architecture - A Gentle Overview
Serverless Architecture - A Gentle Overview
CodeOps Technologies LLP
 
Choisir entre une API RPC, SOAP, REST, GraphQL? 
Et si le problème était ai...
Choisir entre une API  RPC, SOAP, REST, GraphQL?  
Et si le problème était ai...Choisir entre une API  RPC, SOAP, REST, GraphQL?  
Et si le problème était ai...
Choisir entre une API RPC, SOAP, REST, GraphQL? 
Et si le problème était ai...
François-Guillaume Ribreau
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
Guido Schmutz
 
Strata NYC 2015: What's new in Spark Streaming
Strata NYC 2015: What's new in Spark StreamingStrata NYC 2015: What's new in Spark Streaming
Strata NYC 2015: What's new in Spark Streaming
Databricks
 

Similar to Lambda Architecture Using SQL (20)

Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matter
 
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar SeriesIntroducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
 
Mashups
MashupsMashups
Mashups
 
Data Microservices In The Cloud + 日本語コメント
Data Microservices In The Cloud + 日本語コメントData Microservices In The Cloud + 日本語コメント
Data Microservices In The Cloud + 日本語コメント
 
Spark streaming state of the union
Spark streaming state of the unionSpark streaming state of the union
Spark streaming state of the union
 
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
Databasecentricapisonthecloudusingplsqlandnodejscon3153oow2016 160922021655
 
Cloud Lambda Architecture Patterns
Cloud Lambda Architecture PatternsCloud Lambda Architecture Patterns
Cloud Lambda Architecture Patterns
 
Spark ML Pipeline serving
Spark ML Pipeline servingSpark ML Pipeline serving
Spark ML Pipeline serving
 
StrongLoop Overview
StrongLoop OverviewStrongLoop Overview
StrongLoop Overview
 
(CMP403) AWS Lambda: Simplifying Big Data Workloads
(CMP403) AWS Lambda: Simplifying Big Data Workloads(CMP403) AWS Lambda: Simplifying Big Data Workloads
(CMP403) AWS Lambda: Simplifying Big Data Workloads
 
Hopsworks Feature Store 2.0 a new paradigm
Hopsworks Feature Store  2.0   a new paradigmHopsworks Feature Store  2.0   a new paradigm
Hopsworks Feature Store 2.0 a new paradigm
 
Webinar: Serverless Architectures with AWS Lambda and MongoDB Atlas
Webinar: Serverless Architectures with AWS Lambda and MongoDB AtlasWebinar: Serverless Architectures with AWS Lambda and MongoDB Atlas
Webinar: Serverless Architectures with AWS Lambda and MongoDB Atlas
 
Apache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's NextApache Samza 1.0 - What's New, What's Next
Apache Samza 1.0 - What's New, What's Next
 
Serverless Architectural Patterns & Best Practices
Serverless Architectural Patterns & Best PracticesServerless Architectural Patterns & Best Practices
Serverless Architectural Patterns & Best Practices
 
Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)
 
Unified Big Data Processing with Apache Spark
Unified Big Data Processing with Apache SparkUnified Big Data Processing with Apache Spark
Unified Big Data Processing with Apache Spark
 
Serverless Architecture - A Gentle Overview
Serverless Architecture - A Gentle OverviewServerless Architecture - A Gentle Overview
Serverless Architecture - A Gentle Overview
 
Choisir entre une API RPC, SOAP, REST, GraphQL? 
Et si le problème était ai...
Choisir entre une API  RPC, SOAP, REST, GraphQL?  
Et si le problème était ai...Choisir entre une API  RPC, SOAP, REST, GraphQL?  
Et si le problème était ai...
Choisir entre une API RPC, SOAP, REST, GraphQL? 
Et si le problème était ai...
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
 
Strata NYC 2015: What's new in Spark Streaming
Strata NYC 2015: What's new in Spark StreamingStrata NYC 2015: What's new in Spark Streaming
Strata NYC 2015: What's new in Spark Streaming
 

More from SATOSHI TAGOMORI

Ractor's speed is not light-speed
Ractor's speed is not light-speedRactor's speed is not light-speed
Ractor's speed is not light-speed
SATOSHI TAGOMORI
 
Good Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/OperationsGood Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/Operations
SATOSHI TAGOMORI
 
Maccro Strikes Back
Maccro Strikes BackMaccro Strikes Back
Maccro Strikes Back
SATOSHI TAGOMORI
 
Invitation to the dark side of Ruby
Invitation to the dark side of RubyInvitation to the dark side of Ruby
Invitation to the dark side of Ruby
SATOSHI TAGOMORI
 
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
SATOSHI TAGOMORI
 
Make Your Ruby Script Confusing
Make Your Ruby Script ConfusingMake Your Ruby Script Confusing
Make Your Ruby Script Confusing
SATOSHI TAGOMORI
 
Hijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in RubyHijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in Ruby
SATOSHI TAGOMORI
 
Lock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive OperationsLock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive Operations
SATOSHI TAGOMORI
 
Data Processing and Ruby in the World
Data Processing and Ruby in the WorldData Processing and Ruby in the World
Data Processing and Ruby in the World
SATOSHI TAGOMORI
 
Planet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: BigdamPlanet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: Bigdam
SATOSHI TAGOMORI
 
Technologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise BusinessTechnologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise Business
SATOSHI TAGOMORI
 
Ruby and Distributed Storage Systems
Ruby and Distributed Storage SystemsRuby and Distributed Storage Systems
Ruby and Distributed Storage Systems
SATOSHI TAGOMORI
 
Perfect Norikra 2nd Season
Perfect Norikra 2nd SeasonPerfect Norikra 2nd Season
Perfect Norikra 2nd Season
SATOSHI TAGOMORI
 
Fluentd 101
Fluentd 101Fluentd 101
Fluentd 101
SATOSHI TAGOMORI
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
SATOSHI TAGOMORI
 
How To Write Middleware In Ruby
How To Write Middleware In RubyHow To Write Middleware In Ruby
How To Write Middleware In Ruby
SATOSHI TAGOMORI
 
Modern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real WorldModern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real World
SATOSHI TAGOMORI
 
Open Source Software, Distributed Systems, Database as a Cloud Service
Open Source Software, Distributed Systems, Database as a Cloud ServiceOpen Source Software, Distributed Systems, Database as a Cloud Service
Open Source Software, Distributed Systems, Database as a Cloud Service
SATOSHI TAGOMORI
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and Then
SATOSHI TAGOMORI
 
How to Make Norikra Perfect
How to Make Norikra PerfectHow to Make Norikra Perfect
How to Make Norikra Perfect
SATOSHI TAGOMORI
 

More from SATOSHI TAGOMORI (20)

Ractor's speed is not light-speed
Ractor's speed is not light-speedRactor's speed is not light-speed
Ractor's speed is not light-speed
 
Good Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/OperationsGood Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/Operations
 
Maccro Strikes Back
Maccro Strikes BackMaccro Strikes Back
Maccro Strikes Back
 
Invitation to the dark side of Ruby
Invitation to the dark side of RubyInvitation to the dark side of Ruby
Invitation to the dark side of Ruby
 
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
 
Make Your Ruby Script Confusing
Make Your Ruby Script ConfusingMake Your Ruby Script Confusing
Make Your Ruby Script Confusing
 
Hijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in RubyHijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in Ruby
 
Lock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive OperationsLock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive Operations
 
Data Processing and Ruby in the World
Data Processing and Ruby in the WorldData Processing and Ruby in the World
Data Processing and Ruby in the World
 
Planet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: BigdamPlanet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: Bigdam
 
Technologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise BusinessTechnologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise Business
 
Ruby and Distributed Storage Systems
Ruby and Distributed Storage SystemsRuby and Distributed Storage Systems
Ruby and Distributed Storage Systems
 
Perfect Norikra 2nd Season
Perfect Norikra 2nd SeasonPerfect Norikra 2nd Season
Perfect Norikra 2nd Season
 
Fluentd 101
Fluentd 101Fluentd 101
Fluentd 101
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
 
How To Write Middleware In Ruby
How To Write Middleware In RubyHow To Write Middleware In Ruby
How To Write Middleware In Ruby
 
Modern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real WorldModern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real World
 
Open Source Software, Distributed Systems, Database as a Cloud Service
Open Source Software, Distributed Systems, Database as a Cloud ServiceOpen Source Software, Distributed Systems, Database as a Cloud Service
Open Source Software, Distributed Systems, Database as a Cloud Service
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and Then
 
How to Make Norikra Perfect
How to Make Norikra PerfectHow to Make Norikra Perfect
How to Make Norikra Perfect
 

Recently uploaded

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 

Recently uploaded (20)

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 

Lambda Architecture Using SQL

  • 1. Lambda Architecture Platform Using SQL Sep 13 2014 HadoopCon 2014 Taiwan TAGOMORI Satoshi (@tagomoris)
  • 3. Topics About Me & LINE Data analytics workloads Batch processing Stream processing Lambda architecture Lambda architecture using SQL Norikra: Stream processing with SQL 13:30-14:20 4F
  • 4. @tagomoris Satoshi Tagomori (田籠 聡) LINE Corporation Analytics Platform Team
  • 6.
  • 7. LINE Offices Tokyo HQ Spain Thailand Taipei USA Korea
  • 8. LINE is born! JUNE 23, 2011
  • 9.
  • 11. Various Data Analytics Workload Reports Monthly/Daily reports Hourly (or shorter) news Real-time metrics Automatically updated reports/graphs Alerts for abuse of services, overload, ...
  • 12.
  • 13. Batch Processing Hadoop MapReduce (or Spark, Tez) & DSLs (Hive, Pig, ...) For reports MPP Engines Cloudera Impala, Apache Drill, Facebook Presto, ... For interactive analysis For reports of shorter window
  • 14. Stream Processing Apache Storm Incubator project “Distributed and fault-tolerant realtime computation” Norikra by tagomoris Non-distributed “Stream processing with SQL”
  • 15. Why Stream Processing? Less latency Realtime metrics Short-term prompt reports Less computing power 10Mbps for batch processing: 100GB/day 10Mbps for stream processing: 1 Server No query schedule management Once query registered, it runs forever
  • 16. Disadvantage of Stream Processing Queries must be written before data There should be another way to query past data Queries cannot be run twice All results will be lost when any error occurs All data have gone when bugs found Disorders of events break results Recorded time based queries? Or arrival time based queries?
  • 17. Part 02 Lambda Architecture
  • 18. Lambda Architecture “The Lambda-architecture aims to satisfy the needs for a robust system that is fault-tolerant, both against hardware failures and human mistakes, being able to serve a wide range of workloads and use cases, and in which low-latency reads and updates are required. The resulting system should be linearly scalable, and it should scale out rather than up.” http://lambda-architecture.net/
  • 19. Lambda Architecture: Overview new data batch layer master dataset serving layer view speed layer real-time view query
  • 20. Twitter Summingbird Lambda architecture library Batch mode: Scalding on Hadoop MapReduce Realtime mode: Storm Word counting by Summingbird (scala): def wordCount[P <: Platform[P]] (source: Producer[P, String], store: P#Store[String, Long]) = source.flatMap { sentence => toWords(sentence).map(_ -> 1L) }.sumByKey(store) https://github.com/twitter/summingbird https://blog.twitter.com/2013/streaming-mapreduce-with-summingbird
  • 21. What Lambda Architecture Provides Replayable queries Redo queries anytime if results of speed layer are broken Accurate results on demand Prompt reports in speed layer with arrival time Fixed reports in batch layer with recorded time ... And many more benefits of stream processing
  • 22. Why All of Us Don’t Use It? Storm doesn’t fit well with many uses Storm requires computer resources too big to deploy Summingbird requires many steps to deploy Many directors/analysts don’t write Scala/Java Summingbird DSL is not enough easy for non-professional people
  • 24. Existing Hadoop Platform new data HDFS hive query Fluentd presto query
  • 25. Norikra Schema-less stream processing with SQL “Norikra is a open source server software provides "Stream Processing" with SQL, written in JRuby, runs on JVM, licensed under GPLv2.” SELECT path, COUNT(1, status=200) AS success_count, COUNT(1, status=500) AS server_error_count, COUNT(*) AS count FROM AccessLog.win:time_batch(10 min, 0L) WHERE service='myservice' AND path LIKE '/api/%' GROUP BY path http://norikra.github.io/
  • 26. Added-on Lambda Architecture Platform new data presto query HDFS hive query norikra query
  • 27. “Pseudo Lambda” Architecture Using SQL Lambda architecture platform with almost same queries SELECT path, COUNT(IF(status=200,1,NULL)) AS success_count, COUNT(IF(status=500,1,NULL)) AS server_error_count, COUNT(*) AS count FROM AccessLog WHERE service='myservice' AND path LIKE '/api/%' AND timestamp >= ‘2014-09-13 10:40:00’ AND timestamp < ‘2014-09-13 10:50:00’ GROUP BY path SELECT path, COUNT(1, status=200) AS success_count, COUNT(1, status=500) AS server_error_count, COUNT(*) AS count FROM AccessLog.win:time_batch(10 min, 0L) WHERE service='myservice' AND path LIKE '/api/%' GROUP BY path
  • 28. “Pseudo Lambda” Architecture Using SQL SQL dialects are easy to learn! Standard SQL, Hive, Presto, Impala, Drill, ... + Norikra For non-professional people too! SQL queries are very easy to write twice!
  • 29. Use Cases in LINE Prompt reports for Ads service Short-term prompt reports by Norikra Daily fixed reports by Hive Summary of application server error log Aggregate error log for alerting by Norikra Check details with Hive, Presto (or grep!) See you later for details!
  • 30. TMTOWTDI “There’s more than one way to do it.” - Perl programming language
  • 31. SHARE What I want & What I’m doing! - tagomoris
  • 32. Q & A