SlideShare a Scribd company logo

Norikra: Stream Processing with SQL

The document discusses stream processing using SQL with Norikra. It begins with an overview of stream processing basics and using SQL for queries on streaming data. It then provides details on Norikra, an open source software that allows for schema-less stream processing with SQL. Key features of Norikra include running on JRuby, supporting complex event types, and allowing SQL queries on streaming data without needing to restart for schema changes. Examples of Norikra queries are also shown.

1 of 61
Download to read offline
Norikra: 
Stream Processing With SQL 
2014/09/13 
HadoopCon 2014 Taiwan 
Satoshi Tagomori (@tagomoris)
Satoshi Tagomori (@tagomoris) 
LINE Corporation 
Analytics Platform Team
THE ONE THING 
WHAT YOU MUST LEAN 
TODAY IS
Norikra
Norikra 
IS NOT 
Norika
Topics 
Basics of stream processing 
Stream processing with SQL 
Norikra overview 
Norikra queries 
Use cases in production

Recommended

Norikra: SQL Stream Processing In Ruby
Norikra: SQL Stream Processing In RubyNorikra: SQL Stream Processing In Ruby
Norikra: SQL Stream Processing In RubySATOSHI TAGOMORI
 
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormDavorin Vukelic
 
HadoopCon 2016 - 用 Jupyter Notebook Hold 住一個上線 Spark Machine Learning 專案實戰
HadoopCon 2016  - 用 Jupyter Notebook Hold 住一個上線 Spark  Machine Learning 專案實戰HadoopCon 2016  - 用 Jupyter Notebook Hold 住一個上線 Spark  Machine Learning 專案實戰
HadoopCon 2016 - 用 Jupyter Notebook Hold 住一個上線 Spark Machine Learning 專案實戰Wayne Chen
 
Kafka 102: Streams and Tables All the Way Down | Kafka Summit San Francisco 2019
Kafka 102: Streams and Tables All the Way Down | Kafka Summit San Francisco 2019Kafka 102: Streams and Tables All the Way Down | Kafka Summit San Francisco 2019
Kafka 102: Streams and Tables All the Way Down | Kafka Summit San Francisco 2019Michael Noll
 
Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Flink 0.10 @ Bay Area Meetup (October 2015)
Flink 0.10 @ Bay Area Meetup (October 2015)Stephan Ewen
 
Realtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQRealtime Statistics based on Apache Storm and RocketMQ
Realtime Statistics based on Apache Storm and RocketMQXin Wang
 
Spark Streaming with Cassandra
Spark Streaming with CassandraSpark Streaming with Cassandra
Spark Streaming with CassandraJacek Lewandowski
 
Leveraging the Power of Solr with Spark
Leveraging the Power of Solr with SparkLeveraging the Power of Solr with Spark
Leveraging the Power of Solr with SparkQAware GmbH
 

More Related Content

What's hot

Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014Konrad Malawski
 
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...DataWorks Summit
 
PSUG #52 Dataflow and simplified reactive programming with Akka-streams
PSUG #52 Dataflow and simplified reactive programming with Akka-streamsPSUG #52 Dataflow and simplified reactive programming with Akka-streams
PSUG #52 Dataflow and simplified reactive programming with Akka-streamsStephane Manciot
 
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...Lucidworks
 
Introduction to Twitter Storm
Introduction to Twitter StormIntroduction to Twitter Storm
Introduction to Twitter StormUwe Printz
 
RESTful API – How to Consume, Extract, Store and Visualize Data with InfluxDB...
RESTful API – How to Consume, Extract, Store and Visualize Data with InfluxDB...RESTful API – How to Consume, Extract, Store and Visualize Data with InfluxDB...
RESTful API – How to Consume, Extract, Store and Visualize Data with InfluxDB...InfluxData
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationslucenerevolution
 
Escape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Escape from Hadoop: Ultra Fast Data Analysis with Spark & CassandraEscape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Escape from Hadoop: Ultra Fast Data Analysis with Spark & CassandraPiotr Kolaczkowski
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureP. Taylor Goetz
 
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...Spark Summit
 
Storm - As deep into real-time data processing as you can get in 30 minutes.
Storm - As deep into real-time data processing as you can get in 30 minutes.Storm - As deep into real-time data processing as you can get in 30 minutes.
Storm - As deep into real-time data processing as you can get in 30 minutes.Dan Lynn
 
Druid meetup 4th_sql_on_druid
Druid meetup 4th_sql_on_druidDruid meetup 4th_sql_on_druid
Druid meetup 4th_sql_on_druidYousun Jeong
 
Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013
Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013
Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013Sonal Raj
 
Analytics with Cassandra & Spark
Analytics with Cassandra & SparkAnalytics with Cassandra & Spark
Analytics with Cassandra & SparkMatthias Niehoff
 
Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDBMonitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDBGeoffrey Anderson
 
Building a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrBuilding a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrlucenerevolution
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustAltinity Ltd
 
Real Time Data Processing Using Spark Streaming
Real Time Data Processing Using Spark StreamingReal Time Data Processing Using Spark Streaming
Real Time Data Processing Using Spark StreamingHari Shreedharan
 

What's hot (20)

Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014Scalding - the not-so-basics @ ScalaDays 2014
Scalding - the not-so-basics @ ScalaDays 2014
 
Spark streaming: Best Practices
Spark streaming: Best PracticesSpark streaming: Best Practices
Spark streaming: Best Practices
 
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...
Modus operandi of Spark Streaming - Recipes for Running your Streaming Applic...
 
Scala+data
Scala+dataScala+data
Scala+data
 
PSUG #52 Dataflow and simplified reactive programming with Akka-streams
PSUG #52 Dataflow and simplified reactive programming with Akka-streamsPSUG #52 Dataflow and simplified reactive programming with Akka-streams
PSUG #52 Dataflow and simplified reactive programming with Akka-streams
 
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
 
Introduction to Twitter Storm
Introduction to Twitter StormIntroduction to Twitter Storm
Introduction to Twitter Storm
 
RESTful API – How to Consume, Extract, Store and Visualize Data with InfluxDB...
RESTful API – How to Consume, Extract, Store and Visualize Data with InfluxDB...RESTful API – How to Consume, Extract, Store and Visualize Data with InfluxDB...
RESTful API – How to Consume, Extract, Store and Visualize Data with InfluxDB...
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Escape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Escape from Hadoop: Ultra Fast Data Analysis with Spark & CassandraEscape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Escape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm Architecture
 
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
 
Storm - As deep into real-time data processing as you can get in 30 minutes.
Storm - As deep into real-time data processing as you can get in 30 minutes.Storm - As deep into real-time data processing as you can get in 30 minutes.
Storm - As deep into real-time data processing as you can get in 30 minutes.
 
Druid meetup 4th_sql_on_druid
Druid meetup 4th_sql_on_druidDruid meetup 4th_sql_on_druid
Druid meetup 4th_sql_on_druid
 
Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013
Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013
Real Time Graph Computations in Storm, Neo4J, Python - PyCon India 2013
 
Analytics with Cassandra & Spark
Analytics with Cassandra & SparkAnalytics with Cassandra & Spark
Analytics with Cassandra & Spark
 
Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDBMonitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDB
 
Building a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrBuilding a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solr
 
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, AdjustShipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
Shipping Data from Postgres to Clickhouse, by Murat Kabilov, Adjust
 
Real Time Data Processing Using Spark Streaming
Real Time Data Processing Using Spark StreamingReal Time Data Processing Using Spark Streaming
Real Time Data Processing Using Spark Streaming
 

Viewers also liked

Norikra in Action (ver. 2014 spring)
Norikra in Action (ver. 2014 spring)Norikra in Action (ver. 2014 spring)
Norikra in Action (ver. 2014 spring)SATOSHI TAGOMORI
 
fluent-plugin-norikra #fluentdcasual
fluent-plugin-norikra #fluentdcasualfluent-plugin-norikra #fluentdcasual
fluent-plugin-norikra #fluentdcasualSATOSHI TAGOMORI
 
Lambda Architecture Using SQL
Lambda Architecture Using SQLLambda Architecture Using SQL
Lambda Architecture Using SQLSATOSHI TAGOMORI
 
BigQuery, Fluentd and tagomoris #gcpja
BigQuery, Fluentd and tagomoris #gcpjaBigQuery, Fluentd and tagomoris #gcpja
BigQuery, Fluentd and tagomoris #gcpjaSATOSHI TAGOMORI
 
運用とデータ分析の遠くて近い関係、ISUCONを添えて
運用とデータ分析の遠くて近い関係、ISUCONを添えて運用とデータ分析の遠くて近い関係、ISUCONを添えて
運用とデータ分析の遠くて近い関係、ISUCONを添えてSATOSHI TAGOMORI
 
Norikra + Fluentd + Elasticsearch + Kibana リアルタイムストリーミング処理 ログ集計による異常検知
Norikra + Fluentd+ Elasticsearch + Kibana リアルタイムストリーミング処理ログ集計による異常検知Norikra + Fluentd+ Elasticsearch + Kibana リアルタイムストリーミング処理ログ集計による異常検知
Norikra + Fluentd + Elasticsearch + Kibana リアルタイムストリーミング処理 ログ集計による異常検知daisuke-a-matsui
 
Reporting Large Environment Zabbix Database
Reporting Large Environment Zabbix DatabaseReporting Large Environment Zabbix Database
Reporting Large Environment Zabbix DatabaseAlain Ganuchaud
 
How to Make Norikra Perfect
How to Make Norikra PerfectHow to Make Norikra Perfect
How to Make Norikra PerfectSATOSHI TAGOMORI
 
Norikraで作るPHPの例外検知システム YAPC::Asia Tokyo 2015 LT
Norikraで作るPHPの例外検知システム YAPC::Asia Tokyo 2015 LTNorikraで作るPHPの例外検知システム YAPC::Asia Tokyo 2015 LT
Norikraで作るPHPの例外検知システム YAPC::Asia Tokyo 2015 LTMasahiro Nagano
 
Hadoop and Kerberos
Hadoop and KerberosHadoop and Kerberos
Hadoop and KerberosYuta Imai
 

Viewers also liked (14)

Norikra in Action (ver. 2014 spring)
Norikra in Action (ver. 2014 spring)Norikra in Action (ver. 2014 spring)
Norikra in Action (ver. 2014 spring)
 
fluent-plugin-norikra #fluentdcasual
fluent-plugin-norikra #fluentdcasualfluent-plugin-norikra #fluentdcasual
fluent-plugin-norikra #fluentdcasual
 
Invitation for v1.0.0
Invitation for v1.0.0Invitation for v1.0.0
Invitation for v1.0.0
 
Handling not so big data
Handling not so big dataHandling not so big data
Handling not so big data
 
Lambda Architecture Using SQL
Lambda Architecture Using SQLLambda Architecture Using SQL
Lambda Architecture Using SQL
 
BigQuery, Fluentd and tagomoris #gcpja
BigQuery, Fluentd and tagomoris #gcpjaBigQuery, Fluentd and tagomoris #gcpja
BigQuery, Fluentd and tagomoris #gcpja
 
Norikra in action
Norikra in actionNorikra in action
Norikra in action
 
運用とデータ分析の遠くて近い関係、ISUCONを添えて
運用とデータ分析の遠くて近い関係、ISUCONを添えて運用とデータ分析の遠くて近い関係、ISUCONを添えて
運用とデータ分析の遠くて近い関係、ISUCONを添えて
 
Norikra + Fluentd + Elasticsearch + Kibana リアルタイムストリーミング処理 ログ集計による異常検知
Norikra + Fluentd+ Elasticsearch + Kibana リアルタイムストリーミング処理ログ集計による異常検知Norikra + Fluentd+ Elasticsearch + Kibana リアルタイムストリーミング処理ログ集計による異常検知
Norikra + Fluentd + Elasticsearch + Kibana リアルタイムストリーミング処理 ログ集計による異常検知
 
Reporting Large Environment Zabbix Database
Reporting Large Environment Zabbix DatabaseReporting Large Environment Zabbix Database
Reporting Large Environment Zabbix Database
 
Fluentd and WebHDFS
Fluentd and WebHDFSFluentd and WebHDFS
Fluentd and WebHDFS
 
How to Make Norikra Perfect
How to Make Norikra PerfectHow to Make Norikra Perfect
How to Make Norikra Perfect
 
Norikraで作るPHPの例外検知システム YAPC::Asia Tokyo 2015 LT
Norikraで作るPHPの例外検知システム YAPC::Asia Tokyo 2015 LTNorikraで作るPHPの例外検知システム YAPC::Asia Tokyo 2015 LT
Norikraで作るPHPの例外検知システム YAPC::Asia Tokyo 2015 LT
 
Hadoop and Kerberos
Hadoop and KerberosHadoop and Kerberos
Hadoop and Kerberos
 

Similar to Norikra: Stream Processing with SQL

Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemWprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemSages
 
Wprowadzenie do technologi Big Data i Apache Hadoop
Wprowadzenie do technologi Big Data i Apache HadoopWprowadzenie do technologi Big Data i Apache Hadoop
Wprowadzenie do technologi Big Data i Apache HadoopSages
 
Making sense of your data
Making sense of your dataMaking sense of your data
Making sense of your dataGerald Muecke
 
Kick Your Database to the Curb
Kick Your Database to the CurbKick Your Database to the Curb
Kick Your Database to the CurbBill Bejeck
 
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | Prometheus
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | PrometheusCreating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | Prometheus
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | PrometheusInfluxData
 
Streaming Operational Data with MariaDB MaxScale
Streaming Operational Data with MariaDB MaxScaleStreaming Operational Data with MariaDB MaxScale
Streaming Operational Data with MariaDB MaxScaleMariaDB plc
 
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"DataStax Academy
 
Intravert Server side processing for Cassandra
Intravert Server side processing for CassandraIntravert Server side processing for Cassandra
Intravert Server side processing for CassandraEdward Capriolo
 
Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19confluent
 
Scalding Big (Ad)ta
Scalding Big (Ad)taScalding Big (Ad)ta
Scalding Big (Ad)tab0ris_1
 
Visual sedimentation - IEEE VIS 2013 Atlanta
Visual sedimentation - IEEE VIS 2013 AtlantaVisual sedimentation - IEEE VIS 2013 Atlanta
Visual sedimentation - IEEE VIS 2013 AtlantaSamuel Huron
 
(NET301) New Capabilities for Amazon Virtual Private Cloud
(NET301) New Capabilities for Amazon Virtual Private Cloud(NET301) New Capabilities for Amazon Virtual Private Cloud
(NET301) New Capabilities for Amazon Virtual Private CloudAmazon Web Services
 
Moving beyond moving bytes
Moving beyond moving bytesMoving beyond moving bytes
Moving beyond moving bytesSuneel Marthi
 
Flink Forward Berlin 2017: Joey Frazee, Suneel Marthi - Moving Beyond Moving ...
Flink Forward Berlin 2017: Joey Frazee, Suneel Marthi - Moving Beyond Moving ...Flink Forward Berlin 2017: Joey Frazee, Suneel Marthi - Moving Beyond Moving ...
Flink Forward Berlin 2017: Joey Frazee, Suneel Marthi - Moving Beyond Moving ...Flink Forward
 
Programming with Python and PostgreSQL
Programming with Python and PostgreSQLProgramming with Python and PostgreSQL
Programming with Python and PostgreSQLPeter Eisentraut
 
Foundations of streaming SQL: stream & table theory
Foundations of streaming SQL: stream & table theoryFoundations of streaming SQL: stream & table theory
Foundations of streaming SQL: stream & table theoryDataWorks Summit
 
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...InfluxData
 
Fotolog: Scaling the World's Largest Photo Blogging Community
Fotolog: Scaling the World's Largest Photo Blogging CommunityFotolog: Scaling the World's Largest Photo Blogging Community
Fotolog: Scaling the World's Largest Photo Blogging Communityfarhan "Frank"​ mashraqi
 

Similar to Norikra: Stream Processing with SQL (20)

Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data EcosystemWprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
Wprowadzenie do technologii Big Data / Intro to Big Data Ecosystem
 
Wprowadzenie do technologi Big Data i Apache Hadoop
Wprowadzenie do technologi Big Data i Apache HadoopWprowadzenie do technologi Big Data i Apache Hadoop
Wprowadzenie do technologi Big Data i Apache Hadoop
 
Making sense of your data
Making sense of your dataMaking sense of your data
Making sense of your data
 
Kick Your Database to the Curb
Kick Your Database to the CurbKick Your Database to the Curb
Kick Your Database to the Curb
 
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | Prometheus
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | PrometheusCreating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | Prometheus
Creating the PromQL Transpiler for Flux by Julius Volz, Co-Founder | Prometheus
 
Streaming Operational Data with MariaDB MaxScale
Streaming Operational Data with MariaDB MaxScaleStreaming Operational Data with MariaDB MaxScale
Streaming Operational Data with MariaDB MaxScale
 
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
 
Intravert Server side processing for Cassandra
Intravert Server side processing for CassandraIntravert Server side processing for Cassandra
Intravert Server side processing for Cassandra
 
Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19
 
Scalding Big (Ad)ta
Scalding Big (Ad)taScalding Big (Ad)ta
Scalding Big (Ad)ta
 
Visual sedimentation - IEEE VIS 2013 Atlanta
Visual sedimentation - IEEE VIS 2013 AtlantaVisual sedimentation - IEEE VIS 2013 Atlanta
Visual sedimentation - IEEE VIS 2013 Atlanta
 
(NET301) New Capabilities for Amazon Virtual Private Cloud
(NET301) New Capabilities for Amazon Virtual Private Cloud(NET301) New Capabilities for Amazon Virtual Private Cloud
(NET301) New Capabilities for Amazon Virtual Private Cloud
 
Apache HAWQ Architecture
Apache HAWQ ArchitectureApache HAWQ Architecture
Apache HAWQ Architecture
 
Moving beyond moving bytes
Moving beyond moving bytesMoving beyond moving bytes
Moving beyond moving bytes
 
Flink Forward Berlin 2017: Joey Frazee, Suneel Marthi - Moving Beyond Moving ...
Flink Forward Berlin 2017: Joey Frazee, Suneel Marthi - Moving Beyond Moving ...Flink Forward Berlin 2017: Joey Frazee, Suneel Marthi - Moving Beyond Moving ...
Flink Forward Berlin 2017: Joey Frazee, Suneel Marthi - Moving Beyond Moving ...
 
Xbfs HPDC'2019
Xbfs HPDC'2019Xbfs HPDC'2019
Xbfs HPDC'2019
 
Programming with Python and PostgreSQL
Programming with Python and PostgreSQLProgramming with Python and PostgreSQL
Programming with Python and PostgreSQL
 
Foundations of streaming SQL: stream & table theory
Foundations of streaming SQL: stream & table theoryFoundations of streaming SQL: stream & table theory
Foundations of streaming SQL: stream & table theory
 
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
 
Fotolog: Scaling the World's Largest Photo Blogging Community
Fotolog: Scaling the World's Largest Photo Blogging CommunityFotolog: Scaling the World's Largest Photo Blogging Community
Fotolog: Scaling the World's Largest Photo Blogging Community
 

More from SATOSHI TAGOMORI

Ractor's speed is not light-speed
Ractor's speed is not light-speedRactor's speed is not light-speed
Ractor's speed is not light-speedSATOSHI TAGOMORI
 
Good Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/OperationsGood Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/OperationsSATOSHI TAGOMORI
 
Invitation to the dark side of Ruby
Invitation to the dark side of RubyInvitation to the dark side of Ruby
Invitation to the dark side of RubySATOSHI TAGOMORI
 
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)SATOSHI TAGOMORI
 
Make Your Ruby Script Confusing
Make Your Ruby Script ConfusingMake Your Ruby Script Confusing
Make Your Ruby Script ConfusingSATOSHI TAGOMORI
 
Hijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in RubyHijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in RubySATOSHI TAGOMORI
 
Lock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive OperationsLock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive OperationsSATOSHI TAGOMORI
 
Data Processing and Ruby in the World
Data Processing and Ruby in the WorldData Processing and Ruby in the World
Data Processing and Ruby in the WorldSATOSHI TAGOMORI
 
Planet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: BigdamPlanet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: BigdamSATOSHI TAGOMORI
 
Technologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise BusinessTechnologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise BusinessSATOSHI TAGOMORI
 
Ruby and Distributed Storage Systems
Ruby and Distributed Storage SystemsRuby and Distributed Storage Systems
Ruby and Distributed Storage SystemsSATOSHI TAGOMORI
 
Perfect Norikra 2nd Season
Perfect Norikra 2nd SeasonPerfect Norikra 2nd Season
Perfect Norikra 2nd SeasonSATOSHI TAGOMORI
 
To Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT ToTo Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT ToSATOSHI TAGOMORI
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersSATOSHI TAGOMORI
 
How To Write Middleware In Ruby
How To Write Middleware In RubyHow To Write Middleware In Ruby
How To Write Middleware In RubySATOSHI TAGOMORI
 
Modern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real WorldModern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real WorldSATOSHI TAGOMORI
 
Open Source Software, Distributed Systems, Database as a Cloud Service
Open Source Software, Distributed Systems, Database as a Cloud ServiceOpen Source Software, Distributed Systems, Database as a Cloud Service
Open Source Software, Distributed Systems, Database as a Cloud ServiceSATOSHI TAGOMORI
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and ThenSATOSHI TAGOMORI
 

More from SATOSHI TAGOMORI (20)

Ractor's speed is not light-speed
Ractor's speed is not light-speedRactor's speed is not light-speed
Ractor's speed is not light-speed
 
Good Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/OperationsGood Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/Operations
 
Maccro Strikes Back
Maccro Strikes BackMaccro Strikes Back
Maccro Strikes Back
 
Invitation to the dark side of Ruby
Invitation to the dark side of RubyInvitation to the dark side of Ruby
Invitation to the dark side of Ruby
 
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
 
Make Your Ruby Script Confusing
Make Your Ruby Script ConfusingMake Your Ruby Script Confusing
Make Your Ruby Script Confusing
 
Hijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in RubyHijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in Ruby
 
Lock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive OperationsLock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive Operations
 
Data Processing and Ruby in the World
Data Processing and Ruby in the WorldData Processing and Ruby in the World
Data Processing and Ruby in the World
 
Planet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: BigdamPlanet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: Bigdam
 
Technologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise BusinessTechnologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise Business
 
Ruby and Distributed Storage Systems
Ruby and Distributed Storage SystemsRuby and Distributed Storage Systems
Ruby and Distributed Storage Systems
 
Perfect Norikra 2nd Season
Perfect Norikra 2nd SeasonPerfect Norikra 2nd Season
Perfect Norikra 2nd Season
 
Fluentd 101
Fluentd 101Fluentd 101
Fluentd 101
 
To Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT ToTo Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT To
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
 
How To Write Middleware In Ruby
How To Write Middleware In RubyHow To Write Middleware In Ruby
How To Write Middleware In Ruby
 
Modern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real WorldModern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real World
 
Open Source Software, Distributed Systems, Database as a Cloud Service
Open Source Software, Distributed Systems, Database as a Cloud ServiceOpen Source Software, Distributed Systems, Database as a Cloud Service
Open Source Software, Distributed Systems, Database as a Cloud Service
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and Then
 

Recently uploaded

Building Bridges: Merging RPA Processes, UiPath Apps, and Data Service to bu...
Building Bridges:  Merging RPA Processes, UiPath Apps, and Data Service to bu...Building Bridges:  Merging RPA Processes, UiPath Apps, and Data Service to bu...
Building Bridges: Merging RPA Processes, UiPath Apps, and Data Service to bu...DianaGray10
 
Mind your App Footprint 🐾⚡️🌱 (@FlutterHeroes 2024)
Mind your App Footprint 🐾⚡️🌱 (@FlutterHeroes 2024)Mind your App Footprint 🐾⚡️🌱 (@FlutterHeroes 2024)
Mind your App Footprint 🐾⚡️🌱 (@FlutterHeroes 2024)François
 
Achieving Excellence IESVE for HVAC Simulation.pdf
Achieving Excellence IESVE for HVAC Simulation.pdfAchieving Excellence IESVE for HVAC Simulation.pdf
Achieving Excellence IESVE for HVAC Simulation.pdfIES VE
 
Leonis Insights: The State of AI (7 trends for 2023 and 7 predictions for 2024)
Leonis Insights: The State of AI (7 trends for 2023 and 7 predictions for 2024)Leonis Insights: The State of AI (7 trends for 2023 and 7 predictions for 2024)
Leonis Insights: The State of AI (7 trends for 2023 and 7 predictions for 2024)Jay Zhao
 
How AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptxHow AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptxInfosec
 
Unleash the Solace Pub Sub connector | Banaglore MuleSoft Meetup #31
Unleash the Solace Pub Sub connector | Banaglore MuleSoft Meetup #31Unleash the Solace Pub Sub connector | Banaglore MuleSoft Meetup #31
Unleash the Solace Pub Sub connector | Banaglore MuleSoft Meetup #31shyamraj55
 
Q4 2023 Quarterly Investor Presentation - FINAL.pdf
Q4 2023 Quarterly Investor Presentation - FINAL.pdfQ4 2023 Quarterly Investor Presentation - FINAL.pdf
Q4 2023 Quarterly Investor Presentation - FINAL.pdfTejal81
 
Artificial Intelligence, Design, and More-than-Human Justice
Artificial Intelligence, Design, and More-than-Human JusticeArtificial Intelligence, Design, and More-than-Human Justice
Artificial Intelligence, Design, and More-than-Human JusticeJosh Gellers
 
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...ShapeBlue
 
AMER Introduction to ThousandEyes Webinar
AMER Introduction to ThousandEyes WebinarAMER Introduction to ThousandEyes Webinar
AMER Introduction to ThousandEyes WebinarThousandEyes
 
Learning About GenAI Engineering with AWS PartyRock [AWS User Group Basel - F...
Learning About GenAI Engineering with AWS PartyRock [AWS User Group Basel - F...Learning About GenAI Engineering with AWS PartyRock [AWS User Group Basel - F...
Learning About GenAI Engineering with AWS PartyRock [AWS User Group Basel - F...Chris Bingham
 
Elevating Cloud Infrastructure with Object Storage, DRS, VM Scheduling, and D...
Elevating Cloud Infrastructure with Object Storage, DRS, VM Scheduling, and D...Elevating Cloud Infrastructure with Object Storage, DRS, VM Scheduling, and D...
Elevating Cloud Infrastructure with Object Storage, DRS, VM Scheduling, and D...ShapeBlue
 
PrismCRM-RealEstate-SalesCRM_byCode5Company
PrismCRM-RealEstate-SalesCRM_byCode5CompanyPrismCRM-RealEstate-SalesCRM_byCode5Company
PrismCRM-RealEstate-SalesCRM_byCode5CompanyMustafa Kuğu
 
AI for Educators - Integrating AI in the Classrooms
AI for Educators - Integrating AI in the ClassroomsAI for Educators - Integrating AI in the Classrooms
AI for Educators - Integrating AI in the ClassroomsPremsankar Chakkingal
 
Boosting Developer Effectiveness with a Java platform team 1.4 - ArnhemJUG
Boosting Developer Effectiveness with a Java platform team 1.4 - ArnhemJUGBoosting Developer Effectiveness with a Java platform team 1.4 - ArnhemJUG
Boosting Developer Effectiveness with a Java platform team 1.4 - ArnhemJUGRick Ossendrijver
 
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptxGraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptxNeo4j
 
ChatGPT's Code Interpreter: Your secret weapon for SEO automation success - S...
ChatGPT's Code Interpreter: Your secret weapon for SEO automation success - S...ChatGPT's Code Interpreter: Your secret weapon for SEO automation success - S...
ChatGPT's Code Interpreter: Your secret weapon for SEO automation success - S...SearchNorwich
 
KUBRICK Graphs: A journey from in vogue to success-ion
KUBRICK Graphs: A journey from in vogue to success-ionKUBRICK Graphs: A journey from in vogue to success-ion
KUBRICK Graphs: A journey from in vogue to success-ionNeo4j
 
Python For Kids - Sách Lập trình cho trẻ em
Python For Kids - Sách Lập trình cho trẻ emPython For Kids - Sách Lập trình cho trẻ em
Python For Kids - Sách Lập trình cho trẻ emNho Vĩnh
 

Recently uploaded (20)

Building Bridges: Merging RPA Processes, UiPath Apps, and Data Service to bu...
Building Bridges:  Merging RPA Processes, UiPath Apps, and Data Service to bu...Building Bridges:  Merging RPA Processes, UiPath Apps, and Data Service to bu...
Building Bridges: Merging RPA Processes, UiPath Apps, and Data Service to bu...
 
Mind your App Footprint 🐾⚡️🌱 (@FlutterHeroes 2024)
Mind your App Footprint 🐾⚡️🌱 (@FlutterHeroes 2024)Mind your App Footprint 🐾⚡️🌱 (@FlutterHeroes 2024)
Mind your App Footprint 🐾⚡️🌱 (@FlutterHeroes 2024)
 
Achieving Excellence IESVE for HVAC Simulation.pdf
Achieving Excellence IESVE for HVAC Simulation.pdfAchieving Excellence IESVE for HVAC Simulation.pdf
Achieving Excellence IESVE for HVAC Simulation.pdf
 
Leonis Insights: The State of AI (7 trends for 2023 and 7 predictions for 2024)
Leonis Insights: The State of AI (7 trends for 2023 and 7 predictions for 2024)Leonis Insights: The State of AI (7 trends for 2023 and 7 predictions for 2024)
Leonis Insights: The State of AI (7 trends for 2023 and 7 predictions for 2024)
 
How AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptxHow AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptx
 
Unleash the Solace Pub Sub connector | Banaglore MuleSoft Meetup #31
Unleash the Solace Pub Sub connector | Banaglore MuleSoft Meetup #31Unleash the Solace Pub Sub connector | Banaglore MuleSoft Meetup #31
Unleash the Solace Pub Sub connector | Banaglore MuleSoft Meetup #31
 
Q4 2023 Quarterly Investor Presentation - FINAL.pdf
Q4 2023 Quarterly Investor Presentation - FINAL.pdfQ4 2023 Quarterly Investor Presentation - FINAL.pdf
Q4 2023 Quarterly Investor Presentation - FINAL.pdf
 
Artificial Intelligence, Design, and More-than-Human Justice
Artificial Intelligence, Design, and More-than-Human JusticeArtificial Intelligence, Design, and More-than-Human Justice
Artificial Intelligence, Design, and More-than-Human Justice
 
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...
What’s New in CloudStack 4.19, Abhishek Kumar, Release Manager Apache CloudSt...
 
AMER Introduction to ThousandEyes Webinar
AMER Introduction to ThousandEyes WebinarAMER Introduction to ThousandEyes Webinar
AMER Introduction to ThousandEyes Webinar
 
Learning About GenAI Engineering with AWS PartyRock [AWS User Group Basel - F...
Learning About GenAI Engineering with AWS PartyRock [AWS User Group Basel - F...Learning About GenAI Engineering with AWS PartyRock [AWS User Group Basel - F...
Learning About GenAI Engineering with AWS PartyRock [AWS User Group Basel - F...
 
Elevating Cloud Infrastructure with Object Storage, DRS, VM Scheduling, and D...
Elevating Cloud Infrastructure with Object Storage, DRS, VM Scheduling, and D...Elevating Cloud Infrastructure with Object Storage, DRS, VM Scheduling, and D...
Elevating Cloud Infrastructure with Object Storage, DRS, VM Scheduling, and D...
 
PrismCRM-RealEstate-SalesCRM_byCode5Company
PrismCRM-RealEstate-SalesCRM_byCode5CompanyPrismCRM-RealEstate-SalesCRM_byCode5Company
PrismCRM-RealEstate-SalesCRM_byCode5Company
 
AI for Educators - Integrating AI in the Classrooms
AI for Educators - Integrating AI in the ClassroomsAI for Educators - Integrating AI in the Classrooms
AI for Educators - Integrating AI in the Classrooms
 
Boosting Developer Effectiveness with a Java platform team 1.4 - ArnhemJUG
Boosting Developer Effectiveness with a Java platform team 1.4 - ArnhemJUGBoosting Developer Effectiveness with a Java platform team 1.4 - ArnhemJUG
Boosting Developer Effectiveness with a Java platform team 1.4 - ArnhemJUG
 
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptxGraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
 
In sharing we trust. Taking advantage of a diverse consortium to build a tran...
In sharing we trust. Taking advantage of a diverse consortium to build a tran...In sharing we trust. Taking advantage of a diverse consortium to build a tran...
In sharing we trust. Taking advantage of a diverse consortium to build a tran...
 
ChatGPT's Code Interpreter: Your secret weapon for SEO automation success - S...
ChatGPT's Code Interpreter: Your secret weapon for SEO automation success - S...ChatGPT's Code Interpreter: Your secret weapon for SEO automation success - S...
ChatGPT's Code Interpreter: Your secret weapon for SEO automation success - S...
 
KUBRICK Graphs: A journey from in vogue to success-ion
KUBRICK Graphs: A journey from in vogue to success-ionKUBRICK Graphs: A journey from in vogue to success-ion
KUBRICK Graphs: A journey from in vogue to success-ion
 
Python For Kids - Sách Lập trình cho trẻ em
Python For Kids - Sách Lập trình cho trẻ emPython For Kids - Sách Lập trình cho trẻ em
Python For Kids - Sách Lập trình cho trẻ em
 

Norikra: Stream Processing with SQL

  • 1. Norikra: Stream Processing With SQL 2014/09/13 HadoopCon 2014 Taiwan Satoshi Tagomori (@tagomoris)
  • 2. Satoshi Tagomori (@tagomoris) LINE Corporation Analytics Platform Team
  • 3. THE ONE THING WHAT YOU MUST LEAN TODAY IS
  • 5. Norikra IS NOT Norika
  • 6. Topics Basics of stream processing Stream processing with SQL Norikra overview Norikra queries Use cases in production
  • 7. Stream Processing Less latency Less computing power No query schedule management
  • 8. Data Flow And Latency data window query execution Batch Stream incremental query execution
  • 9. Query For Stored Data table v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 At first, all data MUST be stored.
  • 10. Query For Stored Data v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 SELECT v1,v2,COUNT(*) FROM table WHERE v3=’x’ GROUP BY v1,v2 table
  • 11. Query For Stored Data v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 SELECT v1,v2,COUNT(*) FROM table WHERE v3=’x’ GROUP BY v1,v2 table SELECT v4,COUNT(*) FROM table WHERE v1 AND v2 GROUP BY v4
  • 12. Query For Stored Data v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 SELECT v1,v2,COUNT(*) FROM table WHERE v3=’x’ GROUP BY v1,v2 table SELECT v4,COUNT(*) FROM table WHERE v1 AND v2 GROUP BY v4 “All data” means “data that will not be used”.
  • 13. Query For Stream Data v1,v2,v3,v4,v5,v6 SELECT v1,v2,COUNT(*) FROM table.win:xxx WHERE v3=’x’ GROUP BY v1,v2 stream SELECT v4,COUNT(*) FROM table.win:xxx WHERE v1 AND v2 GROUP BY v4 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6
  • 14. Query For Stream Data v1,v2,v3,v4,v5,v6 SELECT v1,v2,COUNT(*) FROM table.win:xxx WHERE v3=’x’ GROUP BY v1,v2 stream SELECT v4,COUNT(*) FROM table.win:xxx WHERE v1 AND v2 GROUP BY v4 v1,v2,v3 v1,v2,v4 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6
  • 15. Query For Stream Data v1,v2,v3,v4,v5,v6 SELECT v1,v2,COUNT(*) FROM table.win:xxx WHERE v3=’x’ GROUP BY v1,v2 stream SELECT v4,COUNT(*) FROM table.win:xxx WHERE v1 AND v2 GROUP BY v4 v1,v2,v3 v1,v2,v3,v4,v5,v6 v1,v2,v4 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6
  • 16. Query For Stream Data v1,v2,v3,v4,v5,v6 SELECT v1,v2,COUNT(*) FROM table.win:xxx WHERE v3=’x’ GROUP BY v1,v2 stream SELECT v4,COUNT(*) FROM table.win:xxx WHERE v1 AND v2 GROUP BY v4 v1,v2,v3 v1,v2,v4 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 All data will be discarded right after insertion. (Bye-bye storage system maintenance!)
  • 17. Incremental Calculation v1,v2,v3,v4,v5,v6 SELECT v1,v2,COUNT(*) FROM table.win:xxx WHERE v3=’x’ GROUP BY v1,v2 stream v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 internal data (memory) v1 v2 COUNT TRUE TRUE 0 TRUE FALSE 1 FALSE TRUE 33 FALSE FALSE 2
  • 18. Incremental Calculation v1,v2,v3,v4,v5,v6 SELECT v1,v2,COUNT(*) FROM table.win:xxx WHERE v3=’x’ GROUP BY v1,v2 stream v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 internal data (memory) v1 v2 COUNT TRUE TRUE 1 TRUE FALSE 1 FALSE TRUE 33 FALSE FALSE 2
  • 19. Incremental Calculation v1,v2,v3,v4,v5,v6 SELECT v1,v2,COUNT(*) FROM table.win:xxx WHERE v3=’x’ GROUP BY v1,v2 stream v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 internal data (memory) v1 v2 COUNT TRUE TRUE 1 TRUE FALSE 1 FALSE TRUE 34 FALSE FALSE 2
  • 20. Incremental Calculation SELECT v1,v2,COUNT(*) FROM table.win:xxx WHERE v3=’x’ GROUP BY v1,v2 stream v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 v1,v2,v3,v4,v5,v6 internal data (memory) v1 v2 COUNT TRUE TRUE 1 TRUE FALSE 2 FALSE TRUE 37 FALSE FALSE 3 memory can store internal data
  • 21. Data Window Target time (or size) range of queries Batch FROM-TO: WHERE dt >= ‘2014-09-13 13:30:00‘ AND dt < ‘2014-09-13 14:20:00’ Stream “Calculate this query every 50 minutes” Extended SQL required SELECT v1,v2,COUNT(*) FROM table.win:xxx WHERE v3=’x’ GROUP BY v1,v2
  • 22. Stream Processing With SQL Esper: Java library to process stream needs to be implemented in Java daemon code With schema for data/query OSS under GPLv2 http://esper.codehaus.org/
  • 23. Esper EPL Select values of height and weight for all events with age larger than 30 SELECT height, weight FROM tbl WHERE age > 30
  • 24. Esper EPL Count records group by height value for events with age larger than 30 SELECT height, COUNT(*) AS c FROM tbl WHERE age > 30 GROUP BY height This query doesn’t ever produce results
  • 25. Esper EPL Count records group by height value for events with age larger than 30 per every 1 hour SELECT height, COUNT(*) AS c FROM tbl.win:time_batch(1 hour) WHERE age > 30 GROUP BY height
  • 26. With/without Schema Schema-full data: strict schema: predefined fields w/ types (or reject) schema on read: try to read known fields (or ignore) Schema-less data: Any field (or ignore), any type (implicit/explicit conversion) fit for services under development: All internet services including us!
  • 27. Stream Processing & Schema Queries first, data second for all stream processing Queries automatically know what fields to query schema-less (mixed) data stream fields subset for query A fields subset for query B query A query B events from API endpoint events from billing service events of service X TO BE
  • 31. Norikra: Schema-less Stream Processing with SQL Server software, runs on JVM Open source software (GPLv2) http://norikra.github.io/ https://github.com/norikra/norikra
  • 32. Norikra: Schema-less event stream: Add/Remove data fields whenever you want SQL: No more restarts to add/remove queries w/ JOINs, w/ SubQueries w/ UDF (in Java/Ruby from rubygem) Truly Complex events: Nested Hash/Array, accessible directly from SQL HTTP RPC w/ JSON or MessagePack (fluentd plugin available!)
  • 33. How To Setup Norikra: Install JRuby download jruby.tar.gz, extract it and export $PATH use rbenv rbenv install jruby-1.7.xx rbenv shell jruby-.. Install Norikra gem install norikra Execute Norikra server norikra start
  • 34. Norikra Interface: Command line: norikra-client norikra-client target open ... norikra-client query add ... tail -f ... | norikra-client event send ... WebUI show status show/add/remove queries HTTP API JSON, MessagePack
  • 35. Norikra Queries: (1) SELECT name, age FROM events target
  • 36. Norikra Queries: (1) {“name”:”tagomoris”, “age”:34, “address”:”Tokyo”, “corp”:”LINE”, “current”:”Taipei”} SELECT name, age FROM events {“name”:”tagomoris”,”age”:34}
  • 37. Norikra Queries: (1) {“name”:”tagomoris”, “address”:”Tokyo”, “corp”:”LINE”, “current”:”Taipei”} without “age” SELECT name, age FROM events nothing
  • 38. Norikra Queries: (2) {“name”:”tagomoris”, “age”:34, “address”:”Tokyo”, “corp”:”LINE”, “current”:”Taipei”} SELECT name, age FROM events WHERE current=”Taipei” {“name”:”tagomoris”,”age”:34}
  • 39. Norikra Queries: (2) {“name”:”hadoop”, “age”:99, “address”:”Somewhere”, “corp”:”ASF”, “current”:”Elsewhere”} SELECT name, age FROM events WHERE current=”Taipei” nothing
  • 40. Norikra Queries: (3) SELECT age, COUNT(*) as cnt FROM events.win:time_batch(5 mins) GROUP BY age
  • 41. Norikra Queries: (3) {“name”:”tagomoris”, “age”:34, “address”:”Tokyo”, “corp”:”LINE”, “current”:”Taipei”} SELECT age, COUNT(*) as cnt FROM events.win:time_batch(5 mins) GROUP BY age every 5 mins {”age”:34,”cnt”:3}, {“age”:33,”cnt”:1}, ...
  • 42. Norikra Queries: (4) {“name”:”tagomoris”, “age”:34, “address”:”Tokyo”, “corp”:”LINE”, “current”:”Taipei”} SELECT age, COUNT(*) as cnt FROM events.win:time_batch(5 mins) GROUP BY age SELECT max(age) as max FROM events.win:time_batch(5 mins) {”age”:34,”cnt”:3}, {“age”:33,”cnt”:1}, ... {“max”:51} every 5 mins
  • 43. Norikra Queries: (5) {“name”:”tagomoris”, “user:{“age”:34, “corp”:”LINE”, “address”:”Tokyo”}, “current”:”Taipei”, “speaker”:true, “attend”:[true,true,false, ...] } SELECT age, COUNT(*) as cnt FROM events.win:time_batch(5 mins) GROUP BY age
  • 44. Norikra Queries: (5) {“name”:”tagomoris”, “user:{“age”:34, “corp”:”LINE”, “address”:”Tokyo”}, “current”:”Taipei”, “speaker”:true, “attend”:[true,true,false, ...] } SELECT user.age, COUNT(*) as cnt FROM events.win:time_batch(5 mins) GROUP BY user.age
  • 45. Norikra Queries: (5) {“name”:”tagomoris”, “user:{“age”:34, “corp”:”LINE”, “address”:”Tokyo”}, “current”:”Taipei”, “speaker”:true, “attend”:[true,true,false, ...] } SELECT user.age, COUNT(*) as cnt FROM events.win:time_batch(5 mins) WHERE current=”Taipei” AND attend.$0 AND attend.$1 GROUP BY user.age
  • 47. Use case 1: External API call reports for partners (LINE) External API call for LINE Business Connect LINE backend sends requests to partner’s API endpoint using users’ messages http://developers.linecorp.com/blog/?p=3386
  • 48. Use case 1: External API call reports for partners (LINE) API error response summaries http://developers.linecorp.com/blog/?p=3386
  • 49. Use case 1: External API call reports for partners (LINE) channel gateway partner’s server logs query results MySQL Mail SELECT channelId AS channel_id, reason, detail, count(*) AS error_count, min(timestamp) AS first_timestamp, max(timestamp) AS last_timestamp FROM api_error_log.win:time_batch(60 sec) GROUP BY channelId,reason,detail HAVING count(*) > 0 http://developers.linecorp.com/blog/?p=3386
  • 50. Use case 2: Prompt reports for Ad service console Prompt reports with Norikra + Fixed reports with Hive app serverapp serverapp server app serverapp serverapp server Fluentd HDFS console service execute hive query (daily) fetch query results (frequently) impression logs
  • 51. Use case 2: Prompt reports for Ad service console Hive query for fixed reports SELECT yyyymmdd, hh, campaign_id, region, lang, COUNT(*) AS click, COUNT(DISTINCT member_id) AS uu FROM ( SELECT yyyymmdd, hh, get_json_object(log, '$.campaign.id') AS campaign_id, get_json_object(log, '$.member.region') AS region, get_json_object(log, '$.member.lang') AS lang, get_json_object(log, '$.member.id') AS member_id FROM applog WHERE service='myservice' AND yyyymmdd='20140913' AND get_json_object(log, '$.type')='click' ) x GROUP BY yyyymmdd, hh, campaign_id, region, lang
  • 52. Use case 2: Prompt reports for Ad service console Norikra query for prompt reports SELECT campaign.id AS campaign_id, member.region AS region, member.lang AS lang, COUNT(*) AS click, COUNT(DISTINCT member.id) AS uu FROM myservice.win:time_batch(1 hours) WHERE type="click" GROUP BY campaign.id, member.region, member.lang
  • 53. Use case 3: Realtime access dashboard on Google Platform Access log visualization Count using Norikra (2-step), Store on Google BigQuery Dashboard on Google Spreadsheet + Apps Script http://qiita.com/kazunori279/items/6329df57635799405547 https://www.youtube.com/watch?v=EZkw5TDcCGw
  • 54. Use case 3: Realtime access dashboard on Google Platform Server Fluentd http://qiita.com/kazunori279/items/6329df57635799405547 https://www.youtube.com/watch?v=EZkw5TDcCGw ngnix access log access logs to BigQuery norikra query results norikra query to aggregate node to aggregate locally
  • 55. Use case 3: Realtime access dashboard on Google Platform Fluentd logs to store http://qiita.com/kazunori279/items/6329df57635799405547 https://www.youtube.com/watch?v=EZkw5TDcCGw ngnix 70 servers, 120,000 requests/sec (or more!) ngngninxix ngngninxix ngngninxix ngngninxix ngngninxix ngngninxix ngngninxix ngngninxix ngnix Google BigQuery Google Spreadsheet + Apps script ... counts per host total count
  • 56. More queries, more simplicity and less latency. Thanks! photo: by my co-workers
  • 57. See also: http://norikra.github.io/ “Stream processing and Norikra” http://www.slideshare.net/tagomoris/stream-processing-and-norikra “Batch processing and Stream processing by SQL” http://www.slideshare.net/tagomoris/hcj2014-sql “Log analysis systems and its designs in LINE Corp 2014 Early” http://www.slideshare.net/tagomoris/log-analysis-system-and-its-designs-in-line- corp-2014-early “Norikra in Action” http://www.slideshare.net/tagomoris/norikra-in-action-ver-2014-spring
  • 58. HA? Distributed? NO! I have some idea, but I have no time to implement it There are no needs for HA/Distributed processing
  • 59. Data flow & API? Use Fluentd!
  • 60. Scalability? 10,000 - 100,000 events/sec on 2CPU 8Core server
  • 61. Storm or Norikra? Simple and fixed workload for huge traffic Use Storm! Complex and fragile workload for non-huge traffic Use Norikra!