SlideShare a Scribd company logo


 “Stream-driven” paradigm
4
レガシーデバイス
(カスタムプロトコル)
IoT デバイス
IP 通信可能な
デバイス
(Windows/Linux)
省電力駆動のデバ
イス (RTOS)
Cloud gateways
Field
gateways
ストリーム処理
(Stream Analytics)
クエリと検索
(Azure Search)
見える化・分析
(Power BI)
ダッシュボード
(Azure App Services)
機械学習
(Machine
Learning API)
機械学習
(Machine
Learning /
Revolution
R Enterprise)
並列データ処理
(Azure Data Lake Analytics)
デバイスへの通知
(Notification Hubs)
DWH
ドキュメント
データベース
時系列データ
イベント
ブローカー/
デバイス管理
Azure Active
Directory 認証基盤
Azure Active
Directory



Partition 1
Partition 2
Partition “n”
Consumer Group C
Callback for prtn. 6
Callback for prtn. 2
Worker “n”
Callback for prtn. 1
Callback “n”
Worker 1
Consumer Group B
Callback for prtn. 6
Callback for prtn. 2
Worker “n”
Callback for prtn. 1
Callback “n”
Worker 1
Consumer Group A
Worker “n”
Callback for prtn. 6
Callback for prtn. 2
Callback for prtn. 1
Callback “n”
Worker 1











Intermediary
Broker
Backpressure
Feedback











Broker
Broker
Broker









Transformation Meaning
map(func)
Return a new DStream by passing each element of the source
DStream through a function func.
flatMap(func)
Similar to map, but each input item can be mapped to 0 or more
output items.
filter(func)
Return a new DStream by selecting only the records of the source
DStream on which func returns true.
repartition(numPartitions)
Changes the level of parallelism in this DStream by creating more or
fewer partitions.
union(otherStream)
Return a new DStream that contains the union of the elements in
the source DStream and otherDStream.
count()
Return a new DStream of single-element RDDs by counting the
number of elements in each RDD of the source DStream.
reduce(func)
Return a new DStream of single-element RDDs by aggregating the
elements in each RDD of the source DStream using a function func
(which takes two arguments and returns one). The function should
be associative so that it can be computed in parallel.
Transformation Meaning
countByValue()
When called on a DStream of elements of type K, return a new DStream of (K,
Long) pairs where the value of each key is its frequency in each RDD of the source
DStream.
reduceByKey(func,
[numTasks])
When called on a DStream of (K, V) pairs, return a new DStream of (K, V) pairs
where the values for each key are aggregated using the given reduce function.
Note: By default, this uses Spark's default number of parallel tasks (2 for local
mode, and in cluster mode the number is determined by the config property
spark.default.parallelism) to do the grouping. You can pass an optional numTasks
argument to set a different number of tasks.
join(otherStream,
[numTasks])
When called on two DStreams of (K, V) and (K, W) pairs, return a new DStream of
(K, (V, W)) pairs with all pairs of elements for each key.
cogroup(otherStream,
[numTasks])
When called on DStream of (K, V) and (K, W) pairs, return a new DStream of (K,
Seq[V], Seq[W]) tuples.
transform(func)
Return a new DStream by applying a RDD-to-RDD function to every RDD of the
source DStream. This can be used to do arbitrary RDD operations on the DStream.
updateStateByKey(func)
Return a new "state" DStream where the state for each key is updated by applying
the given function on the previous state of the key and the new values for the key.
This can be used to maintain arbitrary state data for each key.
Transformation Description
Map
Takes one element and produces one element. A map function that doubles the values of the input
stream
FlatMap
Takes one element and produces zero, one, or more elements. A flatmap function that splits sentences
to words
Filter
Evaluates a boolean function for each element and retains those for which the function returns true. A
filter that filters out zero values:
KeyBy
Logically partitions a stream into disjoint partitions, each partition containing elements of the same
key. Internally, this is implemented with hash partitioning.
Reduce
A "rolling" reduce on a keyed data stream. Combines the current element with the last reduced value
and emits the new value.
Fold
A "rolling" fold on a keyed data stream with an initial value. Combines the current element with the
last folded value and emits the new value.
Aggregations
Rolling aggregations on a keyed data stream. The difference between min and minBy is that min
returns the minimun value, whereas minBy returns the element that has the minimum value in this field
(same for max and maxBy).
Window
Windows can be defined on already partitioned KeyedStreams. Windows group the data in each key
according to some characteristic (e.g., the data that arrived within the last 5 seconds).
WindowAll
Windows can be defined on regular DataStreams. Windows group all the stream events according to
some characteristic (e.g., the data that arrived within the last 5 seconds).
Window Apply
Applies a general function to the window as a whole. Below is a function that manually sums the
elements of a window.
Window Reduce Applies a functional reduce function to the window and returns the reduced value.
Window Fold Applies a functional fold function to the window and returns the folded value.
Transformation Description
Aggregations on
windows
Aggregates the contents of a window. The difference between min and minBy is that min
returns the minimun value, whereas minBy returns the element that has the minimum value in
this field (same for max and maxBy).
Union
Union of two or more data streams creating a new stream containing all the elements from all
the streams. Node: If you union a data stream with itself you will get each element twice in the
resulting stream.
Window Join Join two data streams on a given key and a common window.
Window CoGroup Cogroups two data streams on a given key and a common window.
Connect
"Connects" two data streams retaining their types. Connect allowing for shared state between
the two streams.
CoMap, CoFlatMap Similar to map and flatMap on a connected data stream
Split Split the stream into two or more streams according to some criterion.
Select Select one or more streams from a split stream.
Iterate
Creates a "feedback" loop in the flow, by redirecting the output of one operator to some
previous operator. This is especially useful for defining algorithms that continuously update a
model. The following code starts with a stream and applies the iteration body continuously.
Elements that are greater than 0 are sent back to the feedback channel, and the rest of the
elements are forwarded downstream.
Extract Timestamps
Extracts timestamps from records in order to work with windows that use event time
semantics.


出典: S-Store: Streaming Meets Transaction Processing

http://yahoohadoop.tumblr.com/post/135370591481/benchmarking-streaming-computation-engines-at






















 1.4億 TATP tps

 PB オーダー
















出典: No compromises distributed transactions with consistency, availability, and performance
出典: 同上
出典: 同上









 本書に記載した情報は、本書各項目に関する発行日現在の Microsoft の見解を表明するものです。Microsoftは絶えず変化する市場に対応しなければならないため、ここに記載した情報に
対していかなる責務を負うものではなく、提示された情報の信憑性については保証できません。
 本書は情報提供のみを目的としています。 Microsoft は、明示的または暗示的を問わず、本書にいかなる保証も与えるものではありません。
 すべての当該著作権法を遵守することはお客様の責務です。Microsoftの書面による明確な許可なく、本書の如何なる部分についても、転載や検索システムへの格納または挿入を行うこ
とは、どのような形式または手段(電子的、機械的、複写、レコーディング、その他)、および目的であっても禁じられています。これらは著作権保護された権利を制限するものではあ
りません。
 Microsoftは、本書の内容を保護する特許、特許出願書、商標、著作権、またはその他の知的財産権を保有する場合があります。Microsoftから書面によるライセンス契約が明確に供給さ
れる場合を除いて、本書の提供はこれらの特許、商標、著作権、またはその他の知的財産へのライセンスを与えるものではありません。
© 2015 Microsoft Corporation. All rights reserved.
Microsoft, Windows, その他本文中に登場した各製品名は、Microsoft Corporation の米国およびその他の国における登録商標または商標です。
その他、記載されている会社名および製品名は、一般に各社の商標です。

More Related Content

What's hot

Data Stream Outlier Detection Algorithm
Data Stream Outlier Detection Algorithm Data Stream Outlier Detection Algorithm
Data Stream Outlier Detection Algorithm
Hamza Aslam
 
Hadoop & Hive Change the Data Warehousing Game Forever
Hadoop & Hive Change the Data Warehousing Game ForeverHadoop & Hive Change the Data Warehousing Game Forever
Hadoop & Hive Change the Data Warehousing Game Forever
DataWorks Summit
 
Streams processing with Storm
Streams processing with StormStreams processing with Storm
Streams processing with Storm
Mariusz Gil
 
Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...
Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...
Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...
DataStax
 
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Spark Summit
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduce
Dr Ganesh Iyer
 
Enhancing Spark SQL Optimizer with Reliable Statistics
Enhancing Spark SQL Optimizer with Reliable StatisticsEnhancing Spark SQL Optimizer with Reliable Statistics
Enhancing Spark SQL Optimizer with Reliable Statistics
Jen Aman
 
Map reduce: beyond word count
Map reduce: beyond word countMap reduce: beyond word count
Map reduce: beyond word count
Jeff Patti
 
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Spark Summit
 
k-means algorithm implementation on Hadoop
k-means algorithm implementation on Hadoopk-means algorithm implementation on Hadoop
k-means algorithm implementation on Hadoop
Stratos Gounidellis
 
Cassandra at Finn.io — May 30th 2013
Cassandra at Finn.io — May 30th 2013Cassandra at Finn.io — May 30th 2013
Cassandra at Finn.io — May 30th 2013
DataStax Academy
 
Intro to Akka Streams
Intro to Akka StreamsIntro to Akka Streams
Intro to Akka Streams
Michael Kendra
 
NLP on a Billion Documents: Scalable Machine Learning with Apache Spark
NLP on a Billion Documents: Scalable Machine Learning with Apache SparkNLP on a Billion Documents: Scalable Machine Learning with Apache Spark
NLP on a Billion Documents: Scalable Machine Learning with Apache Spark
Martin Goodson
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
Albert Bifet
 
Cassandra&map reduce
Cassandra&map reduceCassandra&map reduce
Cassandra&map reduce
vlaskinvlad
 
Azure Stream Analytics Project : On-demand real-time analytics
Azure Stream Analytics Project : On-demand real-time analyticsAzure Stream Analytics Project : On-demand real-time analytics
Azure Stream Analytics Project : On-demand real-time analytics
Lamprini Koutsokera
 
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLSebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Flink Forward
 
Supercharge your Analytics with ClickHouse, v.2. By Vadim Tkachenko
Supercharge your Analytics with ClickHouse, v.2. By Vadim TkachenkoSupercharge your Analytics with ClickHouse, v.2. By Vadim Tkachenko
Supercharge your Analytics with ClickHouse, v.2. By Vadim Tkachenko
Altinity Ltd
 
Create a correlation plot from joined tables and lag times
Create a correlation plot from joined tables and lag timesCreate a correlation plot from joined tables and lag times
Create a correlation plot from joined tables and lag times
DougLoqa
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Jen Aman
 

What's hot (20)

Data Stream Outlier Detection Algorithm
Data Stream Outlier Detection Algorithm Data Stream Outlier Detection Algorithm
Data Stream Outlier Detection Algorithm
 
Hadoop & Hive Change the Data Warehousing Game Forever
Hadoop & Hive Change the Data Warehousing Game ForeverHadoop & Hive Change the Data Warehousing Game Forever
Hadoop & Hive Change the Data Warehousing Game Forever
 
Streams processing with Storm
Streams processing with StormStreams processing with Storm
Streams processing with Storm
 
Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...
Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...
Building a Fast, Resilient Time Series Store with Cassandra (Alex Petrov, Dat...
 
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
Time-evolving Graph Processing on Commodity Clusters: Spark Summit East talk ...
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduce
 
Enhancing Spark SQL Optimizer with Reliable Statistics
Enhancing Spark SQL Optimizer with Reliable StatisticsEnhancing Spark SQL Optimizer with Reliable Statistics
Enhancing Spark SQL Optimizer with Reliable Statistics
 
Map reduce: beyond word count
Map reduce: beyond word countMap reduce: beyond word count
Map reduce: beyond word count
 
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
Horizontally Scalable Relational Databases with Spark: Spark Summit East talk...
 
k-means algorithm implementation on Hadoop
k-means algorithm implementation on Hadoopk-means algorithm implementation on Hadoop
k-means algorithm implementation on Hadoop
 
Cassandra at Finn.io — May 30th 2013
Cassandra at Finn.io — May 30th 2013Cassandra at Finn.io — May 30th 2013
Cassandra at Finn.io — May 30th 2013
 
Intro to Akka Streams
Intro to Akka StreamsIntro to Akka Streams
Intro to Akka Streams
 
NLP on a Billion Documents: Scalable Machine Learning with Apache Spark
NLP on a Billion Documents: Scalable Machine Learning with Apache SparkNLP on a Billion Documents: Scalable Machine Learning with Apache Spark
NLP on a Billion Documents: Scalable Machine Learning with Apache Spark
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
 
Cassandra&map reduce
Cassandra&map reduceCassandra&map reduce
Cassandra&map reduce
 
Azure Stream Analytics Project : On-demand real-time analytics
Azure Stream Analytics Project : On-demand real-time analyticsAzure Stream Analytics Project : On-demand real-time analytics
Azure Stream Analytics Project : On-demand real-time analytics
 
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLSebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
 
Supercharge your Analytics with ClickHouse, v.2. By Vadim Tkachenko
Supercharge your Analytics with ClickHouse, v.2. By Vadim TkachenkoSupercharge your Analytics with ClickHouse, v.2. By Vadim Tkachenko
Supercharge your Analytics with ClickHouse, v.2. By Vadim Tkachenko
 
Create a correlation plot from joined tables and lag times
Create a correlation plot from joined tables and lag timesCreate a correlation plot from joined tables and lag times
Create a correlation plot from joined tables and lag times
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
 

Viewers also liked

ソフトウェア工学からコンピューターサイエンスへ (デブサミ2014)
ソフトウェア工学からコンピューターサイエンスへ (デブサミ2014)ソフトウェア工学からコンピューターサイエンスへ (デブサミ2014)
ソフトウェア工学からコンピューターサイエンスへ (デブサミ2014)Masayoshi Hagiwara
 
Fly weight pattern #dezapatan
Fly weight pattern #dezapatanFly weight pattern #dezapatan
Fly weight pattern #dezapatankuidaoring
 
おれおれブログシステムにServiceWorkerを導入してみた #serviceworker
おれおれブログシステムにServiceWorkerを導入してみた #serviceworkerおれおれブログシステムにServiceWorkerを導入してみた #serviceworker
おれおれブログシステムにServiceWorkerを導入してみた #serviceworker
Toshiaki Maki
 
State of Securing Restful APIs s12gx2015
State of Securing Restful APIs s12gx2015State of Securing Restful APIs s12gx2015
State of Securing Restful APIs s12gx2015
robwinch
 
Strategy パターンと開放/閉鎖原則に見るデザインパターンの有用性
Strategy パターンと開放/閉鎖原則に見るデザインパターンの有用性Strategy パターンと開放/閉鎖原則に見るデザインパターンの有用性
Strategy パターンと開放/閉鎖原則に見るデザインパターンの有用性
tomo_masakura
 
Container sig#1 ansible-container
Container sig#1 ansible-containerContainer sig#1 ansible-container
Container sig#1 ansible-container
Naoya Hashimoto
 
Reactor Design Pattern
Reactor Design PatternReactor Design Pattern
Reactor Design Pattern
liminescence
 
楽しくて病みつきになるゲームジャムのススメ
楽しくて病みつきになるゲームジャムのススメ楽しくて病みつきになるゲームジャムのススメ
楽しくて病みつきになるゲームジャムのススメ
Hiroki Omae
 
GoF のデザインパターンじゃないけど、よくあるパターン
GoF のデザインパターンじゃないけど、よくあるパターンGoF のデザインパターンじゃないけど、よくあるパターン
GoF のデザインパターンじゃないけど、よくあるパターン
Gaprot
 
デザインパターン
デザインパターンデザインパターン
デザインパターン
gaaupp
 
Enabling Cloud Native Security with OAuth2 and Multi-Tenant UAA
Enabling Cloud Native Security with OAuth2 and Multi-Tenant UAA Enabling Cloud Native Security with OAuth2 and Multi-Tenant UAA
Enabling Cloud Native Security with OAuth2 and Multi-Tenant UAA
Will Tran
 
スペクトル分布(白熱電球、蛍光灯、LED照明)
スペクトル分布(白熱電球、蛍光灯、LED照明)スペクトル分布(白熱電球、蛍光灯、LED照明)
Linux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKBLinux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKB
shimosawa
 
HTTP2 時代の Web - web over http2
HTTP2 時代の Web - web over http2HTTP2 時代の Web - web over http2
HTTP2 時代の Web - web over http2
Jxck Jxck
 
Servlet 4.0 at GeekOut 2015
Servlet 4.0 at GeekOut 2015Servlet 4.0 at GeekOut 2015
Servlet 4.0 at GeekOut 2015
Edward Burns
 
Cloud Foundy Java Client V 2.0 #cf_tokyo
Cloud Foundy Java Client V 2.0 #cf_tokyoCloud Foundy Java Client V 2.0 #cf_tokyo
Cloud Foundy Java Client V 2.0 #cf_tokyo
Toshiaki Maki
 
Concourse CI Meetup Demo
Concourse CI Meetup DemoConcourse CI Meetup Demo
Concourse CI Meetup Demo
Toshiaki Maki
 
Introduction to Concourse CI #渋谷Java
Introduction to Concourse CI #渋谷JavaIntroduction to Concourse CI #渋谷Java
Introduction to Concourse CI #渋谷Java
Toshiaki Maki
 
FiNC DDD第一回勉強会
FiNC DDD第一回勉強会FiNC DDD第一回勉強会
FiNC DDD第一回勉強会
裕紀 重村
 
XP寺子屋 デザインパターン入門
XP寺子屋 デザインパターン入門XP寺子屋 デザインパターン入門
XP寺子屋 デザインパターン入門takepu
 

Viewers also liked (20)

ソフトウェア工学からコンピューターサイエンスへ (デブサミ2014)
ソフトウェア工学からコンピューターサイエンスへ (デブサミ2014)ソフトウェア工学からコンピューターサイエンスへ (デブサミ2014)
ソフトウェア工学からコンピューターサイエンスへ (デブサミ2014)
 
Fly weight pattern #dezapatan
Fly weight pattern #dezapatanFly weight pattern #dezapatan
Fly weight pattern #dezapatan
 
おれおれブログシステムにServiceWorkerを導入してみた #serviceworker
おれおれブログシステムにServiceWorkerを導入してみた #serviceworkerおれおれブログシステムにServiceWorkerを導入してみた #serviceworker
おれおれブログシステムにServiceWorkerを導入してみた #serviceworker
 
State of Securing Restful APIs s12gx2015
State of Securing Restful APIs s12gx2015State of Securing Restful APIs s12gx2015
State of Securing Restful APIs s12gx2015
 
Strategy パターンと開放/閉鎖原則に見るデザインパターンの有用性
Strategy パターンと開放/閉鎖原則に見るデザインパターンの有用性Strategy パターンと開放/閉鎖原則に見るデザインパターンの有用性
Strategy パターンと開放/閉鎖原則に見るデザインパターンの有用性
 
Container sig#1 ansible-container
Container sig#1 ansible-containerContainer sig#1 ansible-container
Container sig#1 ansible-container
 
Reactor Design Pattern
Reactor Design PatternReactor Design Pattern
Reactor Design Pattern
 
楽しくて病みつきになるゲームジャムのススメ
楽しくて病みつきになるゲームジャムのススメ楽しくて病みつきになるゲームジャムのススメ
楽しくて病みつきになるゲームジャムのススメ
 
GoF のデザインパターンじゃないけど、よくあるパターン
GoF のデザインパターンじゃないけど、よくあるパターンGoF のデザインパターンじゃないけど、よくあるパターン
GoF のデザインパターンじゃないけど、よくあるパターン
 
デザインパターン
デザインパターンデザインパターン
デザインパターン
 
Enabling Cloud Native Security with OAuth2 and Multi-Tenant UAA
Enabling Cloud Native Security with OAuth2 and Multi-Tenant UAA Enabling Cloud Native Security with OAuth2 and Multi-Tenant UAA
Enabling Cloud Native Security with OAuth2 and Multi-Tenant UAA
 
スペクトル分布(白熱電球、蛍光灯、LED照明)
スペクトル分布(白熱電球、蛍光灯、LED照明)スペクトル分布(白熱電球、蛍光灯、LED照明)
スペクトル分布(白熱電球、蛍光灯、LED照明)
 
Linux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKBLinux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKB
 
HTTP2 時代の Web - web over http2
HTTP2 時代の Web - web over http2HTTP2 時代の Web - web over http2
HTTP2 時代の Web - web over http2
 
Servlet 4.0 at GeekOut 2015
Servlet 4.0 at GeekOut 2015Servlet 4.0 at GeekOut 2015
Servlet 4.0 at GeekOut 2015
 
Cloud Foundy Java Client V 2.0 #cf_tokyo
Cloud Foundy Java Client V 2.0 #cf_tokyoCloud Foundy Java Client V 2.0 #cf_tokyo
Cloud Foundy Java Client V 2.0 #cf_tokyo
 
Concourse CI Meetup Demo
Concourse CI Meetup DemoConcourse CI Meetup Demo
Concourse CI Meetup Demo
 
Introduction to Concourse CI #渋谷Java
Introduction to Concourse CI #渋谷JavaIntroduction to Concourse CI #渋谷Java
Introduction to Concourse CI #渋谷Java
 
FiNC DDD第一回勉強会
FiNC DDD第一回勉強会FiNC DDD第一回勉強会
FiNC DDD第一回勉強会
 
XP寺子屋 デザインパターン入門
XP寺子屋 デザインパターン入門XP寺子屋 デザインパターン入門
XP寺子屋 デザインパターン入門
 

Similar to 最新のデータベース技術の方向性で思うこと

Apache Spark Overview part2 (20161117)
Apache Spark Overview part2 (20161117)Apache Spark Overview part2 (20161117)
Apache Spark Overview part2 (20161117)
Steve Min
 
Summary of "Amazon's Dynamo" for the 2nd nosql summer reading in Tokyo
Summary of "Amazon's Dynamo" for the 2nd nosql summer reading in TokyoSummary of "Amazon's Dynamo" for the 2nd nosql summer reading in Tokyo
Summary of "Amazon's Dynamo" for the 2nd nosql summer reading in Tokyo
CLOUDIAN KK
 
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormC*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
DataStax
 
Apache Flink Stream Processing
Apache Flink Stream ProcessingApache Flink Stream Processing
Apache Flink Stream Processing
Suneel Marthi
 
Scala+data
Scala+dataScala+data
Scala+data
Samir Bessalah
 
Phily JUG : Web Services APIs for Real-time Analytics w/ Storm and DropWizard
Phily JUG : Web Services APIs for Real-time Analytics w/ Storm and DropWizardPhily JUG : Web Services APIs for Real-time Analytics w/ Storm and DropWizard
Phily JUG : Web Services APIs for Real-time Analytics w/ Storm and DropWizard
Brian O'Neill
 
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Databricks
 
A Tale of Data Pattern Discovery in Parallel
A Tale of Data Pattern Discovery in ParallelA Tale of Data Pattern Discovery in Parallel
A Tale of Data Pattern Discovery in Parallel
Jenny Liu
 
Strtio Spark Streaming + Siddhi CEP Engine
Strtio Spark Streaming + Siddhi CEP EngineStrtio Spark Streaming + Siddhi CEP Engine
Strtio Spark Streaming + Siddhi CEP Engine
Myung Ho Yun
 
Meet the squirrel @ #CSHUG
Meet the squirrel @ #CSHUGMeet the squirrel @ #CSHUG
Meet the squirrel @ #CSHUG
Márton Balassi
 
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
InfluxData
 
What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?
Miklos Christine
 
Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentation
ateeq ateeq
 
Map reduce
Map reduceMap reduce
Map reduce
xydii
 
Map reduceoriginalpaper mandatoryreading
Map reduceoriginalpaper mandatoryreadingMap reduceoriginalpaper mandatoryreading
Map reduceoriginalpaper mandatoryreading
coolmirza143
 
AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)
Paul Chao
 
Interpreting the Data:Parallel Analysis with Sawzall
Interpreting the Data:Parallel Analysis with SawzallInterpreting the Data:Parallel Analysis with Sawzall
Interpreting the Data:Parallel Analysis with Sawzall
Tilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL
 
Big data reactive streams and OSGi - M Rulli
Big data reactive streams and OSGi - M RulliBig data reactive streams and OSGi - M Rulli
Big data reactive streams and OSGi - M Rulli
mfrancis
 
Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandra
Wu Liang
 
Distributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache StormDistributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache Storm
the100rabh
 

Similar to 最新のデータベース技術の方向性で思うこと (20)

Apache Spark Overview part2 (20161117)
Apache Spark Overview part2 (20161117)Apache Spark Overview part2 (20161117)
Apache Spark Overview part2 (20161117)
 
Summary of "Amazon's Dynamo" for the 2nd nosql summer reading in Tokyo
Summary of "Amazon's Dynamo" for the 2nd nosql summer reading in TokyoSummary of "Amazon's Dynamo" for the 2nd nosql summer reading in Tokyo
Summary of "Amazon's Dynamo" for the 2nd nosql summer reading in Tokyo
 
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormC*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
 
Apache Flink Stream Processing
Apache Flink Stream ProcessingApache Flink Stream Processing
Apache Flink Stream Processing
 
Scala+data
Scala+dataScala+data
Scala+data
 
Phily JUG : Web Services APIs for Real-time Analytics w/ Storm and DropWizard
Phily JUG : Web Services APIs for Real-time Analytics w/ Storm and DropWizardPhily JUG : Web Services APIs for Real-time Analytics w/ Storm and DropWizard
Phily JUG : Web Services APIs for Real-time Analytics w/ Storm and DropWizard
 
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
 
A Tale of Data Pattern Discovery in Parallel
A Tale of Data Pattern Discovery in ParallelA Tale of Data Pattern Discovery in Parallel
A Tale of Data Pattern Discovery in Parallel
 
Strtio Spark Streaming + Siddhi CEP Engine
Strtio Spark Streaming + Siddhi CEP EngineStrtio Spark Streaming + Siddhi CEP Engine
Strtio Spark Streaming + Siddhi CEP Engine
 
Meet the squirrel @ #CSHUG
Meet the squirrel @ #CSHUGMeet the squirrel @ #CSHUG
Meet the squirrel @ #CSHUG
 
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
 
What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?What's new with Apache Spark's Structured Streaming?
What's new with Apache Spark's Structured Streaming?
 
Map reduce presentation
Map reduce presentationMap reduce presentation
Map reduce presentation
 
Map reduce
Map reduceMap reduce
Map reduce
 
Map reduceoriginalpaper mandatoryreading
Map reduceoriginalpaper mandatoryreadingMap reduceoriginalpaper mandatoryreading
Map reduceoriginalpaper mandatoryreading
 
AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)
 
Interpreting the Data:Parallel Analysis with Sawzall
Interpreting the Data:Parallel Analysis with SawzallInterpreting the Data:Parallel Analysis with Sawzall
Interpreting the Data:Parallel Analysis with Sawzall
 
Big data reactive streams and OSGi - M Rulli
Big data reactive streams and OSGi - M RulliBig data reactive streams and OSGi - M Rulli
Big data reactive streams and OSGi - M Rulli
 
Dynamo cassandra
Dynamo cassandraDynamo cassandra
Dynamo cassandra
 
Distributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache StormDistributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache Storm
 

Recently uploaded

INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLESINTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
anfaltahir1010
 
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISDECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
Tier1 app
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
dakas1
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
The Third Creative Media
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
sjcobrien
 
Project Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdfProject Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdf
Karya Keeper
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Paul Brebner
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Julian Hyde
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
Reetu63
 
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
kalichargn70th171
 
UI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design SystemUI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design System
Peter Muessig
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
kgyxske
 
What’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete RoadmapWhat’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete Roadmap
Envertis Software Solutions
 
How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?
ToXSL Technologies
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
kalichargn70th171
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
Preparing Non - Technical Founders for Engaging a Tech Agency
Preparing Non - Technical Founders for Engaging  a  Tech AgencyPreparing Non - Technical Founders for Engaging  a  Tech Agency
Preparing Non - Technical Founders for Engaging a Tech Agency
ISH Technologies
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
brainerhub1
 

Recently uploaded (20)

INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLESINTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
 
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISDECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
 
Project Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdfProject Management: The Role of Project Dashboards.pdf
Project Management: The Role of Project Dashboards.pdf
 
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
 
ppt on the brain chip neuralink.pptx
ppt  on   the brain  chip neuralink.pptxppt  on   the brain  chip neuralink.pptx
ppt on the brain chip neuralink.pptx
 
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
 
UI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design SystemUI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design System
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
 
What’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete RoadmapWhat’s New in Odoo 17 – A Complete Roadmap
What’s New in Odoo 17 – A Complete Roadmap
 
How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?How Can Hiring A Mobile App Development Company Help Your Business Grow?
How Can Hiring A Mobile App Development Company Help Your Business Grow?
 
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
A Comprehensive Guide on Implementing Real-World Mobile Testing Strategies fo...
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
Preparing Non - Technical Founders for Engaging a Tech Agency
Preparing Non - Technical Founders for Engaging  a  Tech AgencyPreparing Non - Technical Founders for Engaging  a  Tech Agency
Preparing Non - Technical Founders for Engaging a Tech Agency
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
 

最新のデータベース技術の方向性で思うこと

  • 1.
  • 2.
  • 4. 4 レガシーデバイス (カスタムプロトコル) IoT デバイス IP 通信可能な デバイス (Windows/Linux) 省電力駆動のデバ イス (RTOS) Cloud gateways Field gateways ストリーム処理 (Stream Analytics) クエリと検索 (Azure Search) 見える化・分析 (Power BI) ダッシュボード (Azure App Services) 機械学習 (Machine Learning API) 機械学習 (Machine Learning / Revolution R Enterprise) 並列データ処理 (Azure Data Lake Analytics) デバイスへの通知 (Notification Hubs) DWH ドキュメント データベース 時系列データ イベント ブローカー/ デバイス管理 Azure Active Directory 認証基盤 Azure Active Directory
  • 5.    Partition 1 Partition 2 Partition “n” Consumer Group C Callback for prtn. 6 Callback for prtn. 2 Worker “n” Callback for prtn. 1 Callback “n” Worker 1 Consumer Group B Callback for prtn. 6 Callback for prtn. 2 Worker “n” Callback for prtn. 1 Callback “n” Worker 1 Consumer Group A Worker “n” Callback for prtn. 6 Callback for prtn. 2 Callback for prtn. 1 Callback “n” Worker 1
  • 13. Transformation Meaning map(func) Return a new DStream by passing each element of the source DStream through a function func. flatMap(func) Similar to map, but each input item can be mapped to 0 or more output items. filter(func) Return a new DStream by selecting only the records of the source DStream on which func returns true. repartition(numPartitions) Changes the level of parallelism in this DStream by creating more or fewer partitions. union(otherStream) Return a new DStream that contains the union of the elements in the source DStream and otherDStream. count() Return a new DStream of single-element RDDs by counting the number of elements in each RDD of the source DStream. reduce(func) Return a new DStream of single-element RDDs by aggregating the elements in each RDD of the source DStream using a function func (which takes two arguments and returns one). The function should be associative so that it can be computed in parallel.
  • 14. Transformation Meaning countByValue() When called on a DStream of elements of type K, return a new DStream of (K, Long) pairs where the value of each key is its frequency in each RDD of the source DStream. reduceByKey(func, [numTasks]) When called on a DStream of (K, V) pairs, return a new DStream of (K, V) pairs where the values for each key are aggregated using the given reduce function. Note: By default, this uses Spark's default number of parallel tasks (2 for local mode, and in cluster mode the number is determined by the config property spark.default.parallelism) to do the grouping. You can pass an optional numTasks argument to set a different number of tasks. join(otherStream, [numTasks]) When called on two DStreams of (K, V) and (K, W) pairs, return a new DStream of (K, (V, W)) pairs with all pairs of elements for each key. cogroup(otherStream, [numTasks]) When called on DStream of (K, V) and (K, W) pairs, return a new DStream of (K, Seq[V], Seq[W]) tuples. transform(func) Return a new DStream by applying a RDD-to-RDD function to every RDD of the source DStream. This can be used to do arbitrary RDD operations on the DStream. updateStateByKey(func) Return a new "state" DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values for the key. This can be used to maintain arbitrary state data for each key.
  • 15.
  • 16. Transformation Description Map Takes one element and produces one element. A map function that doubles the values of the input stream FlatMap Takes one element and produces zero, one, or more elements. A flatmap function that splits sentences to words Filter Evaluates a boolean function for each element and retains those for which the function returns true. A filter that filters out zero values: KeyBy Logically partitions a stream into disjoint partitions, each partition containing elements of the same key. Internally, this is implemented with hash partitioning. Reduce A "rolling" reduce on a keyed data stream. Combines the current element with the last reduced value and emits the new value. Fold A "rolling" fold on a keyed data stream with an initial value. Combines the current element with the last folded value and emits the new value. Aggregations Rolling aggregations on a keyed data stream. The difference between min and minBy is that min returns the minimun value, whereas minBy returns the element that has the minimum value in this field (same for max and maxBy). Window Windows can be defined on already partitioned KeyedStreams. Windows group the data in each key according to some characteristic (e.g., the data that arrived within the last 5 seconds). WindowAll Windows can be defined on regular DataStreams. Windows group all the stream events according to some characteristic (e.g., the data that arrived within the last 5 seconds). Window Apply Applies a general function to the window as a whole. Below is a function that manually sums the elements of a window. Window Reduce Applies a functional reduce function to the window and returns the reduced value. Window Fold Applies a functional fold function to the window and returns the folded value.
  • 17. Transformation Description Aggregations on windows Aggregates the contents of a window. The difference between min and minBy is that min returns the minimun value, whereas minBy returns the element that has the minimum value in this field (same for max and maxBy). Union Union of two or more data streams creating a new stream containing all the elements from all the streams. Node: If you union a data stream with itself you will get each element twice in the resulting stream. Window Join Join two data streams on a given key and a common window. Window CoGroup Cogroups two data streams on a given key and a common window. Connect "Connects" two data streams retaining their types. Connect allowing for shared state between the two streams. CoMap, CoFlatMap Similar to map and flatMap on a connected data stream Split Split the stream into two or more streams according to some criterion. Select Select one or more streams from a split stream. Iterate Creates a "feedback" loop in the flow, by redirecting the output of one operator to some previous operator. This is especially useful for defining algorithms that continuously update a model. The following code starts with a stream and applies the iteration body continuously. Elements that are greater than 0 are sent back to the feedback channel, and the rest of the elements are forwarded downstream. Extract Timestamps Extracts timestamps from records in order to work with windows that use event time semantics.
  • 18.   出典: S-Store: Streaming Meets Transaction Processing
  • 22.
  • 23.        1.4億 TATP tps   PB オーダー 
  • 26. 出典: No compromises distributed transactions with consistency, availability, and performance
  • 30.  本書に記載した情報は、本書各項目に関する発行日現在の Microsoft の見解を表明するものです。Microsoftは絶えず変化する市場に対応しなければならないため、ここに記載した情報に 対していかなる責務を負うものではなく、提示された情報の信憑性については保証できません。  本書は情報提供のみを目的としています。 Microsoft は、明示的または暗示的を問わず、本書にいかなる保証も与えるものではありません。  すべての当該著作権法を遵守することはお客様の責務です。Microsoftの書面による明確な許可なく、本書の如何なる部分についても、転載や検索システムへの格納または挿入を行うこ とは、どのような形式または手段(電子的、機械的、複写、レコーディング、その他)、および目的であっても禁じられています。これらは著作権保護された権利を制限するものではあ りません。  Microsoftは、本書の内容を保護する特許、特許出願書、商標、著作権、またはその他の知的財産権を保有する場合があります。Microsoftから書面によるライセンス契約が明確に供給さ れる場合を除いて、本書の提供はこれらの特許、商標、著作権、またはその他の知的財産へのライセンスを与えるものではありません。 © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, その他本文中に登場した各製品名は、Microsoft Corporation の米国およびその他の国における登録商標または商標です。 その他、記載されている会社名および製品名は、一般に各社の商標です。