Submit Search
Upload
Spark Streamingによるリアルタイムユーザ属性推定
•
5 likes
•
1,900 views
Yoshiyasu SAEKI
Follow
Spark Meetup December 2015 http://connpass.com/event/23159/ 発表資料
Read less
Read more
Data & Analytics
Report
Share
Report
Share
1 of 27
Download now
Download to read offline
Recommended
Apache Kafka 0.11 の Exactly Once Semantics
Apache Kafka 0.11 の Exactly Once Semantics
Yoshiyasu SAEKI
StackStormを1年間データ基盤で使ってみてぶつかったトラブルとその解決策の共有
StackStormを1年間データ基盤で使ってみてぶつかったトラブルとその解決策の共有
Yoshiyasu SAEKI
Apache Kafkaとグラフデータベースによる成長するネットワークグラフを分析・可視化する基盤
Apache Kafkaとグラフデータベースによる成長するネットワークグラフを分析・可視化する基盤
Yoshiyasu SAEKI
ストリーミングデータのアドホック分析エンジンの比較
ストリーミングデータのアドホック分析エンジンの比較
Yoshiyasu SAEKI
グラフデータベース Neptune 使ってみた
グラフデータベース Neptune 使ってみた
Yoshiyasu SAEKI
Queryable State for Kafka Streamsを使ってみた
Queryable State for Kafka Streamsを使ってみた
Yoshiyasu SAEKI
KafkaとAWS Kinesisの比較
KafkaとAWS Kinesisの比較
Yoshiyasu SAEKI
データの民主化のために StackStorm を活用した事例
データの民主化のために StackStorm を活用した事例
Yoshiyasu SAEKI
Recommended
Apache Kafka 0.11 の Exactly Once Semantics
Apache Kafka 0.11 の Exactly Once Semantics
Yoshiyasu SAEKI
StackStormを1年間データ基盤で使ってみてぶつかったトラブルとその解決策の共有
StackStormを1年間データ基盤で使ってみてぶつかったトラブルとその解決策の共有
Yoshiyasu SAEKI
Apache Kafkaとグラフデータベースによる成長するネットワークグラフを分析・可視化する基盤
Apache Kafkaとグラフデータベースによる成長するネットワークグラフを分析・可視化する基盤
Yoshiyasu SAEKI
ストリーミングデータのアドホック分析エンジンの比較
ストリーミングデータのアドホック分析エンジンの比較
Yoshiyasu SAEKI
グラフデータベース Neptune 使ってみた
グラフデータベース Neptune 使ってみた
Yoshiyasu SAEKI
Queryable State for Kafka Streamsを使ってみた
Queryable State for Kafka Streamsを使ってみた
Yoshiyasu SAEKI
KafkaとAWS Kinesisの比較
KafkaとAWS Kinesisの比較
Yoshiyasu SAEKI
データの民主化のために StackStorm を活用した事例
データの民主化のために StackStorm を活用した事例
Yoshiyasu SAEKI
Voldemortの紹介
Voldemortの紹介
Yoshiyasu SAEKI
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Yoshiyasu SAEKI
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Michael Noll
PWL: One VM to Rule Them All
PWL: One VM to Rule Them All
Aysylu Greenberg
Facebook Presto presentation
Facebook Presto presentation
Cyanny LIANG
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Jen Aman
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Thoughtworks
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
confluent
Ruby and Distributed Storage Systems
Ruby and Distributed Storage Systems
SATOSHI TAGOMORI
Building a newsfeed from the Universe: Data streams in astronomy (Maria Patte...
Building a newsfeed from the Universe: Data streams in astronomy (Maria Patte...
confluent
Spark Compute as a Service at Paypal with Prabhu Kasinathan
Spark Compute as a Service at Paypal with Prabhu Kasinathan
Databricks
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
台灣資料科學年會
Technologies for Data Analytics Platform
Technologies for Data Analytics Platform
N Masahiro
Apache Kafka lessons learned @PAYBACK
Apache Kafka lessons learned @PAYBACK
Maxim Shelest
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
seoul_engineer
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Michael Noll
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture
confluent
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Chris Fregly
Api world apache nifi 101
Api world apache nifi 101
Timothy Spann
Apache Pulsar Community-Jennifer
Apache Pulsar Community-Jennifer
StreamNative
ストリーム処理を支えるキューイングシステムの選び方
ストリーム処理を支えるキューイングシステムの選び方
Yoshiyasu SAEKI
Spark Streamingを使ってみた ~Twitterリアルタイムトレンドランキング~
Spark Streamingを使ってみた ~Twitterリアルタイムトレンドランキング~
sugiyama koki
More Related Content
What's hot
Voldemortの紹介
Voldemortの紹介
Yoshiyasu SAEKI
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Yoshiyasu SAEKI
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Michael Noll
PWL: One VM to Rule Them All
PWL: One VM to Rule Them All
Aysylu Greenberg
Facebook Presto presentation
Facebook Presto presentation
Cyanny LIANG
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Jen Aman
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Thoughtworks
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
confluent
Ruby and Distributed Storage Systems
Ruby and Distributed Storage Systems
SATOSHI TAGOMORI
Building a newsfeed from the Universe: Data streams in astronomy (Maria Patte...
Building a newsfeed from the Universe: Data streams in astronomy (Maria Patte...
confluent
Spark Compute as a Service at Paypal with Prabhu Kasinathan
Spark Compute as a Service at Paypal with Prabhu Kasinathan
Databricks
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
台灣資料科學年會
Technologies for Data Analytics Platform
Technologies for Data Analytics Platform
N Masahiro
Apache Kafka lessons learned @PAYBACK
Apache Kafka lessons learned @PAYBACK
Maxim Shelest
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
seoul_engineer
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Michael Noll
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture
confluent
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Chris Fregly
Api world apache nifi 101
Api world apache nifi 101
Timothy Spann
Apache Pulsar Community-Jennifer
Apache Pulsar Community-Jennifer
StreamNative
What's hot
(20)
Voldemortの紹介
Voldemortの紹介
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Apache Sparkにおけるメモリ - アプリケーションを落とさないメモリ設計手法 -
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
PWL: One VM to Rule Them All
PWL: One VM to Rule Them All
Facebook Presto presentation
Facebook Presto presentation
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Big data pipeline with scala by Rohit Rai, Tuplejump - presented at Pune Scal...
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
Ruby and Distributed Storage Systems
Ruby and Distributed Storage Systems
Building a newsfeed from the Universe: Data streams in astronomy (Maria Patte...
Building a newsfeed from the Universe: Data streams in astronomy (Maria Patte...
Spark Compute as a Service at Paypal with Prabhu Kasinathan
Spark Compute as a Service at Paypal with Prabhu Kasinathan
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
Technologies for Data Analytics Platform
Technologies for Data Analytics Platform
Apache Kafka lessons learned @PAYBACK
Apache Kafka lessons learned @PAYBACK
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Tallinn Estonia Advanced Java Meetup Spark + TensorFlow = TensorFrames Oct 24...
Api world apache nifi 101
Api world apache nifi 101
Apache Pulsar Community-Jennifer
Apache Pulsar Community-Jennifer
Viewers also liked
ストリーム処理を支えるキューイングシステムの選び方
ストリーム処理を支えるキューイングシステムの選び方
Yoshiyasu SAEKI
Spark Streamingを使ってみた ~Twitterリアルタイムトレンドランキング~
Spark Streamingを使ってみた ~Twitterリアルタイムトレンドランキング~
sugiyama koki
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
NTT DATA OSS Professional Services
ビッグじゃなくても使えるSpark Streaming
ビッグじゃなくても使えるSpark Streaming
chibochibo
Apache Spark の紹介(前半:Sparkのキホン)
Apache Spark の紹介(前半:Sparkのキホン)
NTT DATA OSS Professional Services
Fast Distributed Online Classification
Fast Distributed Online Classification
DataWorks Summit/Hadoop Summit
Training Large-scale Ad Ranking Models in Spark
Training Large-scale Ad Ranking Models in Spark
Patrick Pletscher
Run Spark on EMRってどんな仕組みになってるの?
Run Spark on EMRってどんな仕組みになってるの?
Satoshi Noto
Apache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data Processing
prajods
2015-01-27 Introduction to Docker
2015-01-27 Introduction to Docker
Shuji Yamada
'Flume' Case Study
'Flume' Case Study
PriyankaRadha
Tokyo webmining発表資料 20111127
Tokyo webmining発表資料 20111127
kan_yukiko
Apache flume
Apache flume
Ramakrishna kapa
テキストマイニングで発掘!? 売上とユーザーレビューの相関分析
テキストマイニングで発掘!? 売上とユーザーレビューの相関分析
Shintaro Takemura
データセンタにおける消費電力のお話
データセンタにおける消費電力のお話
Koji Suganuma
Way of Experiment & Evaluation
Way of Experiment & Evaluation
Tatsuya Coike
Spark Streaming の基本とスケールする時系列データ処理 - Spark Meetup December 2015/12/09
Spark Streaming の基本とスケールする時系列データ処理 - Spark Meetup December 2015/12/09
MapR Technologies Japan
FreeBSD on Mac
FreeBSD on Mac
Yuichiro Naito
Kibana
Kibana
Torstein Hansen
Apache Sparkについて
Apache Sparkについて
BrainPad Inc.
Viewers also liked
(20)
ストリーム処理を支えるキューイングシステムの選び方
ストリーム処理を支えるキューイングシステムの選び方
Spark Streamingを使ってみた ~Twitterリアルタイムトレンドランキング~
Spark Streamingを使ってみた ~Twitterリアルタイムトレンドランキング~
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
Apache Sparkに手を出してヤケドしないための基本 ~「Apache Spark入門より」~ (デブサミ 2016 講演資料)
ビッグじゃなくても使えるSpark Streaming
ビッグじゃなくても使えるSpark Streaming
Apache Spark の紹介(前半:Sparkのキホン)
Apache Spark の紹介(前半:Sparkのキホン)
Fast Distributed Online Classification
Fast Distributed Online Classification
Training Large-scale Ad Ranking Models in Spark
Training Large-scale Ad Ranking Models in Spark
Run Spark on EMRってどんな仕組みになってるの?
Run Spark on EMRってどんな仕組みになってるの?
Apache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data Processing
2015-01-27 Introduction to Docker
2015-01-27 Introduction to Docker
'Flume' Case Study
'Flume' Case Study
Tokyo webmining発表資料 20111127
Tokyo webmining発表資料 20111127
Apache flume
Apache flume
テキストマイニングで発掘!? 売上とユーザーレビューの相関分析
テキストマイニングで発掘!? 売上とユーザーレビューの相関分析
データセンタにおける消費電力のお話
データセンタにおける消費電力のお話
Way of Experiment & Evaluation
Way of Experiment & Evaluation
Spark Streaming の基本とスケールする時系列データ処理 - Spark Meetup December 2015/12/09
Spark Streaming の基本とスケールする時系列データ処理 - Spark Meetup December 2015/12/09
FreeBSD on Mac
FreeBSD on Mac
Kibana
Kibana
Apache Sparkについて
Apache Sparkについて
Similar to Spark Streamingによるリアルタイムユーザ属性推定
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Databricks
Introduction Apache Kafka
Introduction Apache Kafka
Joe Stein
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
Vasil Remeniuk
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetup
Rafal Kwasny
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Chris Fregly
Flink September 2015 Community Update
Flink September 2015 Community Update
Robert Metzger
15年前に作ったアプリを現在に蘇らせてみた話
15年前に作ったアプリを現在に蘇らせてみた話
Naoki Nagazumi
PySpark Best Practices
PySpark Best Practices
Cloudera, Inc.
リバースプロキシでwebサーバを集約ついでにdocker化しよう
リバースプロキシでwebサーバを集約ついでにdocker化しよう
Yasunori Kuji
Top 5 mistakes when writing Streaming applications
Top 5 mistakes when writing Streaming applications
hadooparchbook
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
Databricks
Ingesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmed
whoschek
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Joe Stein
IVS CTO Night And Day 2018 Winter - [re:Cap] Serverless & Mobile
IVS CTO Night And Day 2018 Winter - [re:Cap] Serverless & Mobile
Amazon Web Services Japan
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
Michael Noll
Introduction to real time big data with Apache Spark
Introduction to real time big data with Apache Spark
Taras Matyashovsky
Spark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim Dowling
Spark Summit
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
confluent
Teaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Teaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Yao Yao
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
Rahul Jain
Similar to Spark Streamingによるリアルタイムユーザ属性推定
(20)
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Structured-Streaming-as-a-Service with Kafka, YARN, and Tooling with Jim Dowling
Introduction Apache Kafka
Introduction Apache Kafka
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetup
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Flink September 2015 Community Update
Flink September 2015 Community Update
15年前に作ったアプリを現在に蘇らせてみた話
15年前に作ったアプリを現在に蘇らせてみた話
PySpark Best Practices
PySpark Best Practices
リバースプロキシでwebサーバを集約ついでにdocker化しよう
リバースプロキシでwebサーバを集約ついでにdocker化しよう
Top 5 mistakes when writing Streaming applications
Top 5 mistakes when writing Streaming applications
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
Ingesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmed
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
IVS CTO Night And Day 2018 Winter - [re:Cap] Serverless & Mobile
IVS CTO Night And Day 2018 Winter - [re:Cap] Serverless & Mobile
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
Introduction to real time big data with Apache Spark
Introduction to real time big data with Apache Spark
Spark Summit EU talk by Jim Dowling
Spark Summit EU talk by Jim Dowling
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
Teaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Teaching Apache Spark: Demonstrations on the Databricks Cloud Platform
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
Recently uploaded
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
Emmanuel Dauda
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Boston Institute of Analytics
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
F La
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
Human37
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
gstagge
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
📊 Markus Baersch
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
natarajan8993
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
sapnasaifi408
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
Rafezzaman
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
Sapana Sha
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
dajasot375
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
Boston Institute of Analytics
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
yuu sss
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
voginip
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
ccctableauusergroup
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
Florian Roscheck
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Universitat Politècnica de Catalunya
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
jennyeacort
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Social Samosa
Recently uploaded
(20)
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Spark Streamingによるリアルタイムユーザ属性推定
1.
Spark Streaming / @laclefyoshi <ysaeki@r.recruit.co.jp>
2.
• • Spark Streaming • • •
Spark Streaming Tips • 2
3.
: / SAEKI
Yoshiyasu : IT : Web 4 9 R&D Hadoop, Kafka, Storm, Spark, Druid : RICOH Theta ( ) + Google Cardboard 3
4.
Spark Streaming http://spark.apache.org/docs/1.5.2/streaming-programming-guide.html 4
5.
5
6.
• • = • • http://www.recruit.jp/company/about/structure.html 6
7.
• • ≒ … • •
! OS etc. 7
8.
1. Web (JavaScript) 2.
fluentd Kafka 8
9.
: fluentd →
Kafka • fluent-plugin-kafka • https://github.com/htgc/fluent-plugin-kafka • output type = kafka_buffered (on file) • Kafka 0.8.2.2 • 0.9.0 • ACL 9
10.
10
11.
Suro • Netflix • https://github.com/Netflix/suro •
: Kafka Consumer API Thrift API • : • HDFS • AWS S3 • Kafka Producer • Elasticsearch • 11 LinkedIn Gobblin
12.
Hadoop • • HDFS • MLlib
• Streaming linear regression (Classification) • Streaming k-means (Clustering) • 12
13.
Spark Streaming 13
14.
Kafka • Direct Approach
(>= Spark 1.3) • • Exactly-once • Kafka Simple Consumer API Direct Approach 14
15.
Spark Streaming 1 15 http://spark.apache.org/docs/1.5.2/streaming-programming-guide.html RDD
@ time1 RDD @ time2 RDD @ time3 RDD @ time4
16.
Spark Streaming 2 16 http://spark.apache.org/docs/1.5.2/streaming-programming-guide.html
17.
Micro-batch 17 1Micro-batch (Cookie)
18.
Window-based micro-batch 1 1Micro-batch1Micro-batch 18
19.
Micro-batch • RDD HBase dstream.foreachRDD
{ rdd => val hbaseConf = createHbaseConfiguration() val jobConf = new Configuration(hbaseConf) jobConf.set("mapreduce.job.output.key.class", classOf[Text].getName) jobConf.set("mapreduce.job.output.value.class", classOf[Text].getName) jobConf.set("mapreduce.outputformat.class", classOf[TableOutputFormat[Text]].getName) new PairRDDFunctions(rdd.map(hbaseConvert)).saveAsNewAPIHadoopDataset(jobConf) } // RDD[(String, Map[K,V])] RDD[(String, Put)] def hbaseConvert(t:(String, Map[String, String])) = { val p = new Put(Bytes.toBytes(t._1)) t._2.toSeq.foreach( m => p.addColumn(Bytes.toBytes("seg"), Bytes.toBytes(m._1), Bytes.toBytes(m._2)) ) (t._1, p) } 19 0.5 1
20.
20
21.
Spark Streaming : •
DStream RDD • Spark Spark Streaming 21 http://spark.apache.org/docs/1.5.2/streaming-programming-guide.html
22.
Spark Streaming : •
Fault Tolerance • Micro-batch • YARN • YARN Dynamic Resource Allocation • 22
23.
Spark Streaming : •
: → RDD → RDD DStream → DStream • 1Micro-batch 23 // RDD → RDD val input:RDD[String] = sparkContext.makeRDD(Seq("a", "b", “c")) // DStream → DStream val queue = scala.collection.mutable.Queue(rdd) val dstream:DStream[String] = sparkStreamingContext.queueStream(queue)
24.
Spark Streaming : •
spark-testing-base • https://github.com/holdenk/spark-testing-base class JsonElementCountTest extends StreamingSuiteBase { test("simple") { val input = List(List("aa"), List("bb")) val expected = List(List("AA"), List(“BB")) testOperation[String, String]( input, converterMethod _, expected, useSet = true) } } 24
25.
Spark Streaming : •
Window-based micro-batch • • o.a.spark.streaming.util.ManualClock • private class Scala • http://mkuthan.github.io/blog/2015/03/01/spark- unit-testing/ 25
26.
Spark Streaming : •
Scala Java • • Spark Streaming Kafka HBase Scala • Java 26 // api/java/JavaRDD.scala object JavaRDD { implicit def fromRDD[T: ClassTag](rdd: RDD[T]): JavaRDD[T] = new JavaRDD[T](rdd) implicit def toRDD[T](rdd: JavaRDD[T]): RDD[T] = rdd.rdd }
27.
27 • • • = • Spark
Streaming • MLlib • GraphX
Download now