SlideShare a Scribd company logo
Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved.
Kafka Streams VS Spark
~Kafka StreamsはどこまでSparkに迫れるか~
KDDI総合研究所
コネクティッドネットワーク部門
森澤 雄太
Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. 2
◼ KDDI総合研究所
⚫ KDDIの子会社
• 本社事業方針に沿った研究開発
• 「5G時代に向けた イノベーションの創出」「通信とライフデザインの融合」「ビッグデータの活
用」「グローバル事業の さらなる拡大」「金融事業の拡大」「グループとしての成長」「サステナ
ビリティ」
⚫ コネクティッドネットワーク部門
• ネットワーク、コネクティッドカー、自動運転、遠隔運転、IoT、運用自動化など……
◼ 自己紹介
⚫ ビッグデータ基盤・ストリームデータ、遠隔運転、 エッジコンピューティング、GPU
⚫ Apache Flinkのドキュメントの英文校正をちょっとやった
会社・自己紹介
Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. 3
やりたいこと:IoTデータ連携基盤のアーキテクチャ策定
車
防犯カメラ
センサー
データ基盤
データ
素敵なサービス
IoTデータを利用したサービス基盤
セーフティ インフォテイメント エージェント カーライフ
サポート
Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. 4
やりたいこと:IoTデータ連携基盤のアーキテクチャ策定
車
防犯カメラ
センサー
データ基盤
データ
ETL
素敵なサービス
サービスロジック
素敵なサービスの前には必ずETLがある
Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. 5
アイデア
開発したETL
新しいETL
Kafka Streamsで
ETLを代替できな
いか?
でも,パフォー
マンスは大丈夫
だろうか?
Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved.
パフォーマンスベンチマーク
6
HadoopProxy
Device1
Device2
Device3
REST Kafka
情報抽出
型変換
データの粒度
調整
次段に転送
・3台のサーバ(Device)からバイナリデータをHttpで送出(120Mbps~)
・Kafka-REST(Proxy)でKafkaに変換
・Hadoop Clusterで情報抽出・型変換・粒度調整を実施
Kafka
比較範囲
Sink
Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. 7
◼ バージョン
⚫ Hadoop 2.9.2
⚫ Spark 2.4.3
⚫ Kafka 2.2.0
◼ 割り当てリソース
⚫ Master Memory : 2GB
⚫ Worker Memory : 3GB
⚫ CPU : 1Core / node
⚫ 5 node
環境
◼ 評価方法
⚫ InputとOutputのKafka Timestampから処
理遅延を算出
◼ 実装
⚫ Spark
• Structured Streaming
⚫ Kafka
• Scala
• Stream DSL(Timestampの取得のみ
Processor API)
Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. 8
レイテンシ比較
0
500
1000
1500
2000
2500
3000
3500
1 3 5
Latency[ms]
Nodes
kafka spark
Kafka Streamsの方が低遅延
ノード数を小さくすると差が顕著に=ノード辺りの処理性能もSparkより高い
Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved.
Kafka Streams VS Spark
~Kafka StreamsはどこまでSparkに迫れるか~
KDDI総合研究所
コネクティッドネットワーク部門
森澤 雄太
迫りすぎて越えてしまった!
Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. 10
◼ 考察
⚫ Sparkは分散基盤 = 大規模環境でメモリインテンシブな処理があって初めて力を発揮する
⚫ 今回のETLはEmbarrassingly Parallel,かつ,小規模基盤なのでSparkのよいところが活かせな
かった
◼ 結論
⚫ Kafka StreamsはETLでとても有用(かもしれない)
• オーバーヘッドが小さいため
• 耐障害性などは未評価
⚫ シンプルで便利
• 耐障害性,スケール性など自作が面倒なことをカバーしてくれる
考察と結論
実装の議論や「それはおかしい!」という意見歓迎です!
個別に話しましょう!
Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. 11

More Related Content

Similar to Kafka Streams VS Spark ~Kafka StreamsはどこまでSparkに迫れるか~

Cloud Computing and Edge Computing(CTO Kieun Park) - Edge Computing Seminar
Cloud Computing and Edge Computing(CTO Kieun Park) - Edge Computing SeminarCloud Computing and Edge Computing(CTO Kieun Park) - Edge Computing Seminar
Cloud Computing and Edge Computing(CTO Kieun Park) - Edge Computing Seminar
NAVER CLOUD PLATFORMㅣ네이버 클라우드 플랫폼
 
How to Succeed in the Cloud (Financially)
How to Succeed in the Cloud (Financially)How to Succeed in the Cloud (Financially)
How to Succeed in the Cloud (Financially)
Rand Group
 
E-Magazine September Issue 2021
E-Magazine September Issue 2021E-Magazine September Issue 2021
E-Magazine September Issue 2021
VARINDIA
 
Avner algom feb 7 2012
Avner algom feb 7 2012Avner algom feb 7 2012
Avner algom feb 7 2012Avner Algom
 
Virtual Instruments Presentation
Virtual Instruments PresentationVirtual Instruments Presentation
Virtual Instruments Presentation
Key Information Systems, Inc.
 
Deploy and Manage Your Industrial IoT Edge Solutions In Weeks With EdgeOps
Deploy and Manage Your Industrial IoT Edge Solutions In Weeks With EdgeOpsDeploy and Manage Your Industrial IoT Edge Solutions In Weeks With EdgeOps
Deploy and Manage Your Industrial IoT Edge Solutions In Weeks With EdgeOps
Tredence Inc
 
Unleash the cloud + 5 g + ai era
Unleash the cloud + 5 g + ai eraUnleash the cloud + 5 g + ai era
Unleash the cloud + 5 g + ai era
myehuman
 
The Future of 6G Wireless Networks Opportunities, Requirements, and Challenge...
The Future of 6G Wireless Networks Opportunities, Requirements, and Challenge...The Future of 6G Wireless Networks Opportunities, Requirements, and Challenge...
The Future of 6G Wireless Networks Opportunities, Requirements, and Challenge...
ijtsrd
 
DCD Big Discussion Guide
DCD Big Discussion GuideDCD Big Discussion Guide
DCD Big Discussion Guide
James Laker
 
Network Evolution and Market Outlook
Network Evolution and Market OutlookNetwork Evolution and Market Outlook
Network Evolution and Market Outlook
Open Networking Summit
 
How to Select a Next-Generation Packet Broker to Manage Digital Transformation
How to Select a Next-Generation Packet Broker to Manage Digital TransformationHow to Select a Next-Generation Packet Broker to Manage Digital Transformation
How to Select a Next-Generation Packet Broker to Manage Digital Transformation
Enterprise Management Associates
 
IDC Executive Overview
IDC Executive OverviewIDC Executive Overview
IDC Executive Overviewjkabrud
 
Crisis-Ready Crisis-Proof IT Infrastructure for the New Normal
Crisis-Ready Crisis-Proof IT Infrastructure for the New NormalCrisis-Ready Crisis-Proof IT Infrastructure for the New Normal
Crisis-Ready Crisis-Proof IT Infrastructure for the New Normal
Kalin Hitrov
 
Présentation Matinée SD-WAN Waycom & Citrix
Présentation Matinée SD-WAN Waycom & CitrixPrésentation Matinée SD-WAN Waycom & Citrix
Présentation Matinée SD-WAN Waycom & Citrix
Waycom
 
Soonr Overview
Soonr OverviewSoonr Overview
Soonr Overviewgingerh
 
Microsoft Telecommunications Industry Newsletter | December 2019
Microsoft Telecommunications Industry Newsletter | December 2019Microsoft Telecommunications Industry Newsletter | December 2019
Microsoft Telecommunications Industry Newsletter | December 2019
Rick Lievano
 
Cisco Mobilize Magazine: Winter/Spring 2013
Cisco Mobilize Magazine: Winter/Spring 2013Cisco Mobilize Magazine: Winter/Spring 2013
Cisco Mobilize Magazine: Winter/Spring 2013
Cisco Service Provider Mobility
 
2019 technology innovations and investments
2019 technology innovations and investments2019 technology innovations and investments
2019 technology innovations and investments
Marko Paris
 
Wed Sponsor Press Conf - 10.15
Wed Sponsor Press Conf - 10.15Wed Sponsor Press Conf - 10.15
Wed Sponsor Press Conf - 10.15Bessie Wang
 
ZStack for Datacenter as a Service - Product Deck
ZStack for Datacenter as a Service - Product DeckZStack for Datacenter as a Service - Product Deck
ZStack for Datacenter as a Service - Product Deck
Ryo Ardian
 

Similar to Kafka Streams VS Spark ~Kafka StreamsはどこまでSparkに迫れるか~ (20)

Cloud Computing and Edge Computing(CTO Kieun Park) - Edge Computing Seminar
Cloud Computing and Edge Computing(CTO Kieun Park) - Edge Computing SeminarCloud Computing and Edge Computing(CTO Kieun Park) - Edge Computing Seminar
Cloud Computing and Edge Computing(CTO Kieun Park) - Edge Computing Seminar
 
How to Succeed in the Cloud (Financially)
How to Succeed in the Cloud (Financially)How to Succeed in the Cloud (Financially)
How to Succeed in the Cloud (Financially)
 
E-Magazine September Issue 2021
E-Magazine September Issue 2021E-Magazine September Issue 2021
E-Magazine September Issue 2021
 
Avner algom feb 7 2012
Avner algom feb 7 2012Avner algom feb 7 2012
Avner algom feb 7 2012
 
Virtual Instruments Presentation
Virtual Instruments PresentationVirtual Instruments Presentation
Virtual Instruments Presentation
 
Deploy and Manage Your Industrial IoT Edge Solutions In Weeks With EdgeOps
Deploy and Manage Your Industrial IoT Edge Solutions In Weeks With EdgeOpsDeploy and Manage Your Industrial IoT Edge Solutions In Weeks With EdgeOps
Deploy and Manage Your Industrial IoT Edge Solutions In Weeks With EdgeOps
 
Unleash the cloud + 5 g + ai era
Unleash the cloud + 5 g + ai eraUnleash the cloud + 5 g + ai era
Unleash the cloud + 5 g + ai era
 
The Future of 6G Wireless Networks Opportunities, Requirements, and Challenge...
The Future of 6G Wireless Networks Opportunities, Requirements, and Challenge...The Future of 6G Wireless Networks Opportunities, Requirements, and Challenge...
The Future of 6G Wireless Networks Opportunities, Requirements, and Challenge...
 
DCD Big Discussion Guide
DCD Big Discussion GuideDCD Big Discussion Guide
DCD Big Discussion Guide
 
Network Evolution and Market Outlook
Network Evolution and Market OutlookNetwork Evolution and Market Outlook
Network Evolution and Market Outlook
 
How to Select a Next-Generation Packet Broker to Manage Digital Transformation
How to Select a Next-Generation Packet Broker to Manage Digital TransformationHow to Select a Next-Generation Packet Broker to Manage Digital Transformation
How to Select a Next-Generation Packet Broker to Manage Digital Transformation
 
IDC Executive Overview
IDC Executive OverviewIDC Executive Overview
IDC Executive Overview
 
Crisis-Ready Crisis-Proof IT Infrastructure for the New Normal
Crisis-Ready Crisis-Proof IT Infrastructure for the New NormalCrisis-Ready Crisis-Proof IT Infrastructure for the New Normal
Crisis-Ready Crisis-Proof IT Infrastructure for the New Normal
 
Présentation Matinée SD-WAN Waycom & Citrix
Présentation Matinée SD-WAN Waycom & CitrixPrésentation Matinée SD-WAN Waycom & Citrix
Présentation Matinée SD-WAN Waycom & Citrix
 
Soonr Overview
Soonr OverviewSoonr Overview
Soonr Overview
 
Microsoft Telecommunications Industry Newsletter | December 2019
Microsoft Telecommunications Industry Newsletter | December 2019Microsoft Telecommunications Industry Newsletter | December 2019
Microsoft Telecommunications Industry Newsletter | December 2019
 
Cisco Mobilize Magazine: Winter/Spring 2013
Cisco Mobilize Magazine: Winter/Spring 2013Cisco Mobilize Magazine: Winter/Spring 2013
Cisco Mobilize Magazine: Winter/Spring 2013
 
2019 technology innovations and investments
2019 technology innovations and investments2019 technology innovations and investments
2019 technology innovations and investments
 
Wed Sponsor Press Conf - 10.15
Wed Sponsor Press Conf - 10.15Wed Sponsor Press Conf - 10.15
Wed Sponsor Press Conf - 10.15
 
ZStack for Datacenter as a Service - Product Deck
ZStack for Datacenter as a Service - Product DeckZStack for Datacenter as a Service - Product Deck
ZStack for Datacenter as a Service - Product Deck
 

Recently uploaded

Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
ssuser9bd3ba
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
Kamal Acharya
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
Kamal Acharya
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
Kamal Acharya
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
Kamal Acharya
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 

Recently uploaded (20)

Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 

Kafka Streams VS Spark ~Kafka StreamsはどこまでSparkに迫れるか~

  • 1. Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. Kafka Streams VS Spark ~Kafka StreamsはどこまでSparkに迫れるか~ KDDI総合研究所 コネクティッドネットワーク部門 森澤 雄太
  • 2. Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. 2 ◼ KDDI総合研究所 ⚫ KDDIの子会社 • 本社事業方針に沿った研究開発 • 「5G時代に向けた イノベーションの創出」「通信とライフデザインの融合」「ビッグデータの活 用」「グローバル事業の さらなる拡大」「金融事業の拡大」「グループとしての成長」「サステナ ビリティ」 ⚫ コネクティッドネットワーク部門 • ネットワーク、コネクティッドカー、自動運転、遠隔運転、IoT、運用自動化など…… ◼ 自己紹介 ⚫ ビッグデータ基盤・ストリームデータ、遠隔運転、 エッジコンピューティング、GPU ⚫ Apache Flinkのドキュメントの英文校正をちょっとやった 会社・自己紹介
  • 3. Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. 3 やりたいこと:IoTデータ連携基盤のアーキテクチャ策定 車 防犯カメラ センサー データ基盤 データ 素敵なサービス IoTデータを利用したサービス基盤 セーフティ インフォテイメント エージェント カーライフ サポート
  • 4. Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. 4 やりたいこと:IoTデータ連携基盤のアーキテクチャ策定 車 防犯カメラ センサー データ基盤 データ ETL 素敵なサービス サービスロジック 素敵なサービスの前には必ずETLがある
  • 5. Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. 5 アイデア 開発したETL 新しいETL Kafka Streamsで ETLを代替できな いか? でも,パフォー マンスは大丈夫 だろうか?
  • 6. Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. パフォーマンスベンチマーク 6 HadoopProxy Device1 Device2 Device3 REST Kafka 情報抽出 型変換 データの粒度 調整 次段に転送 ・3台のサーバ(Device)からバイナリデータをHttpで送出(120Mbps~) ・Kafka-REST(Proxy)でKafkaに変換 ・Hadoop Clusterで情報抽出・型変換・粒度調整を実施 Kafka 比較範囲 Sink
  • 7. Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. 7 ◼ バージョン ⚫ Hadoop 2.9.2 ⚫ Spark 2.4.3 ⚫ Kafka 2.2.0 ◼ 割り当てリソース ⚫ Master Memory : 2GB ⚫ Worker Memory : 3GB ⚫ CPU : 1Core / node ⚫ 5 node 環境 ◼ 評価方法 ⚫ InputとOutputのKafka Timestampから処 理遅延を算出 ◼ 実装 ⚫ Spark • Structured Streaming ⚫ Kafka • Scala • Stream DSL(Timestampの取得のみ Processor API)
  • 8. Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. 8 レイテンシ比較 0 500 1000 1500 2000 2500 3000 3500 1 3 5 Latency[ms] Nodes kafka spark Kafka Streamsの方が低遅延 ノード数を小さくすると差が顕著に=ノード辺りの処理性能もSparkより高い
  • 9. Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. Kafka Streams VS Spark ~Kafka StreamsはどこまでSparkに迫れるか~ KDDI総合研究所 コネクティッドネットワーク部門 森澤 雄太 迫りすぎて越えてしまった!
  • 10. Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. 10 ◼ 考察 ⚫ Sparkは分散基盤 = 大規模環境でメモリインテンシブな処理があって初めて力を発揮する ⚫ 今回のETLはEmbarrassingly Parallel,かつ,小規模基盤なのでSparkのよいところが活かせな かった ◼ 結論 ⚫ Kafka StreamsはETLでとても有用(かもしれない) • オーバーヘッドが小さいため • 耐障害性などは未評価 ⚫ シンプルで便利 • 耐障害性,スケール性など自作が面倒なことをカバーしてくれる 考察と結論 実装の議論や「それはおかしい!」という意見歓迎です! 個別に話しましょう!
  • 11. Copyright(C) 2019 KDDI Research, Inc. All Rights Reserved. 11