SlideShare a Scribd company logo
1
Utilizing Spark Streaming
for analyzing a real-time
sport data feeds
Demonstration
2
4.5 Trillion
Frames per second
60 Frames
Visible to the human eye
3
Camera Tracking Systems
An array of cameras around the
field capture the players and ball
positions LIVE
4
So what?
• Cool usecase and all, but what's the value?
• Real-time streams from robotic manufacturing (Audi, Ford, BMW, Toyota)
• Real-time traffic analysis for Smart Cities / Theme Parks (Denver, Cincinnati,
London, Disney, Universal)
• Real-time mechanical data from devices (Aircraft - Air France, Windmills – GE)
• And before you discount this whole sports things
• UK tax office collects 1.3B pounds ~2B USD in taxes each year from EPL teams
• Greater than the GDP of the bottom 25% of all countries
• 95 billion dollars wagered annually on NFL and college football
• #1 on Forbes 2000 list by a lot…
5
6
7
What version do you need to solve the problem?
8
Flow
d d d
+
+
9
Raw vs Encoded
150mpbs at 4k per camera
d d d
+
+
Stadiums have on avg 20-30 cameras
10
From Seen To Described
d d d
+
+
Gigs of Video data to KB/MB description data
Most applications that convert are proprietary
but seeing investment in space by the usual suspects
11
Phone home?
d d d
+
+
Data tends to be JSON or XML
Onvif Standard for Security
Messaging vs Web services?
12
Where does it reside?
d d d
+
+
13
©2015 Talend Inc
14
15
16
aggregate the
speed and
distance run of
each player IN
REAL TIME
Our goal:
17
• The camera array sends a feed of 25 frames per second
• Each frame captures the x,y,z coordinates of every player
• A live feed of sport data is actually pretty serious Big Data!
Challenges
18
Analytics Architecture
Database
Ingestion Process Store VisualizeDeliver
ALL designed in Talend – NO coding
19
• It let's you publish and subscribe to
streams of records. In this respect it
is similar to a message queue or
enterprise messaging system.
• It let's you store streams of records in
a fault-tolerant way.
• It let's you process streams of records
as they occur.
Distributed Streaming Platform
Kafka Background
20
• Fast and general engine for large-scale data processing
• Developed in response to processing limitations with MapReduce
• 10x faster than MapReduce on disk
• 100x faster than MapReduce in memory
• Has a stack of libraries including Spark Streaming & MLib (machine learning)
• Runs everywhere; on Hadoop or Standalone
Spark Background
21
©2015 Talend Inc
22
Next Step: From Analysis to Prediction
Team stats
Who is the most likely to score
next?
Which team is going to win?
Individual players stats
Which player need a rest/bench?
Which player are being traded
( bring in historical data)
23
Free Trial: Talend Big Data Sandbox
• A ready-to-run Docker environment
• A step-by-step expert guide: the cookbook
• Real-world scenarios using Spark, Kafka,
MapReduce & NoSQL
• Iot Analytics
• Real-time Recommendation
• Clickstream Analysis
• Weblogs Analysis
• EDW Offload
www.talend.com/BigDataSandbox
Hit the Easy Button for Hadoop, Spark and Machine Learning
24
• An active community
• 80,000 visitors/week
• 3M of total downloads
• Engaged members
• Individual members &
partners
• Active User Groups
• 1,000+components built by
the community
The NEW Talend Community
25
Talend Data Masters Awards
• Share your Talend story &
win in $1,500 for your
favorite charity
• Deadline: July 28th
• https://info.talend.com/d
atamasters2017all.html

More Related Content

What's hot

欲しいアプリは自分で作る!経済産業省も認めたPower Appsの威力と可能性
欲しいアプリは自分で作る!経済産業省も認めたPower Appsの威力と可能性欲しいアプリは自分で作る!経済産業省も認めたPower Appsの威力と可能性
欲しいアプリは自分で作る!経済産業省も認めたPower Appsの威力と可能性
Junichi Kodama
 
シスコ装置を使い倒す!組込み機能による可視化からセキュリティ強化
シスコ装置を使い倒す!組込み機能による可視化からセキュリティ強化シスコ装置を使い倒す!組込み機能による可視化からセキュリティ強化
シスコ装置を使い倒す!組込み機能による可視化からセキュリティ強化
シスコシステムズ合同会社
 
remote Docker over SSHが熱い
remote Docker over SSHが熱いremote Docker over SSHが熱い
remote Docker over SSHが熱い
Hiroyuki Ohnaka
 
PPAPを何とかしたいがPHSも何とかしたい
PPAPを何とかしたいがPHSも何とかしたいPPAPを何とかしたいがPHSも何とかしたい
PPAPを何とかしたいがPHSも何とかしたい
UEHARA, Tetsutaro
 
コロナ禍の現代におけるコミュニケーションの整理と人間が感じる不安、そして弊社での取り組みや意識について
コロナ禍の現代におけるコミュニケーションの整理と人間が感じる不安、そして弊社での取り組みや意識についてコロナ禍の現代におけるコミュニケーションの整理と人間が感じる不安、そして弊社での取り組みや意識について
コロナ禍の現代におけるコミュニケーションの整理と人間が感じる不安、そして弊社での取り組みや意識について
宮坂 望未
 
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018
cyberagent
 
デザイン思考マスタークラス 2015年12月2-4日
デザイン思考マスタークラス 2015年12月2-4日デザイン思考マスタークラス 2015年12月2-4日
デザイン思考マスタークラス 2015年12月2-4日
(旧アカウント)一般社団法人デザイン思考研究所
 
SharePoint で始める情報共有とそのアプローチ
SharePoint で始める情報共有とそのアプローチSharePoint で始める情報共有とそのアプローチ
SharePoint で始める情報共有とそのアプローチ
日本マイクロソフト株式会社
 
データ分析基盤、どう作る?システム設計のポイント、教えます - Developers.IO 2019 (20191101)
データ分析基盤、どう作る?システム設計のポイント、教えます - Developers.IO 2019 (20191101)データ分析基盤、どう作る?システム設計のポイント、教えます - Developers.IO 2019 (20191101)
データ分析基盤、どう作る?システム設計のポイント、教えます - Developers.IO 2019 (20191101)
Yosuke Katsuki
 
音声感情認識の分野動向と実用化に向けたNTTの取り組み
音声感情認識の分野動向と実用化に向けたNTTの取り組み音声感情認識の分野動向と実用化に向けたNTTの取り組み
音声感情認識の分野動向と実用化に向けたNTTの取り組み
Atsushi_Ando
 
【プレゼン】見やすいプレゼン資料の作り方【初心者用】
【プレゼン】見やすいプレゼン資料の作り方【初心者用】【プレゼン】見やすいプレゼン資料の作り方【初心者用】
【プレゼン】見やすいプレゼン資料の作り方【初心者用】
MOCKS | Yuta Morishige
 
Panorama ux 2019
Panorama ux 2019Panorama ux 2019
Panorama ux 2019
Carolina Leslie
 
早わかり匠Method
早わかり匠Method早わかり匠Method
早わかり匠Method
Hagimoto Junzo
 
いまさら聞けない機械学習のキホン
いまさら聞けない機械学習のキホンいまさら聞けない機械学習のキホン
いまさら聞けない機械学習のキホン
dsuke Takaoka
 
自称・世界一わかりやすい音声認識入門
自称・世界一わかりやすい音声認識入門自称・世界一わかりやすい音声認識入門
自称・世界一わかりやすい音声認識入門
Tom Hakamata
 
L2延伸を利用したクラウド移行とクラウド活用術
L2延伸を利用したクラウド移行とクラウド活用術L2延伸を利用したクラウド移行とクラウド活用術
L2延伸を利用したクラウド移行とクラウド活用術
富士通クラウドテクノロジーズ株式会社
 
Goertek’s Experience with the Qualcomm Virtual Reality (VR) Accelerator Program
Goertek’s Experience with the Qualcomm Virtual Reality (VR) Accelerator ProgramGoertek’s Experience with the Qualcomm Virtual Reality (VR) Accelerator Program
Goertek’s Experience with the Qualcomm Virtual Reality (VR) Accelerator Program
AugmentedWorldExpo
 
誰にでもできるプレゼン入門 〜解脱プレゼンの極意〜
誰にでもできるプレゼン入門 〜解脱プレゼンの極意〜誰にでもできるプレゼン入門 〜解脱プレゼンの極意〜
誰にでもできるプレゼン入門 〜解脱プレゼンの極意〜
VirtualTech Japan Inc./Begi.net Inc.
 
画像キャプションの自動生成
画像キャプションの自動生成画像キャプションの自動生成
画像キャプションの自動生成
Yoshitaka Ushiku
 
ソーシャルゲームにレコメンドエンジンを導入した話
ソーシャルゲームにレコメンドエンジンを導入した話ソーシャルゲームにレコメンドエンジンを導入した話
ソーシャルゲームにレコメンドエンジンを導入した話Tokoroten Nakayama
 

What's hot (20)

欲しいアプリは自分で作る!経済産業省も認めたPower Appsの威力と可能性
欲しいアプリは自分で作る!経済産業省も認めたPower Appsの威力と可能性欲しいアプリは自分で作る!経済産業省も認めたPower Appsの威力と可能性
欲しいアプリは自分で作る!経済産業省も認めたPower Appsの威力と可能性
 
シスコ装置を使い倒す!組込み機能による可視化からセキュリティ強化
シスコ装置を使い倒す!組込み機能による可視化からセキュリティ強化シスコ装置を使い倒す!組込み機能による可視化からセキュリティ強化
シスコ装置を使い倒す!組込み機能による可視化からセキュリティ強化
 
remote Docker over SSHが熱い
remote Docker over SSHが熱いremote Docker over SSHが熱い
remote Docker over SSHが熱い
 
PPAPを何とかしたいがPHSも何とかしたい
PPAPを何とかしたいがPHSも何とかしたいPPAPを何とかしたいがPHSも何とかしたい
PPAPを何とかしたいがPHSも何とかしたい
 
コロナ禍の現代におけるコミュニケーションの整理と人間が感じる不安、そして弊社での取り組みや意識について
コロナ禍の現代におけるコミュニケーションの整理と人間が感じる不安、そして弊社での取り組みや意識についてコロナ禍の現代におけるコミュニケーションの整理と人間が感じる不安、そして弊社での取り組みや意識について
コロナ禍の現代におけるコミュニケーションの整理と人間が感じる不安、そして弊社での取り組みや意識について
 
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018
「これ危ない設定じゃないでしょうか」とヒアリングするための仕組み @AWS Summit Tokyo 2018
 
デザイン思考マスタークラス 2015年12月2-4日
デザイン思考マスタークラス 2015年12月2-4日デザイン思考マスタークラス 2015年12月2-4日
デザイン思考マスタークラス 2015年12月2-4日
 
SharePoint で始める情報共有とそのアプローチ
SharePoint で始める情報共有とそのアプローチSharePoint で始める情報共有とそのアプローチ
SharePoint で始める情報共有とそのアプローチ
 
データ分析基盤、どう作る?システム設計のポイント、教えます - Developers.IO 2019 (20191101)
データ分析基盤、どう作る?システム設計のポイント、教えます - Developers.IO 2019 (20191101)データ分析基盤、どう作る?システム設計のポイント、教えます - Developers.IO 2019 (20191101)
データ分析基盤、どう作る?システム設計のポイント、教えます - Developers.IO 2019 (20191101)
 
音声感情認識の分野動向と実用化に向けたNTTの取り組み
音声感情認識の分野動向と実用化に向けたNTTの取り組み音声感情認識の分野動向と実用化に向けたNTTの取り組み
音声感情認識の分野動向と実用化に向けたNTTの取り組み
 
【プレゼン】見やすいプレゼン資料の作り方【初心者用】
【プレゼン】見やすいプレゼン資料の作り方【初心者用】【プレゼン】見やすいプレゼン資料の作り方【初心者用】
【プレゼン】見やすいプレゼン資料の作り方【初心者用】
 
Panorama ux 2019
Panorama ux 2019Panorama ux 2019
Panorama ux 2019
 
早わかり匠Method
早わかり匠Method早わかり匠Method
早わかり匠Method
 
いまさら聞けない機械学習のキホン
いまさら聞けない機械学習のキホンいまさら聞けない機械学習のキホン
いまさら聞けない機械学習のキホン
 
自称・世界一わかりやすい音声認識入門
自称・世界一わかりやすい音声認識入門自称・世界一わかりやすい音声認識入門
自称・世界一わかりやすい音声認識入門
 
L2延伸を利用したクラウド移行とクラウド活用術
L2延伸を利用したクラウド移行とクラウド活用術L2延伸を利用したクラウド移行とクラウド活用術
L2延伸を利用したクラウド移行とクラウド活用術
 
Goertek’s Experience with the Qualcomm Virtual Reality (VR) Accelerator Program
Goertek’s Experience with the Qualcomm Virtual Reality (VR) Accelerator ProgramGoertek’s Experience with the Qualcomm Virtual Reality (VR) Accelerator Program
Goertek’s Experience with the Qualcomm Virtual Reality (VR) Accelerator Program
 
誰にでもできるプレゼン入門 〜解脱プレゼンの極意〜
誰にでもできるプレゼン入門 〜解脱プレゼンの極意〜誰にでもできるプレゼン入門 〜解脱プレゼンの極意〜
誰にでもできるプレゼン入門 〜解脱プレゼンの極意〜
 
画像キャプションの自動生成
画像キャプションの自動生成画像キャプションの自動生成
画像キャプションの自動生成
 
ソーシャルゲームにレコメンドエンジンを導入した話
ソーシャルゲームにレコメンドエンジンを導入した話ソーシャルゲームにレコメンドエンジンを導入した話
ソーシャルゲームにレコメンドエンジンを導入した話
 

Similar to Integrating Real-Time Video Data Streams with Spark and Kafka

Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBMWalmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
Redis Labs
 
Launching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSLaunching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWS
Amazon Web Services
 
Tech
TechTech
Ovh analytics data compute with apache spark as a service meetup ovh bordeaux
Ovh analytics data compute with apache spark as a service   meetup ovh bordeauxOvh analytics data compute with apache spark as a service   meetup ovh bordeaux
Ovh analytics data compute with apache spark as a service meetup ovh bordeaux
Mojtaba Imani
 
OVH Analytics Data Compute - Apache Spark Cluster as a Service
OVH Analytics Data Compute - Apache Spark Cluster as a ServiceOVH Analytics Data Compute - Apache Spark Cluster as a Service
OVH Analytics Data Compute - Apache Spark Cluster as a Service
OVHcloud
 
How to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your NeedsHow to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your Needs
ScyllaDB
 
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Dataconomy Media
 
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Maya Lumbroso
 
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57AAMIR FAROOQUI
 
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big MemoryTech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
MemVerge
 
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...
DevGAMM Conference
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysis
Amazon Web Services
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1
Tugdual Grall
 
Big Data and OSS at IBM
Big Data and OSS at IBMBig Data and OSS at IBM
Big Data and OSS at IBM
Boulder Java User's Group
 
AquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks PresentationAquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks PresentationAquaQ Analytics
 
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes EverythingThe Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
Kai Wähner
 
Apache kafka event_streaming___kai_waehner
Apache kafka event_streaming___kai_waehnerApache kafka event_streaming___kai_waehner
Apache kafka event_streaming___kai_waehner
confluent
 
Transforming the Database: Critical Innovations for Performance at Scale
Transforming the Database: Critical Innovations for Performance at ScaleTransforming the Database: Critical Innovations for Performance at Scale
Transforming the Database: Critical Innovations for Performance at Scale
ScyllaDB
 
The NECSTLab Multi-Faceted Experience with AWS F1
The NECSTLab Multi-Faceted Experience with AWS F1The NECSTLab Multi-Faceted Experience with AWS F1
The NECSTLab Multi-Faceted Experience with AWS F1
NECST Lab @ Politecnico di Milano
 
Lessons learned building a big data analytics engine, from proprietary to ope...
Lessons learned building a big data analytics engine, from proprietary to ope...Lessons learned building a big data analytics engine, from proprietary to ope...
Lessons learned building a big data analytics engine, from proprietary to ope...
J On The Beach
 

Similar to Integrating Real-Time Video Data Streams with Spark and Kafka (20)

Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBMWalmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBM
 
Launching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSLaunching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWS
 
Tech
TechTech
Tech
 
Ovh analytics data compute with apache spark as a service meetup ovh bordeaux
Ovh analytics data compute with apache spark as a service   meetup ovh bordeauxOvh analytics data compute with apache spark as a service   meetup ovh bordeaux
Ovh analytics data compute with apache spark as a service meetup ovh bordeaux
 
OVH Analytics Data Compute - Apache Spark Cluster as a Service
OVH Analytics Data Compute - Apache Spark Cluster as a ServiceOVH Analytics Data Compute - Apache Spark Cluster as a Service
OVH Analytics Data Compute - Apache Spark Cluster as a Service
 
How to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your NeedsHow to Build a Scylla Database Cluster that Fits Your Needs
How to Build a Scylla Database Cluster that Fits Your Needs
 
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
 
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
 
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
HC28.22.430-Vision-Neural-Net-GregEfland-Cadence-v02-57
 
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big MemoryTech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
Tech Talk: Moneyball - Hitting real-time apps out of the park with Big Memory
 
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...
Modern Data Stack for Game Analytics / Dmitry Anoshin (Microsoft Gaming, The ...
 
Streaming data for real time analysis
Streaming data for real time analysisStreaming data for real time analysis
Streaming data for real time analysis
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1
 
Big Data and OSS at IBM
Big Data and OSS at IBMBig Data and OSS at IBM
Big Data and OSS at IBM
 
AquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks PresentationAquaQ Analytics Kx Event - Data Direct Networks Presentation
AquaQ Analytics Kx Event - Data Direct Networks Presentation
 
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes EverythingThe Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
 
Apache kafka event_streaming___kai_waehner
Apache kafka event_streaming___kai_waehnerApache kafka event_streaming___kai_waehner
Apache kafka event_streaming___kai_waehner
 
Transforming the Database: Critical Innovations for Performance at Scale
Transforming the Database: Critical Innovations for Performance at ScaleTransforming the Database: Critical Innovations for Performance at Scale
Transforming the Database: Critical Innovations for Performance at Scale
 
The NECSTLab Multi-Faceted Experience with AWS F1
The NECSTLab Multi-Faceted Experience with AWS F1The NECSTLab Multi-Faceted Experience with AWS F1
The NECSTLab Multi-Faceted Experience with AWS F1
 
Lessons learned building a big data analytics engine, from proprietary to ope...
Lessons learned building a big data analytics engine, from proprietary to ope...Lessons learned building a big data analytics engine, from proprietary to ope...
Lessons learned building a big data analytics engine, from proprietary to ope...
 

More from Data Con LA

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
Data Con LA
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
Data Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
Data Con LA
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
Data Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
Data Con LA
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
Data Con LA
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
Data Con LA
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
Data Con LA
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA
 

More from Data Con LA (20)

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
 

Recently uploaded

Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 

Recently uploaded (20)

Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 

Integrating Real-Time Video Data Streams with Spark and Kafka

  • 1. 1 Utilizing Spark Streaming for analyzing a real-time sport data feeds Demonstration
  • 2. 2 4.5 Trillion Frames per second 60 Frames Visible to the human eye
  • 3. 3 Camera Tracking Systems An array of cameras around the field capture the players and ball positions LIVE
  • 4. 4 So what? • Cool usecase and all, but what's the value? • Real-time streams from robotic manufacturing (Audi, Ford, BMW, Toyota) • Real-time traffic analysis for Smart Cities / Theme Parks (Denver, Cincinnati, London, Disney, Universal) • Real-time mechanical data from devices (Aircraft - Air France, Windmills – GE) • And before you discount this whole sports things • UK tax office collects 1.3B pounds ~2B USD in taxes each year from EPL teams • Greater than the GDP of the bottom 25% of all countries • 95 billion dollars wagered annually on NFL and college football • #1 on Forbes 2000 list by a lot…
  • 5. 5
  • 6. 6
  • 7. 7 What version do you need to solve the problem?
  • 9. 9 Raw vs Encoded 150mpbs at 4k per camera d d d + + Stadiums have on avg 20-30 cameras
  • 10. 10 From Seen To Described d d d + + Gigs of Video data to KB/MB description data Most applications that convert are proprietary but seeing investment in space by the usual suspects
  • 11. 11 Phone home? d d d + + Data tends to be JSON or XML Onvif Standard for Security Messaging vs Web services?
  • 12. 12 Where does it reside? d d d + +
  • 14. 14
  • 15. 15
  • 16. 16 aggregate the speed and distance run of each player IN REAL TIME Our goal:
  • 17. 17 • The camera array sends a feed of 25 frames per second • Each frame captures the x,y,z coordinates of every player • A live feed of sport data is actually pretty serious Big Data! Challenges
  • 18. 18 Analytics Architecture Database Ingestion Process Store VisualizeDeliver ALL designed in Talend – NO coding
  • 19. 19 • It let's you publish and subscribe to streams of records. In this respect it is similar to a message queue or enterprise messaging system. • It let's you store streams of records in a fault-tolerant way. • It let's you process streams of records as they occur. Distributed Streaming Platform Kafka Background
  • 20. 20 • Fast and general engine for large-scale data processing • Developed in response to processing limitations with MapReduce • 10x faster than MapReduce on disk • 100x faster than MapReduce in memory • Has a stack of libraries including Spark Streaming & MLib (machine learning) • Runs everywhere; on Hadoop or Standalone Spark Background
  • 22. 22 Next Step: From Analysis to Prediction Team stats Who is the most likely to score next? Which team is going to win? Individual players stats Which player need a rest/bench? Which player are being traded ( bring in historical data)
  • 23. 23 Free Trial: Talend Big Data Sandbox • A ready-to-run Docker environment • A step-by-step expert guide: the cookbook • Real-world scenarios using Spark, Kafka, MapReduce & NoSQL • Iot Analytics • Real-time Recommendation • Clickstream Analysis • Weblogs Analysis • EDW Offload www.talend.com/BigDataSandbox Hit the Easy Button for Hadoop, Spark and Machine Learning
  • 24. 24 • An active community • 80,000 visitors/week • 3M of total downloads • Engaged members • Individual members & partners • Active User Groups • 1,000+components built by the community The NEW Talend Community
  • 25. 25 Talend Data Masters Awards • Share your Talend story & win in $1,500 for your favorite charity • Deadline: July 28th • https://info.talend.com/d atamasters2017all.html

Editor's Notes

  1. More often that not, most data people anayze today is voliate – it comes and goes, in analyzed and gone. The idea was that you needed to download twitter to do anything of value with social analytics but that’s not true… there’s an api for that. The things Data anayltics is important to every originzation, doesn’t matter the size so “big” is different for everyone and that doesn’t Velocity and variety of the data Who here is a sports fan? Big fantasy league players here? Big data is an interesting marketing
  2. The 4.5 trillion frames per second is the FASTEST slow motion camera to date, it is used to capture the moments leading up to, during and after a chemical reaction… not something we’d need for a goal line review but it certainly exemplifies the big data challenge we are presenting. If you were to manually watch this, It would take you ~ hundreds of thousands of years to process…hope you didn’t have plans
  3. NFL Zebra – RFID’s in jerseys – Force impact, speed, concussion rates NBA, you’d think they could keep the traveling down to a minimum Goal Line technology
  4. There is a lot of value in the data that is created behind this… influence even by a small fraction we’re talking about millions
  5. Now we’re going to break this challenge up into two sections, the first will cover all aspects of the image collection and video processing, the second covers the analytics
  6. The first question that needs to be asked when architecting a solution for processing video and image data is what do I need to solve the problem. A lot of architectural decisions will be made depending on this question. Is the challenge to identify that what I am seeing is a car? do I need to know what color it is? Or what the model is? Or in the case of video, can I tell the difference between one car and another? Perhaps I am just getting a general flow of traffic on a highway, or am I trying to identify the market share of one of my competitors by identifying the ratio of my car brands vs theirs within a given area?
  7. Almost all video and image processing pipelines look like this. We’re capturing the raw video format and they compressing / encoding. Next we process the video to extract relevant metadata and then pass that information further downstream to our analytical process. There are a lot of questions as to where and when to do certain steps and we’ll walk though them in the following slides.
  8. * This makes a very strong argument for processing and handling it as locally as possible to work with that high bandwidth *18.88 Mbps in most urban areas with it even higher for a premium The FCC recently found that 39% of rural populations lack target levels of speed: 25 Mbps for downloads and 3 Mbps uploads This impacts things like smart farming or smart aggriculter Some HD video cameras output uncompressed video, whereas others compress the video using a lossy compression method such as MPEG or H.264 H265 is also picking up HEVC was developed with the goal of providing twice the compression efficiency of the previous standard, H.264 / AVC At an identical level of visual quality, HEVC enables video to be compressed to a file that is about half the size (or half the bit rate) of AVC, When compressed to the same file size or bit rate as AVC, HEVC delivers significantly better visual quality.
  9. NFL stadiums tend to have hundreds to the thousand servers within the stadium devoted to encoding and metadata processing. The usual suspects, Amazon, Google, Microsoft, IBM …. Just to name a few While a lot of the camera hardware vendors will provide this processing capability, I did a check and there are some 30 + available API’s out there to handle the video processing. This is likely the most complex and use case specific process and I have yet to find a one size fits all API.
  10. This makes a very strong argument for processing and handling it as locally as possible to work with that high bandwidth But as discussed as work continues in codec compression and infrastructure improves upload bandwidth we might get to the point where this discussion becomes mute. In short, the better we get at lossless compression the more flexible we can be in this step.... Where’s pied piper when you need them  So with that in mind I’d like to show you how you could build a process like this. We’re going to take the google vision API for a little spin, I am going to gather you up and we’re going to take a picture that I’ll post on twitter and pull down using Talend to analyze with the Google Vision API. It will spit out some interest results and hopefully recognize you all as people and see your faces.
  11. So we just covered how to architect something to handle video processing and discussed some of the trade offs for locality of service finishing off with a demo highlighting some of the work cloud based companies like google are doing to democratize the video and image meta data gathering process.
  12. So now lets focus on the analytical side. Where we left off from the video processing architecture was that the video data had been converted into a metadata representation. We’re going to want to work with that in a more general analytical setting.
  13. So going back to our conversation earlier about sports analytics and the gobs of money it brings it, we see coaches, analysts even the average sports viewer looking for insight into their favorite players; looking for ways to optimize their strategy to improve success.. In the case we have here which is focused on data collected from the EPL, players are often running all over the place and identifying when they are getting tired can be important intel for both teams. When you have players playing well into their 40s’ you want to make sure one of them isn’t going to break a hip or something…. The NFL is doing similar fact finding with regards to force impact analysis.. With so much attention on concussion rates and effects you bet everyone is making sure they keep their 120 Million franchise player safe and healthy.
  14. Heres just an example of what is in the JSON information we receive, while it’s not the 4.5 trillion frames per second
  15. Consistent Growth 1,500 members in the new Community.Talend.com INTERNAL ONLY 3M of total download of Talend software to-date since the company was founded (includes TOS + evals) In 2016, we had 360,000 total downloads, up 14% since 2015 (total downloads include TOS + evals) Engaged members: Members: Our community members are “strategic partners” in solving data challenges—not just Talend challenges. Talend Advocates: Small-to-medium SIs and VARs are the some of the greatest Talend champions in the community. They share their technical expertise and by sharing their knowledge, they get visibility and find new customers Thought Leaders: We’re about to launch a new Discussion Board about IoT/Smart Cities. By comparison, competitors use their forum for product support only. The health of a community is measure by the engagement—not just growth User Groups: Not only do we have community members that actively respond to questions on the forum …. …. we also have customers who are creating and managing User Groups around the world (US, UK, Germany, France, Belgium, Switzerland, and India) Our User Group in Portland, Maine, and Vancouver, Canada were launched by customers, and so were many others. The Community Team is launching one NEW user group/quarter. In 2017, we plan to have a new user group in Chicago, Dallas, Toronto, and Atlanta in 2017. Vancouver was launched in Q1. Every day, we have about 400 online concurrent users. Monetization: Both Talend and the Talend partners know how to monetize the community. Talend has been converting open source customers (i.e. Judicial Court of California, Mogo Finance Technology) from Open Studion to the commercial version, Talend Data Integration And partners who are active on the community are finding new business (some of the most active members are SI partners)
  16. Criteria Creativity and uniqueness of use Scope and complexity of project Business transformation and improvement Timeline We are accepting entries until July 28, 2017. Hurry and send your entries now! Winners will be notified in September. Winners will be announced in November. Eligibility Requirements Award winners should be willing to have their story shared publicly on Talend web site (company logo, video and case study) and promoted on social media and in press announcements.