Submit Search
Upload
Data to Drive Decision-Making - CaliStream Meetup
•
0 likes
•
1,371 views
Jerome Boulon
Follow
Data to Drive Decision-Making
Read less
Read more
Technology
Report
Share
Report
Share
1 of 36
Recommended
Real-time Distributed Stream Processing@ Scale
Real-time Distributed Stream Processing@ Scale
Jerome Boulon
Cloud Connect 2012, Big Data @ Netflix
Cloud Connect 2012, Big Data @ Netflix
Jerome Boulon
Hadoop summit 2010, HONU
Hadoop summit 2010, HONU
Jerome Boulon
Processing Real-Time Data at Scale: A streaming platform as a central nervous...
Processing Real-Time Data at Scale: A streaming platform as a central nervous...
confluent
Instrumenting your Instruments
Instrumenting your Instruments
DataWorks Summit/Hadoop Summit
Visualizing AutoTrader Traffic in Near Real-Time with Spark Streaming-(Jon Gr...
Visualizing AutoTrader Traffic in Near Real-Time with Spark Streaming-(Jon Gr...
Spark Summit
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Gigaom
Pipelining the Heroes with Kafka and Graph
Pipelining the Heroes with Kafka and Graph
confluent
Recommended
Real-time Distributed Stream Processing@ Scale
Real-time Distributed Stream Processing@ Scale
Jerome Boulon
Cloud Connect 2012, Big Data @ Netflix
Cloud Connect 2012, Big Data @ Netflix
Jerome Boulon
Hadoop summit 2010, HONU
Hadoop summit 2010, HONU
Jerome Boulon
Processing Real-Time Data at Scale: A streaming platform as a central nervous...
Processing Real-Time Data at Scale: A streaming platform as a central nervous...
confluent
Instrumenting your Instruments
Instrumenting your Instruments
DataWorks Summit/Hadoop Summit
Visualizing AutoTrader Traffic in Near Real-Time with Spark Streaming-(Jon Gr...
Visualizing AutoTrader Traffic in Near Real-Time with Spark Streaming-(Jon Gr...
Spark Summit
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Gigaom
Pipelining the Heroes with Kafka and Graph
Pipelining the Heroes with Kafka and Graph
confluent
How Disney+ uses fast data ubiquity to improve the customer experience
How Disney+ uses fast data ubiquity to improve the customer experience
Martin Zapletal
Fast data for fitness 10 nov 2020
Fast data for fitness 10 nov 2020
Timothy Spann
Span Conference: Why your company needs a unified log
Span Conference: Why your company needs a unified log
Alexander Dean
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
✔ Eric David Benari, PMP
Introducing Tupilak, Snowplow's unified log fabric
Introducing Tupilak, Snowplow's unified log fabric
Alexander Dean
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous Applications
Databricks
One Kubernetes to rule them all (ZEUS 2019 Keynote)
One Kubernetes to rule them all (ZEUS 2019 Keynote)
Simon Harrer
Optimizing industrial operations using the big data ecosystem
Optimizing industrial operations using the big data ecosystem
DataWorks Summit
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
HostedbyConfluent
Apache Kylin and Use Cases - 2018 Big Data Spain
Apache Kylin and Use Cases - 2018 Big Data Spain
Luke Han
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
Eva Tse
How to Quantify the Value of Kafka in Your Organization
How to Quantify the Value of Kafka in Your Organization
confluent
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
✔ Eric David Benari, PMP
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
HostedbyConfluent
Druid Overview by Rachel Pedreschi
Druid Overview by Rachel Pedreschi
Brian Olsen
AWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache Storm
Amazon Web Services
AWS re:Invent 2016: Taking Data to the Extreme (MBL202)
AWS re:Invent 2016: Taking Data to the Extreme (MBL202)
Amazon Web Services
Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...
Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...
Amazon Web Services
Big Data Pipeline and Analytics Platform
Big Data Pipeline and Analytics Platform
Sudhir Tonse
Scala eXchange: Building robust data pipelines in Scala
Scala eXchange: Building robust data pipelines in Scala
Alexander Dean
Recommendation at scale
Recommendation at scale
simondolle
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Amazon Web Services
More Related Content
What's hot
How Disney+ uses fast data ubiquity to improve the customer experience
How Disney+ uses fast data ubiquity to improve the customer experience
Martin Zapletal
Fast data for fitness 10 nov 2020
Fast data for fitness 10 nov 2020
Timothy Spann
Span Conference: Why your company needs a unified log
Span Conference: Why your company needs a unified log
Alexander Dean
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
✔ Eric David Benari, PMP
Introducing Tupilak, Snowplow's unified log fabric
Introducing Tupilak, Snowplow's unified log fabric
Alexander Dean
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous Applications
Databricks
One Kubernetes to rule them all (ZEUS 2019 Keynote)
One Kubernetes to rule them all (ZEUS 2019 Keynote)
Simon Harrer
Optimizing industrial operations using the big data ecosystem
Optimizing industrial operations using the big data ecosystem
DataWorks Summit
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
HostedbyConfluent
Apache Kylin and Use Cases - 2018 Big Data Spain
Apache Kylin and Use Cases - 2018 Big Data Spain
Luke Han
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
Eva Tse
How to Quantify the Value of Kafka in Your Organization
How to Quantify the Value of Kafka in Your Organization
confluent
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
✔ Eric David Benari, PMP
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
HostedbyConfluent
Druid Overview by Rachel Pedreschi
Druid Overview by Rachel Pedreschi
Brian Olsen
AWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache Storm
Amazon Web Services
AWS re:Invent 2016: Taking Data to the Extreme (MBL202)
AWS re:Invent 2016: Taking Data to the Extreme (MBL202)
Amazon Web Services
Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...
Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...
Amazon Web Services
Big Data Pipeline and Analytics Platform
Big Data Pipeline and Analytics Platform
Sudhir Tonse
Scala eXchange: Building robust data pipelines in Scala
Scala eXchange: Building robust data pipelines in Scala
Alexander Dean
What's hot
(20)
How Disney+ uses fast data ubiquity to improve the customer experience
How Disney+ uses fast data ubiquity to improve the customer experience
Fast data for fitness 10 nov 2020
Fast data for fitness 10 nov 2020
Span Conference: Why your company needs a unified log
Span Conference: Why your company needs a unified log
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
Introducing Tupilak, Snowplow's unified log fabric
Introducing Tupilak, Snowplow's unified log fabric
Headaches and Breakthroughs in Building Continuous Applications
Headaches and Breakthroughs in Building Continuous Applications
One Kubernetes to rule them all (ZEUS 2019 Keynote)
One Kubernetes to rule them all (ZEUS 2019 Keynote)
Optimizing industrial operations using the big data ecosystem
Optimizing industrial operations using the big data ecosystem
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Apache Kylin and Use Cases - 2018 Big Data Spain
Apache Kylin and Use Cases - 2018 Big Data Spain
The evolution of the big data platform @ Netflix (OSCON 2015)
The evolution of the big data platform @ Netflix (OSCON 2015)
How to Quantify the Value of Kafka in Your Organization
How to Quantify the Value of Kafka in Your Organization
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Druid Overview by Rachel Pedreschi
Druid Overview by Rachel Pedreschi
AWS Webcast - Amazon Kinesis and Apache Storm
AWS Webcast - Amazon Kinesis and Apache Storm
AWS re:Invent 2016: Taking Data to the Extreme (MBL202)
AWS re:Invent 2016: Taking Data to the Extreme (MBL202)
Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...
Amazon Kinesis: Real-time Streaming Big data Processing Applications (BDT311)...
Big Data Pipeline and Analytics Platform
Big Data Pipeline and Analytics Platform
Scala eXchange: Building robust data pipelines in Scala
Scala eXchange: Building robust data pipelines in Scala
Similar to Data to Drive Decision-Making - CaliStream Meetup
Recommendation at scale
Recommendation at scale
simondolle
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Amazon Web Services
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinar
Hortonworks
ABD217_From Batch to Streaming
ABD217_From Batch to Streaming
Amazon Web Services
Data Driven Enterprise with Apache Kafka
Data Driven Enterprise with Apache Kafka
confluent
Analyze billions of records on Salesforce App Cloud with BigObject
Analyze billions of records on Salesforce App Cloud with BigObject
Salesforce Developers
DataArchiva’s Journey to Success in Salesforce Data Archiving
DataArchiva’s Journey to Success in Salesforce Data Archiving
DataArchiva
Get Started with Real-Time Streaming Data in Under 5 Minutes - AWS Online Tec...
Get Started with Real-Time Streaming Data in Under 5 Minutes - AWS Online Tec...
Amazon Web Services
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
AWS Summits
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Amazon Web Services
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
Amazon Web Services
Building a modern data platform in the cloud. AWS DevDay Nordics
Building a modern data platform in the cloud. AWS DevDay Nordics
javier ramirez
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Amazon Web Services
AWS 金融服務概覽與區塊鍊案例分享
AWS 金融服務概覽與區塊鍊案例分享
Amazon Web Services
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale
Amazon Web Services
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming Analytics
Slim Baltagi
Unleash the Potential of Big Data on Salesforce
Unleash the Potential of Big Data on Salesforce
Dreamforce
Proof of Concept: Adobe Analytics Live Stream on Amazon Web Services
Proof of Concept: Adobe Analytics Live Stream on Amazon Web Services
YASH Technologies
Developing a Continuous Automated Approach to Cloud Security
Developing a Continuous Automated Approach to Cloud Security
Amazon Web Services
Data Pipelines -Big Data Meets Salesforce
Data Pipelines -Big Data Meets Salesforce
CarolEnLaNube
Similar to Data to Drive Decision-Making - CaliStream Meetup
(20)
Recommendation at scale
Recommendation at scale
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinar
ABD217_From Batch to Streaming
ABD217_From Batch to Streaming
Data Driven Enterprise with Apache Kafka
Data Driven Enterprise with Apache Kafka
Analyze billions of records on Salesforce App Cloud with BigObject
Analyze billions of records on Salesforce App Cloud with BigObject
DataArchiva’s Journey to Success in Salesforce Data Archiving
DataArchiva’s Journey to Success in Salesforce Data Archiving
Get Started with Real-Time Streaming Data in Under 5 Minutes - AWS Online Tec...
Get Started with Real-Time Streaming Data in Under 5 Minutes - AWS Online Tec...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
Building a modern data platform in the cloud. AWS DevDay Nordics
Building a modern data platform in the cloud. AWS DevDay Nordics
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
AWS 金融服務概覽與區塊鍊案例分享
AWS 金融服務概覽與區塊鍊案例分享
Modern Data Architectures for Business Insights at Scale
Modern Data Architectures for Business Insights at Scale
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming Analytics
Unleash the Potential of Big Data on Salesforce
Unleash the Potential of Big Data on Salesforce
Proof of Concept: Adobe Analytics Live Stream on Amazon Web Services
Proof of Concept: Adobe Analytics Live Stream on Amazon Web Services
Developing a Continuous Automated Approach to Cloud Security
Developing a Continuous Automated Approach to Cloud Security
Data Pipelines -Big Data Meets Salesforce
Data Pipelines -Big Data Meets Salesforce
Recently uploaded
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
Dilum Bandara
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
Pixlogix Infotech
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
2toLead Limited
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
BookNet Canada
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
Fwdays
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
Manik S Magar
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
DianaGray10
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
Florian Wilhelm
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
LoriGlavin3
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
UiPathCommunity
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
Addepto
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
Rizwan Syed
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Slibray Presentation
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
null - The Open Security Community
How to write a Business Continuity Plan
How to write a Business Continuity Plan
Databarracks
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
Alan Dix
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
Kalema Edgar
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
Curtis Poe
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
Sergiu Bodiu
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
Stephanie Beckett
Recently uploaded
(20)
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
How to write a Business Continuity Plan
How to write a Business Continuity Plan
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
Data to Drive Decision-Making - CaliStream Meetup
1.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Data to Drive Decision-Making 2015-03-03 Jerome Boulon 1
2.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Quick History 1999 2008 2009 2010 2012 Yahoo!: Chukwa Hadoop Monitoring Solution Netflix: Honu Data Collection Pipeline CaliStream: Founder Honu: Data As a Service Monitoring Solution for cable modems/TV network Ontology Search Acquired by Microsoft
3.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Agenda 3 • Agenda • Decision-Making, the process • Netflix Recommendation • Big Data @ Riot Games • Data Pipeline • Conclusion
4.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Decision-Making, The Process • Explicit • Theory Driven • Data-driven • Measurable Outcomes • Iterative Prove a hypothesis right (or wrong) Want result AND explanation
5.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Iterative Hypothesis Gather DataAnalyze Data
6.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Offline Testing Online A/BTesting Roll-out To Prod Offline/Online Testing Fail Success Success Iterations Iterations
7.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Netflix Recommendation
8.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Presentation 1 1/3 0/3 0+/3 1+/3 Presentation 2 1/3 1/3 0+/3 2+/3 Netflix Recommendation
9.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Offline Testing Online A/BTesting Roll-out To Prod The Data Fail Success Success Search Time Rating Impression Demographic Social User Behavior Geo Information
10.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 The Gaming Space
11.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Use Case: Honu @ • Error code • AVG Ping latency • AVG Queue Wait time • ClientVersion • Operating System • Hardware Profile • Etc
12.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 The Data Pipeline 12
13.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Data Pipeline • Direct correlation to the success of your Big Data project • Structured and Multi-Structured Events • Schema evolution • Collecting massive amount of data live • 60% of BI project resources is consume here! • Most “underestimated” and “unsexy” but MOST important phase
14.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 S3 Apps Sensor Data Click Stream Location … • Built for Netflix scale • 100+ Billions events/Day • Automatic Discovery, Load-balancing, Fail-over • Schema-less • … HONU
15.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 S3 Apps Sensor Data Click Stream Location … • Team • Event format • Protocol • Discovery • Load balancing • Fail-Over • Kinesis/Kafka/Scrib e … • … Collect Collect Collect Collectors
16.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 S3 Apps Sensor Data Click Stream Location … Collect Collect Collect Collectors • HadoopTeam • Hadoop knowledge • Hive • Schema evolution • Upgrades • Files optimization • …
17.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 S3 Apps Sensor Data Click Stream Location … Time to market > 18 Months ! Collect Collect Collect Collectors
18.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 201518 CaliStreamTake Control ofYour Data
19.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 S3 Apps Sensor Data Click Stream Location … HONU CaliStream: Honu as a Service Time to market (Hours)
20.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Our Solution: CaliStream CaliStream provides a SaaS data processing pipeline to easily stream large volume of events from your applications directly to Hive/Hadoop in a robust, scalable and cost effective way without any prior Hadoop Knowledge
21.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 CaliStream, Native Hive integration Big Data Sensor Data Social Click Stream Location logs Sensor Data Click Stream Location logsSocial … … CaliStream
22.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 CaliStream Everything in CaliStream is represented as a Hive table so you can easily analyze your data Select […] from[…] where[…] group by […] ;
23.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 CaliStream HiveTable Java API
24.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Generic REST API + JSON CaliStream HiveTable
25.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 HTTPRestProxy Collectors BI Analytics Redshift CaliStream EchoService AppsClusters
26.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 CaliStream.comTake Control ofYour Data Jerome Boulon jboulon@caliStream.com CaliStreamTake Control ofYour Data
27.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Backup Slides 27
28.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 BeaconServers Collectors BI Analytics Redshift AppsClusters * Ownership CaliStream * Run on your account Connectors Your Company Your Company Your S3 Bucket
29.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 IoT Use Cases
30.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 IoT Use Case 1 30
31.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Use Case: Global Boats tracking system • Boats information • Telemetry data • etc
32.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 IoT Use Case 2 32
33.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Use case: Small company in Montreal • Business model: Sensor data acquisition for cities worldwide (Montreal,Toronto, Paris, …) Database Montreal Database City 2 Database City n
34.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Problems • Data silos: each city is self-contained, managed with his own deployment • Strict data access and security rules that limit business opportunities • Deriving value from larger data set is a tedious and manual task • Multiple copies of the same data • Etc
35.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 Customer Initial Architecture 35 Database Sensors Sensor Listener Service
36.
Take Control ofYour
Data: CaliStream.com © CaliStream.com 2015 CaliStream Integration (< 1 Week) 36 Sensors Sensor Listener Service CaliStream Database Table-1 Table-2 Table-n S3 Data Warehous e ive CaliStream Client SDK CaliStream Events Unified S3 Data warehouse Big Data: TheValue