SlideShare a Scribd company logo
1 of 13
Download to read offline
a real-time bird tracker for Central Park
Eamon Kavanagh, Insight Data Engineering Fellowship
Summer 2016
Motivation & Main Problems	
•  Birds can be fast and elusive unless you know where to look
•  How do you process real-time location and trending data?
•  How do you properly handle unreliable sensor data?
•  Can you store data in a way to ensure accuracy in batch?
Hooded Warbler Yellow-rumped Warbler
Motivation & Main Problems	
•  Birds can be fast and elusive unless you know where to look
•  How do you process real-time location and trending data?
•  How do you properly handle unreliable sensor data?
•  Can you store data in a way to ensure accuracy in batch?
Hooded Warbler Yellow-rumped Warbler
Motivation & Main Problems	
•  Birds can be fast and elusive unless you know where to look
•  How do you process real-time location and trending data?
•  How do you properly handle unreliable sensor data?
•  Can you store data in a way to ensure accuracy in batch?
Hooded Warbler Yellow-rumped Warbler
Motivation & Main Problems	
•  Birds can be fast and elusive unless you know where to look
•  How do you process real-time location and trending data?
•  How do you properly handle unreliable sensor data?
•  Can you store data in a way to ensure accuracy in batch?
Hooded Warbler Yellow-rumped Warbler
Demo	
eamonkavanagh.com/bird-feed
Pipeline	
{“name”: “Catbird”, “family”: “Thrush”, “lat”: …}
Challenges & Solutions	
•  Managing real-time location and trending data to have
up-to-date queries
•  Properly handling out-of-order real-time data so you have a
sense of computational accuracy
•  Using very new open-source technology (cloned Flink locally
to implement a bug fix before it was officially released)
Challenges & Solutions	
•  Managing real-time location and trending data to have
up-to-date queries
Challenges & Solutions	
•  Managing real-time location and trending data to have
up-to-date ‘near me’ queries
[Streaming Windows in Apache Flink] Retrieved June 23, 2016 link
Challenges & Solutions	
•  Properly handling out-of-order real-time data so you have a
sense of computational accuracy
Challenges & Solutions	
•  Properly handling out-of-order real-time data so you have a
sense of computational accuracy
[Watermarks in Apache Flink] Retrieved June 23, 2016 link
About Me	
•  ~2 years experience as a data scientist in ad tech
•  MSc in Applied Mathematics (University of British Columbia)
•  BSc in Pure Mathematics (McMaster University)

More Related Content

Viewers also liked

Insight Demo
Insight DemoInsight Demo
Insight Demoreza-asad
 
Eric Fan Insight Project Demo
Eric Fan Insight Project DemoEric Fan Insight Project Demo
Eric Fan Insight Project DemoEric Fan
 
Sidi chang week_4.3
Sidi chang week_4.3Sidi chang week_4.3
Sidi chang week_4.3Sidi Chang
 
Insight Data Engineering project
Insight Data Engineering projectInsight Data Engineering project
Insight Data Engineering projectHoa Nguyen
 
Insight Data Engineering Project
Insight Data Engineering ProjectInsight Data Engineering Project
Insight Data Engineering ProjectAravind Ramesh
 
Detecting Anomalies in Streaming Data
Detecting Anomalies in Streaming DataDetecting Anomalies in Streaming Data
Detecting Anomalies in Streaming DataNumenta
 
Machine learning and Internet of Things, the future of medical prevention
Machine learning and Internet of Things, the future of medical preventionMachine learning and Internet of Things, the future of medical prevention
Machine learning and Internet of Things, the future of medical preventionPierre Gutierrez
 
Statistical Learning Based Anomaly Detection @ Twitter
Statistical Learning Based Anomaly Detection @ TwitterStatistical Learning Based Anomaly Detection @ Twitter
Statistical Learning Based Anomaly Detection @ TwitterArun Kejariwal
 
Insight Data Engineering: Open source data ingestion
Insight Data Engineering: Open source data ingestionInsight Data Engineering: Open source data ingestion
Insight Data Engineering: Open source data ingestionTreasure Data, Inc.
 
Detecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataDetecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataDataWorks Summit
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkDatabricks
 
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...DataWorks Summit/Hadoop Summit
 
Anomaly Detection with Apache Spark
Anomaly Detection with Apache SparkAnomaly Detection with Apache Spark
Anomaly Detection with Apache SparkCloudera, Inc.
 
Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013Julien Le Dem
 
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino BusaReal-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino BusaSpark Summit
 

Viewers also liked (20)

Prashant de-ny-project-s1
Prashant de-ny-project-s1Prashant de-ny-project-s1
Prashant de-ny-project-s1
 
RideOn
RideOnRideOn
RideOn
 
Insight Demo
Insight DemoInsight Demo
Insight Demo
 
VenmoPlus
VenmoPlusVenmoPlus
VenmoPlus
 
MapMyCab
MapMyCabMapMyCab
MapMyCab
 
Traffichelper demo
Traffichelper demoTraffichelper demo
Traffichelper demo
 
Eric Fan Insight Project Demo
Eric Fan Insight Project DemoEric Fan Insight Project Demo
Eric Fan Insight Project Demo
 
Sidi chang week_4.3
Sidi chang week_4.3Sidi chang week_4.3
Sidi chang week_4.3
 
Insight Data Engineering project
Insight Data Engineering projectInsight Data Engineering project
Insight Data Engineering project
 
Insight Data Engineering Project
Insight Data Engineering ProjectInsight Data Engineering Project
Insight Data Engineering Project
 
Detecting Anomalies in Streaming Data
Detecting Anomalies in Streaming DataDetecting Anomalies in Streaming Data
Detecting Anomalies in Streaming Data
 
Machine learning and Internet of Things, the future of medical prevention
Machine learning and Internet of Things, the future of medical preventionMachine learning and Internet of Things, the future of medical prevention
Machine learning and Internet of Things, the future of medical prevention
 
Statistical Learning Based Anomaly Detection @ Twitter
Statistical Learning Based Anomaly Detection @ TwitterStatistical Learning Based Anomaly Detection @ Twitter
Statistical Learning Based Anomaly Detection @ Twitter
 
Insight Data Engineering: Open source data ingestion
Insight Data Engineering: Open source data ingestionInsight Data Engineering: Open source data ingestion
Insight Data Engineering: Open source data ingestion
 
Detecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking DataDetecting Hacks: Anomaly Detection on Networking Data
Detecting Hacks: Anomaly Detection on Networking Data
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache Spark
 
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
 
Anomaly Detection with Apache Spark
Anomaly Detection with Apache SparkAnomaly Detection with Apache Spark
Anomaly Detection with Apache Spark
 
Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013Parquet Strata/Hadoop World, New York 2013
Parquet Strata/Hadoop World, New York 2013
 
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino BusaReal-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa
 

Similar to Real-Time Bird Tracker for Central Park

Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...
Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...
Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...Koray Tugberk GUBUR
 
Test Automation in the Microservices Oriented Enterprise by Shawn Wallace
Test Automation in the Microservices Oriented Enterprise by Shawn WallaceTest Automation in the Microservices Oriented Enterprise by Shawn Wallace
Test Automation in the Microservices Oriented Enterprise by Shawn WallaceQA or the Highway
 
Combining Data Mining and Machine Learning for Effective User Profiling
Combining Data Mining and Machine Learning for Effective User ProfilingCombining Data Mining and Machine Learning for Effective User Profiling
Combining Data Mining and Machine Learning for Effective User ProfilingCodePolitan
 
Applying AI to Root-cause Analysis Webinar
Applying AI to Root-cause Analysis WebinarApplying AI to Root-cause Analysis Webinar
Applying AI to Root-cause Analysis WebinarDeborah Schalm
 
Applying AI to Root-cause Analysis Webinar
Applying AI to Root-cause Analysis WebinarApplying AI to Root-cause Analysis Webinar
Applying AI to Root-cause Analysis WebinarDevOps.com
 
Visualising Space and Time
Visualising Space and TimeVisualising Space and Time
Visualising Space and TimeShawn Day
 
Basic Security for Digital Companies - #MarketersUnbound (2014)
Basic Security for Digital Companies - #MarketersUnbound (2014)Basic Security for Digital Companies - #MarketersUnbound (2014)
Basic Security for Digital Companies - #MarketersUnbound (2014)Justin Bull
 
Risk Management and Reliable Forecasting using Un-reliable Data (magennis) - ...
Risk Management and Reliable Forecasting using Un-reliable Data (magennis) - ...Risk Management and Reliable Forecasting using Un-reliable Data (magennis) - ...
Risk Management and Reliable Forecasting using Un-reliable Data (magennis) - ...Troy Magennis
 
Refactoring RIA Unleashed 2011
Refactoring RIA Unleashed 2011Refactoring RIA Unleashed 2011
Refactoring RIA Unleashed 2011Jesse Warden
 
Acceptance, accessible, actionable and auditable
Acceptance, accessible, actionable and auditableAcceptance, accessible, actionable and auditable
Acceptance, accessible, actionable and auditableAlban Gérôme
 
Ellicium Solutions - Making Data Science Work
Ellicium  Solutions - Making Data Science Work Ellicium  Solutions - Making Data Science Work
Ellicium Solutions - Making Data Science Work Ellicium Solutions Inc.
 
Information Technology supporting Emergency Management
Information Technology supporting Emergency ManagementInformation Technology supporting Emergency Management
Information Technology supporting Emergency ManagementMasoud
 
Preservation and institutional repositories for the digital arts and humanities
Preservation and institutional repositories for the digital arts and humanitiesPreservation and institutional repositories for the digital arts and humanities
Preservation and institutional repositories for the digital arts and humanitiesDorothea Salo
 
David Gerster: Hands on Machine Learning
David Gerster: Hands on Machine LearningDavid Gerster: Hands on Machine Learning
David Gerster: Hands on Machine LearningDavid Gerster
 
Influx/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron SchwartzInflux/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron SchwartzInfluxData
 
Cloud-Native Observability
Cloud-Native ObservabilityCloud-Native Observability
Cloud-Native ObservabilityTyler Treat
 

Similar to Real-Time Bird Tracker for Central Park (20)

Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...
Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...
Opinion-based Article Ranking for Information Retrieval Systems: Factoids and...
 
Big databigideasit4bc
Big databigideasit4bcBig databigideasit4bc
Big databigideasit4bc
 
Test Automation in the Microservices Oriented Enterprise by Shawn Wallace
Test Automation in the Microservices Oriented Enterprise by Shawn WallaceTest Automation in the Microservices Oriented Enterprise by Shawn Wallace
Test Automation in the Microservices Oriented Enterprise by Shawn Wallace
 
Combining Data Mining and Machine Learning for Effective User Profiling
Combining Data Mining and Machine Learning for Effective User ProfilingCombining Data Mining and Machine Learning for Effective User Profiling
Combining Data Mining and Machine Learning for Effective User Profiling
 
Applying AI to Root-cause Analysis Webinar
Applying AI to Root-cause Analysis WebinarApplying AI to Root-cause Analysis Webinar
Applying AI to Root-cause Analysis Webinar
 
Applying AI to Root-cause Analysis Webinar
Applying AI to Root-cause Analysis WebinarApplying AI to Root-cause Analysis Webinar
Applying AI to Root-cause Analysis Webinar
 
Visualising Space and Time
Visualising Space and TimeVisualising Space and Time
Visualising Space and Time
 
Connor big data
Connor big dataConnor big data
Connor big data
 
Basic Security for Digital Companies - #MarketersUnbound (2014)
Basic Security for Digital Companies - #MarketersUnbound (2014)Basic Security for Digital Companies - #MarketersUnbound (2014)
Basic Security for Digital Companies - #MarketersUnbound (2014)
 
Big data, big tourism
Big data, big tourismBig data, big tourism
Big data, big tourism
 
Janitor vs cleaner
Janitor vs cleanerJanitor vs cleaner
Janitor vs cleaner
 
Risk Management and Reliable Forecasting using Un-reliable Data (magennis) - ...
Risk Management and Reliable Forecasting using Un-reliable Data (magennis) - ...Risk Management and Reliable Forecasting using Un-reliable Data (magennis) - ...
Risk Management and Reliable Forecasting using Un-reliable Data (magennis) - ...
 
Refactoring RIA Unleashed 2011
Refactoring RIA Unleashed 2011Refactoring RIA Unleashed 2011
Refactoring RIA Unleashed 2011
 
Acceptance, accessible, actionable and auditable
Acceptance, accessible, actionable and auditableAcceptance, accessible, actionable and auditable
Acceptance, accessible, actionable and auditable
 
Ellicium Solutions - Making Data Science Work
Ellicium  Solutions - Making Data Science Work Ellicium  Solutions - Making Data Science Work
Ellicium Solutions - Making Data Science Work
 
Information Technology supporting Emergency Management
Information Technology supporting Emergency ManagementInformation Technology supporting Emergency Management
Information Technology supporting Emergency Management
 
Preservation and institutional repositories for the digital arts and humanities
Preservation and institutional repositories for the digital arts and humanitiesPreservation and institutional repositories for the digital arts and humanities
Preservation and institutional repositories for the digital arts and humanities
 
David Gerster: Hands on Machine Learning
David Gerster: Hands on Machine LearningDavid Gerster: Hands on Machine Learning
David Gerster: Hands on Machine Learning
 
Influx/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron SchwartzInflux/Days 2017 San Francisco | Baron Schwartz
Influx/Days 2017 San Francisco | Baron Schwartz
 
Cloud-Native Observability
Cloud-Native ObservabilityCloud-Native Observability
Cloud-Native Observability
 

Recently uploaded

Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are successPratikSingh115843
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfPratikPatil591646
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 

Recently uploaded (17)

2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are success
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdf
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 

Real-Time Bird Tracker for Central Park

  • 1. a real-time bird tracker for Central Park Eamon Kavanagh, Insight Data Engineering Fellowship Summer 2016
  • 2. Motivation & Main Problems •  Birds can be fast and elusive unless you know where to look •  How do you process real-time location and trending data? •  How do you properly handle unreliable sensor data? •  Can you store data in a way to ensure accuracy in batch? Hooded Warbler Yellow-rumped Warbler
  • 3. Motivation & Main Problems •  Birds can be fast and elusive unless you know where to look •  How do you process real-time location and trending data? •  How do you properly handle unreliable sensor data? •  Can you store data in a way to ensure accuracy in batch? Hooded Warbler Yellow-rumped Warbler
  • 4. Motivation & Main Problems •  Birds can be fast and elusive unless you know where to look •  How do you process real-time location and trending data? •  How do you properly handle unreliable sensor data? •  Can you store data in a way to ensure accuracy in batch? Hooded Warbler Yellow-rumped Warbler
  • 5. Motivation & Main Problems •  Birds can be fast and elusive unless you know where to look •  How do you process real-time location and trending data? •  How do you properly handle unreliable sensor data? •  Can you store data in a way to ensure accuracy in batch? Hooded Warbler Yellow-rumped Warbler
  • 8. Challenges & Solutions •  Managing real-time location and trending data to have up-to-date queries •  Properly handling out-of-order real-time data so you have a sense of computational accuracy •  Using very new open-source technology (cloned Flink locally to implement a bug fix before it was officially released)
  • 9. Challenges & Solutions •  Managing real-time location and trending data to have up-to-date queries
  • 10. Challenges & Solutions •  Managing real-time location and trending data to have up-to-date ‘near me’ queries [Streaming Windows in Apache Flink] Retrieved June 23, 2016 link
  • 11. Challenges & Solutions •  Properly handling out-of-order real-time data so you have a sense of computational accuracy
  • 12. Challenges & Solutions •  Properly handling out-of-order real-time data so you have a sense of computational accuracy [Watermarks in Apache Flink] Retrieved June 23, 2016 link
  • 13. About Me •  ~2 years experience as a data scientist in ad tech •  MSc in Applied Mathematics (University of British Columbia) •  BSc in Pure Mathematics (McMaster University)