6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
Real-Time Bird Tracker for Central Park
1. a real-time bird tracker for Central Park
Eamon Kavanagh, Insight Data Engineering Fellowship
Summer 2016
2. Motivation & Main Problems
• Birds can be fast and elusive unless you know where to look
• How do you process real-time location and trending data?
• How do you properly handle unreliable sensor data?
• Can you store data in a way to ensure accuracy in batch?
Hooded Warbler Yellow-rumped Warbler
3. Motivation & Main Problems
• Birds can be fast and elusive unless you know where to look
• How do you process real-time location and trending data?
• How do you properly handle unreliable sensor data?
• Can you store data in a way to ensure accuracy in batch?
Hooded Warbler Yellow-rumped Warbler
4. Motivation & Main Problems
• Birds can be fast and elusive unless you know where to look
• How do you process real-time location and trending data?
• How do you properly handle unreliable sensor data?
• Can you store data in a way to ensure accuracy in batch?
Hooded Warbler Yellow-rumped Warbler
5. Motivation & Main Problems
• Birds can be fast and elusive unless you know where to look
• How do you process real-time location and trending data?
• How do you properly handle unreliable sensor data?
• Can you store data in a way to ensure accuracy in batch?
Hooded Warbler Yellow-rumped Warbler
8. Challenges & Solutions
• Managing real-time location and trending data to have
up-to-date queries
• Properly handling out-of-order real-time data so you have a
sense of computational accuracy
• Using very new open-source technology (cloned Flink locally
to implement a bug fix before it was officially released)
9. Challenges & Solutions
• Managing real-time location and trending data to have
up-to-date queries
10. Challenges & Solutions
• Managing real-time location and trending data to have
up-to-date ‘near me’ queries
[Streaming Windows in Apache Flink] Retrieved June 23, 2016 link
11. Challenges & Solutions
• Properly handling out-of-order real-time data so you have a
sense of computational accuracy
12. Challenges & Solutions
• Properly handling out-of-order real-time data so you have a
sense of computational accuracy
[Watermarks in Apache Flink] Retrieved June 23, 2016 link
13. About Me
• ~2 years experience as a data scientist in ad tech
• MSc in Applied Mathematics (University of British Columbia)
• BSc in Pure Mathematics (McMaster University)