Closing The Loop for Evaluating Big Data Analysis


Published on

This talk was held at the 11th meeting on April 7 2014 by Karolina Alexiou.

Analysis of big data is useless (and a lot harder to sell) when you can't measure whether the resulting insights are correct. In order to develop sophisticated data analysis methodologies tailored to your particular use-case, you need to be able to figure out what works and what doesn't. It is crucial to gather data independently to your analysis (ground truth) and compare it to your results using the correct metrics and account for biases. The sheer volume of data means that you also need to have a strategy for slicing and dicing the data to isolate the really valuable parts, and also, a keen eye for visualization so that you can quickly compare methodologies and support the validity of your insights to third parties.

Published in: Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Closing The Loop for Evaluating Big Data Analysis

  1. 1. Closing the Loop Evaluating Big Data Analysis Karolina Alexiou
  2. 2. About The speaker ● ETH graduate ● Joined Teralytics in September 2013 ● Data Scientist/Software Engineer The talk (takeaways) ● Point out how evaluation can improve your project ● Suggest concrete steps to build an evaluation framework
  3. 3. The value of evaluation Data analysis can be fun and exploratory, BUT: “If you torture the data long enough, it will confess to anything.” -Ronald Coase, economist
  4. 4. The value of evaluation Without feedback on the data analysis results, (=closing the loop) I don’t know whether my fancy algorithm is better than a naive one. How to measure?
  5. 5. Strategy People-driven ● Get a 2nd opinion on your methodology Data-driven ● Get another data source to verify results (ground truth) ● Convert ground truth and your output to the same format ● Compare against meaningful metric ● Store & visualize results
  6. 6. General evaluation framework
  7. 7. General evaluation framework Statistical significance?
  8. 8. Teralytics Case Study: Congestion Estimation Ongoing project: Use of cellular data to estimate traffic/congestion in Swiss roads Our estimations: Mean speed on a highway at a given time, given location
  9. 9. Ground truth ● Complex algorithm with lots of knobs and subproblems ● How to know we’re changing things for the better? ● Collect ground truth regarding road traffic in Switzerland -> sensor data available from 3rd party site ● Write hackish script to login to website and fetch sensor data that match our highway locations ● Instant sense of purpose :)
  10. 10. Same format Not just a data architecture problem. ● Our algorithm’s speed estimations are fancy averages of distance/time_needed_for_distance (journey speed) ● Sensor data reports instantaneous speed. ● Sensors are probably going to report higher speeds systematically (bias).
  11. 11. Comparing against metric ● Group data every 3 minutes ● Metric: Percentage of data where the difference between ground truth and estimation is <7% ● Other options ○ linear correlation of time-series of speed ○ cross-correlation to find optimal time shift
  12. 12. Pitfalls of comparison ● Overfitting to ground truth ● Correlation may be statistically insignificant Need proper methodology (training set/testing set) & adequate amounts of ground truth
  13. 13. Visualization ● Instant feedback on what is working and what is not. ● Insights ○ on assumptions ○ on quality of data sources ○ presence of time shift
  14. 14. Lessons learned Ground truth isn’t easy to get ● No API - web scraping ● May be biased ● May have to create it yourself
  15. 15. Lessons learned Use the right tools ● The output of a Big Data analysis problem is of more manageable size -> no need to overengineer, python is fitting for the job ● Need to be able to handle missing data/add constraints /average/interpolate-> use existing library (pandas) with useful abstractions ● Crucial to be able to pinpoint what goes wrong -> interactivity (ipython), logging
  16. 16. Lessons learned Use the right workflow ● Run the whole thing at once for timely feedback ● Always visualize -> large CSVs are hard to make sense of (false sense of security) ● Iterative development pays off & is sped up by automated evaluation :)
  17. 17. Action Points Ask questions ● Is there some place of my data analysis where my results are unverified? ● Am I using the right tools to evaluate? ● Is overengineering getting in the way of quick & timely feedback?
  18. 18. Action Points Make a plan ● What ground truth can I get or create? ● How can I make sure I am comparing apples to apples? ● How should I compare my data to the ground truth (metric, comparison method)? ● What’s the best visualization to show correlation?
  19. 19. Recommended Reading ● Excellent abstractions for data cleaning & transformation ● Good performance ● Portable data formats ● Increases productivity ● +ipython for easy exploring of the data (more insight, what went wrong etc) It takes some time to learn to use the full power of pandas - so get your data scientists to learn it asap. :)
  20. 20. Recommended Reading ● Even new companies have “legacy” code (code that is blocking change) ● Acknowledges the imperfection of the real world (even if design is good, problems may arise) ● Acknowledges the value of quick feedback in dev productivity ● Case-by-case scenarios to unblock yourself and be able to evaluate your code
  21. 21. Recommended Reading
  22. 22. Thanks I would like to thank my colleagues for making good decisions, in particular ● Valentin for introducing pandas to Teralytics ● Nima for organizing the collection of ground truth on several projects ● Laurent for insisting on testing & best practices
  23. 23. Questions? We are hiring :) Looking for Machine Learning/Big Data experts Experience with pandas is a plus Just send your CV to
  24. 24. Bonus Recommended Reading Evaluation of impact of charity organizations is a hard, unsolved problem involving data ● transparency ● more motivation to give