Big Data, We Have a Communication Problem

35,394 views

Published on

Big Data, We Have a Communication Problem

by Daniel Tunkelang

Presented on April 30, 2013 at the TTI/Vanguard Conference on Ginormous Systems
http://www.ttivanguard.com/conference/2013/ginormous.html

It's a cliché that we live in a world of Big Data. But the bottleneck in understanding data is not computational. Rather, the biggest challenge is designing technical solutions that effectively leverage human cognitive ability. Data analysis systems should augment people's capabilities rather than replace them. This argument is as old as computer science itself: in 1962, Doug Engelbart said that the goal of technology is “the enhancement of human intellect by increasing the capability of a human to approach a complex problem situation.” Algorithms extract signal from raw data, but people fill in the gaps, creating models and evaluating analyses.

Empowering people to understand data is not just a surface problem of building better interfaces and visualizations. We need to interact with data not only after performing computational analysis, but throughout the analysis process in order to improve our models and algorithms. In order to do so, we need tools and processes specifically designed to offer people transparency, guidance, and control.

Human-computer information retrieval has been revolutionizing our approach to information seeking -- no modern search engine limits users to black-box relevance ranking and ten blue links. We need to take similar steps in our analysis of big data, making people the center of the analysis process and developing the technical innovations that enable people to fulfill this role.

Published in: Technology, Education
1 Comment
19 Likes
Statistics
Notes
No Downloads
Views
Total views
35,394
On SlideShare
0
From Embeds
0
Number of Embeds
24,437
Actions
Shares
0
Downloads
132
Comments
1
Likes
19
Embeds 0
No embeds

No notes for slide

Big Data, We Have a Communication Problem

  1. BIG DATA,We have a communication problem.GINORMOUS SYSTEMSApril 30–May 1, 2013Washington, D.C.Daniel TunkelangHead of Query Understanding, LinkedIn
  2. BIG DATA IS EVERYWHERE
  3. BIG DATA POWERS EVERYTHING
  4. DATA SCIENTISTS WORRY ABOUTVOLUME, VELOCITY, VARIETY, …
  5. BUT THE BOTTLENECKISN’T COMPUTATIONALIT’S COGNITIVE
  6. TOOLS AUGMENTHUMAN INTELLECTBIG DATA IS A TOOLDoug Engelbart, inventor ofthe mouse, hypertext, etc.
  7. NOT EVERYONE SUBSCRIBESTO THIS POINT OF VIEW…Claudia Perlich, Chief Scientist of media6degrees, speaking atTTI/Vanguard 2012 Conference on Understanding Understanding:
  8. SHE HAS A POINT
  9. BUT PREDICTIVE MODELINGIS NOT ENOUGH
  10. TRAININGDATA?OBJECTIVEFUNCTION?
  11. WE NEED APEOPLE-CENTRICAPPROACH TOBIG DATAINTERPRETABILITYINTERACTIONINSIGHT
  12. LET’S START WITHINTERPRETABILITY
  13. EXAMPLE:SVMvs.DECISION TREE
  14. DECISION TREES HAVE FLAWS…DISCRETE
  15. BUT THEYCOMMUNICATE(if they’re shallow)early splits providebig picture…fat leaves guidefeature engineering…or reveal trainingdata problems
  16. ITERATION
  17. INTERPRETABILITY DELIVERSKey search leader favors rule-basedapproach for key scoring algorithms.Replaced regression with decisiontree in local search model: gainedaccuracy and insight.Using trees to recognize spam,analyze search abandonment,model / quantify social proof.
  18. GO DEEP vs INTERPRETABILITYA KEY DATA SCIENCE TRADE-OFF
  19. ON TOINTERACTION
  20. DON’T OVERPAY FOR PRECISION
  21. BE FAST, CHEAP, AND 98% RIGHThttp://metamarkets.com/2012/fast-cheap-and-98-right-cardinality-estimation-for-big-data/
  22. ARE PEOPLE THAT IMPATIENT?tolerable wait time for web users0.1s increase in latency significantlyreduces # of searches, ad revenuetl;dr: YES
  23. IMPATIENCE IS GOODSPEED MATTERS
  24. INSIGHT
  25. http://blog.takejune.com/archives/52334044.html
  26. BE TRENDY AND NORMALIZEvs
  27. Sept. 11thAbu GhraibWeapons InspectorsSOLVE FOR INTERESTINGNESS
  28. COMPUTE POTENTIAL INSIGHTSAPPLY HUMAN INTUITION
  29. SUMMARY: Let’s have aconversation with Big Data.INTERPRETABILITYINTERACTIONINSIGHT

×