Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Towards Visualization Recommendation Systems

1,337 views

Published on

Data visualization is often used as the first step while performing a variety of analytical tasks. With the advent of large, high-dimensional datasets and strong interest in data science, there is a need for tools that can support rapid visual analysis. In this paper we describe our vision for a new class of visualization recommendation systems that can automatically identify and interactively recommend visualizations relevant to an analytical task.

Published in: Technology
  • Be the first to comment

Towards Visualization Recommendation Systems

  1. 1. Aditya Parameswaran Assistant Professor University of Illinois (w/ ManasiVartak, Samuel Madden @ MIT; Tarique Siddiqui, Silu Huang @ Illinois) http://data-people.cs.illinois.edu DSIAWorkshop,VIS 2015 TowardsVisualization Recommendation Systems 1
  2. 2. “Bring out your dead!” courtesy Monty Python The Dark Ages ofVisualization Recommendations Substantial manual effort and tedious trial-and-error 2
  3. 3. To the Age of Enlightenment: the Holy Grail Can we build systems that automatically recommend visualizations highlighting patterns of interest? 3 “The Holy Grail” courtesy Monty Python
  4. 4. Why now? Reason 1: Too much data: records and attributes Most of the dataset is unexplored! 4
  5. 5. Why now? Reason 2: Lack of skills Harvard Business Review Mashable.com 5
  6. 6. Limitations in CurrentTools • Big Picture • Analyst Preferences • Specification • Exploration not ACID … 6
  7. 7. Limitations in CurrentTools • Big Picture – Poor comprehension of context • Analyst Preferences – Limited understanding of user interests • Specification – Insufficient means to specify trends of interest • Exploration – Inadequate navigation to unexplored areas 7
  8. 8. RecentAttempts atVizrec Systems • Tableau Elastic • Voyager • Harvest • Profiler • Our systems – SeeDB [VLDB 14 x 2,VLDB 16] – zenvisage [unpublished] This conference! 8 Still early days!
  9. 9. SeeDB: ComparativeTasks Task: Compare staplers (target, query) with other products Results: Visualizations where staplers “differ most” from other products Issue: Many attributes  Many many visualizations!9 50 10 10 30 MA CA IL NY 30 20 10 40 Stapler sales Other sales Stapler prod 9 Other prod
  10. 10. : SearchTasks Very early demo! Feedback welcome. (you saw it here first...) 10
  11. 11. 5 RecommendationAxes • Specification of IntendedTask or Insight – e.g., comparative (X vs.Y), search (find X with a desired criteria), outliers (find unusual X) • Data Characteristics – e.g., typical correlations, patterns, trends across attributes, across rows • Semantics or Domain Knowledge • Visual Ease of Understanding • Analyst Preferences 11data-people.cs.illinois.edu/papers/dsia.pdf
  12. 12. Architectural Considerations • Pre-computation • Online computation –Sharing –Parallelism –Pruning –Approximations [VLDB’15] 12data-people.cs.illinois.edu/papers/dsia.pdf
  13. 13. A Clarion Call to DSIA Researchers… Visualization Recommendation Systems: are critically important are timely lead to interesting viz, db, ml, hci problems Let’s move towards the age of enlightenment! “The Holy Grail” courtesy Monty Python 13 data-people.cs.illinois.edu/papers/dsia.pdf
  14. 14. Ongoing Projects in Interactive Analytics Minimizing effort & maximizing efficiency http://data-people.cs.illinois.edu • Data Manipulation [VLDB’15 x 2] • DataVisualization [VLDB’14 x 2,VLDB ’15,VLDB ‘16] • Data Collaboration [VLDB ’15 x 2, CIDR ’15,TAPP ’15] • Data Processing with [VLDB ’15, HCOMP ’15, KDD ‘15] datahub 14 Recent Papers, Demos POPULACE
  15. 15. 15
  16. 16. ResearchThrust II: Crowds Minimizing cost and maximizing accuracy in human-powered data management Data Processing Algorithms Auxiliary Plugins: Quality, Pricing Data Processing Systems Filter [SIGMOD12,VLDB14] Max [SIGMOD12] Clean [KDD12,TKDD13] Categorize [VLDB11] Search [ICDE14] Debug [NIPS12] Count [HCOMP15] Deco [CIKM12, VLDB12, TR12, SIGMOD Record 12] DataSift [HCOMP13, SIGMOD14] HQuery [CIDR11] Conf [KDD13, ICDE15] Evict [TR12] Debias [KDD15] Pricing[VLDB15] Quality [HCOMP14] 16
  17. 17. Human-in-the-loop Data Management Dual personalities • Analysts supervising the analysis – How do we help them get the insights they want? • Crowds helping the analysis – How do we best make use of them to process data? 17
  18. 18. Visualizations Queries (100s) Sharing Pruning Optimizer DBMS Middleware Layer 18
  19. 19. Task Specification ManualVisualization Builder Visualization Pane Recommendation Bar
  20. 20. User Study Part I :Validate utility metric vs. other metrics – See paper! Part II : Study impact of recommendations – H1: SeeDB finds interesting visualizations faster – H2: Users prefer tool w/recommendations
  21. 21. I. SeeDB enables faster analysis • Users view more visualizations with SeeDB • Users bookmark more visualizations with SeeDB • Bookmark rate 3X higher with SeeDB # charts # bookmarks bookmark rate Manual 6.3 +/- 3.8 1.1 +/- 1.45 0.14 +/- 0.16 SeeDB 10.8 +/- 4.41 3.4 +/- 1.35 0.43 +/- 0.23
  22. 22. II. Users Prefer SeeDB 100% users prefer SeeDB over Manual “. . . quickly deciding what correlations are relevant” and “[analyze] . . . a new dataset quickly” “. . . great tool for proposing a set of initial queries for a dataset” “. . . potential downside may be that it made me lazy so I didn’t bother thinking as much about what I really could study or be interested in”
  23. 23. Questions on Part 2?
  24. 24. Overall research agenda … Human-in-the-loop Data Management 24
  25. 25. 25

×