SlideShare a Scribd company logo
Aditya Parameswaran
Assistant Professor
University of Illinois
(w/ ManasiVartak, Samuel Madden @ MIT;
Tarique Siddiqui, Silu Huang @ Illinois)
http://data-people.cs.illinois.edu
DSIAWorkshop,VIS 2015
TowardsVisualization
Recommendation Systems
1
“Bring out your dead!” courtesy Monty Python
The Dark Ages ofVisualization
Recommendations
Substantial manual effort and tedious trial-and-error
2
To the Age of Enlightenment:
the Holy Grail
Can we build systems that automatically recommend
visualizations highlighting patterns of interest?
3
“The Holy Grail” courtesy Monty Python
Why now?
Reason 1: Too much data: records and attributes
Most of the dataset is unexplored!
4
Why now?
Reason 2: Lack of skills
Harvard Business Review Mashable.com
5
Limitations in CurrentTools
• Big Picture
• Analyst Preferences
• Specification
• Exploration
not ACID …
6
Limitations in CurrentTools
• Big Picture
– Poor comprehension of context
• Analyst Preferences
– Limited understanding of user interests
• Specification
– Insufficient means to specify trends of interest
• Exploration
– Inadequate navigation to unexplored areas
7
RecentAttempts atVizrec Systems
• Tableau Elastic
• Voyager
• Harvest
• Profiler
• Our systems
– SeeDB [VLDB 14 x 2,VLDB 16]
– zenvisage [unpublished]
This conference!
8
Still early days!
SeeDB: ComparativeTasks
Task:
Compare staplers (target, query)
with other products
Results:
Visualizations where staplers
“differ most” from other products
Issue: Many attributes  Many many visualizations!9
50
10 10
30
MA CA IL NY
30
20
10
40
Stapler sales
Other sales
Stapler prod
9
Other prod
: SearchTasks
Very early demo! Feedback welcome.
(you saw it here first...)
10
5 RecommendationAxes
• Specification of IntendedTask or Insight
– e.g., comparative (X vs.Y), search (find X with a
desired criteria), outliers (find unusual X)
• Data Characteristics
– e.g., typical correlations, patterns, trends across
attributes, across rows
• Semantics or Domain Knowledge
• Visual Ease of Understanding
• Analyst Preferences
11data-people.cs.illinois.edu/papers/dsia.pdf
Architectural Considerations
• Pre-computation
• Online computation
–Sharing
–Parallelism
–Pruning
–Approximations [VLDB’15]
12data-people.cs.illinois.edu/papers/dsia.pdf
A Clarion Call to DSIA Researchers…
Visualization Recommendation Systems:
are critically important
are timely
lead to interesting viz, db, ml, hci problems
Let’s move towards the age of enlightenment!
“The Holy Grail” courtesy Monty Python
13
data-people.cs.illinois.edu/papers/dsia.pdf
Ongoing Projects in Interactive Analytics
Minimizing effort & maximizing efficiency
http://data-people.cs.illinois.edu
• Data Manipulation [VLDB’15 x 2]
• DataVisualization [VLDB’14 x 2,VLDB ’15,VLDB ‘16]
• Data Collaboration [VLDB ’15 x 2, CIDR ’15,TAPP ’15]
• Data Processing with [VLDB ’15, HCOMP ’15, KDD ‘15]
datahub
14
Recent Papers, Demos
POPULACE
15
ResearchThrust II: Crowds
Minimizing cost and maximizing accuracy in
human-powered data management
Data Processing
Algorithms
Auxiliary Plugins:
Quality, Pricing
Data Processing
Systems
Filter [SIGMOD12,VLDB14] Max [SIGMOD12]
Clean [KDD12,TKDD13] Categorize [VLDB11]
Search [ICDE14] Debug [NIPS12] Count [HCOMP15]
Deco [CIKM12, VLDB12, TR12, SIGMOD Record 12]
DataSift [HCOMP13, SIGMOD14] HQuery [CIDR11]
Conf [KDD13, ICDE15] Evict [TR12] Debias [KDD15]
Pricing[VLDB15] Quality [HCOMP14]
16
Human-in-the-loop
Data Management
Dual personalities
• Analysts supervising the analysis
– How do we help them get the insights they want?
• Crowds helping the analysis
– How do we best make use of them to process data?
17
Visualizations
Queries (100s)
Sharing
Pruning
Optimizer
DBMS
Middleware
Layer
18
Task Specification
ManualVisualization Builder
Visualization Pane
Recommendation Bar
User Study
Part I :Validate utility metric vs. other metrics
– See paper!
Part II : Study impact of recommendations
– H1: SeeDB finds interesting visualizations faster
– H2: Users prefer tool w/recommendations
I. SeeDB enables faster analysis
• Users view more visualizations with SeeDB
• Users bookmark more visualizations with SeeDB
• Bookmark rate 3X higher with SeeDB
# charts # bookmarks bookmark rate
Manual 6.3 +/- 3.8 1.1 +/- 1.45 0.14 +/- 0.16
SeeDB 10.8 +/- 4.41 3.4 +/- 1.35 0.43 +/- 0.23
II. Users Prefer SeeDB
100% users prefer SeeDB over Manual
“. . . quickly deciding what correlations are relevant” and
“[analyze] . . . a new dataset quickly”
“. . . great tool for proposing a set of initial queries for a
dataset”
“. . . potential downside may be that it made me lazy so I
didn’t bother thinking as much about what I really could study
or be interested in”
Questions on Part 2?
Overall research agenda …
Human-in-the-loop
Data Management
24
25

More Related Content

What's hot

Demystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningDemystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine Learning
Julian Bright
 
Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11
Dr.Sotarat Thammaboosadee CIMP-Data Governance
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Caserta
 
Presentation on Big Data Analytics
Presentation on Big Data AnalyticsPresentation on Big Data Analytics
Presentation on Big Data Analytics
S P Sajjan
 
Introduction to data science intro,ch(1,2,3)
Introduction to data science intro,ch(1,2,3)Introduction to data science intro,ch(1,2,3)
Introduction to data science intro,ch(1,2,3)
heba_ahmad
 
Big Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsBig Data Science: Intro and Benefits
Big Data Science: Intro and Benefits
Chandan Rajah
 
Data Science Project Lifecycle and Skill Set
Data Science Project Lifecycle and Skill SetData Science Project Lifecycle and Skill Set
Data Science Project Lifecycle and Skill Set
IDEAS - Int'l Data Engineering and Science Association
 
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014The Hive
 
Big-data analytics: challenges and opportunities
Big-data analytics: challenges and opportunitiesBig-data analytics: challenges and opportunities
Big-data analytics: challenges and opportunities
台灣資料科學年會
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
Sampath Kumar
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
Lars Marius Garshol
 
EDF2013: Big Data Tutorial: Marko Grobelnik
EDF2013: Big Data Tutorial: Marko GrobelnikEDF2013: Big Data Tutorial: Marko Grobelnik
EDF2013: Big Data Tutorial: Marko Grobelnik
European Data Forum
 
Data science presentation
Data science presentationData science presentation
Data science presentation
MSDEVMTL
 
Stanford DeepDive Framework
Stanford DeepDive FrameworkStanford DeepDive Framework
Stanford DeepDive Framework
Ran Zhang
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & Opportunities
CodePolitan
 
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
Ilkay Altintas, Ph.D.
 
Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)
Data Science Thailand
 
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez BlanchfieldBig Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
Dez Blanchfield
 

What's hot (20)

Big Data: Issues and Challenges
Big Data: Issues and ChallengesBig Data: Issues and Challenges
Big Data: Issues and Challenges
 
Demystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine LearningDemystifying Data Science with an introduction to Machine Learning
Demystifying Data Science with an introduction to Machine Learning
 
Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Presentation on Big Data Analytics
Presentation on Big Data AnalyticsPresentation on Big Data Analytics
Presentation on Big Data Analytics
 
Introduction to data science intro,ch(1,2,3)
Introduction to data science intro,ch(1,2,3)Introduction to data science intro,ch(1,2,3)
Introduction to data science intro,ch(1,2,3)
 
Big Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsBig Data Science: Intro and Benefits
Big Data Science: Intro and Benefits
 
Data Science Project Lifecycle and Skill Set
Data Science Project Lifecycle and Skill SetData Science Project Lifecycle and Skill Set
Data Science Project Lifecycle and Skill Set
 
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
 
Big-data analytics: challenges and opportunities
Big-data analytics: challenges and opportunitiesBig-data analytics: challenges and opportunities
Big-data analytics: challenges and opportunities
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
 
EDF2013: Big Data Tutorial: Marko Grobelnik
EDF2013: Big Data Tutorial: Marko GrobelnikEDF2013: Big Data Tutorial: Marko Grobelnik
EDF2013: Big Data Tutorial: Marko Grobelnik
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
Stanford DeepDive Framework
Stanford DeepDive FrameworkStanford DeepDive Framework
Stanford DeepDive Framework
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & Opportunities
 
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
 
Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)
 
Data Scientist Enablement roadmap 1.0
Data Scientist Enablement roadmap 1.0Data Scientist Enablement roadmap 1.0
Data Scientist Enablement roadmap 1.0
 
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez BlanchfieldBig Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
 

Similar to Towards Visualization Recommendation Systems

ch1vsat2k_BDA_Introduction11Jan17-converted.pptx
ch1vsat2k_BDA_Introduction11Jan17-converted.pptxch1vsat2k_BDA_Introduction11Jan17-converted.pptx
ch1vsat2k_BDA_Introduction11Jan17-converted.pptx
Mrityunjay Emmi
 
Big data ppt
Big data pptBig data ppt
Big data ppt
AKASH SIHAG
 
3 джозеп курто превращаем вашу организацию в big data компанию
3 джозеп курто превращаем вашу организацию в big data компанию3 джозеп курто превращаем вашу организацию в big data компанию
3 джозеп курто превращаем вашу организацию в big data компанию
antishmanti
 
Accelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsAccelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time Analytics
Arcadia Data
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
cedrinemadera
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadata
markgrover
 
big_data_case_studies.pdf
big_data_case_studies.pdfbig_data_case_studies.pdf
big_data_case_studies.pdf
vishal choudhary
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
SoftServe
 
Loras College 2016 Business Analytics Symposium Keynote
Loras College 2016 Business Analytics Symposium KeynoteLoras College 2016 Business Analytics Symposium Keynote
Loras College 2016 Business Analytics Symposium Keynote
Rich Clayton
 
Göteborg university(condensed)
Göteborg university(condensed)Göteborg university(condensed)
Göteborg university(condensed)
Zenodia Charpy
 
Crowdsourcing Approaches to Big Data Curation - Rio Big Data Meetup
Crowdsourcing Approaches to Big Data Curation - Rio Big Data MeetupCrowdsourcing Approaches to Big Data Curation - Rio Big Data Meetup
Crowdsourcing Approaches to Big Data Curation - Rio Big Data Meetup
Edward Curry
 
From Science to Data: Following a principled path to Data Science
From Science to Data: Following a principled path to Data ScienceFrom Science to Data: Following a principled path to Data Science
From Science to Data: Following a principled path to Data Science
Institute of Contemporary Sciences
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentation
Tao Feng
 
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Amazon Web Services
 
TDWI BP Report Emerging Technologies
TDWI BP Report Emerging TechnologiesTDWI BP Report Emerging Technologies
TDWI BP Report Emerging Technologies
Andrey Karpov
 
[161] 데이터사이언스팀 빌딩
[161] 데이터사이언스팀 빌딩[161] 데이터사이언스팀 빌딩
[161] 데이터사이언스팀 빌딩
NAVER D2
 
BDA_Module1.pptx
BDA_Module1.pptxBDA_Module1.pptx
BDA_Module1.pptx
Shrinivasa6
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyRohit Dubey
 
Big Data Expo 2015 - Barnsten Why Data Modelling is Essential
Big Data Expo 2015 - Barnsten Why Data Modelling is EssentialBig Data Expo 2015 - Barnsten Why Data Modelling is Essential
Big Data Expo 2015 - Barnsten Why Data Modelling is Essential
BigDataExpo
 
Data Science Demystified
Data Science DemystifiedData Science Demystified
Data Science Demystified
Emily Robinson
 

Similar to Towards Visualization Recommendation Systems (20)

ch1vsat2k_BDA_Introduction11Jan17-converted.pptx
ch1vsat2k_BDA_Introduction11Jan17-converted.pptxch1vsat2k_BDA_Introduction11Jan17-converted.pptx
ch1vsat2k_BDA_Introduction11Jan17-converted.pptx
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
3 джозеп курто превращаем вашу организацию в big data компанию
3 джозеп курто превращаем вашу организацию в big data компанию3 джозеп курто превращаем вашу организацию в big data компанию
3 джозеп курто превращаем вашу организацию в big data компанию
 
Accelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsAccelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time Analytics
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadata
 
big_data_case_studies.pdf
big_data_case_studies.pdfbig_data_case_studies.pdf
big_data_case_studies.pdf
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
 
Loras College 2016 Business Analytics Symposium Keynote
Loras College 2016 Business Analytics Symposium KeynoteLoras College 2016 Business Analytics Symposium Keynote
Loras College 2016 Business Analytics Symposium Keynote
 
Göteborg university(condensed)
Göteborg university(condensed)Göteborg university(condensed)
Göteborg university(condensed)
 
Crowdsourcing Approaches to Big Data Curation - Rio Big Data Meetup
Crowdsourcing Approaches to Big Data Curation - Rio Big Data MeetupCrowdsourcing Approaches to Big Data Curation - Rio Big Data Meetup
Crowdsourcing Approaches to Big Data Curation - Rio Big Data Meetup
 
From Science to Data: Following a principled path to Data Science
From Science to Data: Following a principled path to Data ScienceFrom Science to Data: Following a principled path to Data Science
From Science to Data: Following a principled path to Data Science
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentation
 
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
Easy Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Mac...
 
TDWI BP Report Emerging Technologies
TDWI BP Report Emerging TechnologiesTDWI BP Report Emerging Technologies
TDWI BP Report Emerging Technologies
 
[161] 데이터사이언스팀 빌딩
[161] 데이터사이언스팀 빌딩[161] 데이터사이언스팀 빌딩
[161] 데이터사이언스팀 빌딩
 
BDA_Module1.pptx
BDA_Module1.pptxBDA_Module1.pptx
BDA_Module1.pptx
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit Dubey
 
Big Data Expo 2015 - Barnsten Why Data Modelling is Essential
Big Data Expo 2015 - Barnsten Why Data Modelling is EssentialBig Data Expo 2015 - Barnsten Why Data Modelling is Essential
Big Data Expo 2015 - Barnsten Why Data Modelling is Essential
 
Data Science Demystified
Data Science DemystifiedData Science Demystified
Data Science Demystified
 

Recently uploaded

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 

Recently uploaded (20)

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 

Towards Visualization Recommendation Systems

  • 1. Aditya Parameswaran Assistant Professor University of Illinois (w/ ManasiVartak, Samuel Madden @ MIT; Tarique Siddiqui, Silu Huang @ Illinois) http://data-people.cs.illinois.edu DSIAWorkshop,VIS 2015 TowardsVisualization Recommendation Systems 1
  • 2. “Bring out your dead!” courtesy Monty Python The Dark Ages ofVisualization Recommendations Substantial manual effort and tedious trial-and-error 2
  • 3. To the Age of Enlightenment: the Holy Grail Can we build systems that automatically recommend visualizations highlighting patterns of interest? 3 “The Holy Grail” courtesy Monty Python
  • 4. Why now? Reason 1: Too much data: records and attributes Most of the dataset is unexplored! 4
  • 5. Why now? Reason 2: Lack of skills Harvard Business Review Mashable.com 5
  • 6. Limitations in CurrentTools • Big Picture • Analyst Preferences • Specification • Exploration not ACID … 6
  • 7. Limitations in CurrentTools • Big Picture – Poor comprehension of context • Analyst Preferences – Limited understanding of user interests • Specification – Insufficient means to specify trends of interest • Exploration – Inadequate navigation to unexplored areas 7
  • 8. RecentAttempts atVizrec Systems • Tableau Elastic • Voyager • Harvest • Profiler • Our systems – SeeDB [VLDB 14 x 2,VLDB 16] – zenvisage [unpublished] This conference! 8 Still early days!
  • 9. SeeDB: ComparativeTasks Task: Compare staplers (target, query) with other products Results: Visualizations where staplers “differ most” from other products Issue: Many attributes  Many many visualizations!9 50 10 10 30 MA CA IL NY 30 20 10 40 Stapler sales Other sales Stapler prod 9 Other prod
  • 10. : SearchTasks Very early demo! Feedback welcome. (you saw it here first...) 10
  • 11. 5 RecommendationAxes • Specification of IntendedTask or Insight – e.g., comparative (X vs.Y), search (find X with a desired criteria), outliers (find unusual X) • Data Characteristics – e.g., typical correlations, patterns, trends across attributes, across rows • Semantics or Domain Knowledge • Visual Ease of Understanding • Analyst Preferences 11data-people.cs.illinois.edu/papers/dsia.pdf
  • 12. Architectural Considerations • Pre-computation • Online computation –Sharing –Parallelism –Pruning –Approximations [VLDB’15] 12data-people.cs.illinois.edu/papers/dsia.pdf
  • 13. A Clarion Call to DSIA Researchers… Visualization Recommendation Systems: are critically important are timely lead to interesting viz, db, ml, hci problems Let’s move towards the age of enlightenment! “The Holy Grail” courtesy Monty Python 13 data-people.cs.illinois.edu/papers/dsia.pdf
  • 14. Ongoing Projects in Interactive Analytics Minimizing effort & maximizing efficiency http://data-people.cs.illinois.edu • Data Manipulation [VLDB’15 x 2] • DataVisualization [VLDB’14 x 2,VLDB ’15,VLDB ‘16] • Data Collaboration [VLDB ’15 x 2, CIDR ’15,TAPP ’15] • Data Processing with [VLDB ’15, HCOMP ’15, KDD ‘15] datahub 14 Recent Papers, Demos POPULACE
  • 15. 15
  • 16. ResearchThrust II: Crowds Minimizing cost and maximizing accuracy in human-powered data management Data Processing Algorithms Auxiliary Plugins: Quality, Pricing Data Processing Systems Filter [SIGMOD12,VLDB14] Max [SIGMOD12] Clean [KDD12,TKDD13] Categorize [VLDB11] Search [ICDE14] Debug [NIPS12] Count [HCOMP15] Deco [CIKM12, VLDB12, TR12, SIGMOD Record 12] DataSift [HCOMP13, SIGMOD14] HQuery [CIDR11] Conf [KDD13, ICDE15] Evict [TR12] Debias [KDD15] Pricing[VLDB15] Quality [HCOMP14] 16
  • 17. Human-in-the-loop Data Management Dual personalities • Analysts supervising the analysis – How do we help them get the insights they want? • Crowds helping the analysis – How do we best make use of them to process data? 17
  • 20. User Study Part I :Validate utility metric vs. other metrics – See paper! Part II : Study impact of recommendations – H1: SeeDB finds interesting visualizations faster – H2: Users prefer tool w/recommendations
  • 21. I. SeeDB enables faster analysis • Users view more visualizations with SeeDB • Users bookmark more visualizations with SeeDB • Bookmark rate 3X higher with SeeDB # charts # bookmarks bookmark rate Manual 6.3 +/- 3.8 1.1 +/- 1.45 0.14 +/- 0.16 SeeDB 10.8 +/- 4.41 3.4 +/- 1.35 0.43 +/- 0.23
  • 22. II. Users Prefer SeeDB 100% users prefer SeeDB over Manual “. . . quickly deciding what correlations are relevant” and “[analyze] . . . a new dataset quickly” “. . . great tool for proposing a set of initial queries for a dataset” “. . . potential downside may be that it made me lazy so I didn’t bother thinking as much about what I really could study or be interested in”
  • 24. Overall research agenda … Human-in-the-loop Data Management 24
  • 25. 25

Editor's Notes

  1. Despite the advent of visualization tools like Tableau, we’re still in Current are akin to a movie catalog Where you can see the list of available movies, Select ones you want And see information about them. If you don’t know the movie you want to watch, you’ll have to look at a whole lot of movies before you what you desire In other words, current visualization systems involve sub Before you get the desired result
  2. Let’s move to Much like netflix and amazon recommendations of today,
  3. Why is this timely? Increasingly larger datasets with large numbers of records and attributes As a result Motivating the need for recommendations for the unexplored areas
  4. Second reason is that everyone wants to be a data scientist (and who are we to argue), but don’t really have the skills. We need to build the tools that help them get the insights they need.
  5. So what do current systems lack. I’m a database guy, and for some reason, we love chemistry based acronyms, so here’s a new one.
  6. Provide a.. Is the dip in february in sales expected? Or is it anomalous? Do not take into account typical browsing patterns For instance, if the analyst wants to find all products that took a hit in february? Can we find all attributes on which two products differ? Often users focus on a tiny portion of the dataset, perhaps due to inexperience.
  7. As it turns out.. We aren’t the only ones preaching this wisdom. Partially addressing these limitations Including one from tableau and one appearing at this very conf from the jeff and the uw folks I’m going to tell you about our systems to give you a flavor of what we’re talking about
  8. Caters to the user specification of a comparative task What SeeDB will provide are .. Among all the vis Key issue here is that
  9. Caters to the user specification of a search task
  10. In our workshop paper, we identified 5 recommendation axes: Which is very hard Ton of work from the viz community on this
  11. In building these vizrec systems there are a number of interesting systems challenges What should be done online and offline Online, how do we maximize sharing and parallelism in evaluating these recs? How do we … that we know are not useful How do we leverage app to return results faster, or return approximate results?
  12. In the age of data science
  13. Overall architecture Middleware layer that sits between the UI and the DBMS User task (compare married/un) is broken down into a collection of q; Optimizer handles these q using a combination of … optimizations and makes repeated q to the DBMS
  14. Note of caution