SlideShare a Scribd company logo

Intent-aware Visualization Recommendation for Tabular Data

A
A

Slide that I presented in WISE2021 conference

Intent-aware Visualization Recommendation for Tabular Data

1 of 39
Download to read offline
WISE 2021
Atsuki Maruta (University of Tsukuba)
Makoto P. Kato (University of Tsukuba / JST, PRESTO)
Intent-aware Visualization Recommendation
for Tabular Data
Visualization is an effective means to gain insights
Problem
• Effective visualization requires special skills, knowledge, and deep analysis
of data
○ What tools to use or what charts to use
Introduction 2
It is difficult to understand
the contents of the tabular data Charts can be used for a quick understanding
Examples of Charts 3
Pie chart [3]
Area chart [4] Multi polygon chart [5]
[1] https://public.tableau.com/app/profile/zhishun.zhang/viz/BarChart_43/Sheet1
[2] https://public.tableau.com/app/profile/amadou1986/viz/LineChart_15562961467070/LineChart_1
[3] https://public.tableau.com/app/profile/ravleen.anand/viz/PieChart_71/Sheet1
[4] https://public.tableau.com/app/profile/ellas3717/viz/AreaChart_24/AreaChart
[5] https://public.tableau.com/app/profile/sukumar.roy.chowdhury/viz/Asia-PacificPolygonMap/AsiaPacificPolygonMap
Bar chart [1] Line chart [2]
There are visualization recommendation studies
Related Work
Visualization recommendation can be categorized into two approaches
4
Machine Learning-based
Machine learning with statistical
features of tabular data [9][10][11][12]
- e.g.) Using the variance of the column
values and the correlation between the
columns as features
Rule-based
Rules are set by experts about
visualization using statistical
features from tabular data [6][7][8]
- e.g.) Pie charts are not suitable when the
number of rows is large
[6] Stolte Chris et al. Polaris: A system for query, analysis, and visualization of multidimensional relational databases. IEEE TVCG. 2002, vol. 8, no. 1, p. 52-65.
[7] Wongsuphasawat Kanit et al. Voyager: Exploratory analysis via faceted browsing of visualization recommendations. IEEE TVCG. 2015, vol. 22, no. 1, p. 649-658.
[8] Eberhardt A, Milene S. Show me the data! A systematic mapping on open government data visualization. DG.O 2018. 2018, p. 1-10.
[9] Moritz Dominik et al. Formalizing visualization design knowledge as constraints: Actionable and extensible models in draco. IEEE TVCG. 2018, vol. 25, no. 1, p. 438-448.
[10] Dibia Victor, Demiralp Çağatay. Data2vis: Automatic generation of data visualizations using sequence-to-sequence recurrent neural networks. IEEE CGA. 2019, vol. 39, no. 5, p. 33-46.
[11] Luo, Y et al. Deepeye: Towards automatic data visualization. IEEE 34th ICDE, 2018, p. 101-112.
[12] Hu, Kevin et al. Vizml: A machine learning approach to visualization recommendation. CHI, 2019, p. 1-12.
Our method!
Related Work Visualization Type Prediction with Statistical Features 5
There is a machine learning-based method that predicts
visualization with statistical features of tabular data [12]
Year Population
2017 100
2018 200
2019 250
Extract
features
Maximum value : 2019
Mean value : 2018
Variance value : 1
Number of rows : 3
Data type : numeric
・・・
Input
Machine
learning
model
Output
Line chart
Tabular data Statistical feature of tabular data Predict visualization
Extract statistical features such as maximum value, variance, etc. from tabular
data and input them into machine learning model to predict visualization.
[12] Hu, Kevin et al. Vizml: A machine learning approach to visualization recommendation. CHI, 2019, p. 1-12.
In existing studies, visualization intent is not considered
Visualization Intent ~ Content of the Data that the User Wants to Visualize 6
Population
trends in Italy
Percentage of students in
Japanese population
trends
Percentage
Line chart
Pie chart
Predict more appropriate visualization
by considering the visualization intent
Visualization intent Visualization type
An effective term for predicting
visualization method in visualization intent
Ad

Recommended

Business mathematics and statistics by G.Reka
Business mathematics and statistics by G.RekaBusiness mathematics and statistics by G.Reka
Business mathematics and statistics by G.RekaPOLIKAIYOOR REKA
 
A Review on Non Linear Dimensionality Reduction Techniques for Face Recognition
A Review on Non Linear Dimensionality Reduction Techniques for Face RecognitionA Review on Non Linear Dimensionality Reduction Techniques for Face Recognition
A Review on Non Linear Dimensionality Reduction Techniques for Face Recognitionrahulmonikasharma
 
A Review on data visualization tools used for Big Data
A Review on data visualization tools used for Big DataA Review on data visualization tools used for Big Data
A Review on data visualization tools used for Big DataIRJET Journal
 
IRJET- Data Analytics and Visualization through R Programming
IRJET-  	  Data Analytics and Visualization through R ProgrammingIRJET-  	  Data Analytics and Visualization through R Programming
IRJET- Data Analytics and Visualization through R ProgrammingIRJET Journal
 
Create Powerful Reports Using Data Visualization With Quick BI
Create Powerful Reports Using Data Visualization With Quick BICreate Powerful Reports Using Data Visualization With Quick BI
Create Powerful Reports Using Data Visualization With Quick BIOliver Theobald
 
Big Data Quickstart Series 1: Create Powerful Data Visualization
Big Data Quickstart Series 1: Create Powerful Data VisualizationBig Data Quickstart Series 1: Create Powerful Data Visualization
Big Data Quickstart Series 1: Create Powerful Data VisualizationAlibaba Cloud
 
Predictive Analysis for Diabetes using Tableau
Predictive Analysis for Diabetes using TableauPredictive Analysis for Diabetes using Tableau
Predictive Analysis for Diabetes using Tableaurahulmonikasharma
 

More Related Content

Similar to Intent-aware Visualization Recommendation for Tabular Data

Projects are the Catalysts for Urbanization - VSR
Projects are the Catalysts for Urbanization - VSRProjects are the Catalysts for Urbanization - VSR
Projects are the Catalysts for Urbanization - VSRVSR *
 
Data Visualization1.pptx
Data Visualization1.pptxData Visualization1.pptx
Data Visualization1.pptxqwtadhsaber
 
ETECH Q1 Wk4-Infographics.pptx
ETECH Q1 Wk4-Infographics.pptxETECH Q1 Wk4-Infographics.pptx
ETECH Q1 Wk4-Infographics.pptxJohn Carlo Rollon
 
Data visualisation for gender statistics (EIGE)
Data visualisation for gender statistics (EIGE)Data visualisation for gender statistics (EIGE)
Data visualisation for gender statistics (EIGE)Nikolaos Vaslamatzis
 
1) Definition of Data visualization-Representation and prese.docx
1) Definition of Data visualization-Representation and prese.docx1) Definition of Data visualization-Representation and prese.docx
1) Definition of Data visualization-Representation and prese.docxcuddietheresa
 
Social Media Content Analyser
Social Media Content AnalyserSocial Media Content Analyser
Social Media Content AnalyserIRJET Journal
 
Making Data Meaningful
Making Data MeaningfulMaking Data Meaningful
Making Data MeaningfulAmanda Makulec
 
Exploring The Lean Training Strategies That Healthcare...
Exploring The Lean Training Strategies That Healthcare...Exploring The Lean Training Strategies That Healthcare...
Exploring The Lean Training Strategies That Healthcare...Katie Parker
 
Infographer agency portfolio presentation jan 2013
Infographer agency portfolio presentation jan 2013Infographer agency portfolio presentation jan 2013
Infographer agency portfolio presentation jan 2013Max Gorbachevskiy
 
ADV: Solving the data visualization dilemma
ADV: Solving the data visualization dilemmaADV: Solving the data visualization dilemma
ADV: Solving the data visualization dilemmaGrant Thornton LLP
 
Making Data Meaningful
Making Data MeaningfulMaking Data Meaningful
Making Data MeaningfulJSI
 
Paper #2 Macroeconomics Country ForecastAll students must pick .docx
Paper #2 Macroeconomics Country ForecastAll students must pick .docxPaper #2 Macroeconomics Country ForecastAll students must pick .docx
Paper #2 Macroeconomics Country ForecastAll students must pick .docxkarlhennesey
 
Gender and age classification using deep learning
Gender and age classification using deep learningGender and age classification using deep learning
Gender and age classification using deep learningIRJET Journal
 
IRJET- Creating a Dashboard using Tableau
IRJET- Creating a Dashboard using TableauIRJET- Creating a Dashboard using Tableau
IRJET- Creating a Dashboard using TableauIRJET Journal
 
Plan4business technical solution
Plan4business technical solutionPlan4business technical solution
Plan4business technical solutionKarel Charvat
 
Temporal Exploration in 2D Visualization of Emotions on Twitter Stream
Temporal Exploration in 2D Visualization of Emotions on Twitter StreamTemporal Exploration in 2D Visualization of Emotions on Twitter Stream
Temporal Exploration in 2D Visualization of Emotions on Twitter StreamTELKOMNIKA JOURNAL
 

Similar to Intent-aware Visualization Recommendation for Tabular Data (20)

Projects are the Catalysts for Urbanization - VSR
Projects are the Catalysts for Urbanization - VSRProjects are the Catalysts for Urbanization - VSR
Projects are the Catalysts for Urbanization - VSR
 
Data Visualization1.pptx
Data Visualization1.pptxData Visualization1.pptx
Data Visualization1.pptx
 
ETECH Q1 Wk4-Infographics.pptx
ETECH Q1 Wk4-Infographics.pptxETECH Q1 Wk4-Infographics.pptx
ETECH Q1 Wk4-Infographics.pptx
 
Data analysis
Data analysisData analysis
Data analysis
 
Data visualisation for gender statistics (EIGE)
Data visualisation for gender statistics (EIGE)Data visualisation for gender statistics (EIGE)
Data visualisation for gender statistics (EIGE)
 
Beauty of visualization
Beauty of visualizationBeauty of visualization
Beauty of visualization
 
1) Definition of Data visualization-Representation and prese.docx
1) Definition of Data visualization-Representation and prese.docx1) Definition of Data visualization-Representation and prese.docx
1) Definition of Data visualization-Representation and prese.docx
 
Social Media Content Analyser
Social Media Content AnalyserSocial Media Content Analyser
Social Media Content Analyser
 
Making Data Meaningful
Making Data MeaningfulMaking Data Meaningful
Making Data Meaningful
 
storytelling-may-12-2022.pptx
storytelling-may-12-2022.pptxstorytelling-may-12-2022.pptx
storytelling-may-12-2022.pptx
 
Exploring The Lean Training Strategies That Healthcare...
Exploring The Lean Training Strategies That Healthcare...Exploring The Lean Training Strategies That Healthcare...
Exploring The Lean Training Strategies That Healthcare...
 
Infographer agency portfolio presentation jan 2013
Infographer agency portfolio presentation jan 2013Infographer agency portfolio presentation jan 2013
Infographer agency portfolio presentation jan 2013
 
ADV: Solving the data visualization dilemma
ADV: Solving the data visualization dilemmaADV: Solving the data visualization dilemma
ADV: Solving the data visualization dilemma
 
Making Data Meaningful
Making Data MeaningfulMaking Data Meaningful
Making Data Meaningful
 
Paper #2 Macroeconomics Country ForecastAll students must pick .docx
Paper #2 Macroeconomics Country ForecastAll students must pick .docxPaper #2 Macroeconomics Country ForecastAll students must pick .docx
Paper #2 Macroeconomics Country ForecastAll students must pick .docx
 
Gender and age classification using deep learning
Gender and age classification using deep learningGender and age classification using deep learning
Gender and age classification using deep learning
 
journal of monitoring system
journal of monitoring systemjournal of monitoring system
journal of monitoring system
 
IRJET- Creating a Dashboard using Tableau
IRJET- Creating a Dashboard using TableauIRJET- Creating a Dashboard using Tableau
IRJET- Creating a Dashboard using Tableau
 
Plan4business technical solution
Plan4business technical solutionPlan4business technical solution
Plan4business technical solution
 
Temporal Exploration in 2D Visualization of Emotions on Twitter Stream
Temporal Exploration in 2D Visualization of Emotions on Twitter StreamTemporal Exploration in 2D Visualization of Emotions on Twitter Stream
Temporal Exploration in 2D Visualization of Emotions on Twitter Stream
 

Recently uploaded

20CE501PE – INDUSTRIAL WASTE MANAGEMENT.ppt
20CE501PE – INDUSTRIAL WASTE MANAGEMENT.ppt20CE501PE – INDUSTRIAL WASTE MANAGEMENT.ppt
20CE501PE – INDUSTRIAL WASTE MANAGEMENT.pptMohanumar S
 
Critical Literature Review Final -MW.pdf
Critical Literature Review Final -MW.pdfCritical Literature Review Final -MW.pdf
Critical Literature Review Final -MW.pdfMollyWinterbottom
 
Pre-assessment & Data Sheet presentation template - 2023.pptx
Pre-assessment & Data Sheet presentation template - 2023.pptxPre-assessment & Data Sheet presentation template - 2023.pptx
Pre-assessment & Data Sheet presentation template - 2023.pptxssuserc79a6f
 
Introduction communication assignmen.pdf
Introduction communication assignmen.pdfIntroduction communication assignmen.pdf
Introduction communication assignmen.pdfKannigaSaraswathyM
 
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
UNIT I INTRODUCTION TO INTERNET OF THINGS
UNIT I INTRODUCTION TO INTERNET OF THINGSUNIT I INTRODUCTION TO INTERNET OF THINGS
UNIT I INTRODUCTION TO INTERNET OF THINGSbinuvijay1
 
self introduction sri balaji
self introduction sri balajiself introduction sri balaji
self introduction sri balajiSriBalaji891607
 
Final Year Project - Automated web based form filling using OCR.pptx
Final Year Project - Automated web based form filling using OCR.pptxFinal Year Project - Automated web based form filling using OCR.pptx
Final Year Project - Automated web based form filling using OCR.pptxswarajkakade83
 
Hydraulics Introduction& Hydrostatics.pdf
Hydraulics  Introduction&   Hydrostatics.pdfHydraulics  Introduction&   Hydrostatics.pdf
Hydraulics Introduction& Hydrostatics.pdfGetacher Teshome
 
Shankar communication assignment no1 .pdf
Shankar communication assignment no1 .pdfShankar communication assignment no1 .pdf
Shankar communication assignment no1 .pdfshankaranarayanan972
 
Deluck Technical Works Company Profile.pdf
Deluck Technical Works Company Profile.pdfDeluck Technical Works Company Profile.pdf
Deluck Technical Works Company Profile.pdfartpoa9
 
BRINDHA G AD21012 SELF INTRODUCTION.pdf
BRINDHA G  AD21012 SELF INTRODUCTION.pdfBRINDHA G  AD21012 SELF INTRODUCTION.pdf
BRINDHA G AD21012 SELF INTRODUCTION.pdfbrindhaaids12
 
S. Kim, NeurIPS 2023, MLILAB, KAISTAI
S. Kim,  NeurIPS 2023,  MLILAB,  KAISTAIS. Kim,  NeurIPS 2023,  MLILAB,  KAISTAI
S. Kim, NeurIPS 2023, MLILAB, KAISTAIMLILAB
 
Biochemical Thermodynamics for Biotechnology
Biochemical Thermodynamics for BiotechnologyBiochemical Thermodynamics for Biotechnology
Biochemical Thermodynamics for Biotechnologyssusere9cd97
 
Sample Case Study of industry 4.0 and its Outcome
Sample Case Study of industry 4.0 and its OutcomeSample Case Study of industry 4.0 and its Outcome
Sample Case Study of industry 4.0 and its OutcomeHarshith A S
 
Searching and Sorting Unit II Part I.pptx
Searching and Sorting Unit II Part I.pptxSearching and Sorting Unit II Part I.pptx
Searching and Sorting Unit II Part I.pptxDr. Madhuri Jawale
 
chap. 3. lipid deterioration oil and fat processign
chap. 3. lipid deterioration oil and fat processignchap. 3. lipid deterioration oil and fat processign
chap. 3. lipid deterioration oil and fat processignteddymebratie
 
Objectives of Software Engineering and phases of SDLC.pptx
Objectives of Software Engineering and phases of SDLC.pptxObjectives of Software Engineering and phases of SDLC.pptx
Objectives of Software Engineering and phases of SDLC.pptxGraceDenial
 
Bresenham line-drawing-algorithm By S L Sonawane.pdf
Bresenham line-drawing-algorithm By S L Sonawane.pdfBresenham line-drawing-algorithm By S L Sonawane.pdf
Bresenham line-drawing-algorithm By S L Sonawane.pdfSujataSonawane11
 
Presentation of Helmet Detection Using Machine Learning.pptx
Presentation of Helmet Detection Using Machine Learning.pptxPresentation of Helmet Detection Using Machine Learning.pptx
Presentation of Helmet Detection Using Machine Learning.pptxasmitaTele2
 

Recently uploaded (20)

20CE501PE – INDUSTRIAL WASTE MANAGEMENT.ppt
20CE501PE – INDUSTRIAL WASTE MANAGEMENT.ppt20CE501PE – INDUSTRIAL WASTE MANAGEMENT.ppt
20CE501PE – INDUSTRIAL WASTE MANAGEMENT.ppt
 
Critical Literature Review Final -MW.pdf
Critical Literature Review Final -MW.pdfCritical Literature Review Final -MW.pdf
Critical Literature Review Final -MW.pdf
 
Pre-assessment & Data Sheet presentation template - 2023.pptx
Pre-assessment & Data Sheet presentation template - 2023.pptxPre-assessment & Data Sheet presentation template - 2023.pptx
Pre-assessment & Data Sheet presentation template - 2023.pptx
 
Introduction communication assignmen.pdf
Introduction communication assignmen.pdfIntroduction communication assignmen.pdf
Introduction communication assignmen.pdf
 
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
UNIT I INTRODUCTION TO INTERNET OF THINGS
UNIT I INTRODUCTION TO INTERNET OF THINGSUNIT I INTRODUCTION TO INTERNET OF THINGS
UNIT I INTRODUCTION TO INTERNET OF THINGS
 
self introduction sri balaji
self introduction sri balajiself introduction sri balaji
self introduction sri balaji
 
Final Year Project - Automated web based form filling using OCR.pptx
Final Year Project - Automated web based form filling using OCR.pptxFinal Year Project - Automated web based form filling using OCR.pptx
Final Year Project - Automated web based form filling using OCR.pptx
 
Hydraulics Introduction& Hydrostatics.pdf
Hydraulics  Introduction&   Hydrostatics.pdfHydraulics  Introduction&   Hydrostatics.pdf
Hydraulics Introduction& Hydrostatics.pdf
 
Shankar communication assignment no1 .pdf
Shankar communication assignment no1 .pdfShankar communication assignment no1 .pdf
Shankar communication assignment no1 .pdf
 
Deluck Technical Works Company Profile.pdf
Deluck Technical Works Company Profile.pdfDeluck Technical Works Company Profile.pdf
Deluck Technical Works Company Profile.pdf
 
BRINDHA G AD21012 SELF INTRODUCTION.pdf
BRINDHA G  AD21012 SELF INTRODUCTION.pdfBRINDHA G  AD21012 SELF INTRODUCTION.pdf
BRINDHA G AD21012 SELF INTRODUCTION.pdf
 
S. Kim, NeurIPS 2023, MLILAB, KAISTAI
S. Kim,  NeurIPS 2023,  MLILAB,  KAISTAIS. Kim,  NeurIPS 2023,  MLILAB,  KAISTAI
S. Kim, NeurIPS 2023, MLILAB, KAISTAI
 
Biochemical Thermodynamics for Biotechnology
Biochemical Thermodynamics for BiotechnologyBiochemical Thermodynamics for Biotechnology
Biochemical Thermodynamics for Biotechnology
 
Sample Case Study of industry 4.0 and its Outcome
Sample Case Study of industry 4.0 and its OutcomeSample Case Study of industry 4.0 and its Outcome
Sample Case Study of industry 4.0 and its Outcome
 
Searching and Sorting Unit II Part I.pptx
Searching and Sorting Unit II Part I.pptxSearching and Sorting Unit II Part I.pptx
Searching and Sorting Unit II Part I.pptx
 
chap. 3. lipid deterioration oil and fat processign
chap. 3. lipid deterioration oil and fat processignchap. 3. lipid deterioration oil and fat processign
chap. 3. lipid deterioration oil and fat processign
 
Objectives of Software Engineering and phases of SDLC.pptx
Objectives of Software Engineering and phases of SDLC.pptxObjectives of Software Engineering and phases of SDLC.pptx
Objectives of Software Engineering and phases of SDLC.pptx
 
Bresenham line-drawing-algorithm By S L Sonawane.pdf
Bresenham line-drawing-algorithm By S L Sonawane.pdfBresenham line-drawing-algorithm By S L Sonawane.pdf
Bresenham line-drawing-algorithm By S L Sonawane.pdf
 
Presentation of Helmet Detection Using Machine Learning.pptx
Presentation of Helmet Detection Using Machine Learning.pptxPresentation of Helmet Detection Using Machine Learning.pptx
Presentation of Helmet Detection Using Machine Learning.pptx
 

Intent-aware Visualization Recommendation for Tabular Data

  • 1. WISE 2021 Atsuki Maruta (University of Tsukuba) Makoto P. Kato (University of Tsukuba / JST, PRESTO) Intent-aware Visualization Recommendation for Tabular Data
  • 2. Visualization is an effective means to gain insights Problem • Effective visualization requires special skills, knowledge, and deep analysis of data ○ What tools to use or what charts to use Introduction 2 It is difficult to understand the contents of the tabular data Charts can be used for a quick understanding
  • 3. Examples of Charts 3 Pie chart [3] Area chart [4] Multi polygon chart [5] [1] https://public.tableau.com/app/profile/zhishun.zhang/viz/BarChart_43/Sheet1 [2] https://public.tableau.com/app/profile/amadou1986/viz/LineChart_15562961467070/LineChart_1 [3] https://public.tableau.com/app/profile/ravleen.anand/viz/PieChart_71/Sheet1 [4] https://public.tableau.com/app/profile/ellas3717/viz/AreaChart_24/AreaChart [5] https://public.tableau.com/app/profile/sukumar.roy.chowdhury/viz/Asia-PacificPolygonMap/AsiaPacificPolygonMap Bar chart [1] Line chart [2]
  • 4. There are visualization recommendation studies Related Work Visualization recommendation can be categorized into two approaches 4 Machine Learning-based Machine learning with statistical features of tabular data [9][10][11][12] - e.g.) Using the variance of the column values and the correlation between the columns as features Rule-based Rules are set by experts about visualization using statistical features from tabular data [6][7][8] - e.g.) Pie charts are not suitable when the number of rows is large [6] Stolte Chris et al. Polaris: A system for query, analysis, and visualization of multidimensional relational databases. IEEE TVCG. 2002, vol. 8, no. 1, p. 52-65. [7] Wongsuphasawat Kanit et al. Voyager: Exploratory analysis via faceted browsing of visualization recommendations. IEEE TVCG. 2015, vol. 22, no. 1, p. 649-658. [8] Eberhardt A, Milene S. Show me the data! A systematic mapping on open government data visualization. DG.O 2018. 2018, p. 1-10. [9] Moritz Dominik et al. Formalizing visualization design knowledge as constraints: Actionable and extensible models in draco. IEEE TVCG. 2018, vol. 25, no. 1, p. 438-448. [10] Dibia Victor, Demiralp Çağatay. Data2vis: Automatic generation of data visualizations using sequence-to-sequence recurrent neural networks. IEEE CGA. 2019, vol. 39, no. 5, p. 33-46. [11] Luo, Y et al. Deepeye: Towards automatic data visualization. IEEE 34th ICDE, 2018, p. 101-112. [12] Hu, Kevin et al. Vizml: A machine learning approach to visualization recommendation. CHI, 2019, p. 1-12. Our method!
  • 5. Related Work Visualization Type Prediction with Statistical Features 5 There is a machine learning-based method that predicts visualization with statistical features of tabular data [12] Year Population 2017 100 2018 200 2019 250 Extract features Maximum value : 2019 Mean value : 2018 Variance value : 1 Number of rows : 3 Data type : numeric ・・・ Input Machine learning model Output Line chart Tabular data Statistical feature of tabular data Predict visualization Extract statistical features such as maximum value, variance, etc. from tabular data and input them into machine learning model to predict visualization. [12] Hu, Kevin et al. Vizml: A machine learning approach to visualization recommendation. CHI, 2019, p. 1-12.
  • 6. In existing studies, visualization intent is not considered Visualization Intent ~ Content of the Data that the User Wants to Visualize 6 Population trends in Italy Percentage of students in Japanese population trends Percentage Line chart Pie chart Predict more appropriate visualization by considering the visualization intent Visualization intent Visualization type An effective term for predicting visualization method in visualization intent
  • 7. Population trends Year Population Young population Working-age population Aged population 2017 300 60 170 70 2018 350 50 210 90 2019 370 50 220 100 Input Visualization recommendation system Output • Anyone can easily visualize the tabular data • It can be easy to find necessary data by visualizing each data appropriately Tabular Data Visualization Intent Identifying an appropriate visualization method for tabular data with a visualization intent Purpose 7 0 100 200 300 400 2017 2018 2019 Year Population trends population Visualization method Year Population 2017 300 2018 350 2019 370 Visualization type Visualized Column
  • 8. Main Ideas of Our Proposed Method 1 / 2 8 • Both the visualization intent and the features of each column are effective for predicting the visualization type - But not every word or column is necessary • Visualization intent indicates which columns should be used for visualization - Columns with headers that match the visualization intent should be used for visualization • Bidirectional Attention (BiDA) is a model that predicts the important parts of each from two sets of data [8] Year population GDP 2017 300 2.3 2018 350 2.4 2019 370 2.0 Tabular data Population trends in Italy Visualization intent These columns should be used Predict visualization types by identifying important words and columns with BiDA [8] Seo, Minjoon et al. Bidirectional attention flow for machine comprehension. arXiv.org e-Print archive, 2016, 1611.01603. https://arxiv.org/pdf/1611.01603.pdf. “trends” is an effective word
  • 9. Main Ideas of Our Proposed Method 2 / 2 9 • Visualization intent indicates which columns should be used for visualization • Columns with a strong relationship between a visualization intent and a header are used for the visualization • BERT can estimate the correspondence between two sentences and has shown high performance in many NLP tasks [9] Year population GDP 2017 300 2.3 2018 350 2.4 2019 370 2.0 There is strong relationship between the visualization intent and “Year” “population” columns We use BERT to estimate the correspondence between a visualization intent and each header and predict the visualized columns Tabular data Visualization intent Population trends in Italy [9] Devlin, Jacob, et al. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv, 2018, 1810.04805.
  • 10. Our Proposed Method 10 Predict visualization type with BiDA and identify visualized column with BERT Population trends in Italy Year Population GDP Header 2017 1.3 2.3 2018 1.4 2.4 2019 1.5 2.0 Values Embedding Visualization Intent Tabular Data Embedding Embedding population trends in ・・・ Year Population GDP Year Population GDP Intent2Table and Table2Intent BiDA BERT Visualization type Line year poplu lation Visualized columns Year Population Visualization intent vector Header vector Statistical features
  • 11. Our Proposed Method for Visualization Type 11 Population trends in Italy Year Population GDP Header 2017 1.3 2.3 2018 1.4 2.4 2019 1.5 2.0 Values Embedding Visualization Intent Tabular Data Embedding Embedding population trends in ・・・ Year Population GDP Year Population GDP Intent2Table and Table2Intent BiDA Visualization type Line Visualization intent vector Header vector Statistical features To predict visualization type, input visualization intent vector, header vector and column features into BiDA Multi layer perceptron Input Output
  • 12. Our Proposed Method for Visualized Columns 12 To predict visualized columns, input visualization intent and header into BERT Population trends in Italy Year Population GDP Header 2017 1.3 2.3 2018 1.4 2.4 2019 1.5 2.0 Values Visualization Intent Tabular Data Embedding Year Population GDP BERT year poplu lation Visualized columns Year Population Statistical features Multi layer perceptron for each header Output Input Due to the limited time, I will skip the description for prediction of the visualized columns in the following sections
  • 13. BiDA in Detail 13 Visualization Intent Population trends in Italy Year Population GDP 2017 1.3 2.3 2018 1.4 2.4 2019 1.5 2.0 Header Values Tabular Data Word embedding population trends in ・・・ Year Population GDP Word embedding Year Population GDP Statistical embedding • BiDA consists of Table2Intent and Intent2Table • Table2Intent predicts important words in visualization intent and Intent2Table predicts important columns in tabular data Visualization intent vector Header vector Statistical features Visualization type Line Predict important words Intent2Table Predict important columns Table2Intent population trends in Italy Year Population GDP MLP Attention visualization intent vector Attention statistical features Input Input Output Output Input Output
  • 14. Table2Intent in Detail 14 Visualization Intent Population trends in Italy Year Population GDP 2017 1.3 2.3 2018 1.4 2.4 2019 1.5 2.0 Header Values Tabular Data Word embedding population trends in ・・・ Year Population GDP Word embedding population Year Population GDP trends in Italy population trends in Italy : Word similarity Visualization intent vector Header vector Attention visualization intent vector (``population’’ and ``trends’’ received high attention ) Max Softmax Input Input ①: Calculate the similarity between visualization intent words and header words ②: Extract the max similarity value for each visualization intent word ③: Input the similarity of each word into the softmax function to create the weights ④: Calculate the attention visualization intent vector by taking the weighted sum of the visualization intent vectors ① ② ③ ④
  • 15. Table2Intent in Detail 15 Visualization Intent Tabular Data Word embedding Word embedding population Year Population GDP trends in Italy : Word similarity Visualization intent vector Header vector Attention visualization intent vector (``population’’ and ``trends’’ received high attention ) Max Softmax Input Input ①: Calculate the similarity between visualization intent words and header words ②: Extract the max similarity value for each visualization intent word ③: Input the similarity of each word into the softmax function to create the weights ④: Calculate the visualization intent vector by multiplying the weights by the visualization intent vector ① ② ③ ④
  • 16. Table2Intent in Detail 16 Visualization Intent Tabular Data Word embedding Word embedding population trends in Italy : Word similarity Visualization intent vector Header vector Attention visualization intent vector (``population’’ and ``trends’’ received high attention ) Max Softmax Input Input ①: Calculate the similarity between visualization intent words and header words ②: Extract the max similarity value for each visualization intent word ③: Input the similarity of each word into the softmax function to create the weights ④: Calculate the visualization intent vector by multiplying the weights by the visualization intent vector ① ② ③ ④
  • 17. Table2Intent in Detail 17 Visualization Intent Tabular Data Word embedding Word embedding : Word similarity Visualization intent vector Header vector Attention visualization intent vector (``population’’ and ``trends’’ received high attention ) Max Softmax Input Input ①: Calculate the similarity between visualization intent words and header words ②: Extract the max similarity value for each visualization intent word ③: Input the similarity of each word into the softmax function to create the weights ④: Calculate the visualization intent vector by multiplying the weights by the visualization intent vector ① ② ③ ④
  • 18. Table2Intent in Detail 18 Visualization Intent Tabular Data Word embedding Word embedding population trends in Italy population trends in Italy : Word similarity Visualization intent vector Header vector visualization intent vector (``population’’ and ``trends’’ received high attention ) Max Softmax Input Input ①: Calculate the similarity between visualization intent words and header words ②: Extract the max similarity value for each visualization intent word ③: Input the similarity of each word into the softmax function to create the weights ④: Calculate the visualization intent vector by taking the weighted sum of the visualization intent vectors ① ② ③ ④
  • 19. Intent2Table in Detail 19 Visualization Intent Population trends in Italy Year Population GDP 2017 1.3 2.3 2018 1.4 2.4 2019 1.5 2.0 Header Values Tabular Data Word embedding population trends in ・・・ Year Population GDP Word embedding population Year Population GDP trends in Italy Visualization intent vector Header vector statistical features (column ``Year’’ and ``population’’ received high attention ) Max Softmax Input Input Year Population GDP Statistical features Statistical embedding Year Population GDP Year Population GDP ① ② ③ ①: Calculate the similarity between visualization intent and header words ②: Extract the highest similarity for each column ③: Input the similarity of each column into the softmax function to create the weights ④: Calculate statistical features by taking the weighted sum of the statistical features ④
  • 20. Intent2Table in Detail 20 Visualization Intent Tabular Data Word embedding Word embedding population Year Population GDP trends in Italy Visualization intent vector Header vector statistical features (column ``Year’’ and ``population’’ received high attention ) Max Softmax Input Input Statistical features Statistical embedding ① ② ③ ①: Calculate the similarity between visualization intent and header words ②: Extract the highest similarity for each column ③: Input the similarity of each column into the softmax function to create the weights ④: Calculate statistical features by multiplying the weights and the statistical features. ④
  • 21. Intent2Table in Detail 21 Visualization Intent Tabular Data Word embedding Word embedding Year Population GDP Visualization intent vector Header vector statistical features (column ``Year’’ and ``population’’ received high attention ) Max Softmax Input Input Statistical features Statistical embedding ① ② ③ ①: Calculate the similarity between visualization intent and header words ②: Extract the highest similarity for each column ③: Input the similarity of each column into the softmax function to create the weights ④: Calculate statistical features by multiplying the weights and the statistical features. ④
  • 22. Intent2Table in Detail 22 Visualization Intent Tabular Data Word embedding Word embedding Visualization intent vector Header vector statistical features (column ``Year’’ and ``population’’ received high attention ) Max Softmax Input Input Statistical features Statistical embedding ① ② ③ ①: Calculate the similarity between visualization intent and header words ②: Extract the highest similarity for each column ③: Input the similarity of each column into the softmax function to create the weights ④: Calculate statistical features by multiplying the weights and the statistical features. ④
  • 23. Intent2Table in Detail 23 Visualization Intent Tabular Data Word embedding Word embedding Visualization intent vector Header vector statistical features (column ``Year’’ and ``population’’ received high attention) Max Softmax Input Input Statistical features Statistical embedding Year Population GDP Year Population GDP ① ② ③ ①: Calculate the similarity between visualization intent and header words ②: Extract the highest similarity for each column ③: Input the similarity of each column into the softmax function to create the weights ④: Calculate statistical features by taking the weighted sum of the statistical features ④
  • 24. Train 24 ①Combine vectors ②Input the combined vector into eight layers perceptron ③Output predicted type ReLU Softmax Input Output ・・・・ ・・・・ ・・・ ・・・・ ・・・・・ ・・・・・ ・・・・・ ・・・・・ Visualization type Loss function: Cross-entropy loss population trends in Italy Year Population GDP Eight layers perceptron Line
  • 25. • Available data : title, visualization type, tabular data, visualized columns, axis • Visualization type : Bar, Line, Pie, Square, Area, Circle, Shape, Multi polygon https://www.tableau.com/ Data Chart Dataset 25 Crawled from Tableau - Web sites that share visualizations Title Visualization intent Based on user’s visualization
  • 26. • Data size : 115,183 • Evaluation metrics : F measure • Baselines : Logistic regression and random forest (with the most effective statistical features in existing studies [7]) Research Questions Experimental Settings 26 [7] Hu, Kevin et al. Vizml: A machine learning approach to visualization recommendation. CHI, 2019, p. 1-12. 1. Can the use of visualization intents improve the accuracy of predicting visualization type? 2. Can BiDA improve the accuracy of predicting visualization type?
  • 27. 0 0.1 0.2 0.3 0.4 0.5 0.6 Logistic regression Random forest Intent Table Intent & Table Intent+BiDA Table+BiDA Intent & Table+BiDA Experimental Results of Predicting Visualization Type 27 1. Visualization intent improved the accuracy of predicting visualization type 2. BiDA improved the accuracy of predicting visualization type Intent&table (proposed model) Baselines [7] Without BiDA With BiDA Intent Intent& table Table Intent Table [7] Hu, Kevin et al. Vizml: A machine learning approach to visualization recommendation. CHI, 2019, p. 1-12. F measure
  • 28. paper vV computing trend Vtrong growth trend for computing productV ViVualization intent row id order priority diVcount unit price Vhipping coVt cuVtomer id cuVtomer name Vhip mode cuVtomer Vegment product category product Vub - category product container product name product baVe margin region Vtate or province city poVtal code order date Vhip date profit Tuantity ordered new ValeV order id Table headerV −0.4 0.0 0.4 0.8 1.2 Visualization of Similarity Matrix 28 • Visualization of similarity matrix when “Line” chart is correctly predicted • The depth of each word in the visualization intent changes significantly, and the similarity is not affected by a header, but is greatly affected by a visualization intent BiDA predicts column “Order date” (visualized column) and term “trend” are important Visualization intent Headers
  • 29. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Experimental Results of BiDA for Each Visualization Type 29 “Area” chart and “Line” chart showed low prediction accuracy probably because of their similarity Visualization types with unique shape such as “Multi polygon” chart and “Pie” chart showed high prediction accuracy Area Square Line Shape Circle Bar Pie Multi polygon
  • 30. Most Attended Terms in the Intent for Each Visualization Type 1/2 30 Visualization Type Term Attention Value Shape map 1.649 sheet 0.880 top 0.698 time 0.568 Square map 1.735 table 1.230 heat 0.946 sheet 0.618 Visualization Type Term Attention Value Multi polygon map 1.594 state 0.888 sheet 0.878 region 0.750 Pie pie 1.962 donut 1.464 sheet 0.874 gender 0.826 “Multi polygon” chart included geographical terms such as “state” and “region” Unique shape charts predict high prediction accuracy because they have unique visualization intents The most attended terms in the visualization intent and the values for each visualization type
  • 31. Most Attended Terms in the Intent for Each Visualization Type 2/2 31 Visualization Type Term Attention Value Area area 0.984 sheet 0.878 chart 0.779 time 0.585 Line trend 1.040 trends 1.001 line 0.996 sheet 0.883 Visualization Type Term Attention Value Bar bar 1.449 trend 1.086 change 1.069 area 0.950 Circle bubble 1.530 map 1.490 box 1.340 trend 1.132 ``Area’’ chart and ``Line’’ chart had low attention values There were not effective words for predicting these visualization types, and they show low performance The most attended terms in the visualization intent and the values for each visualization type
  • 32. Conclusions 32 Purpose • Propose a visualization recommender system for tabular data with a visualization intent Proposed method • To predict an appropriate visualization type, we proposed a BiDA model that identifies important table columns by the visualization intent, and important parts of the intent by table headers • To identify visualized columns, we used BERT to predict correspondence both visualization intents and columns in tabular data, and estimate which columns are the most likely to be used for visualization Dataset • 115,183 data crawled from Tableau, a website that shares visualizations Experimental Results • Visualization intent improved the accuracy of predicting visualization method • BiDA improved the accuracy of predicting visualization method
  • 34. BERT in Detail 34 Visualization Intent Population trends in Italy Tabular Data [CLS]1 Population trends in Italy [SEP] Year [SEP] [CLS]2 Population trends in Italy [SEP] Population [SEP] [CLS]3 Population trends in Italy [SEP] GDP [SEP] BERT Output Visualization intent Input Input Header Year Population GDP Statistical features Statistical embedding Year Population GDP 2017 1.3 2.3 2018 1.4 2.4 2019 1.5 2.0 Header Values [CLS]1 Year [CLS]2 Population [CLS]3 GDP ① : Input a visualization intent and a header pairs into BERT ② : combine BERT output vector and statistical features [CLS]1 [CLS]2 [CLS]3 ① ②
  • 35. Train in BERT 35 Combined vector Input ① Input the combined vector into two layers perceptron Output 0.8 0.9 0.3 ② Estimate whether the column is a visualized column Year population Loss function : binary cross entropy loss Year Population GDP [CLS]1 [CLS]2 [CLS]3 Visualized columns GDP ReLU Sigmoid
  • 36. • Evaluation method : treated the visualized column identification task as a ranking task • Evaluation metrics : nDCG@10 • Baselines : Random score, Word similarity, BM25 Research Question Experimental Settings for Prediction of Visualized Columns 36 1. Can BERT improve the accuracy of predicting visualized columns?
  • 37. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Random Word similarity BM25 Table BERT BERT + Table Experimental Results of Predicting Visualized Columns 37 Baselines Without BERT With BERT Table BERT BERT+Table (proposed model) 1. BERT improved the accuracy of prediction visualized columns nDCG@10
  • 38. Visualization of Attention in BERT Model 38 [CLS] nba players performance top players and their performance [SEP] mp [SEP] [CLS] most popular car brands [SEP] brand [SEP] [CLS] us school shooting total student ##es enrolled by year [SEP] total enrolled [SEP] Successful Failure illustrated the strength of attention from [CLS] in the output layer of the BERT model The header words that are relevant to the visualization intent received high attention A failure case where the attention was given to only [SEP], probably because the relationship between the visualization intent and columns is unclear Visualization of popular brands of cars Visualization of the total number of students enrolled Visualization of NBA player
  • 39. • 双方向アテンションで用いられている類似度は 学習可能な距離関数である ○ 双方向アテンションは類似度にも重みがついており、類似度も 学習している • 距離関数 ○ 列ベクトルaと単語ベクトルbの類似度cは以下のように求められる ○ c = 𝑤%[𝑎 ∶ 𝑏 ∶ 𝑎 ∘ 𝑏] ○ wは学習可能な重みで : はベクトルの結合を表しており、 ∘ はベクトル の要素ごとの積であるアダマール積を示している 類似度 39