La visualisation est un élément important de la compréhension et de la (re)présentation des données dans les (data) sciences. Elle repose sur des principes et des outils que Christophe Bontemps (Toulouse School of Economics) décryptera à la lumière de son expérience et de ses lectures.
Data Analysis and Statistics in Python using pandas and statsmodelsWes McKinney
The document summarizes Wes McKinney's talk on statistical computing using Python. The talk introduces the scientific Python stack, including pandas for data structures and data analysis, and statsmodels for statistical modeling. It discusses the "research-production gap" in current statistical tools and how Python aims to bridge that gap. McKinney asserts that Python is the best solution for both research and production use of statistics and data analysis. He then demonstrates pandas and statsmodels functionality.
This is a presentation I gave on Data Visualization at a General Assembly event in Singapore, on January 22, 2016. The presso provides a brief history of dataviz as well as examples of common chart and visualization formatting mistakes that you should never make.
Introduction to data pre-processing and cleaning Matteo Manca
This document discusses data preparation and cleaning. It begins by explaining why data cleaning is important, as raw data is often incomplete, noisy, inconsistent, or not in a format suitable for analysis. The main steps of data cleaning are then outlined, including handling missing values, identifying outliers, resolving inconsistencies, and transforming data. Best practices for data cleaning like using pipelines to document the cleaning process and saving clean data files are also presented. Finally, the document introduces R and RStudio as tools that can be used for data cleaning.
North Raleigh Rotarian Katie Turnbull gave a great presentation at our Friday morning extension meeting about data visualization. Katie is a consultant at research and advisory firm, Gartner, Inc.
Data visualization in data science: exploratory EDA, explanatory. Anscobe's quartet, design principles, visual encoding, design engineering and journalism, choosing the right graph, narrative structures, technology and tools.
These slides are for the tutorial on how to use R language for data analysis and Machine Learning tasks.
The workshop was given at OSCON (Austin, TX), 2017
pandas: Powerful data analysis tools for PythonWes McKinney
Wes McKinney introduced pandas, a Python data analysis library built on NumPy. Pandas provides data structures and tools for cleaning, manipulating, and working with relational and time-series data. Key features include DataFrame for 2D data, hierarchical indexing, merging and joining data, and grouping and aggregating data. Pandas is used heavily in financial applications and has over 1500 unit tests, ensuring stability and reliability. Future goals include better time series handling and integration with other Python data science packages.
Visualisation & Storytelling in Data Science & AnalyticsFelipe Rego
The document provides an overview of data visualization and storytelling in data science and analytics. It discusses key concepts like what data visualization is, compelling reasons to visualize data like Anscombe's Quartet, visualization in the context of analytics workflows, components of effective storytelling, considerations for presentation, guidelines for data storytelling, and examples of interesting data visualizations. Throughout the document, the author emphasizes best practices like keeping visualizations clear, addressing the intended audience, and avoiding bias.
Data Analysis and Statistics in Python using pandas and statsmodelsWes McKinney
The document summarizes Wes McKinney's talk on statistical computing using Python. The talk introduces the scientific Python stack, including pandas for data structures and data analysis, and statsmodels for statistical modeling. It discusses the "research-production gap" in current statistical tools and how Python aims to bridge that gap. McKinney asserts that Python is the best solution for both research and production use of statistics and data analysis. He then demonstrates pandas and statsmodels functionality.
This is a presentation I gave on Data Visualization at a General Assembly event in Singapore, on January 22, 2016. The presso provides a brief history of dataviz as well as examples of common chart and visualization formatting mistakes that you should never make.
Introduction to data pre-processing and cleaning Matteo Manca
This document discusses data preparation and cleaning. It begins by explaining why data cleaning is important, as raw data is often incomplete, noisy, inconsistent, or not in a format suitable for analysis. The main steps of data cleaning are then outlined, including handling missing values, identifying outliers, resolving inconsistencies, and transforming data. Best practices for data cleaning like using pipelines to document the cleaning process and saving clean data files are also presented. Finally, the document introduces R and RStudio as tools that can be used for data cleaning.
North Raleigh Rotarian Katie Turnbull gave a great presentation at our Friday morning extension meeting about data visualization. Katie is a consultant at research and advisory firm, Gartner, Inc.
Data visualization in data science: exploratory EDA, explanatory. Anscobe's quartet, design principles, visual encoding, design engineering and journalism, choosing the right graph, narrative structures, technology and tools.
These slides are for the tutorial on how to use R language for data analysis and Machine Learning tasks.
The workshop was given at OSCON (Austin, TX), 2017
pandas: Powerful data analysis tools for PythonWes McKinney
Wes McKinney introduced pandas, a Python data analysis library built on NumPy. Pandas provides data structures and tools for cleaning, manipulating, and working with relational and time-series data. Key features include DataFrame for 2D data, hierarchical indexing, merging and joining data, and grouping and aggregating data. Pandas is used heavily in financial applications and has over 1500 unit tests, ensuring stability and reliability. Future goals include better time series handling and integration with other Python data science packages.
Visualisation & Storytelling in Data Science & AnalyticsFelipe Rego
The document provides an overview of data visualization and storytelling in data science and analytics. It discusses key concepts like what data visualization is, compelling reasons to visualize data like Anscombe's Quartet, visualization in the context of analytics workflows, components of effective storytelling, considerations for presentation, guidelines for data storytelling, and examples of interesting data visualizations. Throughout the document, the author emphasizes best practices like keeping visualizations clear, addressing the intended audience, and avoiding bias.
Data visualization is the graphical representation of information and data. It is used to communicate data or information clearly and effectively to readers by leveraging the human mind's receptiveness to visual information. Effective data visualization can improve transparency and communication, answer questions, discover trends, find patterns, see data in context, support calculations, and present or tell a story. Common tools for data visualization include charts, graphs, maps, and diagrams. Specialized roles involved in data visualization include data visualization experts, data analysts, business intelligence consultants, tool-specific consultants, business analysts, and data scientists.
Data Visualisation & Analytics with Tableau (Beginner) - by Maria KoumandrakiOutreach Digital
This document outlines a 7 step process for creating data visualizations in Tableau. It includes an agenda, descriptions of each step, and demos. The 7 steps are: 1) Connecting to data, 2) Cleaning and preparing data, 3) Creating initial visualizations using Show Me or drag and drop, 4) Editing visualizations, 5) Analyzing data and creating additional visualizations, 6) Creating interactive dashboards, and 7) Sharing visualizations. The presenter leads attendees through examples on air pollution data and life expectancy data to demonstrate the process.
This presentation about Hadoop architecture will help you understand the architecture of Apache Hadoop in detail. In this video, you will learn what is Hadoop, components of Hadoop, what is HDFS, HDFS architecture, Hadoop MapReduce, Hadoop MapReduce example, Hadoop YARN and finally, a demo on MapReduce. Apache Hadoop offers a versatile, adaptable and reliable distributed computing big data framework for a group of systems with capacity limit and local computing power. After watching this video, you will also understand the Hadoop Distributed File System and its features along with the practical implementation.
Below are the topics covered in this Hadoop Architecture presentation:
1. What is Hadoop?
2. Components of Hadoop
3. What is HDFS?
4. HDFS Architecture
5. Hadoop MapReduce
6. Hadoop MapReduce Example
7. Hadoop YARN
8. Demo on MapReduce
What are the course objectives?
This course will enable you to:
1. Understand the different components of Hadoop ecosystem such as Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark
2. Understand Hadoop Distributed File System (HDFS) and YARN as well as their architecture, and learn how to work with them for storage and resource management
3. Understand MapReduce and its characteristics, and assimilate some advanced MapReduce concepts
4. Get an overview of Sqoop and Flume and describe how to ingest data using them
5. Create database and tables in Hive and Impala, understand HBase, and use Hive and Impala for partitioning
6. Understand different types of file formats, Avro Schema, using Arvo with Hive, and Sqoop and Schema evolution
7. Understand Flume, Flume architecture, sources, flume sinks, channels, and flume configurations
8. Understand HBase, its architecture, data storage, and working with HBase. You will also understand the difference between HBase and RDBMS
9. Gain a working knowledge of Pig and its components
10. Do functional programming in Spark
11. Understand resilient distribution datasets (RDD) in detail
12. Implement and build Spark applications
13. Gain an in-depth understanding of parallel processing in Spark and Spark RDD optimization techniques
14. Understand the common use-cases of Spark and the various interactive algorithms
15. Learn Spark SQL, creating, transforming, and querying Data frames
Who should take up this Big Data and Hadoop Certification Training Course?
Big Data career opportunities are on the rise, and Hadoop is quickly becoming a must-know technology for the following professionals:
1. Software Developers and Architects
2. Analytics Professionals
3. Senior IT professionals
4. Testing and Mainframe professionals
5. Data Management Professionals
6. Business Intelligence Professionals
7. Project Managers
8. Aspiring Data Scientists
Learn more at https://www.simplilearn.com/big-data-and-analytics/big-data-and-hadoop-training
This presentation on Spark Architecture will give an idea of what is Apache Spark, the essential features in Spark, the different Spark components. Here, you will learn about Spark Core, Spark SQL, Spark Streaming, Spark MLlib, and Graphx. You will understand how Spark processes an application and runs it on a cluster with the help of its architecture. Finally, you will perform a demo on Apache Spark. So, let's get started with Apache Spark Architecture.
YouTube Video: https://www.youtube.com/watch?v=CF5Ewk0GxiQ
What is this Big Data Hadoop training course about?
The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.
What are the course objectives?
Simplilearn’s Apache Spark and Scala certification training are designed to:
1. Advance your expertise in the Big Data Hadoop Ecosystem
2. Help you master essential Apache and Spark skills, such as Spark Streaming, Spark SQL, machine learning programming, GraphX programming and Shell Scripting Spark
3. Help you land a Hadoop developer job requiring Apache Spark expertise by giving you a real-life industry project coupled with 30 demos
What skills will you learn?
By completing this Apache Spark and Scala course you will be able to:
1. Understand the limitations of MapReduce and the role of Spark in overcoming these limitations
2. Understand the fundamentals of the Scala programming language and its features
3. Explain and master the process of installing Spark as a standalone cluster
4. Develop expertise in using Resilient Distributed Datasets (RDD) for creating applications in Spark
5. Master Structured Query Language (SQL) using SparkSQL
6. Gain a thorough understanding of Spark streaming features
7. Master and describe the features of Spark ML programming and GraphX programming
Who should take this Scala course?
1. Professionals aspiring for a career in the field of real-time big data analytics
2. Analytics professionals
3. Research professionals
4. IT developers and testers
5. Data scientists
6. BI and reporting professionals
7. Students who wish to gain a thorough understanding of Apache Spark
Learn more at https://www.simplilearn.com/big-data-and-analytics/apache-spark-scala-certification-training
This document discusses data visualization tools in Python. It introduces Matplotlib as the first and still standard Python visualization tool. It also covers Seaborn which builds on Matplotlib, Bokeh for interactive visualizations, HoloViews as a higher-level wrapper for Bokeh, and Datashader for big data visualization. Additional tools discussed include Folium for maps, and yt for volumetric data visualization. The document concludes that Python is well-suited for data science and visualization with many options available.
Bayesian Network Modeling using Python and RPyData
This document discusses Bayesian network modeling using Python and R. It begins with an introduction to Bayesian networks and their applications. It then outlines the main Bayesian network packages available in Python like scikit-learn, BayesPy, Bayes Blocks, and PyMC, and in R like bnlearn and RStan. It covers the basics of Bayes' theorem and how Bayesian networks represent probabilistic relationships between variables as a directed acyclic graph. The talk concludes with discussing algorithms for learning Bayesian networks from data and evaluating model performance.
Data visualization is the graphical representation of information and data using visual elements like charts, graphs, and maps to provide an accessible way to see and understand trends and patterns in data. It allows massive amounts of information to be analyzed and data-driven decisions to be made. Data visualization tells a story by removing noise from data and highlighting useful information. Common types include charts, graphs, maps, and infographics, with tools ranging from simple online options to more complex offline programs. The key is to focus on best practices and developing a personal style when creating visualizations.
The document provides an introduction and overview of an introductory course on visual analytics. It outlines the course objectives, which include fundamental concepts in data visualization and analysis, exposure to visualization work across different domains, and hands-on experience using data visualization tools. The course covers basic principles of data analysis, perception and design. It includes a survey of visualization examples and teaches students to apply these principles to create their own visualizations. The document also provides a weekly plan that includes topics like data processing, visualization design, cognitive science, and a review of best practices.
This document discusses big data and use cases. It begins by reviewing the history and evolution of big data and advanced analytics. It then explains how technologies like Hadoop, stream processing, and in-memory computing support big data solutions. The document presents two use cases - analyzing credit risk by examining customer transaction data to improve credit offers, and detecting fraud by analyzing financial transactions for unusual patterns that could indicate suspicious activity. It describes how these use cases leverage technologies like Oracle R Connector for Hadoop to run analytics and machine learning algorithms on large datasets.
The document discusses various techniques for visualizing data, from basic charts to approaches for big data. It covers common basic chart types like line graphs, bar charts, scatter plots, and pie charts. For big data, it addresses challenges like large data volumes, different data varieties, visualization velocity, and filtering. The document recommends understanding your data and goals to select the best visualizations, and introduces SAS Visual Analytics as a tool that performs automatic charting to help users visualize big data.
Bound Tech is a top institute that provides hands-on Tableau training taught by experienced trainers using real-world scenarios and examples. The training covers fundamental concepts, advanced concepts, and job-oriented skills over 50-60 hours. Students learn how to rapidly analyze data, create dashboards and reports, and share analytics using features of Tableau. The course also provides skills needed for roles like business analyst, data scientist, and Tableau developer.
This document discusses techniques for pre-processing big data to improve the quality of analysis. It covers exploring and cleaning data by handling missing values, reducing noise, and reducing dimensions. Data transformation techniques are also discussed, such as standardizing, aggregating, and joining data. Finally, the document emphasizes that data preparation is a key factor in model quality and generating insights from trusted data.
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Md. Main Uddin Rony
This document discusses various machine learning evaluation metrics for supervised learning models. It covers classification, regression, and ranking metrics. For classification, it describes accuracy, confusion matrix, log-loss, and AUC. For regression, it discusses RMSE and quantiles of errors. For ranking, it explains precision-recall, precision-recall curves, F1 score, and NDCG. The document provides examples and visualizations to illustrate how these metrics are calculated and used to evaluate model performance.
What are Hadoop Components? Hadoop Ecosystem and Architecture | EdurekaEdureka!
YouTube Link: https://youtu.be/ll_O9JsjwT4
** Big Data Hadoop Certification Training - https://www.edureka.co/big-data-hadoop-training-certification **
This Edureka PPT on "Hadoop components" will provide you with detailed knowledge about the top Hadoop Components and it will help you understand the different categories of Hadoop Components. This PPT covers the following topics:
What is Hadoop?
Core Components of Hadoop
Hadoop Architecture
Hadoop EcoSystem
Hadoop Components in Data Storage
General Purpose Execution Engines
Hadoop Components in Database Management
Hadoop Components in Data Abstraction
Hadoop Components in Real-time Data Streaming
Hadoop Components in Graph Processing
Hadoop Components in Machine Learning
Hadoop Cluster Management tools
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
The document introduces data engineering and provides an overview of the topic. It discusses (1) what data engineering is, how it has evolved with big data, and the required skills, (2) the roles of data engineers, data scientists, and data analysts in working with big data, and (3) the structure and schedule of an upcoming meetup on data engineering that will use an agile approach over monthly sprints.
Architecture web aujourd'hui, besoin de scalabilité des bases de données relationnelles, découverte des bases de données NoSQL et des différents types de celles-ci. La vidéo de présentation peut être consultée à l'adresse suivante : http://youtu.be/oIpjcqHyx2M
This document provides an introduction to data science and analytics. It discusses why data science jobs are in high demand, what skills are needed for these roles, and common types of analytics including descriptive, predictive, and prescriptive. It also covers topics like machine learning, big data, structured vs unstructured data, and examples of companies that utilize data and analytics like Amazon and Facebook. The document is intended to explain key concepts in data science and why attending a talk on this topic would be beneficial.
Introduction to Data Science: presented by Dr. Sotarat Thammaboosadee, ITM Mahidol and Datalent Team. This presentation is a part of Data Science Clinic no.9 organized by Data Science Thailand, 8 March 2017 at All Season Place, Bangkok, Thailand.
Data visualization is the graphical representation of information and data. It is used to communicate data or information clearly and effectively to readers by leveraging the human mind's receptiveness to visual information. Effective data visualization can improve transparency and communication, answer questions, discover trends, find patterns, see data in context, support calculations, and present or tell a story. Common tools for data visualization include charts, graphs, maps, and diagrams. Specialized roles involved in data visualization include data visualization experts, data analysts, business intelligence consultants, tool-specific consultants, business analysts, and data scientists.
Data Visualisation & Analytics with Tableau (Beginner) - by Maria KoumandrakiOutreach Digital
This document outlines a 7 step process for creating data visualizations in Tableau. It includes an agenda, descriptions of each step, and demos. The 7 steps are: 1) Connecting to data, 2) Cleaning and preparing data, 3) Creating initial visualizations using Show Me or drag and drop, 4) Editing visualizations, 5) Analyzing data and creating additional visualizations, 6) Creating interactive dashboards, and 7) Sharing visualizations. The presenter leads attendees through examples on air pollution data and life expectancy data to demonstrate the process.
This presentation about Hadoop architecture will help you understand the architecture of Apache Hadoop in detail. In this video, you will learn what is Hadoop, components of Hadoop, what is HDFS, HDFS architecture, Hadoop MapReduce, Hadoop MapReduce example, Hadoop YARN and finally, a demo on MapReduce. Apache Hadoop offers a versatile, adaptable and reliable distributed computing big data framework for a group of systems with capacity limit and local computing power. After watching this video, you will also understand the Hadoop Distributed File System and its features along with the practical implementation.
Below are the topics covered in this Hadoop Architecture presentation:
1. What is Hadoop?
2. Components of Hadoop
3. What is HDFS?
4. HDFS Architecture
5. Hadoop MapReduce
6. Hadoop MapReduce Example
7. Hadoop YARN
8. Demo on MapReduce
What are the course objectives?
This course will enable you to:
1. Understand the different components of Hadoop ecosystem such as Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark
2. Understand Hadoop Distributed File System (HDFS) and YARN as well as their architecture, and learn how to work with them for storage and resource management
3. Understand MapReduce and its characteristics, and assimilate some advanced MapReduce concepts
4. Get an overview of Sqoop and Flume and describe how to ingest data using them
5. Create database and tables in Hive and Impala, understand HBase, and use Hive and Impala for partitioning
6. Understand different types of file formats, Avro Schema, using Arvo with Hive, and Sqoop and Schema evolution
7. Understand Flume, Flume architecture, sources, flume sinks, channels, and flume configurations
8. Understand HBase, its architecture, data storage, and working with HBase. You will also understand the difference between HBase and RDBMS
9. Gain a working knowledge of Pig and its components
10. Do functional programming in Spark
11. Understand resilient distribution datasets (RDD) in detail
12. Implement and build Spark applications
13. Gain an in-depth understanding of parallel processing in Spark and Spark RDD optimization techniques
14. Understand the common use-cases of Spark and the various interactive algorithms
15. Learn Spark SQL, creating, transforming, and querying Data frames
Who should take up this Big Data and Hadoop Certification Training Course?
Big Data career opportunities are on the rise, and Hadoop is quickly becoming a must-know technology for the following professionals:
1. Software Developers and Architects
2. Analytics Professionals
3. Senior IT professionals
4. Testing and Mainframe professionals
5. Data Management Professionals
6. Business Intelligence Professionals
7. Project Managers
8. Aspiring Data Scientists
Learn more at https://www.simplilearn.com/big-data-and-analytics/big-data-and-hadoop-training
This presentation on Spark Architecture will give an idea of what is Apache Spark, the essential features in Spark, the different Spark components. Here, you will learn about Spark Core, Spark SQL, Spark Streaming, Spark MLlib, and Graphx. You will understand how Spark processes an application and runs it on a cluster with the help of its architecture. Finally, you will perform a demo on Apache Spark. So, let's get started with Apache Spark Architecture.
YouTube Video: https://www.youtube.com/watch?v=CF5Ewk0GxiQ
What is this Big Data Hadoop training course about?
The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.
What are the course objectives?
Simplilearn’s Apache Spark and Scala certification training are designed to:
1. Advance your expertise in the Big Data Hadoop Ecosystem
2. Help you master essential Apache and Spark skills, such as Spark Streaming, Spark SQL, machine learning programming, GraphX programming and Shell Scripting Spark
3. Help you land a Hadoop developer job requiring Apache Spark expertise by giving you a real-life industry project coupled with 30 demos
What skills will you learn?
By completing this Apache Spark and Scala course you will be able to:
1. Understand the limitations of MapReduce and the role of Spark in overcoming these limitations
2. Understand the fundamentals of the Scala programming language and its features
3. Explain and master the process of installing Spark as a standalone cluster
4. Develop expertise in using Resilient Distributed Datasets (RDD) for creating applications in Spark
5. Master Structured Query Language (SQL) using SparkSQL
6. Gain a thorough understanding of Spark streaming features
7. Master and describe the features of Spark ML programming and GraphX programming
Who should take this Scala course?
1. Professionals aspiring for a career in the field of real-time big data analytics
2. Analytics professionals
3. Research professionals
4. IT developers and testers
5. Data scientists
6. BI and reporting professionals
7. Students who wish to gain a thorough understanding of Apache Spark
Learn more at https://www.simplilearn.com/big-data-and-analytics/apache-spark-scala-certification-training
This document discusses data visualization tools in Python. It introduces Matplotlib as the first and still standard Python visualization tool. It also covers Seaborn which builds on Matplotlib, Bokeh for interactive visualizations, HoloViews as a higher-level wrapper for Bokeh, and Datashader for big data visualization. Additional tools discussed include Folium for maps, and yt for volumetric data visualization. The document concludes that Python is well-suited for data science and visualization with many options available.
Bayesian Network Modeling using Python and RPyData
This document discusses Bayesian network modeling using Python and R. It begins with an introduction to Bayesian networks and their applications. It then outlines the main Bayesian network packages available in Python like scikit-learn, BayesPy, Bayes Blocks, and PyMC, and in R like bnlearn and RStan. It covers the basics of Bayes' theorem and how Bayesian networks represent probabilistic relationships between variables as a directed acyclic graph. The talk concludes with discussing algorithms for learning Bayesian networks from data and evaluating model performance.
Data visualization is the graphical representation of information and data using visual elements like charts, graphs, and maps to provide an accessible way to see and understand trends and patterns in data. It allows massive amounts of information to be analyzed and data-driven decisions to be made. Data visualization tells a story by removing noise from data and highlighting useful information. Common types include charts, graphs, maps, and infographics, with tools ranging from simple online options to more complex offline programs. The key is to focus on best practices and developing a personal style when creating visualizations.
The document provides an introduction and overview of an introductory course on visual analytics. It outlines the course objectives, which include fundamental concepts in data visualization and analysis, exposure to visualization work across different domains, and hands-on experience using data visualization tools. The course covers basic principles of data analysis, perception and design. It includes a survey of visualization examples and teaches students to apply these principles to create their own visualizations. The document also provides a weekly plan that includes topics like data processing, visualization design, cognitive science, and a review of best practices.
This document discusses big data and use cases. It begins by reviewing the history and evolution of big data and advanced analytics. It then explains how technologies like Hadoop, stream processing, and in-memory computing support big data solutions. The document presents two use cases - analyzing credit risk by examining customer transaction data to improve credit offers, and detecting fraud by analyzing financial transactions for unusual patterns that could indicate suspicious activity. It describes how these use cases leverage technologies like Oracle R Connector for Hadoop to run analytics and machine learning algorithms on large datasets.
The document discusses various techniques for visualizing data, from basic charts to approaches for big data. It covers common basic chart types like line graphs, bar charts, scatter plots, and pie charts. For big data, it addresses challenges like large data volumes, different data varieties, visualization velocity, and filtering. The document recommends understanding your data and goals to select the best visualizations, and introduces SAS Visual Analytics as a tool that performs automatic charting to help users visualize big data.
Bound Tech is a top institute that provides hands-on Tableau training taught by experienced trainers using real-world scenarios and examples. The training covers fundamental concepts, advanced concepts, and job-oriented skills over 50-60 hours. Students learn how to rapidly analyze data, create dashboards and reports, and share analytics using features of Tableau. The course also provides skills needed for roles like business analyst, data scientist, and Tableau developer.
This document discusses techniques for pre-processing big data to improve the quality of analysis. It covers exploring and cleaning data by handling missing values, reducing noise, and reducing dimensions. Data transformation techniques are also discussed, such as standardizing, aggregating, and joining data. Finally, the document emphasizes that data preparation is a key factor in model quality and generating insights from trusted data.
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Md. Main Uddin Rony
This document discusses various machine learning evaluation metrics for supervised learning models. It covers classification, regression, and ranking metrics. For classification, it describes accuracy, confusion matrix, log-loss, and AUC. For regression, it discusses RMSE and quantiles of errors. For ranking, it explains precision-recall, precision-recall curves, F1 score, and NDCG. The document provides examples and visualizations to illustrate how these metrics are calculated and used to evaluate model performance.
What are Hadoop Components? Hadoop Ecosystem and Architecture | EdurekaEdureka!
YouTube Link: https://youtu.be/ll_O9JsjwT4
** Big Data Hadoop Certification Training - https://www.edureka.co/big-data-hadoop-training-certification **
This Edureka PPT on "Hadoop components" will provide you with detailed knowledge about the top Hadoop Components and it will help you understand the different categories of Hadoop Components. This PPT covers the following topics:
What is Hadoop?
Core Components of Hadoop
Hadoop Architecture
Hadoop EcoSystem
Hadoop Components in Data Storage
General Purpose Execution Engines
Hadoop Components in Database Management
Hadoop Components in Data Abstraction
Hadoop Components in Real-time Data Streaming
Hadoop Components in Graph Processing
Hadoop Components in Machine Learning
Hadoop Cluster Management tools
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
The document introduces data engineering and provides an overview of the topic. It discusses (1) what data engineering is, how it has evolved with big data, and the required skills, (2) the roles of data engineers, data scientists, and data analysts in working with big data, and (3) the structure and schedule of an upcoming meetup on data engineering that will use an agile approach over monthly sprints.
Architecture web aujourd'hui, besoin de scalabilité des bases de données relationnelles, découverte des bases de données NoSQL et des différents types de celles-ci. La vidéo de présentation peut être consultée à l'adresse suivante : http://youtu.be/oIpjcqHyx2M
This document provides an introduction to data science and analytics. It discusses why data science jobs are in high demand, what skills are needed for these roles, and common types of analytics including descriptive, predictive, and prescriptive. It also covers topics like machine learning, big data, structured vs unstructured data, and examples of companies that utilize data and analytics like Amazon and Facebook. The document is intended to explain key concepts in data science and why attending a talk on this topic would be beneficial.
Introduction to Data Science: presented by Dr. Sotarat Thammaboosadee, ITM Mahidol and Datalent Team. This presentation is a part of Data Science Clinic no.9 organized by Data Science Thailand, 8 March 2017 at All Season Place, Bangkok, Thailand.
Let's do some thinking about data visualisation thinkingAndy Kirk
"Let's do some thinking about data visualisation thinking" talk given by Andy Kirk at the 'Data Visualization Group in the Bay Area' Meetup at the University of San Francisco, on Thursday 23rd October 2014 (http://www.meetup.com/visualizemydata/events/212438912/)
This document introduces infographics and data visualization. It defines infographics as visual representations of information used to support and strengthen information in a sensitive context, while data visualization visually displays measured quantities using points, lines, and other graphical elements. The document provides examples of effective data visualization patterns and preattentive variables that convey information preattentively. It also discusses interactivity and categories of visualization, looking at examples from Descry, Gapminder, and Google Visualization.
Big Data, Data Science, Machine Intelligence and Learning: Demystification, T...Prof. Dr. Diego Kuonen
Keynote presentation given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on March 14, 2017 at Eurostat's international conference `New Techniques and Technologies for Statistics (NTTS) 2017' in Brussels, Belgium.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
Mean, Median, Mode: Measures of Central Tendency Jan Nah
There are three common measures of central tendency: mean, median, and mode. The mean is the average value found by dividing the sum of all values by the total number of values. The median is the middle value when values are arranged from lowest to highest. The mode is the value that occurs most frequently. Each measure provides a single number to represent the central or typical value in a data set.
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION Elvis Muyanja
Today, data science is enabling companies, governments, research centres and other organisations to turn their volumes of big data into valuable and actionable insights. It is important to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. According to the McKinsey Global Institute, the U.S. alone could face a shortage of about 190,000 data scientists and 1.5 million managers and analysts who can understand and make decisions using big data by 2018. In coming years, data scientists will be vital to all sectors —from law and medicine to media and nonprofits. Has the African continent planned to train the next generation of data scientists required on the continent?
This document summarizes key concepts from an introduction to statistics textbook. It covers types of data (quantitative, qualitative, levels of measurement), sampling (population, sample, randomization), experimental design (observational studies, experiments, controlling variables), and potential misuses of statistics (bad samples, misleading graphs, distorted percentages). The goal is to illustrate how common sense is needed to properly interpret data and statistics.
This document discusses probability distributions and key concepts related to discrete random variables including:
- Distinguishing between discrete and continuous random variables
- Constructing a discrete probability distribution from sample data and calculating probabilities
- Finding the mean, variance, and standard deviation of a discrete probability distribution
- Calculating the expected value of a discrete random variable from its possible outcomes and probabilities
This document discusses data visualization and its implications for evaluation. It defines data visualization as using qualitative or quantitative data to produce a representative image that can be understood by viewers and supports exploration, examination and communication of the data. The document outlines the history of data visualization including its roots in cartography, statistics and data visualization thinking. It also discusses how data visualization can be used in evaluation processes like understanding, collecting, analyzing and communicating data. Some limitations of data visualization are identified as well, such as data quality, determining causation and whether it supports or misleads. The document concludes with a discussion of visual ethics and contact information for Amy Germuth.
Graphs can effectively visualize data relationships but must be designed carefully.
Bar charts and pie charts are appropriate for discrete categorical data. Bar charts compare frequencies as proportional bar widths and heights. Pie charts show category proportions through wedge sizes.
Histograms and line graphs effectively display continuous and some ordinal data. Histograms use bar widths to represent value ranges. Line graphs connect data points to show trends over intervals.
However, graphs can mislead if they distort data scales, use inappropriate or confusing designs, or include non-data elements. Data representation and readability should always be prioritized in graph creation.
Information design is both a technical skill and an art form. To design great visualizations requires a diverse range of skill sets and a keen ability to understand the decisions to be made, the data available, the tools and platforms available for visualization design, and how to apply design best practices to create effective visualizations that communicate clearly. Even the most robust routine health information systems face challenges around how to visualize data in a way that facilitates decision-making by key stakeholders.
Trimmed version of the presentation given in New York on Thursday 16th May. Also essentially the same slide deck presented at the IDA Talks event in London on Wednesday 8th May.
Publish versin host monitoring and outbound load balancing(0915113656)gmolina200
The WAN2 default route is the backup route. We only need to enable route failover monitoring on the primary route (WAN1 default route) to detect if it fails over and use the backup WAN2 route instead. Since the WAN2 route is only used if the primary fails, we don't need additional route monitoring configured on it.
This document provides an overview of statistical inference. It discusses descriptive statistics, which summarize data, and inferential statistics, which are used to generalize from samples to populations. Key concepts covered include estimation, hypothesis testing, parameters, statistics, confidence intervals, significance levels, types of errors. Examples are given of how to calculate confidence intervals for means and proportions and how to perform hypothesis tests using z-tests and t-tests. Steps for conducting hypothesis tests are outlined.
The field-guide-to-data-science 2015 (second edition) By Booz | Allen | HamiltonArysha Channa
Foreword: Data science touches aspects of our lives on a daily basis. When we visit the doctor, drive our cars, get on an airplane, or shop for services, Data science is changing the way we interact with and explore our world.
The document discusses various topics related to distributed systems including network types, communication models, and protocols. It defines networks as interconnected resources including hosts, infrastructure, and network devices. It describes different types of networks such as LANs, WANs, and the Internet. It also covers topics such as network addressing, data encapsulation, communication models including message passing and streaming, and protocols like TCP and UDP.
This document contains tables listing statistics items for measuring traffic in a Base Station Controller (BSC). It includes items for measuring access procedures like paging, channel assignment, and call setup. It also includes items for handover procedures within the BSC and between BSCs. Measurement items cover traffic on the A interface, the Abis interface, and within the BSC itself.
This document discusses the history of chocolate production in Europe. It details how chocolate was first introduced from Central and South America to Spain and other European countries in the 16th century. However, it took until the 18th and 19th centuries for chocolate to become popularized and for chocolate factories to start mass-producing chocolate candy and other products for widespread consumption across Europe.
A cognitive architecture-based modelling approach to understanding biases in ...University of Huddersfield
Title: "A cognitive architecture-based modelling approach to
understanding biases in visualisation behaviour". A talk given at the "Dealing with Cognitive Biases in Visualisations (DECISIVe 2014) workshop at IEEE VIS, Paris, November 2014.
Title: "Sources of bias when working with visualisations". Introduction to the "Dealing with Cognitive Biases in Visualisations (DECISIVe 2014) workshop at IEEE VIS, Paris, November 2014.
Evidenced based practice In this writing, locate an article pert.docxturveycharlyn
Evidenced based practice
In this writing, locate an article pertaining to the topic below. Choose your article wisely, because you will be incorporating the article into all three of your writing assignments this session. In this writing, please discuss how this (one) article will be beneficial to your assigned topic. (The article should be a research conducted in United states.) Also state what you will be focusing on.
Topic: Preventing Healthcare Associated Infections.
This should be a page. Do not use direct quotes, but paraphrase. Also, cite the article you chose in APA 6th edition format.
Research Design: Observational
and Correlational Studies
Video Title: Research Design: Observational and Correlational Studies
Originally Published: 2011
Publishing Company: SAGE Publications, Inc
City: Thousand Oaks, USA
ISBN: 9781483397108
DOI: https://dx.doi.org/10.4135/9781483397108
(c) SAGE Publications, Inc., 2011
This PDF has been generated from SAGE Research Methods.
https://dx.doi.org/10.4135/9781483397108
NARRATOR: Research Design-- Observational and Correlational Studies. Since the moment you
were born, you've been exploring the world around you. In a sense, you've been conducting research.
You've noticed the ways people interact with each other, the relative sizes of objects,
NARRATOR [continued]: and how the colors of nature change with the seasons. Each of us is an
amateur researcher, observing, analyzing, and drawing conclusions about everything we see. In order
to conduct a more formal study whose conclusions you can share with others, you need to apply
scientific methods to your research.
NARRATOR [continued]: Knowing about scientific research methods will also help you understand,
interpret, and be more analytical in your thinking about studies you read about in textbooks, journals,
newspapers, or online. To make sure your research is as strong as possible, let's talk about designing
your study and interpreting your results.
NARRATOR [continued]: Specifically, we'll focus on some overarching types of research studies,
when to use an observational design, along with some advantages and disadvantages, two different
types of observational design, those that you conduct in the field and those that you conduct in a
laboratory,
NARRATOR [continued]: analyzing data from an observational study, including some statistical
methods, when to use a correlational design, along with some advantages and disadvantages, how
to design and implement one, and analyzing data from a correlational study.
NARRATOR [continued]: Before we begin to explore research designs, it is important to understand
the terms "variable" and "construct." These terms are used interchangeably and are found throughout
scientific literature.
NICOLE CAIN: A "construct," which can also be called a "variable," is a topic of interest that varies
from person to person. Some examples of constructs that researchers .
theory building is an important tool in research comprehension. explaining the concepts, abstracts, inductive and deductive research by finding through the stages in research
This document provides an overview of an introductory course on statistical concepts at the University of South Florida. It outlines the course objectives, which are to identify the course structure, recap foundational statistics concepts, and identify the programming structure in SAS. The agenda covers topics like data analytics, probability, statistical inference, distributions, and SAS basics. It also discusses key statistical thinking concepts like variation, inference from data, and the relationship between data, information, knowledge and wisdom. Hypothesis testing and its errors and power are explained. Issues with correlated data are also covered.
The Dark Art: Is Music Recommendation Science a Sciencempapish
The document discusses whether music recommendation is more of an art than a science. It outlines how early approaches to music information retrieval (MIR) were more scientific but limitations have been reached. Holistic metrics that measure user trust and satisfaction are suggested to better evaluate recommendations but are not fully objective or standardized. The field may be transitioning from a science to more of a practical challenge dominated by psychological and subjective factors, entering an "Age of the Dark Art". A way forward could involve focusing on unsolved MIR problems or adjourning the workshop discussions.
The document discusses measurement concepts in research design and provides learning outcomes related to measurement. It covers determining what needs to be measured to address research questions, different levels of scale measurement, forming indexes and composite measures, criteria for good measurement, and assessing reliability and validity. It also discusses measuring attitudes using rating scales, ranking techniques, and sorting. Major issues in measurement scale selection and designing questionnaires are presented.
Mba2216 week 07 08 measurement and data collection formsStephen Ong
This document discusses research design and measurement concepts related to data collection forms. It begins with learning outcomes, which focus on measurement scales, concepts, attitudes, and questionnaire design. It then covers determining what to measure based on research questions, operationalizing concepts, and different levels of measurement scales including nominal, ordinal, interval, and ratio. The document also discusses techniques for measuring attitudes, such as ranking, rating, sorting, and choice. Specific scales are described like Likert scales, semantic differentials, and category scales. Guidelines are provided for selecting a measurement scale based on objectives and properties of the data.
Is it important to explain a theorem? A case study in UML and ALCQIAlexandre Rademaker
The document discusses conceptual modeling from a logical point of view. It outlines the main steps of conceptual modeling as observing the world, determining relevance, choosing terminology, writing axioms, and verifying correctness. It notes that steps 1-2 can use informal notations like UML but are essentially an "art". Step 5 of verification demands significant knowledge of the model. The document also discusses using logic to explain theorems proven from an ontology, providing examples of proofs using tableaux and sequent calculus that the ontology implies a disjunction.
Focus on what you learned that made an impression, what may have s.docxkeugene1
Focus on what you learned that made an impression, what may have surprised you, and what you found particularly beneficial and why. Specifically:
What did you find that was really useful, or that challenged your thinking?
What are you still mulling over?
Was there anything that you may take back to your classroom?
Is there anything you would like to have clarified?
ANSWER THE ABOVE QUESTIONS BASED ON THE DOCUMENTS BELOW
Introduction & Goals
This week, we will investigate the distribution of a variable and look at ways to best see the key features of a quantitative variable’s distribution. We will look at visualizations of data, including line plots, frequency tables, stemplots, and histograms. We will hone our ability to describe key features of a distribution from visualizations and use them to compare distributions. We will begin to think about ideas for the Comparative Study by brainstorming in our project groups.
Goals
:
Reinforce the idea that data will vary
Explain what the distribution of variable is
Identify five key features of a distribution: center, spread, shape, clusters & outliers
Identify and create appropriate displays for categorical and quantitative data in one variable, including bar graphs, line plots, frequency tables, and histograms
Analyze distributions using stemplots and histograms
Recognize advantages and limitations of histograms
Begin to explore technology for use in statistics
Begin work on Comparative Study Final Project
DOW #2: How Long Is A Minute?
In week 1, we gathered data for this week’s DoW, addressing the question:
“How long is a minute to an adult?”
This week we'll:
In investigations 1 & 2, you will analyze the data with dot plots, frequency tables, stemplots, and histograms.
In Exercise B2, you will post your initial analysis and interpretation to the discussion board by Wednesday, 10 PM EST and create at least three follow-up posts by Friday, 10 PM EST.
In Exercise D2 & E2, you will post your best histogram to the discussion board by Friday, 10 PM EST. Compare the histograms and choose the one you think best represents the distribution by Sunday, 10 PM EST
Investigation 1: Seeing the Distribution
As we emphasized in Week 1,
data varies
. This point may seem trivial, but it encapsulates one of the most fundamental concepts of statistics:
variability
. Statistical Analysis is really a study of the patterns we find within this variation in the data. The pattern(s) in the variation is called the
distribution
of the variable. Much of statistics focuses on ways to represent and describe the distribution of a variable.
Activities A & B in this investigation focus on representing and describing the distribution.
Activity C introduces Excel as a tool for looking at a distribution.
Inv 1, Activity A: Patterns in the Variation
As we emphasized in Week 1,
data varies
. This point may seem trivial, but it encapsulates one of the most fundamental concepts of statistics:
variability
. Statistical Analy.
Data Analysis - How to Make Evidence from DataRyo Onozuka
1) The document discusses data analysis and the process of decision making, including defining problems, gathering relevant data, analyzing data to make evidence, hypothesizing, selecting hypotheses, and informing others of decisions.
2) It summarizes a case study where the problem of slowing growth at P&G is analyzed through gathering accounting data, setting hypotheses about profitability and restructuring, and testing hypotheses through data comparisons.
3) The analysis finds evidence that profitability increased after restructuring, but it is unclear if restructuring directly increased sales. The document concludes by reiterating the problem definition and measurement could be improved by comparing overall profits to local segment profits.
This document provides information on various qualitative data collection methods, with a focus on observation and interview techniques. It discusses three broad categories of data collection: indirect observation, direct observation, and elicitation. For observation, it describes different types of observers and challenges of observation. It emphasizes the importance of practicing observational skills. For interviews, it outlines types of interviews, issues to consider, and describes the in-depth interview process from planning to conducting the interview. Key functions of communication in interviews are also summarized.
Real-life Data Visualization - guest lecture for McGill INSY-442Mike Deutsch
Guest lecture given to McGill University undergrad class on Business Intelligence & Analytics, April 2014. Narrative: Data Visualization defined; What *good* visualization is; Visualization in business; a final Exercise in visualizing Higher Education Research data.
Running head ONline analytical process1ONline analytical proce.docxtoltonkendal
Running head: ONline analytical process 1
ONline analytical process 2Online Analytical Process and Date Cube
Vaishnavi Gunnam
SEC 6050
Wilmington University
Introduction
Online Analytical Process (OLAP) is among of the powerful and potential technologies used for knowledge discovery in vast database environment. The key part of OLAP model is the data cube. It is a multidimensional arrangement of collective values which provide sophisticated model for the decision support. OLAP is the foundation for numerous business application with sales and market analysis, planning, accounting and performance evaluation. Unlike statistical databases which usually store census data and economic data, OLAP is primarily used for analyzing business data collected from daily transactions such as sales data and health care data.
The main purpose of an OLAP system is to enable analysts to construct a mental image about the underlying data by exploring it from different perspectives, at different level of generalizations, and in an interactive manner. OLAP interacts with other components, such as data warehouse and data mining, to assist analysts in making business decisions.
A data cube is a type of multidimensional structure which allows users to analyze the data that is collected from various sources for different purposes, by taking three different factors into account at same instance. Data cube was proposed as a SQL operator to support common OLAP tasks like histograms and subtotals (Wang, Jajodia, & Wijesekera, 2010) .
Uses of OLAP Data Cube
OLAP data cubes are the most advanced technology that is used to analyze the data in huge data environments. There are many applications and uses of the OLAP data cubes, the following are some of the uses of implementing OLAP data bases in various fields:
· On-Line Analytical Processing (OLAP) techniques are more progressively being used in the Decision support system in order to provide the analysis of the data. The queries that were posted on the decision support systems are very complex and need different views of data. So OLAP data cubes are used to provide various dimensional views and helps in analyzing the data for required results (Blanco et al., 2015).
· On-line Analytical Processing systems facilitate analysts and managers of the organizations to provide insight on the performance of the organization by using various different views of data for reflecting the multidimensional (Blanco et al., 2015).
· The model which is dimensional in a logical way is represented by a cube. The tools will help in facilitating the updating and maintenance of the cube which attributes the multidimensional model, which further assists in easier setting up and helps in maintaining the cube effectively from taking the assistance through the intuitions to the extent of use possible (Blanco et al., 2015).
Operations of Data Cubes
To support OLAP, the datacube should provide the following capabilities.
Roll-up
Roll-u ...
Statistics is not inherently difficult. Those who view it as hard tend to struggle more, while those with a growth mindset find it easier. There are some key lessons that help make statistics more intuitive: 1) It involves philosophical questions about data analysis approaches. 2) It works backwards from data to theories, rather than evaluating theories based on data. 3) Failing to find an effect is different than proving no effect exists. With practice analyzing diverse datasets, statisticians develop expertise in applying these lessons. Regular practice is important for mastering any skill like statistics.
Research
1
Research
Student’s Name
University Affiliation
This is what my professor sent us yesterday through e-mail.
“Your response should look like this:
1. Introduction: Brief description of the study including the purpose and importance of the research question being asked.
Include your response in paragraph form here.
2. What is the null hypothesis? What is the research hypothesis?
Include your response in paragraph form here.
etc.
This format should help you to address EVERY question asked and it helps me in grading.
You should also be defining the statistical test, its requirements, etc and providing an intext citation for this summary. For example: What is a t test? When should it be used? What are the requirements? Why is the particular test appropriate to answer your research hypothesis? “
Research Hypothesis
Students who take breakfast perform better in class that those who don’t (FRAGMENT SENTENCE)
Research question
Does talking breakfast improve the class performance of a student?
Introduction
This research is meant to bring out the relationship between two variables and state how one affects the other. It has a dependent variable as well as independent variables. It should determine whether or not taking breakfast affects the class performance of a student.
Null hypothesis
Talking breakfast does not improve class performance of a student (SPELLING AND PUNCTUATION)
Sampling method
The method used to come up with the sample was the random sampling method. It is because it is unbiased and each unit of the population has an equal chance to be selected and used for the study. There is the assurance that the population will be equally sampled. (REWORD BECAUSE IT IS AWKAWARD)
Sample size
Determination of the right sample size is important because it makes the result of the research more true and reliable. 100 students were selected randomly for the purpose of the research.
Population of interest
The population that was targeted by this research is the entire student body. But since it is not practical to study every unit of the population, the sample was selected to represent the whole population. (YOU CANNOT START THE SENTENCE WITH “BUT” SINCE IT’S A RESEARCH PAPER) The sample size is big enough to represent the real picture depicted by the population under study.
Data analysis
The data was randomly collected from the students in order to create a sample to be used for the study. The data that was collected was analyzed so that to determine if there exists (REWORD) any relationship between breakfast and class performance.
Statistical analysis
This research is to determine the relationship between variables. Correlation refers to how strong two variables are related (Ashley Crossman). (NEEDS TO BE IN PROPER APA FORMAT) Correlation analysis was the most relevant in this case given the fact that the study is to determine the relation between variables. You need to answer this question: Depending on if you are using corr ...
A presentation on how to do ecological research without having to think too much (because thinking hurts). Presented at the National Centre for Biological Sciences, Bengaluru, India, on 15 July 2015. A version of this was first presented at the YETI conference at the Wildlife Institute of India in August 2012.
This document discusses the importance of establishing a theoretical framework in research. It provides examples of common concepts used in library and information science research like information needs, effectiveness, expectations, and value. A good theoretical framework advances knowledge, provides patterns to interpret data, and links studies together. The document also discusses components to consider when developing a logical structure for a study, including identifying what, who, where, when, and how research questions. Establishing objectives for a study is also reviewed.
The document discusses different approaches to representing ideas and problems when developing software systems, including icons, prototypes, metaphors, and propositions. It uses the example of developing a wearable technology system called "Psyche" to monitor the mental health of patients. The team developing Psyche considers representing the problem using each of the four approaches to help define key objects, events, and qualities to address. They choose to use an icon representation focused on timely response to emerging conditions and problem identification for citizens.
Similar to Data Visualisation for Data Science (20)
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
1. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
Data Visualization for
Data Science
Principles in action
Christophe Bontemps
Toulouse School of Economics, INRA
2. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
MY JOB
3. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHY I’M HERE ?
From Huff (1993)
4. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHY I’M HERE ?
From Huff (1993)
5. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHY I’M HERE ?
From Huff (1993)
6. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHY I’M HERE ?
From Huff (1993)
7. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
BEFORE WE START
Let’s do a simple exercise (from Buja et al. (2009))
8. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
THE “VISUAL PERCEPTION” OF A GRAPHIC
(source : Buja et al. (2009))
9. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
THE “VISUAL PERCEPTION” OF A GRAPHIC
(source : Buja et al. (2009))
10. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
“VISUAL PERCEPTION” AS A STATISTICAL TEST
11. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
“VISUAL PERCEPTION” AS A STATISTICAL TEST
“ The human eye acts is a broad feature detector and general
statistical test”. Buja et al. (2009)
12. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
“VISUAL PERCEPTION” AS A STATISTICAL TEST
“ The human eye acts is a broad feature detector and general
statistical test”. Buja et al. (2009)
Test : H0 : {There is "nothing" } = {No relation}
13. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
“VISUAL PERCEPTION” AS A STATISTICAL TEST
“ The human eye acts is a broad feature detector and general
statistical test”. Buja et al. (2009)
Test : H0 : {There is "nothing" } = {No relation}
H1 : { There is "something" } = {There is some relation
(Correlation, linearity, heterogeneity, groups..) }
14. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
“VISUAL PERCEPTION” AS A COMPARISON
15. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
“VISUAL PERCEPTION” AS A COMPARISON
What do you see here ?
16. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
“VISUAL PERCEPTION” AS A COMPARISON
What do you see here ?
17. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
“VISUAL PERCEPTION” AS A COMPARISON
What do you see here ?
Difficult to see the maximum/minimum of each curve...
18. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
“VISUAL PERCEPTION” AS A COMPARISON
What do you see here ?
Difficult to see the maximum/minimum of each curve...
Idea shared by Gelman (2004) and Munzner (2014)
19. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT IS DATA VISUALIZATION ?
It is a representation, a function of the data
20. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT IS DATA VISUALIZATION ?
It is a representation, a function of the data
A statistic too, is a function or a summary of the data
21. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT IS DATA VISUALIZATION ?
It is a representation, a function of the data
A statistic too, is a function or a summary of the data
So, it is a sort of statistic
22. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT IS DATA VISUALIZATION ?
It is a representation, a function of the data
A statistic too, is a function or a summary of the data
So, it is a sort of statistic
It can be descriptive or inferential
23. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT IS DATA VISUALIZATION ?
It is a representation, a function of the data
A statistic too, is a function or a summary of the data
So, it is a sort of statistic
It can be descriptive or inferential
Two or multi-dimensional
24. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT IS DATA VISUALIZATION ?
It is a representation, a function of the data
A statistic too, is a function or a summary of the data
So, it is a sort of statistic
It can be descriptive or inferential
Two or multi-dimensional
Static or dynamic
25. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT IS DATA VISUALIZATION ?
It is a representation, a function of the data
A statistic too, is a function or a summary of the data
So, it is a sort of statistic
It can be descriptive or inferential
Two or multi-dimensional
Static or dynamic
Informative or not
26. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT IS DATA VISUALIZATION ?
It is a representation, a function of the data
A statistic too, is a function or a summary of the data
So, it is a sort of statistic
It can be descriptive or inferential
Two or multi-dimensional
Static or dynamic
Informative or not
Misleading or accurately representing the data
27. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT IS DATA VISUALIZATION ?
It is a representation, a function of the data
A statistic too, is a function or a summary of the data
So, it is a sort of statistic
It can be descriptive or inferential
Two or multi-dimensional
Static or dynamic
Informative or not
Misleading or accurately representing the data
Beautiful or ugly....
28. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT IS DATA VISUALIZATION ?
For Tukey (1977) “The greatest value of a picture is when it
forces us to notice what we never expected to see”
29. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT IS DATA VISUALIZATION ?
For Tukey (1977) “The greatest value of a picture is when it
forces us to notice what we never expected to see”
Cleveland (1994) says that “graphical methods and
techniques are powerful tools for showing the structure of
data. The material is relevant for data analysis, when the
analyst wants to study data, and for data communication,
when the analyst wants to communicate data to others”
30. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT IS DATA VISUALIZATION ?
For Tukey (1977) “The greatest value of a picture is when it
forces us to notice what we never expected to see”
Cleveland (1994) says that “graphical methods and
techniques are powerful tools for showing the structure of
data. The material is relevant for data analysis, when the
analyst wants to study data, and for data communication,
when the analyst wants to communicate data to others”
Bertin (2005) (translated in Bertin (1983)) defines it as a
"visual language" and, as such, with a semiology, i.e. with
a theory of the functions of signs and symbols.
31. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT IS DATA VISUALIZATION ?
For Tukey (1977) “The greatest value of a picture is when it
forces us to notice what we never expected to see”
Cleveland (1994) says that “graphical methods and
techniques are powerful tools for showing the structure of
data. The material is relevant for data analysis, when the
analyst wants to study data, and for data communication,
when the analyst wants to communicate data to others”
Bertin (2005) (translated in Bertin (1983)) defines it as a
"visual language" and, as such, with a semiology, i.e. with
a theory of the functions of signs and symbols.
Tufte (2001) “ Graphics are instruments for reasoning
about quantitative information. Often the most effective
way to describe , explore and summarize a set of numbers
- even a large set - is to look at pictures of those numbers”
32. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
SO WHAT ?
Data visualisation serves different purposes :
Explanatory data analysis
33. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
SO WHAT ?
Data visualisation serves different purposes :
Explanatory data analysis
Statistical questioning of data patterns
34. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
SO WHAT ?
Data visualisation serves different purposes :
Explanatory data analysis
Statistical questioning of data patterns
Visual display of information for communication
35. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
SO WHAT ?
Data visualisation serves different purposes :
Explanatory data analysis
Statistical questioning of data patterns
Visual display of information for communication
Tool for interacting with data
36. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
2 TYPES OF GRAPHICS :
THOSE IMMEDIATE TO UNDERSTAND
FIGURE – Seen on HK-TV
37. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
2 TYPES OF GRAPHICS :
THOSE IMMEDIATE TO UNDERSTAND
FIGURE – Seen on HK-TV
38. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
2 TYPES OF GRAPHICS :
THOSE IMMEDIATE TO UNDERSTAND
FIGURE – Where do people run in Paris (N. Yau)
source :
http://flowingdata.com/2014/02/05/where-people-run/
39. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
2 TYPES OF GRAPHICS :
THOSE IMMEDIATE TO UNDERSTAND
FIGURE – Climate forecast uncertainty (S. Planton)
40. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
... AND THOSE NOT UNDERSTOOD IMMEDIATELY :
FIGURE – (Dynamic) Parallel Coordinates Plot comparing 5 indicators
for 3 countries (Sweden, Nigeria and Germany).
source :
http://ncva.itn.liu.se/education-geovisual-analytics/parallel-c
41. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
... AND THOSE NOT UNDERSTOOD IMMEDIATELY :
FIGURE – Pagerank Algorithm Reveals World’s All-Time Top Soccer
Team (MIT Review, March 2015)
42. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
... AND THOSE NOT UNDERSTOOD IMMEDIATELY :
FIGURE – How people spend their days (NYT).
43. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
“GOOD” OR “BAD” GRAPHICS ?
“There are no “good” nor “bad” graphics (...), there are graphics
answering legitimate questions and graphics that do not answer
question at all ”
Bertin (1981)
44. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
FAMOUS EXAMPLES OF “GOOD” VISUALIZATIONS
FIGURE – Charles Minard’s (1869) chart showing the number of men
in Napoleon’s 1812 Russian campaign army, their movements, as
well as the temperature they encountered on the return path.
45. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
FAMOUS EXAMPLES OF “GOOD” VISUALIZATIONS
FIGURE – Charles Minard’s (1869) chart showing the number of men
in Napoleon’s 1812 Russian campaign army, their movements, as
well as the temperature they encountered on the return path.
46. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
FAMOUS EXAMPLES OF “GOOD” VISUALIZATIONS
FIGURE – Charles Minard’s (1869) chart showing the number of men
in Napoleon’s 1812 Russian campaign army, their movements, as
well as the temperature they encountered on the return path.
47. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
FAMOUS EXAMPLES OF “GOOD” VISUALIZATIONS
FIGURE – London Cholera Map - John Snow (1854)
48. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
FAMOUS EXAMPLES OF “GOOD” VISUALIZATIONS
49. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
FAMOUS EXAMPLES OF “GOOD” VISUALIZATIONS
FIGURE – War Mortality - Florence Nightingale (1855) found that
Zymotic diseases (blue) > wounds injuries.
50. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
FAMOUS EXAMPLES OF “GOOD” VISUALIZATIONS
Same data with “modern” visualisation tools. Gelman and
Unwin (2011)
FIGURE – War Mortality - Florence Nightingale (1855) redrawn by
Gelman and Unwin (2011).
51. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
FAMOUS EXAMPLES OF “GOOD” VISUALIZATIONS
FIGURE – Visualizing 5 dimensions : Gapminder (Hans Rosling)
52. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
SO WHAT ARE THE RULES ?
Can you name some rules for a good (resp. bad) graphic ?
Your turn !
53. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
SO WHAT ARE THE RULES ?
Can you name some rules for a good (resp. bad) graphic ?
Your turn !
Axis and scale (starting at zero !) ?
54. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
SO WHAT ARE THE RULES ?
Can you name some rules for a good (resp. bad) graphic ?
Your turn !
Axis and scale (starting at zero !) ?
Context ?
55. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
SO WHAT ARE THE RULES ?
Can you name some rules for a good (resp. bad) graphic ?
Your turn !
Axis and scale (starting at zero !) ?
Context ?
No multiple scales ?
56. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
SO WHAT ARE THE RULES ?
Can you name some rules for a good (resp. bad) graphic ?
Your turn !
Axis and scale (starting at zero !) ?
Context ?
No multiple scales ?
Colors ?
57. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
YOUR TURN : WHAT’S WRONG WITH THIS GRAPHIC ?
58. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
BANANA’S SALES HAVE INCREASED !
FIGURE – from A. Dix example of interactive bar chart
59. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT’S WRONG WITH THIS GRAPHIC ?
FIGURE – Government spending "Skyrocketing".Tufte (2001) from
Playfair(1786).
60. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
SCALES ARE MISLEADING !
FIGURE – Governemnt spending "Skyrocketing" (revisited). Tufte
(2001) from Playfair(1786).
61. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT’S WRONG WITH THIS GRAPHIC ? (HARDER)
FIGURE – Major Cause of Disability - 1975-2010 (J. Schwabish, 2014).
62. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT’S WRONG WITH THIS GRAPHIC ? (HARDER)
Do you remember a damn thing of this graph ?
63. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
(SMALL) MULTIPLE GRAPHS, ARE OFTEN BETTER
FIGURE – Major Cause of Disability- 1975-2010 (J. Schwabish).
Cf. "brushing" (ex : for parallel Coordinates plots)
64. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT’S WRONG WITH THIS GRAPHIC ? (HARDER)
65. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
KEEP ALL YOUR AUDIENCE
Normal →
Color-blind →
66. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHICH MEANS THAT FOR 5 % OF MEN :
See also the ggplot option + scale_colour_colorblind()
67. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
DATA VISUALISATION IS USED FOR TWO MAIN
PURPOSES
Data exploration
Graphs as visual tests, comparisons (short time to built
and to read)
68. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
DATA VISUALISATION IS USED FOR TWO MAIN
PURPOSES
Data exploration
Graphs as visual tests, comparisons (short time to built
and to read)
Data representation
Summaries, storytelling (long time to build, short time to
read)
69. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
DATA VISUALISATION IS USED FOR TWO MAIN
PURPOSES
Data exploration
Graphs as visual tests, comparisons (short time to built
and to read)
Data representation
Summaries, storytelling (long time to build, short time to
read)
The problem is that :
“ Communicating implies simplification
data exploration implies exhaustivity”
70. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TABLES VS GRAPHICS ?
Several papers have discussed the issue : Gelman et al. (2002),
Gelman (2011) and Friendly and Kwan (2012).
Here, descriptive statistics of continuous variables.
71. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TABLES VS GRAPHICS ?
Graph version of the table. From Gelman (2011)
72. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
GRAPHICS reveal DATA : ANSCOMBE (1973) QUARTET
We use here 4 couples of random variables : (X1, Y1), (X2, Y2)
(X3, Y3) and (X4, Y4). All four data sets have the same
descriptive statistics.
Xs Mean Std. Dev. Ys Mean Std. Dev. corr(Xi, Yi) N
X1 9 3.32 Y1 7.5 2.03 0.8164 11
X2 9 3.32 Y2 7.5 2.03 0.8162 11
X3 9 3.32 Y3 7.5 2.03 0.8163 11
X4 9 3.32 Y4 7.5 2.03 0.8165 11
73. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
ANSCOMBE (1973) QUARTET
All four data sets are described by the same linear model
(Yi = α + βXi + i), revealing apparently the same
relationships :
Dependent variable :
Y1 Y2 Y3 Y4
Regressed on :
Xi, i=1,...,4 0.500 ∗∗∗
0.500∗∗∗
0.500∗∗∗
0.500∗∗∗
Constant 3.000∗∗
3.001∗∗
3.002∗∗
3.002∗∗
R2
0.667 0.666 0.666 0.667
Resid Std. Error 1.237 1.237 1.236 1.236
F Statistic 17.990∗∗∗
17.966∗∗∗
17.972∗∗∗
18.003∗∗∗
Note : Data from Anscombe (1973). ∗
p <0.1 ; ∗∗
p < 0.05 ; ∗∗∗
p < 0.01
74. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
ANSCOMBE (1973) QUARTET
A simple scatter plot (regression overlaid) shows something
very different.
4
8
12
5 10 15
x1
y1
Regression of Y1 on X1 (with constant)
4
8
12
5 10 15
x2
y2
Regression of Y2 on X2 (with constant)
4
8
12
5 10 15
x3
y3
Regression of Y3 on X3 (with constant)
4
8
12
5 10 15
x4
y4
Regression of Y4 on X4 (with constant)
75. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
ANSCOMBE (1973) QUARTET
NP : Plots of the residuals shows also same differences
−2
−1
0
1
2
5 6 7 8 9 10
Fitted values
Residuals
Residual vs Fitted Plot
−2
−1
0
1
5 6 7 8 9 10
Fitted values
Residuals
Residual vs Fitted Plot
−1
0
1
2
3
5 6 7 8 9 10
Fitted values
Residuals
Residual vs Fitted Plot
−1
0
1
2
7 8 9 10 11 12
Fitted values
Residuals
Residual vs Fitted Plot
76. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TABLES AND MATRICES
Data with many 0/1 variables (indicators for towns)
Bertin (1981)
77. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TABLES AND MATRICES
78. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TABLES AND MATRICES
Bertin (1981)
79. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
AND IN MANY DIMENSIONS ?
80. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TABLES AND MATRICES
From Munzner (2014)
81. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TABLES AND MATRICES
From Munzner (2014)
82. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TABLES AND MATRICES
From Munzner (2014)
83. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
REGRESSION TABLES ARE GRAPHICS !
(Mod. 1) (Mod. 2)
Special Special
i_under18 -0.0692∗ -0.119∗∗∗
(-2.25) (-3.57)
log_income 0.116∗∗∗ 0.102∗∗∗
(4.31) (3.51)
i_car 0.00131 -0.112∗
(0.03) (-2.00)
b08_locenv_water 0.0624∗∗∗ 0.0583∗∗
(4.99) (4.28)
i_can 0.710∗∗∗
(23.27)
Constant -1.467∗∗∗ -0.961∗∗
(-5.38) (-3.24)
Classical "visualisation" of regressions
84. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
REGRESSION TABLES ARE GRAPHICS !
(Mod. 1) (Mod. 2)
Special Special
i_under18 -0.0692 -0.119
(-2.25) (-3.57)
log_income 0.116 0.102
(4.31) (3.51)
i_car 0.00131 -0.112
(0.03) (-2.00)
b08_locenv_water 0.0624 0.0583
(4.99) (4.28)
i_can 0.710
(23.27)
Constant -1.467 -0.961
(-5.38) (-3.24)
Stars are used as preattentive visual variables !
85. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
REGRESSION AS A GRAPHIC
86. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
GOOD GRAPHICS ?
It the excellent Handbook of data visualisation Chen et al.
(2007), we find some good questions :
What to Whom, How and Why ?
A graphic may be linked to three pieces of text : its caption, a
headline and an article it accompanies. Ideally, all three should
be consistent and complement each other.
87. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
GOOD GRAPHICS ?
It the excellent Handbook of data visualisation Chen et al.
(2007), we find some good questions :
What to Whom, How and Why ?
A graphic may be linked to three pieces of text : its caption, a
headline and an article it accompanies. Ideally, all three should
be consistent and complement each other.
Present or explore data ?
Different purpose, different requirements !
88. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
GOOD GRAPHICS ?
It the excellent Handbook of data visualisation Chen et al.
(2007), we find some good questions :
What to Whom, How and Why ?
A graphic may be linked to three pieces of text : its caption, a
headline and an article it accompanies. Ideally, all three should
be consistent and complement each other.
Present or explore data ?
Different purpose, different requirements !
Choice of Graphical form ?
Choice depends on the type of data to be displayed (e.g.
univariate continuous data, bivariate categorical data, etc..) and
on what is to be shown.
89. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
GOOD GRAPHICS ?
It the excellent Handbook of data visualisation Chen et al.
(2007), we find some good questions :
What to Whom, How and Why ?
A graphic may be linked to three pieces of text : its caption, a
headline and an article it accompanies. Ideally, all three should
be consistent and complement each other.
Present or explore data ?
Different purpose, different requirements !
Choice of Graphical form ?
Choice depends on the type of data to be displayed (e.g.
univariate continuous data, bivariate categorical data, etc..) and
on what is to be shown.
Unique solution ?
There is not always a unique optimal choice and alternatives can
be equally good or good in different ways, emphasizing different
aspects of the same data.
90. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
EDWARD R. TUFTE’S RULES
In his seminal book, Tufte (2001) propose some principles for
displaying quantitative information.
Data : Above all, show the data
91. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
EDWARD R. TUFTE’S RULES
In his seminal book, Tufte (2001) propose some principles for
displaying quantitative information.
Data : Above all, show the data
Question : Induce the viewer to think about the substance
rather than about methodology, graphic design. Encourage the
eye to compare different piece of data.
92. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
EDWARD R. TUFTE’S RULES
In his seminal book, Tufte (2001) propose some principles for
displaying quantitative information.
Data : Above all, show the data
Question : Induce the viewer to think about the substance
rather than about methodology, graphic design. Encourage the
eye to compare different piece of data.
Data-ink ratio : Maximize the ink-data ratio. Erase all non
data ink, Erase redundant information
93. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
EDWARD R. TUFTE’S RULES
In his seminal book, Tufte (2001) propose some principles for
displaying quantitative information.
Data : Above all, show the data
Question : Induce the viewer to think about the substance
rather than about methodology, graphic design. Encourage the
eye to compare different piece of data.
Data-ink ratio : Maximize the ink-data ratio. Erase all non
data ink, Erase redundant information
Integrity : Avoid distorting what the data have to say
94. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
EDWARD R. TUFTE’S RULES
In his seminal book, Tufte (2001) propose some principles for
displaying quantitative information.
Data : Above all, show the data
Question : Induce the viewer to think about the substance
rather than about methodology, graphic design. Encourage the
eye to compare different piece of data.
Data-ink ratio : Maximize the ink-data ratio. Erase all non
data ink, Erase redundant information
Integrity : Avoid distorting what the data have to say
General to specific : Reveal the data at different levels of
detail (from broad picture to fine structure)
95. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
EDWARD R. TUFTE’S RULES
In his seminal book, Tufte (2001) propose some principles for
displaying quantitative information.
Data : Above all, show the data
Question : Induce the viewer to think about the substance
rather than about methodology, graphic design. Encourage the
eye to compare different piece of data.
Data-ink ratio : Maximize the ink-data ratio. Erase all non
data ink, Erase redundant information
Integrity : Avoid distorting what the data have to say
General to specific : Reveal the data at different levels of
detail (from broad picture to fine structure)
Context : Graphical display should be closely integrated with
the statistical and verbal descriptions of the data set.
96. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
PRACTICAL EXAMPLE : DATA-INK RATIO
Let’s start with a classical graph (R default - Boxplot )
g1 g2 g3 g4 g5
98100102104106108110112
Groupe
Response
FIGURE – Distribution of a continuous variable on 4 groups
97. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
ERASE ALL NON DATA INK
Groupe
Response
1 2 3 4 5
98100102104106108110112
FIGURE – Distribution of a continuous variable on 4 groups
98. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
ERASE ALL REDUNDANT !
Groupe
Response
1 2 3 4 5
98100102104106108110112
FIGURE – Distribution of a continuous variable on 4 groups
99. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
GOING FURTHER...
Groupe
Response
1 2 3 4 5
98100102104106108110112
FIGURE – Distribution of a continuous variable on 4 groups
100. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
AND SHOW THE DATA...
Groupe
Response
101.0
100.0
101.0
103.8
109.1
1 2 3 4 5
FIGURE – Distribution of a continuous variable on 4 groups
101. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
HAVE WE LOST SOMETHING ?
g1 g2 g3 g4 g5
98100102104106108110112
Groupe
Response
Groupe
Response
101.0
100.0
101.0
103.8
109.1
1 2 3 4 5
FIGURE – Distribution of a continuous variable on 4 groups
Did you noticed that group 1 and group 3 had the same median
(101.0) ? see the ggplot theme + theme_tufte()
102. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
INTEGRITY : THE LIE FACTOR
LieFactor =
Size of effect shown in graphic
Size of effect in data
(1)
A Lie Factor = 1 indicates a substantial distortion
FIGURE – Fuel economy standards. (E. Tufte - from NY Times 1978)
103. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
FIGURE – Fuel economy standards (revisited)
The "18 mpg" line measures 1.5 cm (in 1978) ; the "27,5 mpg"
measures 13 cm (in 1985)
−→ Lie factor = 14.5% ! ! !
104. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
BERTIN’S APPROACH : A VISUAL LANGUAGE
If graphs are used to communicate, it is a form of language.
Any language has a grammar, “words” and logic. Let us study
the science that deals with signs or sign language : “The
Semiology”.
TABLE – Bertin’s definition of 8 visual variables
Position (x, y)
Size
Value
Texture
Colour
Orientation
Shape
105. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
THESE VARIABLES SERVE DIFFERENT GOALS
Visual variable syntactics, designating each visual variable as
suited or not for levels of measurement :
Equivalence, differences, order, proportions.
Variable suited for :
Position (x, y) = O ∝
Size = O ∝
Value = O ∝
Texture = O
Colour =
Orientation =
Shape ≡
≡ : Equivalence, = : Differences, O : Order, ∝ : Proportions
106. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
EXAMPLE : SHAPE IS NOT SUITABLE FOR
PROPORTIONALITY
Price of land in the East of France Bertin (1970)
107. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
EXAMPLE : SIZE IS SUITABLE FOR PROPORTIONALITY
Price of land in the East of France Bertin (1970)
108. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
A NOTE ON COLORS
“Colors” are not suited for ordering !
Try putting the following hues in order from low to high.
109. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
A NOTE ON COLORS
These colors are easy to order from low to high.
Few (2008) provides meaningful solutions for choosing palettes
of colours, for example for heatmaps.
See also the ggplot theme theme_few()
110. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
A NOTE ON COLORS (FINAL)
Colors are sometimes a graphic puzzle Tufte (2001).
Your eyes will go back and forth from the graph to the legend...
(source : http://viz.wtf/image/135265269618)
111. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
CONJUNCTION OF COLOURS AND PROPORTIONALITY
Productivity of Airlines
(Demo with goodleVis)
112. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
FLASH QUIZZ :
If 100% of the US prisoners are represented by the big
square...what is the percentage for each group ?
FIGURE – Ethic composition of prisoners in Jail in 2008 in the USA.
(Le Monde 5/12/2014)
113. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
NOT SO SIMPLE...
If 100% of the US prisoners are represented by the big
square...what is the percentage for each group ?
FIGURE – Ethic composition of prisoners in Jail in 2008 in the USA.
(Le Monde 5/12/2014)
114. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
VERIFICATION
If 100% of the US prisoners are represented by the big
square...what is the percentage for each group ?
→
115. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
OR...
If 100% of the US prisoners are represented by the big
square...what is the percentage for each group ?
116. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
IT MATTERS BECAUSE MANY HIGH DIMENSION
VISUALISATION USE AREA..
Spinograms
A spinogram is area-proportional just like the histogram, but
allows a non-linear x-axis and thus can make all boxes of equal
height. Theus and Urbanek (2009)
117. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
MOSAIC PLOTS
Step 1 of the construction of a mosaic plot (Similar to spineplot
here). All surviving passengers are highlighted in all plots.
Theus and Urbanek (2009)
118. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
MOSAIC PLOTS
Step 2 of the construction of a mosaic plot. Conditioning on
Age.Theus and Urbanek (2009)
119. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
MOSAIC PLOTS
Step 3 of the construction of a mosaic plot. Conditioning on Age
and Gender.Theus and Urbanek (2009)
120. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
MOSAIC PLOTS
Final step of the construction of a mosaic plot. Explicit mention
of Survived as highlighted.Theus and Urbanek (2009)
121. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
SCHWABISH (JEP, 2014) BEFORE-AFTER
FIGURE – An Unbalanced Chart - Original
122. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
SCHWABISH (JEP, 2014) BEFORE-AFTER
FIGURE – An Unbalanced Chart - Revised
123. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
SCHWABISH (JEP, 2014) BEFORE-AFTER
FIGURE – A Clutterplot Example - Original
124. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
SCHWABISH (JEP, 2014) BEFORE-AFTER
FIGURE – A Clutterplot Example - Revised
125. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
“GOOD” OR “BAD” GRAPHICS ?
“There are no “good” nor “bad” graphics (...), there are graphics
answering legitimate questions and graphics that do not answer
question at all ”
Bertin (1981)
It is easy to criticize ... but are there some rules ?
126. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
A NOTE ON PERCEPTION
A bird (Duck, Toucan ?) on the X axis, a rabbit on the Y axis !
//
Source
http://flowingdata.com/2014/06/25/duck-vs-rabbit-plot/
127. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
“PREATTENTIVE” VARIABLES
How many "3" in that sequence ? (from Ware (2012))
128. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
“PREATTENTIVE” VARIABLES
How many "3" in that sequence ? (from Ware (2012))
129. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
“PREATTENTIVE” VARIABLES
How many "3" in that sequence ? (from Ware (2012))
130. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
AND NOW...
Find the red dot !
131. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TEST : FIND THE RED DOT !
132. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TEST : FIND THE RED DOT !
133. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TEST : FIND THE RED DOT !
134. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TEST : FIND THE RED DOT !
135. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TEST : FIND THE RED DOT !
136. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TEST : FIND THE RED DOT !
137. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TEST : FIND THE RED DOT !
138. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TEST : FIND THE RED DOT !
139. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TEST : FIND THE RED DOT !
140. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TEST : FIND THE RED DOT !
141. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TEST : FIND THE RED DOT !
142. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TEST : FIND THE RED DOT !
143. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
HARDER : IS THERE A "STRANGER" ?
144. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
HARDER : IS THERE A "STRANGER" ?
145. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
HARDER : IS THERE A "STRANGER" ?
146. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
HARDER : IS THERE A "STRANGER" ?
147. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
HARDER : IS THERE A "STRANGER" ?
148. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
HARDER : IS THERE A "STRANGER" ?
149. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
HARDER : IS THERE A "STRANGER" ?
150. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
THAT WASN’T EASY
Preattentive concept, Treisman (1985) and Healey (2007)
Some visual elements or patterns are detected immediately
But there may be interferences (colour and form)
Very useful (detection, explanatory and presentation)
Helpful to highlight a message !
151. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
TOO MUCH VARIATION DOESN’T HELP
From Ware (2012)
152. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
MOST PREATTENTIVE VISUAL VARIABLES
From Ware (2012)
153. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
VISUAL PERCEPTION AND PIE CHARTS
154. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
VISUAL PERCEPTION AND PIE CHARTS
155. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
VISUAL PERCEPTION AND PIE CHARTS
https://twitter.com/freakonometrics/status/6127423301609512
156. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
VISUAL PERCEPTION AND LINES
From Cairo (2012)
157. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
VISUAL PERCEPTION AND LINES
When was the biggest negative (positive) difference ?
From Cairo (2012)
158. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
VISUAL PERCEPTION AND LINES
159. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
VISUAL PERCEPTION AND LINES
When was the biggest negative (positive) difference ?
160. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
VISUAL PERCEPTION AND LINES
When was the biggest negative (positive) difference ?
161. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
VISUAL PERCEPTION AND LINES
When was the biggest negative (positive) difference ?
From Cairo (2012)
162. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
THE CLEVELAND-MCGILL EFFECT
163. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
THE CLEVELAND-MCGILL EFFECT
164. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
THE CLEVELAND-MCGILL EFFECT
From Cleveland and McGill (1984)
165. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WEBER’S LAW AND FRAMED BOXES
166. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WEBER’S LAW AND FRAMED BOXES
167. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WEBER’S LAW AND FRAMED BOXES
From Cleveland and McGill (1984)
168. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
THE CLEVELAND-MCGILL SCALE
http://hcil2.cs.umd.edu/trs/99-20/99-20.html
169. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
PARTIAL CONCLUSION
Gordon and Finch (2015) gives some nice principles
170. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
PARTIAL CONCLUSION
Gordon and Finch (2015) gives some nice principles
1. Show the data clearly
171. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
PARTIAL CONCLUSION
Gordon and Finch (2015) gives some nice principles
1. Show the data clearly
2. Use simplicity in design
172. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
PARTIAL CONCLUSION
Gordon and Finch (2015) gives some nice principles
1. Show the data clearly
2. Use simplicity in design
3. Use good alignment on a common scale for quantities to be
compared
173. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
PARTIAL CONCLUSION
Gordon and Finch (2015) gives some nice principles
1. Show the data clearly
2. Use simplicity in design
3. Use good alignment on a common scale for quantities to be
compared
4. Keep visual encoding transparent
174. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
PARTIAL CONCLUSION
Gordon and Finch (2015) gives some nice principles
1. Show the data clearly
2. Use simplicity in design
3. Use good alignment on a common scale for quantities to be
compared
4. Keep visual encoding transparent
5. Use graphical forms consistent with those principles
175. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
PARTIAL CONCLUSION
Gordon and Finch (2015) gives some nice principles
1. Show the data clearly
2. Use simplicity in design
3. Use good alignment on a common scale for quantities to be
compared
4. Keep visual encoding transparent
5. Use graphical forms consistent with those principles
We may add some others (use preattentive elements,
integrity, ...)
176. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
PARTIAL CONCLUSION
Do not forget the big picture
177. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
PARTIAL CONCLUSION
Do not forget the big picture
178. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
CASE STUDY : VISUALIZING THE WHOLE AND THE
DETAILS !
2588 dairy farmers over 11 years.
One variable is estimated : risk aversion (AR)
6 region of study
Don’t know the results
https:
//xtophedataviz.shinyapps.io/ShinyParallel/
179. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
CASE STUDY : RISK AVERSION
Simple plot : Median value over time.
180. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
CASE STUDY : RISK AVERSION
Simple plot : Median value with dispersion visualized.
181. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
CASE STUDY : RISK AVERSION
Classical BoxPlot : There are changes over time.
182. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
CASE STUDY : HOW TO VISUALIZE FARMS ?
Points over time : Too much overlapping
183. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
CASE STUDY : HOW TO VISUALIZE FARMS ?
Points over time : Jitter helps !
184. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
CASE STUDY : HOW TO VISUALIZE FARMS ?
Farms over time : Jitter helps !
185. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
CASE STUDY : HOW TO VISUALIZE FARMS ?
Farms over time : Spaghetti plots !
186. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
CASE STUDY : HOW TO VISUALIZE FARMS ?
Farms over time : Spaghetti plots with some Brushing !
187. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
CASE STUDY : HOW TO VISUALIZE FARMS ?
Farms over time by region : Multiple Spaghetti plots !
188. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
CASE STUDY : HOW TO VISUALIZE FARMS ?
Farms over time : Spaghetti plots with some Brushing !
189. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
CASE STUDY : HOW TO VISUALIZE FARMS ?
Farms over time by region : Highlighting Spaghetti plots !
190. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT TO REMEMBER
Data visualisation serves at least two main purposes
Data exploration
Graphs as visual tests, comparisons (short time to built
and to read)
191. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT TO REMEMBER
Data visualisation serves at least two main purposes
Data exploration
Graphs as visual tests, comparisons (short time to built
and to read)
Data representation
Summaries, storytelling (long time to build, short time to
read)
192. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT TO REMEMBER
Data visualisation serves at least two main purposes
Data exploration
Graphs as visual tests, comparisons (short time to built
and to read)
Data representation
Summaries, storytelling (long time to build, short time to
read)
The problem is that :
“ Communicating implies simplification
data exploration implies exhaustivity”
193. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT TO REMEMBER
From the viewer“data visualisation” are implicitly or explicitly
comparisons or even tests (in the statistical sense)
Graphics should help questioning
194. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT TO REMEMBER
From the viewer“data visualisation” are implicitly or explicitly
comparisons or even tests (in the statistical sense)
Graphics should help questioning
They should provide elements, to answer (data at least)
195. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT TO REMEMBER
From the viewer“data visualisation” are implicitly or explicitly
comparisons or even tests (in the statistical sense)
Graphics should help questioning
They should provide elements, to answer (data at least)
If the question implies comparison, they should truthfully
show the comparison
196. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT TO REMEMBER
Many “data visualisation” are useless, meaningless or stupid !
Some are simply poor :
Graphs as visual tests, comparisons (short time to built
and to read)
197. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT TO REMEMBER
Many “data visualisation” are useless, meaningless or stupid !
Some are simply poor :
Graphs as visual tests, comparisons (short time to built
and to read)
Some are funny :
198. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT TO REMEMBER
Many “data visualisation” are useless, meaningless or stupid !
Some are simply poor :
Graphs as visual tests, comparisons (short time to built
and to read)
Some are funny :
Many are ridiculous :
199. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
CHALLENGES : NETWORKS
Relationships of all of Victor Hugo’s characters of "Les
Miserables".
http://bl.ocks.org/mbostock/4062045_
200. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
NETWORKS : ADJACENT MATRIX PLOT
An adjacency matrix, where each cell ij represents an edge from
vertex i to vertex j. Here, vertices represent characters in a
book, while edges represent co-occurrence in a chapter.
201. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
NETWORKS : ADJACENT MATRIX PLOT
Here again, sorting is very useful !
202. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT TO REMEMBER : THERE ARE RULES
Data visualisation is a visual language, so there are :
Elements of language
203. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT TO REMEMBER : THERE ARE RULES
Data visualisation is a visual language, so there are :
Elements of language
Rules of use (spelling)
204. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT TO REMEMBER : THERE ARE RULES
Data visualisation is a visual language, so there are :
Elements of language
Rules of use (spelling)
Grammar
205. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT TO REMEMBER : A GOOD TECHNIQUE DOES
NOT PRECLUDE GOOD COMMON SENSE !
let’s...
KISS
206. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT TO REMEMBER : A GOOD TECHNIQUE DOES
NOT PRECLUDE GOOD COMMON SENSE !
let’s...
KISS
Keep It Simple Stupid !
207. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT TO REMEMBER : A GOOD TECHNIQUE DOES
NOT PRECLUDE GOOD COMMON SENSE !
let’s...
KISS
Keep It Simple Stupid !
Keep It Statistical Stupid !
208. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
WHAT TO REMEMBER : A GOOD TECHNIQUE DOES
NOT PRECLUDE GOOD COMMON SENSE !
let’s...
KISS
Keep It Simple Stupid !
Keep It Statistical Stupid !
Keep It Statistical and Simple !
209. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
REFERENCES I
Anscombe, F. J. (1973). Graphs in statistical analysis. The American
Statistician, 27(1) :17–21.
Bertin, J. (1970). La graphique. Communications, 15(1) :169–185.
Bertin, J. (1981). Théorie matricielle de la graphique. Communication et
langages, 48(1) :62–74.
Bertin, J. (1983). Semiology of graphics, translation from sémilogie graphique
(1967).
Bertin, J. (2005). Sémiologie graphique : Les diagrammes, les réseaux, les cartes. Les
Réimpressions des Éditions de l’École des Hautes Études en Sciences
Sociales. Éditions de l’École des Hautes Études en Sciences Sociales.
Buja, A., Cook, D., Hofmann, H., Lawrence, M., Lee, E.-K., Swayne, D. F., and
Wickham, H. (2009). Statistical inference for exploratory data analysis and
model diagnostics. Philosophical Transactions of the Royal Society of London
A : Mathematical, Physical and Engineering Sciences, 367(1906) :4361–4383.
Cairo, A. (2012). The Functional Art : An introduction to information graphics and
visualization. Voices That Matter. Pearson Education.
210. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
REFERENCES II
Chen, C.-h., Härdle, W. K., and Unwin, A. (2007). Handbook of data
visualization. Springer Science & Business Media.
Cleveland, W. S. (1994). The Elements of Graphing Data. Hobart Press,
Summit : NJ, 2 edition.
Cleveland, W. S. and McGill, R. (1984). Graphical perception : Theory,
experimentation, and application to the development of graphical
methods. Journal of the American Statistical Association, 79(387) :531–554.
Few, S. (2008). Practical rules for using color in charts. Visual Business
Intelligence Newsletter, (11).
Friendly, M. and Kwan, E. (2012). Comment. Journal of Computational and
Graphical Statistics.
Gelman, A. (2004). Exploratory data analysis for complex models. Journal of
Computational and Graphical Statistics, 13(4).
Gelman, A. (2011). Why tables are really much better than graphs. Journal of
Computational and Graphical Statistics, 20(1) :3–7.
Gelman, A., Pasarica, C., and Dodhia, R. (2002). Let’s practice what we
preach : turning tables into graphs. The American Statistician,
56(2) :121–130.
211. Definitions Typologies Good vs bad Tables Principles Before After Visual perception An example What to remember Référ
REFERENCES III
Gelman, A. and Unwin, A. (2011). Visualization, graphics, and statistics.
Statistical Computing and graphics, 22(1) :9–12.
Gordon, I. and Finch, S. (2015). Statistician heal thyself : Have we lost the
plot ? Journal of Computational and Graphical Statistics, 24(4) :1210–1229.
Healey, C. (2007). Perception in visualization.
Huff, D. (1993). How to Lie with Statistics. W. W. Norton & Company.
Munzner, T. (2014). Visualization Analysis and Design. AK Peters Visualization
Series. A K Peters/CRC Press, 1 edition.
Theus, M. and Urbanek, S. (2009). Interactive graphics for data analysis :
principles and examples. Series in computer science and data analysis. CRC
Press.
Treisman, A. (1985). Preattentive processing in vision. Computer Vision,
Graphics, and Image Processing, 31(2) :156–177.
Tufte, E. R. (2001). The Visual Display of Quantitative Information. Graphics
Press, 2 edition.
Tukey, J. W. (1977). Exploratory data analysis. Reading, Mass.
Ware, C. (2012). Information visualization : perception for design. Elsevier.