SlideShare a Scribd company logo
1 of 10
Histograms are a great way to graphically represent
a specific type of data known as “frequency data.”
A histogram of frequency data is constructed by
first identifying a number of equal “intervals” that
will become your bins (e.g. 0-1, 1-2, 2-3; or 0-2, 2-
4, 4-6, and so on).
                                        Website Visit Duration Histogram
                           25
        Number of visits




                           20

                           15

                            10
                                5
                                0
                                    1   2   3   4   5      6       7       8    9   10
                                                                                         More

                                                    Visit Duration in Minutes
The challenge is selecting the correct set of equal
intervals that sufficiently reduces the information
into a readable form but also provides enough
variability to enable viewers to see the shape of the
distribution (i.e. the story of the data). We’ll
explore appropriate bin sizes in Module 3.
                                        Website Visit Duration Histogram
                           25
        Number of visits




                           20

                           15

                            10
                                5
                                0
                                    1   2   3   4   5      6       7       8    9   10
                                                                                         More

                                                    Visit Duration in Minutes
“Frequency data” is the hallmark of histograms. This
is the type of data occurs when a specific event
happens within an interval. For
example, standardized test scores represented in
“percentiles” (e.g. 90th percentile) are frequency
data.
                                       Website Visit Duration Histogram
                          25
       Number of visits




                          20

                          15

                           10
                               5
                               0
                                   1   2   3   4   5      6       7       8    9   10
                                                                                        More

                                                   Visit Duration in Minutes
BAR CHARTS are often used for quantitative or
categorical data, not frequency distribution. The
chart below is measuring categorical data.
Unfortunately, bar charts can also be used for
frequency distributions so long as the number of
unique scores in the data set is not large.
                                    Average Page Duration
        14

        12

        10

         8

         6

         4

         2

         0
             Home page   Products   Services   About Us   Contact Us Capabilities   News
In business, since both histograms and bar charts can be
effectively used to represent the same data set, there can be
a lot of confusion over which chart to use or even which one
you’re looking at! For example, what kind of chart is the one
below, a histogram or a bar chart? It’s not as easy as it
seems. The following slides will demonstrate the similarities
and differences between histograms and bar charts.
                                     Average Page Duration
         14

         12

         10

          8

          6

          4

          2

          0
              Home page   Products   Services   About Us   Contact Us Capabilities   News
SAMPLE HISTOGRAM
Numerical Element                                Continuous Range
                       Frequency (Number of Batches)
   16


   14


   12


   10


    8


    6


    4


    2


    0
         0-5    6-10     11-15   16-20   21-25   26-30   31-35   36-40
SAMPLE BAR CHART
Discrete Element                             Non-Continuous Item
                                   Average Page Duration
      14


      12


      10


       8


       6


       4


       2


       0
           Home page   Products   Services   About Us   Contact Us   Capabilities   News
HEAD TO HEAD
Histogram                                Bar Chart

Method of presenting comparisons         Visual representation of comparing values


Grouped or continuous items              Non-grouped, non-continuous items


Usually numbers presented in ranges      Visual comparison of discrete elements


Bars need to be touching in display      Bars don’t need to be touching in display
(thanks Microsoft for not doing this!)
CRITICAL THINKING: Can you think of a situation
that you’ve experienced where a histogram
would provide more useful information than a
bar chart (think large data set and frequency
distribution)?
      25

      20

      15

       10

           5

           0
               1   2   3   4   5   6   7   8   9   10
                                                        More

More Related Content

Similar to Module 1.3

Three Pillars, Zero Answers: Rethinking Observability
Three Pillars, Zero Answers: Rethinking ObservabilityThree Pillars, Zero Answers: Rethinking Observability
Three Pillars, Zero Answers: Rethinking ObservabilityDevOps.com
 
Learning to Build Distributed Systems the Hard Way
Learning to Build Distributed Systems the Hard WayLearning to Build Distributed Systems the Hard Way
Learning to Build Distributed Systems the Hard WayTheo Hultberg
 
How we evolved data pipeline at Celtra and what we learned along the way
How we evolved data pipeline at Celtra and what we learned along the wayHow we evolved data pipeline at Celtra and what we learned along the way
How we evolved data pipeline at Celtra and what we learned along the wayGrega Kespret
 
Sas rule based codebook generation for exploratory data analysis - wuss 2012
Sas rule based codebook generation for exploratory data analysis - wuss 2012Sas rule based codebook generation for exploratory data analysis - wuss 2012
Sas rule based codebook generation for exploratory data analysis - wuss 2012RossBettinger
 
The Critical Missing Component in the Production ML Stack
The Critical Missing Component in the Production ML StackThe Critical Missing Component in the Production ML Stack
The Critical Missing Component in the Production ML StackDatabricks
 
datadrivengraph
datadrivengraphdatadrivengraph
datadrivengraphpadmaja11
 
Distributed Representation-based Recommender Systems in E-commerce
Distributed Representation-based Recommender Systems in E-commerceDistributed Representation-based Recommender Systems in E-commerce
Distributed Representation-based Recommender Systems in E-commerceRakuten Group, Inc.
 
Art and Science of Dashboard Design
Art and Science of Dashboard DesignArt and Science of Dashboard Design
Art and Science of Dashboard DesignSavvyData
 
MeasureWorks - Velocity Conference Europe - Performance Automation 101
MeasureWorks  - Velocity Conference Europe - Performance Automation 101MeasureWorks  - Velocity Conference Europe - Performance Automation 101
MeasureWorks - Velocity Conference Europe - Performance Automation 101MeasureWorks
 
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-shareBigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-sharestelligence
 
Information visualisation
Information visualisationInformation visualisation
Information visualisationhcicourse
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksGrega Kespret
 
Enterprise Visualization Suite, a business process intelligence application
Enterprise Visualization Suite, a business process intelligence applicationEnterprise Visualization Suite, a business process intelligence application
Enterprise Visualization Suite, a business process intelligence applicationjonespi
 
Spark Summit - Stratio Streaming
Spark Summit - Stratio Streaming Spark Summit - Stratio Streaming
Spark Summit - Stratio Streaming Stratio
 
Statisics for hackers
Statisics for hackersStatisics for hackers
Statisics for hackersSimon Belak
 
Metrics for web companies
Metrics for web companiesMetrics for web companies
Metrics for web companiesDave Fowler
 

Similar to Module 1.3 (18)

Three Pillars, Zero Answers: Rethinking Observability
Three Pillars, Zero Answers: Rethinking ObservabilityThree Pillars, Zero Answers: Rethinking Observability
Three Pillars, Zero Answers: Rethinking Observability
 
Learning to Build Distributed Systems the Hard Way
Learning to Build Distributed Systems the Hard WayLearning to Build Distributed Systems the Hard Way
Learning to Build Distributed Systems the Hard Way
 
Data Visualization Techniques
Data Visualization TechniquesData Visualization Techniques
Data Visualization Techniques
 
How we evolved data pipeline at Celtra and what we learned along the way
How we evolved data pipeline at Celtra and what we learned along the wayHow we evolved data pipeline at Celtra and what we learned along the way
How we evolved data pipeline at Celtra and what we learned along the way
 
Sas rule based codebook generation for exploratory data analysis - wuss 2012
Sas rule based codebook generation for exploratory data analysis - wuss 2012Sas rule based codebook generation for exploratory data analysis - wuss 2012
Sas rule based codebook generation for exploratory data analysis - wuss 2012
 
The Critical Missing Component in the Production ML Stack
The Critical Missing Component in the Production ML StackThe Critical Missing Component in the Production ML Stack
The Critical Missing Component in the Production ML Stack
 
datadrivengraph
datadrivengraphdatadrivengraph
datadrivengraph
 
Distributed Representation-based Recommender Systems in E-commerce
Distributed Representation-based Recommender Systems in E-commerceDistributed Representation-based Recommender Systems in E-commerce
Distributed Representation-based Recommender Systems in E-commerce
 
Art and Science of Dashboard Design
Art and Science of Dashboard DesignArt and Science of Dashboard Design
Art and Science of Dashboard Design
 
MeasureWorks - Velocity Conference Europe - Performance Automation 101
MeasureWorks  - Velocity Conference Europe - Performance Automation 101MeasureWorks  - Velocity Conference Europe - Performance Automation 101
MeasureWorks - Velocity Conference Europe - Performance Automation 101
 
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-shareBigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
 
Information visualisation
Information visualisationInformation visualisation
Information visualisation
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
 
Enterprise Visualization Suite, a business process intelligence application
Enterprise Visualization Suite, a business process intelligence applicationEnterprise Visualization Suite, a business process intelligence application
Enterprise Visualization Suite, a business process intelligence application
 
Module 4.1
Module 4.1Module 4.1
Module 4.1
 
Spark Summit - Stratio Streaming
Spark Summit - Stratio Streaming Spark Summit - Stratio Streaming
Spark Summit - Stratio Streaming
 
Statisics for hackers
Statisics for hackersStatisics for hackers
Statisics for hackers
 
Metrics for web companies
Metrics for web companiesMetrics for web companies
Metrics for web companies
 

More from druhbrown

More from druhbrown (7)

Module 4.3
Module 4.3Module 4.3
Module 4.3
 
Module 4.2
Module 4.2Module 4.2
Module 4.2
 
Module 3.2
Module 3.2Module 3.2
Module 3.2
 
Module 3.1
Module 3.1Module 3.1
Module 3.1
 
Module 2.3
Module 2.3Module 2.3
Module 2.3
 
Module 2.2
Module 2.2Module 2.2
Module 2.2
 
Module 2.1
Module 2.1Module 2.1
Module 2.1
 

Module 1.3

  • 1.
  • 2. Histograms are a great way to graphically represent a specific type of data known as “frequency data.” A histogram of frequency data is constructed by first identifying a number of equal “intervals” that will become your bins (e.g. 0-1, 1-2, 2-3; or 0-2, 2- 4, 4-6, and so on). Website Visit Duration Histogram 25 Number of visits 20 15 10 5 0 1 2 3 4 5 6 7 8 9 10 More Visit Duration in Minutes
  • 3. The challenge is selecting the correct set of equal intervals that sufficiently reduces the information into a readable form but also provides enough variability to enable viewers to see the shape of the distribution (i.e. the story of the data). We’ll explore appropriate bin sizes in Module 3. Website Visit Duration Histogram 25 Number of visits 20 15 10 5 0 1 2 3 4 5 6 7 8 9 10 More Visit Duration in Minutes
  • 4. “Frequency data” is the hallmark of histograms. This is the type of data occurs when a specific event happens within an interval. For example, standardized test scores represented in “percentiles” (e.g. 90th percentile) are frequency data. Website Visit Duration Histogram 25 Number of visits 20 15 10 5 0 1 2 3 4 5 6 7 8 9 10 More Visit Duration in Minutes
  • 5. BAR CHARTS are often used for quantitative or categorical data, not frequency distribution. The chart below is measuring categorical data. Unfortunately, bar charts can also be used for frequency distributions so long as the number of unique scores in the data set is not large. Average Page Duration 14 12 10 8 6 4 2 0 Home page Products Services About Us Contact Us Capabilities News
  • 6. In business, since both histograms and bar charts can be effectively used to represent the same data set, there can be a lot of confusion over which chart to use or even which one you’re looking at! For example, what kind of chart is the one below, a histogram or a bar chart? It’s not as easy as it seems. The following slides will demonstrate the similarities and differences between histograms and bar charts. Average Page Duration 14 12 10 8 6 4 2 0 Home page Products Services About Us Contact Us Capabilities News
  • 7. SAMPLE HISTOGRAM Numerical Element Continuous Range Frequency (Number of Batches) 16 14 12 10 8 6 4 2 0 0-5 6-10 11-15 16-20 21-25 26-30 31-35 36-40
  • 8. SAMPLE BAR CHART Discrete Element Non-Continuous Item Average Page Duration 14 12 10 8 6 4 2 0 Home page Products Services About Us Contact Us Capabilities News
  • 9. HEAD TO HEAD Histogram Bar Chart Method of presenting comparisons Visual representation of comparing values Grouped or continuous items Non-grouped, non-continuous items Usually numbers presented in ranges Visual comparison of discrete elements Bars need to be touching in display Bars don’t need to be touching in display (thanks Microsoft for not doing this!)
  • 10. CRITICAL THINKING: Can you think of a situation that you’ve experienced where a histogram would provide more useful information than a bar chart (think large data set and frequency distribution)? 25 20 15 10 5 0 1 2 3 4 5 6 7 8 9 10 More