SlideShare a Scribd company logo
1 of 45
Download to read offline
Excel for journalists


Part 1: analyze your data




       Copyright - Hille van der Kaa
        www.deuitgeeffabriek.nl
Microsoft Excel is a powerful tool
for analyzing data and discovering
       interesting patterns.
                                            Journalists can use it to:

                                            •              sort data
                                            •              filter data
                                            •              calculate data
                                            •              make pivot tables

                                            (and so many other things I will
                                            not discuss in this presentation…)


                           Copyright - Hille van der Kaa
                            www.deuitgeeffabriek.nl
• In this case, we want to analyze the number of
  stolen bikes and cars in five different cities and
  two provinces in The Netherlands.
• Therefore we start with sorting the number of
  stolen bikes and cars per city.
• All numbers are fictional….



                    Copyright - Hille van der Kaa
                     www.deuitgeeffabriek.nl
Please enter these fictional data in your spreadsheet




                   Copyright - Hille van der Kaa
                    www.deuitgeeffabriek.nl
• Excel organizes your data in
  table form, with rows and
  columns.
• The columns (which are
  labeled A, B, C…) list the
  variables (in this case city,
  province, bike, car)
  Typically, the first row holds
  the names of the variables.
• The rest of the rows are for
  the individual records or
  cases being analyzed. Each
  cell (like B2) holds a piece of
  data.

                           Copyright - Hille van der Kaa
                            www.deuitgeeffabriek.nl
Sort data
• In journalism, we usually are interested in
  extremes like: the least, the most, the biggest
  or the smallest.
• Excel helps you to look for this by sorting the
  data into a revealing order.
• In this case, we would like to sort the data in
  descending order of the total number of
  stolen bikes, with the most crime-ridden city
  at the top.

                   Copyright - Hille van der Kaa
                    www.deuitgeeffabriek.nl
• There are two methods of sorting. The first
  method is quick and can be used for sorting by
  a single variable.




                  Copyright - Hille van der Kaa
                   www.deuitgeeffabriek.nl
Put the cursor in the column you wish to sort and then click the
                          Z-A button.




                        Copyright - Hille van der Kaa
                         www.deuitgeeffabriek.nl
Now you have the number of stolen bikes in descending order




                      Copyright - Hille van der Kaa
                       www.deuitgeeffabriek.nl
Beware!
• Put the cursor in the column, do not select
  the column letter (C, in this case) and then
  sort.
• Doing that will sort only the data in that
  column, and disorder your data!




                   Copyright - Hille van der Kaa
                    www.deuitgeeffabriek.nl
• The other method of sorting is useful when
  you want to sort by more than one variable.
• For instance, suppose we wish to sort the
  crime data first by provinces in alphabetical
  order, and then by “bike” in descending order
  within each city.



                  Copyright - Hille van der Kaa
                   www.deuitgeeffabriek.nl
To do this, look for the toolbar, click on “Data” and then “Sort”…




                         Copyright - Hille van der Kaa
                          www.deuitgeeffabriek.nl
choose the variables by which you wish to sort. Then click “OK”.




                        Copyright - Hille van der Kaa
                         www.deuitgeeffabriek.nl
Now you have the number of stolen bikes in descending order
                     per province




                      Copyright - Hille van der Kaa
                       www.deuitgeeffabriek.nl
Filter data
• Sometimes you want to examine only
  particular records from a large collection of
  data.
• In this case, suppose we only wish to see the
  records from Middelburg. For this, you can use
  Excel’s Filter tool.



                  Copyright - Hille van der Kaa
                   www.deuitgeeffabriek.nl
On the toolbar, go to “Data…Filter”. Small buttons will appear at
                    the top of each column.




                         Copyright - Hille van der Kaa
                          www.deuitgeeffabriek.nl
Click on the button on the cities column and choose Middelburg
                          from the list.




                        Copyright - Hille van der Kaa
                         www.deuitgeeffabriek.nl
Now you have filtered the results from Middelburg.




                  Copyright - Hille van der Kaa
                   www.deuitgeeffabriek.nl
• More complicated filters are possible. For
  instance, suppose you wish to see only
  records in which “bike” is greater than or
  equal to 3.
• ‘Undo’ your previous filter by clicking on
  above your toolbar and…..



                   Copyright - Hille van der Kaa
                    www.deuitgeeffabriek.nl
click on the bike filter button and choose “Number Filter”….




                      Copyright - Hille van der Kaa
                       www.deuitgeeffabriek.nl
Now enter ‘3’….




Copyright - Hille van der Kaa
 www.deuitgeeffabriek.nl
…. and you will only see the cities with 3 or more stolen bikes.




                        Copyright - Hille van der Kaa
                         www.deuitgeeffabriek.nl
Calculate data
• Excel has many built-in functions useful for
  performing math functions.
• For instance, assume that we wish to calculate
  the total number of bike crimes in all the
  provinces.




                  Copyright - Hille van der Kaa
                   www.deuitgeeffabriek.nl
Go to the bottom of Column C, skip a row, and then enter this
              formula in Cell C9: =SUM(C2:C7).




                                          The equals sign (=) is necessary for all
                                          functions. The colon (:) means “all the
                                          numbers from Cell C2 to Cell C7”.




                       Copyright - Hille van der Kaa
                        www.deuitgeeffabriek.nl
Now you have the sum of the numbers in C2 to C7




                Copyright - Hille van der Kaa
                 www.deuitgeeffabriek.nl
• Often you will want to do a calculation on
  each row of your data table.
• For instance, you might want to calculate the
  crime rate (the number of crimes per 100,000
  population), which would let you compare the
  crime problem in cities of different sizes.



                  Copyright - Hille van der Kaa
                   www.deuitgeeffabriek.nl
Please enter the (fictional) number of population in your
                        datasheet




                     Copyright - Hille van der Kaa
                      www.deuitgeeffabriek.nl
Create a new variable called “Crime Rate” in Column F. Then, in
         Cell F2, enter this formula: =(C2/E2)*100000




                                        Notice that there are no spaces and no
                                        thousands separators used in the formula




                        Copyright - Hille van der Kaa
                         www.deuitgeeffabriek.nl
This divides the total crimes by the population, then multiplies
         the result by 100,000. This is your crime rate.




                        Copyright - Hille van der Kaa
                         www.deuitgeeffabriek.nl
Click on the column and ‘format cells’ to adjust the number of
                 decimal places (2) if you like.




                       Copyright - Hille van der Kaa
                        www.deuitgeeffabriek.nl
Now you have the crime rate with two decimals.




                Copyright - Hille van der Kaa
                 www.deuitgeeffabriek.nl
• Excel has a way to rapidly copy a formula
  down a column of cells. To do that, you move
  the cursor (normally a white cross) to the
  bottom right corner of the cell containing the
  formula.




                   Copyright - Hille van der Kaa
                    www.deuitgeeffabriek.nl
When it is in the right spot, the cursor will change to a small black cross. At
that point, you can double-click and the formula will copy down the column
             until it reaches a blank cell in the column to the left.




                               Copyright - Hille van der Kaa
                                www.deuitgeeffabriek.nl
Copyright - Hille van der Kaa
 www.deuitgeeffabriek.nl
You can do the same to the right




         Copyright - Hille van der Kaa
          www.deuitgeeffabriek.nl
Now, if we sort by ‘crime rate bike’ in descending order, we see
         the cities with the worst bike crime problems:




                        Copyright - Hille van der Kaa
                         www.deuitgeeffabriek.nl
Excel has various functions that can be used in similar ways



=AVERAGE – will give you the arithmetic mean of a column or row of numbers


=COUNT – counts the number of items there are in a column or row


=MAX – to look for the largest value in a column or row


=MIN – to look for the smallest value in a column or row



              You can add, subtract, multiply or divide
                  by using the symbols + - * and /.
                               Copyright - Hille van der Kaa
                                www.deuitgeeffabriek.nl
pivot table
• A pivot table creates an interactive cross-
  tabulation of the data by category.
• This will summarize data in categories and
  provides a useful table to play around with
  your data.




                   Copyright - Hille van der Kaa
                    www.deuitgeeffabriek.nl
Make sure your cursor is in some cell in the table. Then go to the
          tool bar and click on “insert” “Pivot Table”.




                         Copyright - Hille van der Kaa
                          www.deuitgeeffabriek.nl
Choose the data you want to analyze




          Copyright - Hille van der Kaa
           www.deuitgeeffabriek.nl
Here you will have your pivot table




          Copyright - Hille van der Kaa
           www.deuitgeeffabriek.nl
• Our example data shows 5 cities in the 2
  provinces of The Netherlands.
• Imagine that you want to know the total
  number of car crimes in each province. The list
  that would answer that question would show
  each province, with the total number of
  crimes next to each name.


                   Copyright - Hille van der Kaa
                    www.deuitgeeffabriek.nl
Pick up “province” from the list of variables in the floating box to
the right, place it in the “row labels” box. Take the “car” variable
                   and put it in the “values” box.




                          Copyright - Hille van der Kaa
                           www.deuitgeeffabriek.nl
• Using a pivot table is a great way to explore
  your data. Please make sure you make a new
  pivot table for each question. This will help
  you not getting lost in your data.




                   Copyright - Hille van der Kaa
                    www.deuitgeeffabriek.nl
• Want to know more
  about using Excel for
  your journalistic
  research?

•   Contact me by info@deuitgeeffabriek for
    workshops.
•   Read the Dutch ‘Handboek
    Datajournalistiek.’
•   Follow me on Twitter @Hillevanderkaa
•   Or just look for the enormous amount of
    tutorials on YouTube!


                                    Copyright - Hille van der Kaa
                                     www.deuitgeeffabriek.nl

More Related Content

More from Hille van der Kaa MA MBA

More from Hille van der Kaa MA MBA (7)

Brand storytelling introduction @iemes fontys
Brand storytelling   introduction @iemes fontysBrand storytelling   introduction @iemes fontys
Brand storytelling introduction @iemes fontys
 
'Happiness on 13'
'Happiness on 13''Happiness on 13'
'Happiness on 13'
 
The Rise of Guerilla Journalism - and the implications for journalism education
The Rise of Guerilla Journalism - and the implications for journalism educationThe Rise of Guerilla Journalism - and the implications for journalism education
The Rise of Guerilla Journalism - and the implications for journalism education
 
Toekomst Van Media
Toekomst Van MediaToekomst Van Media
Toekomst Van Media
 
Storytelling
StorytellingStorytelling
Storytelling
 
Keynote Syntens 'Crossmediaal in 2010'
Keynote Syntens 'Crossmediaal in 2010'Keynote Syntens 'Crossmediaal in 2010'
Keynote Syntens 'Crossmediaal in 2010'
 
Keynote Syntens 'Crossmediaal in 2010'
Keynote Syntens 'Crossmediaal in 2010'Keynote Syntens 'Crossmediaal in 2010'
Keynote Syntens 'Crossmediaal in 2010'
 

How to: Excel for journalists

  • 1. Excel for journalists Part 1: analyze your data Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 2. Microsoft Excel is a powerful tool for analyzing data and discovering interesting patterns. Journalists can use it to: • sort data • filter data • calculate data • make pivot tables (and so many other things I will not discuss in this presentation…) Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 3. • In this case, we want to analyze the number of stolen bikes and cars in five different cities and two provinces in The Netherlands. • Therefore we start with sorting the number of stolen bikes and cars per city. • All numbers are fictional…. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 4. Please enter these fictional data in your spreadsheet Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 5. • Excel organizes your data in table form, with rows and columns. • The columns (which are labeled A, B, C…) list the variables (in this case city, province, bike, car) Typically, the first row holds the names of the variables. • The rest of the rows are for the individual records or cases being analyzed. Each cell (like B2) holds a piece of data. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 6. Sort data • In journalism, we usually are interested in extremes like: the least, the most, the biggest or the smallest. • Excel helps you to look for this by sorting the data into a revealing order. • In this case, we would like to sort the data in descending order of the total number of stolen bikes, with the most crime-ridden city at the top. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 7. • There are two methods of sorting. The first method is quick and can be used for sorting by a single variable. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 8. Put the cursor in the column you wish to sort and then click the Z-A button. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 9. Now you have the number of stolen bikes in descending order Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 10. Beware! • Put the cursor in the column, do not select the column letter (C, in this case) and then sort. • Doing that will sort only the data in that column, and disorder your data! Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 11. • The other method of sorting is useful when you want to sort by more than one variable. • For instance, suppose we wish to sort the crime data first by provinces in alphabetical order, and then by “bike” in descending order within each city. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 12. To do this, look for the toolbar, click on “Data” and then “Sort”… Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 13. choose the variables by which you wish to sort. Then click “OK”. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 14. Now you have the number of stolen bikes in descending order per province Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 15. Filter data • Sometimes you want to examine only particular records from a large collection of data. • In this case, suppose we only wish to see the records from Middelburg. For this, you can use Excel’s Filter tool. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 16. On the toolbar, go to “Data…Filter”. Small buttons will appear at the top of each column. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 17. Click on the button on the cities column and choose Middelburg from the list. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 18. Now you have filtered the results from Middelburg. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 19. • More complicated filters are possible. For instance, suppose you wish to see only records in which “bike” is greater than or equal to 3. • ‘Undo’ your previous filter by clicking on above your toolbar and….. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 20. click on the bike filter button and choose “Number Filter”…. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 21. Now enter ‘3’…. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 22. …. and you will only see the cities with 3 or more stolen bikes. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 23. Calculate data • Excel has many built-in functions useful for performing math functions. • For instance, assume that we wish to calculate the total number of bike crimes in all the provinces. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 24. Go to the bottom of Column C, skip a row, and then enter this formula in Cell C9: =SUM(C2:C7). The equals sign (=) is necessary for all functions. The colon (:) means “all the numbers from Cell C2 to Cell C7”. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 25. Now you have the sum of the numbers in C2 to C7 Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 26. • Often you will want to do a calculation on each row of your data table. • For instance, you might want to calculate the crime rate (the number of crimes per 100,000 population), which would let you compare the crime problem in cities of different sizes. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 27. Please enter the (fictional) number of population in your datasheet Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 28. Create a new variable called “Crime Rate” in Column F. Then, in Cell F2, enter this formula: =(C2/E2)*100000 Notice that there are no spaces and no thousands separators used in the formula Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 29. This divides the total crimes by the population, then multiplies the result by 100,000. This is your crime rate. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 30. Click on the column and ‘format cells’ to adjust the number of decimal places (2) if you like. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 31. Now you have the crime rate with two decimals. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 32. • Excel has a way to rapidly copy a formula down a column of cells. To do that, you move the cursor (normally a white cross) to the bottom right corner of the cell containing the formula. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 33. When it is in the right spot, the cursor will change to a small black cross. At that point, you can double-click and the formula will copy down the column until it reaches a blank cell in the column to the left. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 34. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 35. You can do the same to the right Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 36. Now, if we sort by ‘crime rate bike’ in descending order, we see the cities with the worst bike crime problems: Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 37. Excel has various functions that can be used in similar ways =AVERAGE – will give you the arithmetic mean of a column or row of numbers =COUNT – counts the number of items there are in a column or row =MAX – to look for the largest value in a column or row =MIN – to look for the smallest value in a column or row You can add, subtract, multiply or divide by using the symbols + - * and /. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 38. pivot table • A pivot table creates an interactive cross- tabulation of the data by category. • This will summarize data in categories and provides a useful table to play around with your data. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 39. Make sure your cursor is in some cell in the table. Then go to the tool bar and click on “insert” “Pivot Table”. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 40. Choose the data you want to analyze Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 41. Here you will have your pivot table Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 42. • Our example data shows 5 cities in the 2 provinces of The Netherlands. • Imagine that you want to know the total number of car crimes in each province. The list that would answer that question would show each province, with the total number of crimes next to each name. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 43. Pick up “province” from the list of variables in the floating box to the right, place it in the “row labels” box. Take the “car” variable and put it in the “values” box. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 44. • Using a pivot table is a great way to explore your data. Please make sure you make a new pivot table for each question. This will help you not getting lost in your data. Copyright - Hille van der Kaa www.deuitgeeffabriek.nl
  • 45. • Want to know more about using Excel for your journalistic research? • Contact me by info@deuitgeeffabriek for workshops. • Read the Dutch ‘Handboek Datajournalistiek.’ • Follow me on Twitter @Hillevanderkaa • Or just look for the enormous amount of tutorials on YouTube! Copyright - Hille van der Kaa www.deuitgeeffabriek.nl