Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

1850 Agricultural Production in Pennsylvania

672 views

Published on

Methodology report for creating the 1850 Agricultural Production in Pennsylvania interactive dashboard:
http://tabsoft.co/2oMy0AU

Published in: Data & Analytics
  • Be the first to comment

1850 Agricultural Production in Pennsylvania

  1. 1. 1 Interactive Dashboard Report 1850 Agricultural Production in Pennsylvania Heather Myers DAAN 871: Data Visualization
  2. 2. 2 Table of Contents Project Background ............................................................................................................................ 3 Data Preparation ................................................................................................................................. 4 Data Extraction ............................................................................................................................................. 4 Data Compilation ......................................................................................................................................... 5 Data Dictionary ............................................................................................................................................. 5 Interactive Dashboard ........................................................................................................................ 7 Purpose ........................................................................................................................................................ 7 Data .............................................................................................................................................................. 7 Analytic Questions ....................................................................................................................................... 7 Design Principles .......................................................................................................................................... 8 Detailed Methodology ........................................................................................................................ 8 County Boundaries ....................................................................................................................................... 8 Map Georeferencing .................................................................................................................................. 10 Imagery and Typography ............................................................................................................................ 12 Limitations and Future Enhancements ...................................................................................................... 14 Dashboard Link ................................................................................................................................. 15 Dashboard Screenshot ..................................................................................................................... 16 References ........................................................................................................................................ 17 Appendix A: Explanation of Schedule 4 – Agriculture ..................................................................... 18 Appendix B: List of Software ............................................................................................................ 19
  3. 3. 3 Project Background In 2007, the Pennsylvania Historical and Museum Commission (PHMC) began a multi-year project to document the agricultural history of the Commonwealth of Pennsylvania. The project “includes narrative histories describing the evolution of different farming systems around the state, historic census data, a field guide to historic farm buildings and landscapes, and bibliographic resources.” Project staff digitized manuscripts from the agricultural schedules of the 1850 federal census and compiled the data for all counties and municipalities in Pennsylvania. A few computed average fields were also included to aid in documentation for National Register of Historic Places (NRHP) nominations. The tabulated census data was published online in PDF format for each county. Shortly after I joined PHMC in 2011, I conveyed to staff the value of converting and releasing the agricultural census data in a format that would allow users to analyze the data directly and create visualizations. Due to limited staff resources the data conversion project remained in the idea phase. The DAAN 871 interactive dashboard project presented the opportunity to complete the data conversion. Example of census data PDF file
  4. 4. 4 Data Preparation Data Extraction The original Excel files were retrieved from the Pennsylvania State Historic Preservation Office (PA SHPO), a bureau of PHMC, for 38 counties. Tabula was used to extract the data from 24 counties. Data for Philadelphia municipalities was unavailable in Excel or PDF format so I manually compiled the data from the original manuscripts. Example of census manuscript with column totals for a borough in Philadelphia County
  5. 5. 5 Data Compilation An Excel file was created with tabs for each county. Only the original fields from the census were retained. The computed averages were removed due to inconsistencies with choice and amount of computed average fields across counties. In the future, a standard list of computed average fields chosen by PA SHPO will be added back to the county tabs to aid with NRHP documentation. The original Excel and PDF files were inconsistent with listing whether a municipality was a city, borough, district or township. A few counties have both a township and a borough with the same name so including a field for municipality type is important. The municipality type is listed on the census manuscript web pages so I added that information to the county tabs. During that process, I discovered that many boroughs were missing from the tabulated data files. PA SHPO explained that the primary purpose for the data was for NRHP documentation and since very few farms currently exist in boroughs the data was not compiled. I manually compiled the data from the original manuscripts for the boroughs that were missing. A new tab was created to include data from all of the counties so that the data can be analyzed for the entire state rather than only on the individual county level. Data Dictionary A data dictionary was created with help from the 1850 census documentation. Additional notes were included for clarity. See Appendix A for a detailed explanation of the schedule 4 headings. Headings for the 1850 Federal Census, Schedule 4 – Productions of Agriculture
  6. 6. 6 Data dictionary - notes about the data Data dictionary
  7. 7. 7 Interactive Dashboard Purpose The Pennsylvania Agricultural History Project explains the significance of the industry: “Farming has guided Pennsylvania's economic growth and cultural development and has profoundly shaped the lands and people of the Commonwealth.” The United States Census Bureau writes that the “census of 1850 was the first for which a special agricultural schedule was provided.” The purpose of the 1850 Agricultural Production in Pennsylvania interactive dashboard is to give a high-level overview of agriculture in Pennsylvania during the mid-19th century. It was the first time in history that data was collected on agricultural production at a national scale. The primary user has a casual interest in the subject matter and would like to explore areas of highest agricultural production in Pennsylvania during that era. Data The dataset includes 14 measures for acreage, farm machinery and livestock and 32 measures for produce. Each measure is tabulated for 1,273 municipalities across 63 counties. The dashboard includes aggregated data at the county level for all 14 acreage, machinery and livestock measures and the top 11 produce measures. The produce measures each have a grand total of 1 million or more bushels, pounds or tons. This is sufficient for a high-level overview. The raw data will be available for those interested in a deeper dive into all of the produce measures. A shapefile for historical county boundaries for Pennsylvania is inner joined to the census data. Analytic Questions The interactive dashboard answers these analytic questions about agricultural production in Pennsylvania in 1850: • How many farms were in Pennsylvania? Or each county? • Which counties had higher numbers of farms? Or lower numbers of farms? • What was the cash value of farms in Pennsylvania? Or each county? • What was the value of farm machinery in Pennsylvania? Or each county? • What was the value of live stock in Pennsylvania? Or each county? • What was the value of animals slaughtered in Pennsylvania? Or each county? • How many of each type of live stock did Pennsylvania have? Or each county? o Horses o Milch Cows o Asses and Mules o Working Oxen o Sheep o Swine o Other Cattle
  8. 8. 8 • How many acres of improved and unimproved land were in Pennsylvania? Or each county? • How many total acres of land were in Pennsylvania? Or each county? • How many units of each type of produce did Pennsylvania yield? Or each county? o Buckwheat o Indian Corn o Irish Potatoes o Oats o Rye o Wheat o Hay o Butter o Cheese o Maple Sugar o Wool Design Principles The centerpiece of the dashboard is a map of Pennsylvania displaying the number of farms in each county with corresponding shades of green and overlaid on a historical map. A county filter is included at the top of the dashboard and controls which data is shown on all of the worksheets below it. The original spelling and spacing are retained from the census headings. For example, milch cows instead of milk cows and live stock instead of livestock. The background design is adorned with historic lithographs and headings are created with the Abraham Lincoln font to simulate typography from the era. A section at the bottom includes a few facts about 1850 as well as definitions and a link to learn more about Pennsylvania agricultural history. The detailed methodology section has additional information about the dashboard design and construction. Detailed Methodology County Boundaries The 1850 Pennsylvania agricultural census data is loaded into Tableau, then the farms measure and county dimension are projected onto a map. Since the 2017 county boundaries are different than the 1850 county boundaries there are gaps in the map where Cameron, Forest, Lackawanna and Snyder counties are currently located. Cameron, Lackawanna and Snyder counties did not yet exist in 1850. Forest County existed but was mostly forest area with no agricultural production.
  9. 9. 9 1850 Census: Total Farms Per County (2017 boundaries) The Atlas of Historical County Boundaries by The Newberry Library includes datasets of county boundaries for each state. The KMZ file is loaded into Google Earth to limit the Pennsylvania county boundaries to only those that existed in 1850. The pared dataset is exported as a KML file for use in Tableau. The file is converted to a shapefile (SHP) after Tableau would not accept the KML file. The shapefile data is inner joined with the agricultural census dataset. Tableau data connections The geometry measure for the 1850 county boundaries is added to the number of farms worksheet. All counties are now filled with data except Forest County which did exist but did not
  10. 10. 10 have any agricultural production in 1850. Most of the county labels are slightly adjusted to align within the older county boundaries. 1850 Census: Total Farms Per County (1850 boundaries) Map Georeferencing A map dated 1853 by Ensign & Phelps, N.Y. is selected from Historical Maps of Pennsylvania. The main map section within the Pennsylvania borders is washed out in Photoshop to prevent the county colors on the map from conflicting with the colors on the visualization when data is filtered to an individual county. The map is georeferenced in QGIS and imported into Mapbox to create a custom map background. The Mapbox API is used in Tableau to connect to the map service and use the map in the dashboard.
  11. 11. 11 1853 map of Pennsylvania by Ensign & Phelps, N.Y. Georeferencing the washed out map
  12. 12. 12 Final map in Tableau with custom background and number of farms per county Imagery and Typography While searching for historic photos related to agriculture and Pennsylvania in the mid-1800s, I discovered lithographs of diplomas awarded by local agricultural societies. Two public domain prints are selected for the dashboard background: • Diploma awarded by the Doylestown Agricultural and Mechanics Institute which depicts “scenes from farm and country life, as well as examples of agricultural produce across the top and on both sides.” • Diploma awarded by the Luzerne County Agricultural Society which shows “rural views of a farm, farmers, and livestock, also arrangements of farm produce, and a farmer driving a horse-drawn reaper, and a railroad at a factory or processing plant.” The Abraham Lincoln font by Frances MacLeod is chosen for dashboard heading text. The typeface description: “Inspired by the proportions of the 16th President of the USA, and advertisements/playbills of the 1800s, Abraham Lincoln is a humanistic display face with moderate contrast and sturdy serifs.” Times New Roman is used for all visualization text and non-heading text, including tooltips.
  13. 13. 13 Diploma awarded by the Doylestown Agricultural and Mechanics Institute (c. 1867) Diploma awarded by the Luzerne County Agricultural Society (c. 1857)
  14. 14. 14 Example of Abraham Lincoln font by Frances MacLeod Limitations and Future Enhancements The map georeferencing is slightly out of alignment which is most noticeable in the southeastern part of the state when the map data is added to the custom map background. Several attempts were made to better georeference the map by using different transformation types and resampling methods and varying the number of points. In the future, I can work with a colleague at PHMC that is a GIS specialist to help increase my georeferencing skills and fix the map. Forest County did not have any agricultural production in 1850 so the map is blank in that spot. An annotation is used to fill in the county name but I cannot find an easy method to hide the annotation when only one county is filtered. There is also no easy method to change the font size for the selected text in the filter dropdown. Visualizations in Tableau cannot have a transparent background and the color pickers do not include an option to input exact RGB or hexadecimal color codes. Dashboards do not have a “snap to grid” feature. Due to those limitations, background imagery could not be perfectly blended or aligned with the visualizations. I may work with graphic designer or Tableau specialist in the future to learn more effective methods of blending dashboard backgrounds and components. tiny font size that cannot be changed preferred font size
  15. 15. 15 I cannot find an easy method to add pop-up informational boxes to the dashboard. Learning how to “swap and pop” sheets in a Tableau dashboard is a future goal. As a workaround, an information icon (lowercase i inside of a circle) for measures that require additional information – animals slaughtered and unimproved/improved land – is included next to the visualization with corresponding definitions at the bottom of the dashboard. Additional datasets can be added in the future for 1850 population and square miles per county for further comparisons. Counties with higher populations and square miles tend to have more agricultural production. As an example, the agricultural production per 100 people or 50 square miles can be compared. Despite these limitations, the interactive dashboard for 1850 Agricultural Production in Pennsylvania turned out well and I am pleased with the final product. Dashboard Link The dashboard has been published to Tableau Public: http://tabsoft.co/2oMy0AU
  16. 16. 16 Dashboard Screenshot
  17. 17. 17 References Abraham Lincoln by Frances MacLeod. Lost Type Co-Op, n.d. Web. 01 Apr. 2017. <http://www.losttype.com/font/?name=Abraham%20Lincoln>. Agricultural Schedules, 1850 to 1900. United States Census Bureau, n.d. Web. 01 Apr. 2017. <https://www.census.gov/history/pdf/agcensusschedules.pdf>. Atlas of Historical County Boundaries. The Newberry Library, 2010. Web. 01 Apr. 2017. <http://publications.newberry.org/ahcbp/index.html>. Ensign & Phelps, N. Y. 1853 Pennsylvania. Historical Maps of Pennsylvania, n.d. Web. 01 Apr. 2017. <http://www.mapsofpa.com/antiquemaps35.htm>. Federal Decennial Census, 1850. National Archives, Washington; Record Group 029, National Archives and Records Service, General Services Administration. Lebergott, Stanley. "Labor Force and Employment, 1800–1960." Output, Employment, and Productivity in the United States after 1800. Ed. Dorothy S. Brady. N.p.: National Bureau of Economic Research, 1966. 117-21. Web. 01 Apr. 2017. <http://www.nber.org/chapters/c1567.pdf>. Pennsylvania Agricultural History Project. Pennsylvania Historical and Museum Commission, n.d. Web. 01 Apr. 2017. <http://phmc.info/PaAgHistory>. “Pennsylvania. Agriculture – Farms and Implements, Stock, Products, Home Manufactures, &c.” The Seventh Census of the United States: 1850. United States Census Office, 1853. 194-198. Web. 01 Apr. 2017. <http://usda.mannlib.cornell.edu/usda/AgCensusImages/1850/1850a- 08.pdf>. P.S. Duval & Son, Printer, and James Fuller Queen. Diploma Awarded to [blank] by the Doylestown Agricultural and Mechanics Institute ... / James Queen ; P.S. Duval, Son & Co. The Library of Congress, n.d. Web. 01 Apr. 2017. <https://www.loc.gov/item/2015647823/>. P.S. Duval & Son, Printer, and James Fuller Queen. This diploma was awarded by the Luzerne County Agricultural Society at their blank annual fair / P.S. Duval & Son's lith. Philada. The Library of Congress, n.d. Web. 01 Apr. 2017. <https://www.loc.gov/item/2014648443/>.
  18. 18. 18 Appendix A: Explanation of Schedule 4 – Agriculture
  19. 19. 19 Appendix B: List of Software Software Purpose Tabula Data extraction from PDF files Microsoft Excel Data organization and compilation Tableau Data visualization and interactive dashboard creation Photoshop Image editing and creation Google Earth Data reduction of boundaries for PA counties that existed in 1850 QGIS Map georeferencing Mapbox Custom map style creation

×