Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Extract and Analyze Culture/Trend Data from SJPL Digital Collection

746 views

Published on

An result at Open Data Hack SJ
Saturday, February 21, 2015 from 9:30 AM to 5:00 PM (PST)
San Jose, CA

Published in: Data & Analytics
  • Be the first to comment

Extract and Analyze Culture/Trend Data from SJPL Digital Collection

  1. 1. Extract and Analyze Culture/Trend Data from SJPL Digital Collection Theme: Helping San Jose Public Library figure out how to make the California Room Digital Collections more open, engaging, hackable, linkable, browsable, tag-able, map-able and responsive. Open Data Hack SJ Saturday, February 21, 2015 from 9:30 AM to 5:00 PM (PST) San Jose, CA Hiroyuki Sato @sa2hi
  2. 2. OPEN DATA • SJPL DIGITAL COLLECTIONS • California room • School Yearbooks • http://www.sjpl.org/yearbooks • Digital Data is available for San Jose High School Yearbook (1902- 1929) • http://digitalcollections.sjlibrary.org/cdm/landingpage/collection/sjplyb
  3. 3. What I wanted to do / I did • Can we see a culture/trend from the Digital Collection? • Extract the data related to athletic teams • Count manually the numbers of people for each sport team on yearbooks… • https://docs.google.com/spreadsheets/d/1GhCA-I6mRZ1rs- ORHNt1ktfrd3qn9OB4h7JeqJlaYn4/edit#gid=0 • Visualize the data
  4. 4. An Example of Original Data (1)
  5. 5. An Example of Original Data (2)
  6. 6. 1905 1910 1915 1920 1925 Transition of numbers of team members for each sport
  7. 7. Issues • A lot of missing years • Need more meta data • Need automated detailed metadata extraction technologies from picture and text • Need population/total numbers of school people to compare a data with a data for different year • Need other schools digital data

×