Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
The LITA Forum & 
library data in 
Python
Library and 
Information 
Technology 
Association (LITA)
Nov 5-8 
LITA Forum 
Albuquerque
Learn Python by Playing 
with Library Data 
By Francis Kayiwa 
& Eric Phetteplace
Github
BitBucket
Main class 
https://bitbucket.org/ 
fkayiwa/litaconf/overview
PyMARC scripts 
By Eric 
Phetteplace 
https://github.com/phette23/pymarc-ebooks- 
scripts
• count-tag.py find out many records have a particular tag 
• dual856.py find all your records with multiple 856 (electron...
MARCkbart 
https://github.com/lpmagnuson
EZProxy 
Analysis 
https://github.com/robincamille/ezproxy-analysis
Analyzes EZproxy-generated log files and spits out a CSV with this info: 
• Filename of log being analyzed 
• # total conn...
Beautiful Soup
Real world
Real world
TIPS: 
Don’t use python 3
Albequerque is 
lovely and small
Upcoming SlideShare
Loading in …5
×

Code4 lib 20141129 python

2,525 views

Published on

Description of the Python class from LITA 2014 for Code4Lib BC

Published in: Data & Analytics
  • Be the first to comment

Code4 lib 20141129 python

  1. 1. The LITA Forum & library data in Python
  2. 2. Library and Information Technology Association (LITA)
  3. 3. Nov 5-8 LITA Forum Albuquerque
  4. 4. Learn Python by Playing with Library Data By Francis Kayiwa & Eric Phetteplace
  5. 5. Github
  6. 6. BitBucket
  7. 7. Main class https://bitbucket.org/ fkayiwa/litaconf/overview
  8. 8. PyMARC scripts By Eric Phetteplace https://github.com/phette23/pymarc-ebooks- scripts
  9. 9. • count-tag.py find out many records have a particular tag • dual856.py find all your records with multiple 856 (electronic location) tags • ebooks-to-csv.py save all your ebook (defined as anything with an 856 $u) titles to a CSV file • gmd-counter.py count number of occurrences of different General Material Designations (245 $h) in a collection of records. Example JSON output included. • pymarc-notes.md some very minimal notes on using pymarc, mostly links to documentation • python-on-windows.md notes on getting set up on a Windows machine • proxy-ebooks.py the main script I wrote, others were basically tests leading up to this. We were implementing a proxy server and this cleaned up our 856 fields while proxying appropriate vendor URLs. • search-gmd.py find titles of records with a certain GMD • subfield-counter.py count subfields used in all records? I actually don't know, this is horrible code, Eric. • web-links.py output stats on 856 fields in records • webfeet.py find records with "[selected by Web Feet]" in the title since at some point we imported one of these misguided attempts to catalog "the good parts" of the Internet • write856s.py write records with multiple 856 fields out to a separate MARC file
  10. 10. MARCkbart https://github.com/lpmagnuson
  11. 11. EZProxy Analysis https://github.com/robincamille/ezproxy-analysis
  12. 12. Analyzes EZproxy-generated log files and spits out a CSV with this info: • Filename of log being analyzed • # total connections • # on-campus connections (as determined by IP addresses starting with "10." -- may be different for your campus) • % on-campus connections of total • # off-campus connections • % off-campus connections of total • # library connections (as determined by IP addresses starting with "10.11" and "10.12" -- will almost certainly be different for your campus) • % library of on-campus connections • % library of total connections • # student sessions off-campus • % student sessions of total off-campus • # fac/staff sessions off-campus • % fac/staff sessions of total off-campus
  13. 13. Beautiful Soup
  14. 14. Real world
  15. 15. Real world
  16. 16. TIPS: Don’t use python 3
  17. 17. Albequerque is lovely and small

×