Here’s What Python Does
for Us: What Can it Do for
Your Library?
Katharine Frazier
NC State University Libraries
The following presentation is being shared
with EBSCO’s permission. Certain functions
shown in the slides will only be available to
GOBI customers.
● Large public university
(>34,000 FTE)
● STEM-focused
● Collections & Research
Strategy department
Getting Started
Identifying project opportunities
Learning & implementing Python
Diving In
Automating web interactions (GOBI)
Automating data manipulation
Spreading the word about code
Your Turn!
How do YOU want to use Python to improve
your workflows?
Getting Started
The most important step:
Realizing there’s a problem.
Example:
Time-consuming
monthly reports!
● Not very accurate
● Limited to just one type of
search at a time
● Searching by ISBN is no good!
● Getting better…
● Requires users to repeatedly
navigate back to standard search
Results are not
ordered by date or
edition, and include
duplicative items
1. Search for and identify matches in GOBI
2. Select only same or newer editions
3. Add purchase information to reports
...and do it faster!
Identifying solutions
not in plain sight
“Python is good for repetitive work, and
working with large sets of data
programmatically instead of by hand.”
I want to do this task today... There’s a Python library for that!
Data manipulation Pandas
Web automation Selenium
Spreadsheet writing OpenPyxl/Pandas
String matching Fuzzywuzzy
Project-Based Learning
The learning never
stops!
Let’s see a demo!
1. Takes a list of items (holdings data) and
searches GOBI for matches
2. Grabs the price, binding, year of publication for
each match
3. Selects newest edition, matches to original list
of holdings data
Diving in
Pandas: a library for creating and
manipulating dataframes, reading
from and writing to files (.xlsx, .csv)
Selenium: a library for automating
web interactions
Requests: a function for navigating
to web pages using Selenium
ex: browser.get(‘http://www.lib.ncsu.edu)
Fuzzywuzzy: a module for creating
fuzzy (inexact) string matches
Re: a module for creating regular
expressions (search pattern)
text = “Pub date: 2019”
output = re.search(“d{4}”, text)
Xlsxwriter: a module for creating
Excel documents within Python
● Use regular expressions to find desired values
● Use fuzzy string matching to identify title matches
● Send matching title information to lists
● Create DataFrame from dictionary
● Create sub-frames for print and ebook options
● Sort each DataFrame by descending date
● Drop duplicates: keep only the newest copy
available
● Merge print and ebook DataFrames
● This results in one DataFrame with both print and
ebook purchase information, and only one option
for each title
Spreading the
word about
code projects
Sharing with the library community
● Code4Lib listserv
● Conferences (MDLS, Internet
Librarian, etc…)
● LITA, technology groups
Licensing code
● MIT License
● General Public License (GPL)
● Apache License
● https://choosealicense.com/licenses/
Your Turn!
Think about...
● Your own project needs
● What types of Python activities (data
manipulation, automation) would help?
Idea Generation…
● Code4Lib Journal
● Library Carpentry
● PyPi.org
● StackOverflow
How do YOU want to use Python to
improve your workflows?
Thank you!
Katharine Frazier | kfrazie2@ncsu.edu
https://github.com/kchasefray/GOBI_Searching

NCompass Live: Here’s What Python Does for Us: What Can it Do for Your Library?