Ingesting	Metadata
Some	Informal	Remarks	from	the	Library	
Perspective
NISO-BISG	Forum,	ALA	June	23,	2017
Diana	Brooking
University	of	Washington	Libraries,	Seattle
Outline
• Background
• The	effect	of	a	variety	of	sources	and	standards	on	library	staff
• Sharing	metadata
• Matching	records
• Some	thoughts	on	the	future
Background
From	microform	sets	to	ebook packages
• differences	in	stability	of	content	(imagine	a	publisher	comes	into	your	library	
with	scissors	to	remove	titles	from	microfilm	reels)
• expectations	for	catalog	access	(public	service	librarians	used	to	show	
tolerance	for	the	years	it	took	for	catalog	records	to	appear	for	individual	
titles	in	microform	collections;	now	when	a	new	ebook package	is	activated,	
catalog	records	are	expected	to	be	there	the	same	day)
• changes	in	library	procedures		
• vendor	services	also	went	through	same	learning	curve	(it	took	years	for	the	
removal	of	titles	or	changes	in	urls to	be	reflected	in	metadata—still	not	
always	handled	well)
Background
Ebooks vs.	e-serials	metadata
• linking	vs.	discovery
• Full	cataloging	not	as	important	for	ejournals;	the	unit	of	discovery	is	mostly	
the	article;	metadata	used	for	link	resolution	(ISSN,	title,	coverage	data)
• Full	cataloging	very	important	for	ebooks;	metadata	used	for	discovery (but	
some	KBs	still	only	supply	ISBN,	title)
• distribution	of	records
• eJournals:	the	CONSER	file	serves	as	a	primary	full-level	MARC	record	source	
for	most	commercially	published	journals	and	is	widely	distributed
• no	equivalent	of	CONSER	for	ebooks:	more	reliance	on	publisher	metadata	
and	local	cataloging	workflows
Variety	of	sources	and	standards
• I	don’t	want	to	ingest	your	metadata	(at	least,	not	directly)
• I	want	an	integrated	experience	(librarians	are	no	different	than	their	
patrons)
• Too	much	DIY	in-house:	data	wrangling,	manual	cleanup
• different	procedures	for	each	package/file	(not	efficient,	cognitive	load)
• size	of	file	is	large	for	UW	(many	titles,	many	packages)
• lack	of	tools	?	too	many	tools?	(I	use	7	now);	or	not	the	right	tools?
• escalating	level	of	skills	necessary	to	do	the	job
Sharing
• Distribution	issues	(any	barrier	to	free	flow	of	full	metadata	is	a	
problem)
• ExLibris Community	Zone	(KB):	commercial	entity,	share	with	other	ExL
customers	only	
• OCLC	WorldCat and	WorldShare:	cooperative,	share	with	the	world
• But	do	publishers	share	everything	with	ExL?	With	OCLC?	What	may	get	lost	
in	the	process?
• OCLC	records	vs.	vendor	records
• As	OCLC	member,	we	prefer	OCLC	records;	vendor	record	are	more	difficult	
for	us	to	share	and	less	likely	to	conform	to	library	standards
Matching
• High	quality	metadata	for	ebooks often	exists
• Is	it	shared?	Can	it	be	found?
• Lessons	from	OCLC	KB	(aka	WorldShare Collection	Manager)
• Inadequate	matching	algorithms	(streaming	audio	matched	to	wax	cylinders)
• Early	on	libraries	told	OCLC	“any	record	better	than	nothing”
• Algorithm	now	much	improved
• Identifying	collections/packages	(extremely	difficult,	extremely	
important,	not	addressed)
• Standard	identifiers	possible?
• Consortial catalog	(and	beyond)
• control	numbers	of	various	kinds,	another	problem,	ISBNs	not	reliable	match
Future
• Adherence	to	a	“traditional”	library	standards	has	not	been	easy	in	
the	past	(e.g.,	MARC,	Provider-Neutral,	AACR2)
• Libraries	ingesting	metadata	from	ever	more	sources	with	
proliferating	number	of	standards—will	this	help	or	hurt?
• Linked	data,	a	world	with	no	records?
• data	with	no	context?
• reconcilation,	URIs	from	different	sources
• more	matching	algorithms	than	ever?
Questions?
Diana	Brooking
Cataloging	and	Metadata	Services
University	of	Washington	Libraries,	Seattle
dbrookin@uw.edu

Brooking Ingesting Metadata - FINAL