UNIVERSITY OF WISCONSIN
SOYLENT SEM-WEB PLE!
So hello, guten morgen. My name is Dorothea Salo, and I teach many nerdy things, linked data among them, at the
School of Library and Information Studies at the University of Wisconsin at Madison. I ﬁrst want to say vielen dank
-- thank you VERY much -- for inviting me here, and I hope I can kick off this conference in a fun and useful way.
Soylent Green DVD cover, c. 2007.
Fair use asserted.
In 1973 Charlton Heston starred in a science-ﬁction movie called Soylent Green. And it’s a terrible movie, talky
and preachy and weirdly acted and often just dumb. So I don’t feel too bad about spoiling the big plot twist. In the
movie, the environment has degraded so badly that food can’t be grown, so what everybody eats is artiﬁcial foods
called Soylent Whatever -- Soylent Red, Soylent Yellow, and the brand-new Soylent Green.
What they don’t know, until Charlton Heston yells it at the end of the movie, is that Soylent Green Is People! More
speciﬁcally, Soylent Green is what happens when you make people into food. Ew. But the total nastiness of
cannibalism aside, what’s interesting about this movie is that you have this whole society that has absolutely NO
IDEA that it’s completely dependent on people for its survival!
UNIVERSITY OF WISCONSIN
SOYLENT SEM-WEB PLE!
It’s the year 2013... Data are still the same.
We’ll do anything to make sense of them.
And for that we need PEOPLE.
Now, we’re not cannibals here in Hamburg, we don’t actually eat people. I hope. No, but seriously, the parallel I
want to draw here is that the original Semantic Web vision curiously lacked PEOPLE, except maybe as the end-user
beneﬁciaries of linked data. I mean, you can go back and look at what Berners-Lee and his cronies wrote, and you
have all these people booking travel and getting health care or whatever because of all the nice clean shiny RDF
data whizzing around in nice clean shiny server rooms, sure. But the data whizzes around all by itself. Doesn’t
need people. There are no people. Just data.
And I just think this is a counterproductive, even dangerous, way to frame the Semantic Web. And still much too
common. *CLICK* So I assert that the Soylent Semantic Web Is People! Because I want a HUMAN semantic web. A
HUMANE semantic web. Technology without people is just dead metal and silicon. Data without people is just
It’s the year 2013... Data are still the same.
We’ll do anything to make sense of them.
And for that we need LIBRARIANS.
And more, since we’re here at Semantic Web in Libraries, I will assert that Soylent Semantic Web Is Librarians! We
are the Semantic Web, and the Semantic Web is us!
And I know that isn’t completely news -- we invented SKOS, we invented Dublin Core, we have Karen Coyle and
Diane Hillmann and Ed Summers, just for starters -- but if you had to ask me why this speciﬁc conference is
important? That’s what I’d say. The Soylent Semantic Web Is Librarians.
• Server room: Alex, “servers”
• Bulldozer: GlacierNPS, “The rotary plow”
• (snow-thrower, not bulldozer, but who’s counting?)
• Graph: Jörg Kanngießer, “Graph of klick.JÖrg - Die Homepage”
• Librarians: Charles Greenberg, “IMGP0420”
• All these images licensed CC-BY. Thank you, creators!
While we glance at the photo credits, I’ll tell you that what I want to do today is explain my thoughts about why
the Semantic Web is not soylent, not made of librarians, not made of people. I want to explain why it SHOULD be
soylent. And I want to challenge you in speciﬁc ways to MAKE it soylent. My ultimate goal, which I imagine you
share, is strengthening library adoption of linked data.
So let’s decide, in approved RDF-triple style, just what properties we can assert about librarians and linked data.
And the usual properties I would expect people at this conference to suggest would be the technical ones.
Librarians MODEL linked data. Librarians CROSSWALK TO linked data. Maybe as simple as librarians MAKE linked
data. Librarians HOST linked data. Librarians ARCHIVE linked data. Librarians BUILD SYSTEMS FOR, and around,
But none of those properties really belong to the Soylent Semantic Web, the Semantic Web made of people. These
properties are about the DATA, not the PEOPLE.
Here are some things librarians do, as people, in the Soylent Semantic Web. We INVESTIGATE linked data. We
DISCUSS linked data, sometimes not as knowledgeably as linked-data advocates might like. We LEARN ABOUT
linked data. We TEACH ABOUT linked data. We ADVOCATE FOR linked data. Or don’t. And now we get to the
crucial point: we ADOPT linked data.
Or we don’t. And we don’t because the Semantic Web community, librarians included, hasn’t acknowledged that it
needs to be soylent. We forget that the Semantic Web is made of people, lots of different kinds of people, some of
them people who are not like us and do not do the same work we do and do not have the same understandings
we have. We forget that we NEED our own librarian colleagues to help us make the Semantic Web, and put library
data into it -- and when we forget our librarian colleagues, our librarian colleagues forget us, and forget linked
data. And that’s not good.
Anfuehrer, “Der Fleischwolf bei der Arbeit” CC-BY-SA
And as I talk to librarians about linked data, what I hear back is that they feel ground up into hamburger -- sorry,
sorry, I had to -- by the whole thing, because the way it’s usually explained to them, it’s so abstract and so
divorced from the actual library work they know. The linked data movement can show them graphs, but it can’t
show them interfaces for doing their work. It can tell them about triples, but it’s not telling them how the catalog
will work if their Internet connection fails. It can explain ontologies, but not how they’ll navigate them.
After one explanatory talk I gave, I had one cataloger tell me with immense frustration, “I just don’t see how this
will WORK!” And I didn’t have a good answer for her. Because I don’t see that either.
THIS HAS HAPPENED
Now, switching away from Soylent Green brieﬂy to -- anybody recognize this? I took it from the remade Battlestar
Galactica television series, which has this catchphrase, “this has happened before.” This is not the ﬁrst time an
upstart technology has tried to upend an entire established infrastructure, along with the people using it.
At the turn of the century, I was working in publishing. Speciﬁcally, electronic publishing. Even MORE speciﬁcally,
ebooks. And while some of the big journal publishers climbed onto the XML bandwagon, many other journals
didn’t, and the trade publishing industry just never did. I remember sitting in an ebook conference next to a highlevel editor from a Big New York Publisher, and we were listening to a fairly basic, fairly standard introduction to
XML, and I heard her sigh “This is just not my world any more.” She felt alienated. She felt ALIEN. Is there anybody
in this room who hasn’t heard a colleague express that alienation?
Even worse, XML didn’t make publishers’ lives easier -- it made them harder! Editing, typesetting, indexing, all
these workﬂows got hugely more complicated for what looked at the time like super-dubious returns. And the
XML community took no notice whatever of their difficulties, the difficulties ACTUAL PEOPLE were having doing
ACTUAL WORK with XML. Why? Because the XML community was having way too much fun loudly proclaiming
XML’s superiority over everything ever, and going off into corners to have arcane technical arguments about XML
namespaces. Not very soylent! Not humane! Not made of people!
Now, publishers did still make some XML, I grant you. I saw a lot of it. Forgive my language, but trade publisher
XML was CRAP. It was garbage. You wouldn’t feed it to your pet Cylon, it was so bad. Which goes to show that
technology that doesn’t ﬁt into real people’s environments won’t be used properly, if it’s used at all.
How many of you knew this slide was coming? Go ahead, raise your hands. Yeah. If you know me, you know that I
am just so sad and angry about institutional repositories. In Europe, I know, it hasn’t been quite so bad, but in the
States, it’s been WRETCHED.
But it was the same thing again. There was this technology that was going to make EVERYTHING BETTER, only the
people making the technology forgot all about the people who were supposedly going to use it! So we got these
stupid unusable unﬁxable systems that did stupid things, and no big surprise, nobody willingly put anything in
them! Because they weren’t soylent! They weren’t made of people!
Incidentally, what happened to the people running institutional repositories? People like me? Well, we got blamed.
And I, for one, got OUT. I will NEVER work on an institutional repository again. This is a thing that happens when
systems don’t treat worker-people well. Worker-people abandon those systems, even people who truly believed in
them and had high hopes for them.
So when we lose catalogers, I think it’s a serious problem.
TH I S WILL HAPPE N
So we have plenty of history of technologies not succeeding because they aren’t people-conscious enough. This
will happen again, to linked data, if we’re not careful. If the Semantic Web doesn’t remember that it’s soylent -made of people. I don’t want that. You don’t want that. But that’s what’s going to happen if we can’t bring more
PEOPLE to linked data.
a sl slide
“RDF is built from XML.”
It’s the year 2013... RDF is still the same.
Why do people who should know better
still believe RDF is based on XML?
Just as an example, I was at ASIST a couple of weeks ago, the big annual conference for the Association for
Information Science and Technology. And I went to a session on linked data -- and I won’t be any more clear than
that, because I’m not here to embarrass any speciﬁc person -- and I saw this on a slide. *CLICK*. RDF is built from
This kind of thing makes me think that eating people alive might actually be an interesting lifestyle choice! Maybe
you too? *CLICK* Because my gosh, it’s twenty-thirteen, RDF never was built from XML, so why on earth do people
who really should know better still believe this strongly enough to put it on a presentation slide?!
So clearly education, even REALLY BASIC education, is a problem here. And it’s a PEOPLE problem, not a data
Rex Pe, “student teacher” CC-BY
And as an educator, it’s MY PROBLEM, right? I think of education as my major role in furthering the adoption of
linked data in libraries. Educating future librarians and archivists and other information professionals. Educating
CURRENT ones, which I also do.
I gotta tell you, though, that current linked data infrastructure is NOT making this easy for me.
HTML5 / CSS
Teaching time for minimal competence
Give me forty-ﬁve minutes, and I can drag a roomful of complete HTML novices through making an extremely
basic web page. I know this because I’ve done it! Give me another forty-ﬁve minutes, and I can drag those same
people through the basics of CSS. Again, I know this because I’ve DONE it. And yeah, they won’t be web designers
after that, but they can go and practice usefully on their own and get better, and there’s a TON of resources on
the web to help them.
XML is a bit harder to explain and work with. But. If my roomful of people is actually a roomful of librarians or
library-school students? I can drag them through being able to make a basic MODS record in two and a half hours
or so. I know this. I’ve done it.
*CLICK* Here’s the thing. I don’t know how much time it takes to drag a roomful of novices through minimal RDF
competence. I’m not even sure what minimal RDF competence LOOKS like! So essentially it might as well be
inﬁnite time. I’ve tried, I really have. I just don’t think I’ve succeeded. What are the problems I’m running into?
Dave Hosford, “Diving Board Catch” CC-BY
Part of my problem is that the training materials I have to work with force my librarian learners into stunts like
trying to catch a ball while jumping off a diving board. Really, a lot of the stuff that’s out there, even I bounce
right off of -- and I supposedly know RDF well enough to keynote a semantic-web conference!
Davide Palmisano, “Introduction to Linked Data.” Fair use asserted.
Here’s a linked-data introduction from Cambridge Semantics -- and in fairness to them, they didn’t make this for
librarians, but it’s still one of the best things out there. But look at it. Just the ﬁrst sentence *CLICK* and we’ve
already brought in H-T-T-P and T-C-P-I-P without deﬁning them, much less explaining why they’re important in
this context. My learners? My librarians and library-school students? They don’t know about the alphabet-soup
plumbing of the Internet! They might have heard H-T-T-P and T-C-P-I-P mentioned (quite likely by me, in
another class), but that doesn’t mean they KNOW. They’re just going to bounce right off this, or get distracted by
something that’s actually a pretty minor and useless detail.
It gets worse. What’s the metaphor this intro picked out, to explain linked data? *CLICK* The relational database!
Speaking of things a lot of my learners don’t know about!
So this extremely well-intentioned and well-written tutorial is useless to me. It won’t help the people I have to
teach, so it’s NOT SOYLENT.
Sarah Deer, “duh” CC-BY
The answer to this dilemma is not to call my learners stupid. I warn you, I am not even going to LISTEN to that, so
don’t anybody try it.
I’m also not going to listen to any suggestion that librarians can’t learn about linked data until they learn T-C-PI-P and H-T-T-P and relational databases and XML and at least three programming languages. That’s ridiculous.
I’ve been teaching tech to future librarians since oh-seven, and trust me, with most things you can meet them
where they are -- which can, yes, be a REALLY low skill level -- and still teach them a lot.
B.S. Wise, “humanity. love. respect” CC-BY
How does that work? The answer -- the SOYLENT answer, the answer that acknowledges my learners’ HUMANITY
and their LOVE for what they do -- the answer is respect. Primarily, respect for librarians’ existing knowledge
base. And this is the principle I try to build my lessons on -- draw from what my learners already know.
I start with
because they get it.
So I try to teach linked data based on my learners’ interest in it. No surprise, for most of them, their interest has a
lot to do with linked data replacing MARC. *CLICK* The rest of them are digital librarians and archivists, or
aspiring digital librarians at any rate, and for them I keep library metadata practices in mind.
*CLICK* So, for the sake of time, let’s just stick to MARC. What happens when I try to translate MARC skills and
practices into a linked-data context?
xlibber, “Bad Parking” CC-BY
What happens is the same thing that happened with publishers and XML -- I crash my little linked-data car RIGHT
INTO all the work that libraries now do, all the work that forms the FOUNDATIONS of library data, that is just
IMPOSSIBLE to even DEMONSTRATE with linked data.
I won’t tell you all my tales of woe -- I have a lot of them! -- but here’s one. I teach this continuing-education
course that introduces XML and linked data to working librarians. This fall I wanted to add a couple of weeks on
Open Reﬁne to it. Because I thought that data cleanup was important to teach, for starters. And I thought that
reconciling some random spreadsheet metadata with existing linked data stores would be a cool demo, with
pretty obvious relevance to real-world librarian work.
So naturally I thought about name authority control. Right?! Because it’s just so basic to what librarians do.
Because it’s something the rest of the linked-data world is totally learning to do from libraries! Because even in
the States -- where we’re kind of behind Europe in linked-data experimentation -- even in the States we have
these great name authority linked-datastores, VIAF and the Library of Congress, so I thought this would be EASY.
I learned very quickly, of course, that I can’t use VIAF from Open Reﬁne, because there’s no SPARQL endpoint for
it. And I’m on the record here, so I’ll just say -- YOU tell ME why not.
So, okay, that doesn’t work, what about the Library of Congress? Naturally I went right to the source, Ed Summers,
because who wouldn’t?
Oops. Can’t do authority-control reconciliation THAT way either. And this is where I confess the limits of my own
knowledge: I don’t KNOW how to build a web-available triplestore with a SPARQL endpoint off somebody else’s
data! And this lesson I was working on was two weeks from going live -- I didn’t have time to ﬁgure it out!
So I asked if anybody else had maybe done authority control with Open Reﬁne and could show me how. I just
needed a simple demo!
I heard nothing.
So let me just say, trying to put together a useful lesson about how to do ACTUAL LIBRARY WORK with linked data?
Was NOT a super-humane experience. I felt annoyed. I felt stupid. I felt frustrated. I felt like hey, if the Semantic
Web is so soylent, HOW ABOUT I JUST EAT UP ALL YOU LINKED DATA NERDS?
And I am a vegetarian!
Authority control is basic, basic stuff, folks. Many librarians consider it a touchstone of library practice, something
CENTRAL to our professional identities. (So to speak.) If I can’t do authority control with linked data, do not even
TALK to me about how linked data is more ﬂexible, linked data is wonderful, linked data is superior -- linked data
is USELESS. It is useless for librarians in practical terms. That’s not a problem with librarians. That’s a problem
with linked data.
The end of the story, just to add insult to injury, is that THIS happened. Though I was able to ﬁx it, after some
searching and ﬁddling. And that leads me to another thing I want to talk about, which is the state of tools
available for just messing around with linked data.
These are the instructions for installing the RDF extension for Open Reﬁne -- which, by the way, I think this is
great and I want more things like it. These are the LONG instructions, mind you -- there’s a shorter set on the
*CLICK* There’s a major error in these; you can’t actually get to the workspace directory from the Open Reﬁne
start page, because the start page starts on the Create tab, not the Open tab. I ﬂatter myself I’m pretty tech-savvy,
but I had to click around and swear a bit before I ﬁgured out what these instructions were getting at.
So I wrote my own installation instructions, that seemed to work pretty well. You’re welcome. PLEASE don’t make
me do this again. Wrong installation instructions are just NOT SOYLENT. And this installation method? Is ridiculous
on its face. Not soylent at all.
If there are better tools -- tools that help me... help my learners... get ACTUAL LIBRARY WORK DONE with linked
data, I do not know what they are. I’m not sure they even EXIST. And that’s a gigantic problem for me as an
educator, and ultimately it’s a gigantic problem for you and for linked data. If I fail at my job, you know what
It’s what happened with XML and publishing, where XML did NOT HELP get publishing work done.
It’s what happened with institutional repositories, which basically didn’t help ANYBODY get any work done.
Rob Boudon, “Jamie Lyon - YAY WOW” CC-BY
Soylent technologies, technologies that are so respectful of people that people jump for joy about using them,
HELP THOSE PEOPLE GET STUFF DONE. It’s as simple as that. And this needs to be true for people who are NOT
linked data nerds and NOT programmers.
Look, fundamentally, this is the same reason programmers hate MARC! MARC gets in the way of programmers
getting useful work done, right? But if linked data puts every other librarian on earth in the position that library
programmers are currently in? That’s not going to help linked-data adoption in libraries.
Colby Stopa, “Path” CC-BY
So to sum up here... because I can’t educate people well, and because the tools are so bad, and because
practically nobody can actually get library work done with linked data, linked data is stuck *CLICK* in what I’ve
seen called NEGATIVE PATH DEPENDENCE. What’s negative path dependence? I quote from a recent report on data
sharing: “Because of high switching costs, inferior technologies can become so dominant that even superior
technologies cannot surpass them in the marketplace.” Sounds like XML in publishing, right, compared to PDF?
Sounds like institutional repositories against journals, right?
I’m afraid it sounds like linked data against MARC, too. Meaning no disrespect at all to the great Henriette Avram,
MARC is the inferior technology here! I really believe that! But linked data, despite its superiority, can’t get library
work done at this point without ridiculous costs, so it can’t replace MARC.
But. It doesn’t have to be this way. This I also believe.
So I’ll close with four challenges for the Soylent Semantic Web, the Semantic Web that is made of librarians and
other people. I hope -- and I believe! -- that presenters at this conference will answer these challenges, and I look
forward to seeing that... and I also hope that all of you take these challenges home and work on them.
for linked data
Here is my linked-data heresy. Feel free to turn me into hamburger for it later: I don’t CARE about your ontology. I
don’t care about ANYBODY’s ontology, or data model, or graph, or whatever. I. Do not. CARE. Why should I? We’ve
done library work without ontologies and picture-perfect data models for hundreds of years. Somehow or other.
Can we just get off ontologies already?
What I care about? I care about the WORK I can do with linked data, and the work librarians can do with linked
data, and the work my learners can do with linked data. I care about the tools that help them do that work. I care
about the work skills I can realistically teach my learners that someone will pay them for -- and before you say
anything, “knowing an ontology” is NOT something employers are gonna pay for!
So I don’t need ontologies. I need well-documented linked-data tools that I can use and teach. I need linked-data
workﬂows, based on real-world problems and real-world solutions, that I can demonstrate and imitate. I need
linked-data systems that do REAL LIBRARY WORK, right out of the box. And very little of this exists today, because
too much of the linked-data community is off in corners having arcane discussions about OWL same-as and H-TT-P range fourteen. Just like XML namespaces back in the day! And I’m saying, STOP THAT. Before you write ONE
MORE LINE of OWL or R-D-F Schema, write code that lets real live people do real-world work with linked data.
WHAT YOU CAN DO
WHAT I CAN DO
with linked data
When I was running institutional repositories, I went to conferences about them, as ya do. And at those
conferences I saw a LOT of demos of new and innovative software hacks. And a lot of those demos were
absolutely amazing -- but they were completely irrelevant to me, because they were impossible to implement in
my environment. So I challenge everyone here, because you are all experts already, to stop thinking about what
YOU can do with linked data *CLICK* and instead think about what *I* can do with linked data.
And what my learners can do. And what catalogers and metadata librarians and digital-library managers and
institutional-repository managers and reference librarians can do! Because if YOU are the only one who can do
what you do with linked data, librarianship writ large will NEVER be able to do it. And if you think this is a stealth
demand for better tool usability, you’re absolutely right, it is! But that’s not all it is.
This means that you need to learn about what I do, and what I CAN do. And what catalogers and metadata
librarians and all the rest of us do, right? Maybe actually watching us do it? Maybe doing some of it yourselves?
Yeah. So I challenge you to be curious about my work environment, as an educator. And catalogers’ work
environments. And digital-library work environments. Find out about those, ﬁrsthand, and use what you learn to
build linked-data systems that all librarians and libraries beneﬁt from.
with linked data
My third challenge, and I’m quite hopeful about this one, actually -- make me say WOW! about something you did
with linked data. *CLICK* And why stop at me? I challenge you to wow all of librarianship with linked data!
Some of you may remember the rollout of the Endeca-based library catalog at North Carolina State University in
the mid-two-thousands. For those of you who don’t recall, it was this ONE CATALOG that started the whole
discovery-layer movement. And what I remember most about that was that the new catalog got basically zero
pushback from librarianship generally. Even though it was a HUGE change where you’d normally expect a lot of
negative path dependence to kick in.
Instead, everybody said WOW. Wow, I want that! Wow, look, facets for narrowing searches! Wow, check it out, you
can actually start a query by drilling down through subject headings! Wow, de-duplicated records! Wow, relevance
ranking! It was just a giant leap forward from what we had. Forget negative path dependence, people wanted this
I challenge you to make something for libraries with linked data that has as much wow as that original Endeca
catalog did. So much wow that nobody even argues about linked data because everybody wants what it can do.
with linked data
Okay, I’m just gonna say this: If we want MARC dead? And we do! We’re gonna have to kill it ourselves and eat the
evidence. But I have a different idea about how to do this than I think most librarians in the linked-data space do. I
see linked-data effort focusing on big national libraries, big academic libraries, big consortia, nothing but bigbig-big.
I’m not sure that’s the right strategy all by itself, to be honest. And I’m sorry for using the word “disrupt” because
I know it’s a giant cliché now, but I’m serious about it. Let me explain what I mean.
w.marsh, “old shelby park library” CC-BY
Last summer I taught another continuing-education course for public librarians, about acquiring books from
independent publishers and people who self-publish. And one of my learners, who is a public librarian in a smalltown public library like the one I’m showing here -- she said a very sad thing. There was NO WAY her library
would be able to buy indie or self-published books, not print and not electronic. Just no way. Why not, I asked?
Because there are only two employees at that library, she said, so they can’t do ANY original cataloging!
That librarian and her little tiny two-person library? They’re what disruption theory calls an “underserved market.”
MARC is no good for her -- it’s too complicated and too expensive. If you can make a simple linked-data system
that’s cheaper and easier and more convenient for her, and lets her put in all the books she wants, including indie
books, and lets her patrons ﬁnd all the books they want, SHE WILL USE IT. And so will a LOT of little tiny libraries
that just can’t do MARC. And if linked data is so easy and so great that little tiny libraries with two employees use
it, what’s everybody else’s excuse, right? If linked data starts small, it can take over the world from MARC! I really
Library linked data
FOR GREA JUS
So if you say linked data is so much better than MARC, I’m saying prove it, for great justice!
Okay, okay, last nerd joke, I promise. But the serious point behind the joke is that there really is a social justice
issue here! Linked data shouldn’t be something that only helps big libraries and their librarians. Let’s build small
ﬁrst, and build up from there, and then we can help ALL libraries, all librarians, and ALL library patrons.
I think a linked-data catalog... that small libraries and their librarians can actually USE, and is demonstrably better
than what they have... can be built. Right now, today, it can be built. I challenge you to build it, for great justice -including justice within librarianship for linked data.
This presentation is available under a
Creative Commons 3.0 Attribution
United States license.
Please respect CC licenses on photos if you reuse.
So once again, thanks for having me, and I look forward to the rest of the conference!