Wikipedia and Special Collections:
A Special Relationship
Bob Kosovsky, Curator of Rare Books and Manuscripts
Music Divisi...
• Over 4.2 million articles in English Wikipedia
• 286 Wikipedias by language
• Over 30 million articles in all Wikipedias...
Editors of Wikipedia
• As of November 2011, 31.7 million registered
users; 270,000 active any given month
For English Wiki...
Google’s Knowledge Graph
uses data from Wikipedia
Beta620.nytimes.com – the NY Times’
website for experimentation
Even Birdopedia is derived in part from Wikipedia
“Verifiability, not truth”
Among Wikipedia editors, this is often
phrased as "verifiability, not truth" to
express the ide...
Wikipedia:Core Content Policies
• Neutral point of view
• Verifiability
• No original research
• “…all material in Wikiped...
My user talk page as of August 30, 2006
Digital project:
creating metadata for
thousands of songs. I
wanted to include
performance
information for the
1891 musica...
Della Fox appeared in several musicals of the 1890s,
including two composed by Woolson Morse
The universe of Linked Data:
DBpedia (Wikipedia in data format) at the center
Drexel 4257 – “John Gamble, his booke”
Music Division’s Permission File
→
Linked article
Linked article
Drexel 4257 had been established as an corporate authority heading
...therefore was present in VIAF...
…and now in Wikipedia: VIAF link in the article on Drexel 4257
If it’s
“Did You Know...?” for August 16, 2012
“Catalogs are not the methods
by which the community
learns about things”
-- Katherine Reagan, Division of Rare and
Manusc...
The Wikipedia article on Special collections – only ONE citation!
From the article on Special Collections:
There is currently no article on hidden collections.
How can we highlight the iss...
There is no Wikipedia article on G. Thomas Tanselle
Behind the articles, Wikipedia is also a social network
dedicated to a mission – don’t be shy asking for help
The universe of linked data:
DBpedia (Wikipedia in data format) at the center
some
"If Wikipedia is good enough for the Archivist of the
United States, maybe it should be good enough for you”
– David ...
Thank you!
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Wikipedia and Special Collections: A Special Relationship
Upcoming SlideShare
Loading in …5
×

Wikipedia and Special Collections: A Special Relationship

640 views

Published on

Presented on July 25, 2013 at the preconference of the Rare Books and Manuscripts Section (RBMS) of the Association for Colleges and Research Libraries (ACRL), a division of the American Library Association.

Published in: Education, Technology
1 Comment
0 Likes
Statistics
Notes
  • Be the first to like this

No Downloads
Views
Total views
640
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
9
Comments
1
Likes
0
Embeds 0
No embeds

No notes for slide
  • Questions to the audience: Who has used WP – ever? [keep hands up] Who has used WP in the last year? Who has edited WP? Who has created an article on WP? So you’re somewhat familiar with the website: How many articles in Wikipedia? How many language Wikipedias? How many articles in all language Wikipedias?
  • The English-language Wikipedia has over 4.2 million articles. There are Wikipedias in 286 different languages--all different from one another--which together have a total of over 30 million articles. According to Alexa (a leading measure of web traffic) it ranks sixth globally among all websites across the world, and has an estimated 365 million readers worldwide. Wikipedia has become the largest and most popular general reference work on the Internet. And regarding the people who edit Wikipedia…
  • ...as of November 2011, 31.7 million people had registered for accounts. 270,000 of those were active in a given month. For English Wikipedia, there are 33,511 active Wikipedians who make 3.3 million edits per month. One of the results is the creation of 910 new articles PER DAY .
  • Wikipedia results often come near to the top of results in a Google search. Since Wikipedia’s text is freely available for re-use, numerous websites use it as the basis or some of their own entries. Google’s Knowledge Graph makes generous use of Wikipedia.
  • The New York Times is re-envisioning the part of their website currently called “Times Topics.” Still under development, the new Times Topics will be not just a list of recent articles but a portal balancing and blending information and media from the Times with Wikipedia and other freely available information. Google and the NY Times are only a few of the many sites whose content is (in part) derived from Wikipedia. Freebase, Answers.com, Reference.com, and others have content based on Wikipedia. My sister is a veterinary surgeon. One of her hobbies is birding. She introduced me to...
  • ...Bird-o-pedia, which is also based on Wikipedia. So it should be clear that Wikipedia is not just a website, but one whose content permeates many websites, raising its influence AND importance throughout the Internet. Wikipedia also has its critics and controversies. As an example, let’s take the recent incident involving the famous novelist, Philip Roth.
  • In September 2012, The New Yorker published Roth’s “An Open Letter to Wikipedia.” According to Roth, the Wikipedia article on his novel “The Human Stain” said that the narrative was based on the life of literary critic Anatole Broyard. Roth said this was wrong, that in fact, the story was inspired by an incident in the life of his friend, Melvin Tumin, a Princeton professor. Treating it like a traditional publication, Roth demanded that Wikipedia remove what he claimed was an incorrect statement. His literary agent then registered on Wikipedia to remove the statement, only to see his edit “ reverted ” - the word used on Wikipedia to refer to removing editing by returning an article to a previous version. Claiming frustration and not knowing where to turn, Roth wrote this article. Roth ran into the problem of not understanding what Wikipedia is. People assume Wikipedia is simply an online version of the kind of encyclopedias that have existed for centuries. Although superficially similar, there is a difference. Quoting from the article on Wikipedia in Wikipedia:
  • “ Wikipedia intends to convey only knowledge that is already established and recognized . It must not present new information or original research. A claim that is likely to be challenged, requires a reference to a reliable source. [QUOTE BEGINS HERE] Among Wikipedia editors, this is often phrased as "verifiability, not truth" to express the idea that the readers , not the encyclopedia, are ultimately responsible for checking the truthfulness of the articles and making their own interpretations.” These ideas are codified in Wikipedia’s Core Content Policies:
  • Neutral point of view Verifiability No original research. “… all material in Wikipedia must be attributable to a reliable, published source. Articles may not contain any new analysis or synthesis of published material that serves to advance a position not clearly advanced by the sources.”   In reviewing Roth’s novel, at least SIX different critics associated the leading character with Anatole Broyard. Once Roth’s open letter was published, the Wikipedia article on the novel was amended; it still mentions the 6 critics who equated his story with Anatole Broyard (what Roth wanted removed), but now also incorporates Roth’s published claim that the source of inspiration was Melvin Tumin. It took me a long time before I finally visited the website in 2006. I probably made a few edits as an unregistered user. Many of my edits were reverted--presumably because I wasn’t adhering to the guidelines. I finally registered in July 2006, but my real relationship with Wikipedia began when I established my talk page on August 30, of that year.
  • Having a user and talk page allows one to receive feedback from others. Shortly after I created this page I received a message informing me of Wikipedia’s 5 pillars concerning how to edit pages. How did I approach editing Wikipedia? I did it with small edits - adding a sentence or two here and there - nothing very extensive. Once I gained enough confidence in editing, I embarked upon a project.
  • Realizing the opportunity to give archival holdings greater exposure, I added external links in Wikipedia to our archival collections. I know a number of institutions have done similarly. Then in the fall of 2008, I was put on a digital project.
  • The Music Division of the New York Public Library has a collection of about half-a-million pieces of sheet music, much of it uncataloged and uninventoried. We were digitizing titles published between 1890 and 1901 -- that’s 4,700 pieces of music. I was one of two people charged with creating metadata. The music was arranged chronologically by year of copyright, and then alphabetically by title. As I was going through the songs from 1891, I noticed certain shows kept on showing up. One of them was the musical “Wang” composed by Woolson Morse, a forgotten composer. I found the information I needed in reference books, but then I thought: There must be other people who want to know about this. So I decided to create a page about Wang. It was one of the first articles I created.
  • I included the songs from the show, so that when I came across them, I could easily copy the performance information into metadata. I haven’t gone back to enhance the article although several others have added to it. What of the composer, Woolson Morse? I consider myself knowledgeable about Broadway musicals, but I had never heard of him. As I came across more songs by him from other shows, I decided to create an article on him.
  • Since he was a fairly obscure composer I had to dig to find sources, such as obituaries and reviews of his shows. An example of how far I went is at the bottom right of the page. There you can see a photograph of his unmarked grave at the famous Green-Wood Cemetery in Brooklyn, NY (it’s the same cemetery where his famous relative, Samuel Morse, inventor of Morse Code, is also buried). Genealogy is one of my hobbies, so I was eager to contact the Green-Wood Cemetery office. They provided me with the location of his grave, enabling me to take this picture. Since it’s an unmarked grave, Wikipedia editors asked me for justification for including it. So in the metadata for the picture, I included the section number, lot number and grave number which the cemetery staff had provided me, verifying that this is Woolson Morse’s grave. An ancillary to the musical “Wang” and Woolson Morse was the singer Della Fox.
  • She was occasionally mentioned on sheet music covers I encountered in the digitization project, and she was the leading lady in two of Morse’s musicals “Wang,” and “Panjandrum.” So I created an article on her. You can tell that this article looks more elaborate. That’s because Della Fox also appeared in some of the earliest American productions of Gilbert and Sullivan operettas. Since Wikipedia has a very active and extremely passionate group of Gilbert & Sullivan devotees, they helped flesh out the article with numerous sources and good editing. The result is that this article contains much more information on Della Fox and her context than any other source. I started with just a piece of sheet music; why did I bother to create articles on the musical, the composer and the leading actor?
  • By creating these articles, I was going beyond just providing a point-to-point connection between article and link (as I had been doing with finding aids). I was providing the broader intellectual context for the works in our collection. As a hypertext resource, the intellectual value of any individual article increases as more links are created to other articles and topics, creating a network of “shared value.” So often I encounter readers who often choose see things as discrete individual items or topics. How much more powerful is their research when they recognize the myriad associations and connections between various kinds of information.   Some of you might remember this slide from last year’s talk by Jon Voss on Linked Data that shows the many connections among websites. Of course, DBpedia, which is the data format of Wikipedia, is at the center, the source of much of the world of linked data.))   Every Wikipedia article always involves a collaboration. When I made mistakes, people corrected what I had done. When I didn’t understand an emendation, they explained it and when necessary, referred me to various help files. When I had questions or suggestions, they responded to them. Where previously I was unable to add a link for our archival holdings because there was no article, I now had the confidence to create articles on other items in our collection. Then I decided to be bold.
  • I began to think of creating articles for some of our more notable manuscripts. This is a manuscript known as Drexel 4257, also known as “John Gamble, his book” (after the first owner and composer of some of the songs in it). Dating from about 1659, it’s one of the prized possessions of the Music Division and of the New York Public Library, for it is the largest source of English song from the first half of the 17th century.
  • For many years the Music Division retained permissions for reproductions of its manuscripts. Over time, this file had grown to include letters and articles concerning particular manuscripts. This gave me a headstart in locating articles relevant to Drexel 4257. In addition, a dissertation had been written on this one manuscript. I collated all these sources and created...
  • ...the article on Drexel 4257 for Wikipedia. Because I had a lot of information at my disposal, this article is rather detailed. (READ OUT THE SECTIONS FROM THE TABLE OF CONTENTS)
  • You’ll note this part of the article includes a photo of the binder’s label from 1944. When I unveiled the article, people asked me about the binding. I said that none of the authors who had written about it mentioned the binding, so I couldn’t write anything about it. (Remember: Wikipedia generally needs published and verifiable information.) So one of the editors gave me an idea: If I could take a picture of the binder’s label, upload it to Wikimedia Commons, it would be essentially published, available for all to see, consult, and use. I could then use the information from the published photograph in the body of the article. So that’s what I did, incorporating information about the binding into the article based on this photograph.
  • [reading out sections] Dating, Provenance, Organization,...
  • Handwriting, Politics,...
  • [continuing to read out sections] Topical or literary content, Musical content and style...
  • Of course it’s necessary to have a contents list for a music manuscript. One of the most useful and practical things about a list of songs in a Wikipedia article is that you can add information that would be uncomfortable for a contents note in a bibliographic record.
  • When people tell me that Wikipedia has all these untrue statements, I tell them that, ideally, nearly every statement in Wikipedia should be sourced . Seeing the sources for an article is one of the ways you can determine its quality. Here’s my list of footnotes followed by a bibliography, showing my strong belief in sourced statements. Because the resolution is so small here, I don’t think you can see that Wikipedia has a special footnote script so that all citations to the same page are combined into a single note; so even though you see 65 numbered footnotes, the actual number is 110.
  • Earlier I touched upon the nature of a web encyclopedia, and that of its linkage. A small article on John Gamble already existed in Wikipedia. But since this manuscript once belonged to him and contains 28 of his composition, I thought it appropriate to expand the artlcle by including a paragraph about the manuscript and an accompanying picture.
  • To enhance this relatedness, I created an article on the collection of which Drexel 4257 is a part - the Drexel Collection, named for Joseph Drexel, a member of the Philadelphia banking family. Joseph Drexel’s collection, a founding collection of the New York Public Library, contains over 6,000 books and scores (many of them rare and unique), and a number of important music manuscripts. The point of creating and connecting these related articles is to foster the linkage between them which enhances their shared intellectual content.
  • In 1987, Garland Publishing came out with a facsimile of the manuscript. Because of that publication, a corporate name authority record was established. Because it is in the Library of Congress’s authority file...
  • ...it is also part of VIAF, the Virtual International Authority File. Merrilee will have more to say about VIAF.
  • Thanks to OCLC’s Wikipedian-in-Residence, there is a template to link entries in VIAF with Wikipedia articles. Here at the bottom of that article on Drexel 4257, is the link to VIAF. So how is anyone going to know that this article was created and exists? Let me show you one method.
  • This is the top portion of the main Wikipedia page on a typical day. Notice the 4 sections, going clockwise from upper left: From today’s featured article, In the news, On this day.... AND at the bottom left: Did you know. You saw at the outset that approximately 910 articles are created each day. “Did you know” is a way to promote these new articles. Editors voluntarily create one-sentence hooks in the form of questions and submit them for consideration. I created the Drexel 4257 article on August 9, 2012 and simultaneously submitted my hook.
  • On August 16 the hook for Drexel 4257 appeared on the front page of Wikipedia. “ Did You Know” hooks are also mirrored on Twitter. Thanks to “Did You Know” ...
  • The article received nearly 2,000 page visits over the course of 3 days. Think of it: 2,000 page visits for 3-and-a-half century old manuscript: Would that be possible in a library catalog, or even a finding aid?
  • These are the traffic statistics for April of this year [2013]. Admittedly much less than 2,000, no one day hits 30 visits. Yet it shows that someone or something views the article nearly every day. The average seems to be between and 4 and 10 hits; I don’t know what led to the various spikes of between 21 and 26 page views on some days, but it’s interesting to ponder. The manuscript was microfilmed in 1947 and many copies of the film have been sold to various libraries, 12 of which show up in Worldcat, so that, too, could be a source of traffic. There’s a video on YouTube of one of the songs that mentions Drexel 4257 in its title, so that also could be a source of traffic.
  • This is one example of a case study, showing that the presence of links in Wikipedia articles does increase library use of particular items. Has it increased usage of Drexel 4257? The fragility of the manuscript is such that we keep it locked up for use only by qualified and experienced researchers. These researchers need to convince us of the need to see the physical item. Since creating this article last year, only one person has asked to see the manuscript, and upon further questioning, that person just wanted to investigate one song for which the facsimile was sufficient. You may wonder: “Why create a Wikipedia article for a locked-up manuscript if there is no corresponding increase in demand for the item?” But I feel the question is misguided . WIKIPEDIA IS NOT A MARKETING TOOL . In fact, there are strict rules against promotion and warnings where there might be a conflict of interest.
  • Rather, this is an opportunity to let information about your unique holdings escape the confines of your ILS, so it can be embedded within Wikipedia, which, as we’ve seen, pollinates much of the web and lays the foundation for the nascent semantic web. Would it not be a fabulous thing – if our unique items, our collections, our libraries, our profession , could be found where information is easily accessible and disseminated? So why aren’t more special collections librarians contributing to Wikipedia?
  • This is the entire Wikipedia article on special collections. It’s pitifully brief, isn’t it? There’s only one citation.
  • This is a sentence from that Special collections article. An editor was apparently puzzled by the idiom “hidden collections” and wanted amplification. Couldn’t we write an entire article on hidden collections? Might the idea of hidden collections receive more interest and support if everyone could go to a Wikipedia article that explains the existence and issues surrounding hidden collections?
  • There is no Wikipedia article on Thomas Tanselle. The man is a major figure in the world of scholarship, and there is no article on him. Who is going to supply all the information for these and other articles that have yet to be created?
  • It’s up to all of us to ensure that Wikipedia covers our profession adequately and when possible, thoroughly. Getting involved in Wikipedia starts with registering your username.
  • In my case, I took things very gradually, and watched for feedback which I took seriously and in good faith. A new visual editor is being rolled out next month and that should improve the site’s accessibility.
  • As with meeting new people and learning to exchange trust, so too does becoming familiar with the site’s social network take skill. You never make Wikipedia articles in isolation, but in collaboration with others. The Wikipedia Teahouse was set up to help new editors forge relationships and learn from each other.
  • Remember: Wikipedia is made by people like you. Every bit of content you add to Wikipedia becomes a valued contribution to the universe of knowledge. In a local sense , the benefits of engaging with Wikipedia will be greater exposure for your library, your collections, your manuscripts, your unique knowledge, AND for special collections librarianship in general.
  • From a broader perspective , we have an opportunity to not only improve the documentation of our profession, but to disseminate it throughout the Internet and show its connections and relationships to numerous fields. [SLOWER] I can think of no better conclusion than the words of David Ferriero, Archivist of the United States, who has said on several occasions:
  • “ If Wikipedia is good enough for the Archivist of the United States, maybe it should be good enough for you. “
  • Thank you – now, go register!
  • Wikipedia and Special Collections: A Special Relationship

    1. 1. Wikipedia and Special Collections: A Special Relationship Bob Kosovsky, Curator of Rare Books and Manuscripts Music Division - New York Public Library Preconference of the Rare Books and Manuscripts Section of ACRL / ALA June 25, 2013
    2. 2. • Over 4.2 million articles in English Wikipedia • 286 Wikipedias by language • Over 30 million articles in all Wikipedias • 6th most-used website on Earth1 • 365 million readers worldwide • 8th most-used website in the United States • “Most popular reference work on the Internet” 2 1 Alexa.com, accessed 10 June 2013 2 “Wikipedia” in Wikipedia, footnotes 5-9, accessed 10 June 2013
    3. 3. Editors of Wikipedia • As of November 2011, 31.7 million registered users; 270,000 active any given month For English Wikipedia (figures as of April 2013) • 33,511 active Wikipedians • 3.3 million edits per month • 910 new articles per day Source: http://stats.wikimedia.org/EN/TablesWikipediaEN.htm#wikipedians
    4. 4. Google’s Knowledge Graph uses data from Wikipedia
    5. 5. Beta620.nytimes.com – the NY Times’ website for experimentation
    6. 6. Even Birdopedia is derived in part from Wikipedia
    7. 7. “Verifiability, not truth” Among Wikipedia editors, this is often phrased as "verifiability, not truth" to express the idea that the readers, not the encyclopedia, are ultimately responsible for checking the truthfulness of the articles and making their own interpretations.
    8. 8. Wikipedia:Core Content Policies • Neutral point of view • Verifiability • No original research • “…all material in Wikipedia must be attributable to a reliable, published source. Articles may not contain any new analysis or synthesis of published material that serves to advance a position not clearly advanced by the sources.” http://en.wikipedia.org/wiki/Wikipedia:Core_content_policies
    9. 9. My user talk page as of August 30, 2006
    10. 10. Digital project: creating metadata for thousands of songs. I wanted to include performance information for the 1891 musical “Wang” by composer Woolson Morse.
    11. 11. Della Fox appeared in several musicals of the 1890s, including two composed by Woolson Morse
    12. 12. The universe of Linked Data: DBpedia (Wikipedia in data format) at the center
    13. 13. Drexel 4257 – “John Gamble, his booke”
    14. 14. Music Division’s Permission File →
    15. 15. Linked article
    16. 16. Linked article
    17. 17. Drexel 4257 had been established as an corporate authority heading
    18. 18. ...therefore was present in VIAF...
    19. 19. …and now in Wikipedia: VIAF link in the article on Drexel 4257
    20. 20. If it’s
    21. 21. “Did You Know...?” for August 16, 2012
    22. 22. “Catalogs are not the methods by which the community learns about things” -- Katherine Reagan, Division of Rare and Manuscript Collections, Cornell University
    23. 23. The Wikipedia article on Special collections – only ONE citation!
    24. 24. From the article on Special Collections: There is currently no article on hidden collections. How can we highlight the issues with hidden collections if it’s not possible to find a definition and explanation of the phrase?
    25. 25. There is no Wikipedia article on G. Thomas Tanselle
    26. 26. Behind the articles, Wikipedia is also a social network dedicated to a mission – don’t be shy asking for help
    27. 27. The universe of linked data: DBpedia (Wikipedia in data format) at the center
    28. 28. some "If Wikipedia is good enough for the Archivist of the United States, maybe it should be good enough for you” – David Ferriero, Archivist of the United States
    29. 29. Thank you!

    ×