Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Shared Data: What it Meansfor the Future of LibrariesRobin Fay @georgiawebgurlHead, DBM/Cataloging / UGA LibrariesPeter Mu...
Agenda• Overview of big data▫ What is big data? What is shared data?▫ Implications and challenges▫ Background: Alistair Cr...
How did our data get big?• Technology that has unforeseen consequences• Technology changes.• We leave digital trails where...
How did our data get big?• “Collectively the datathat we leave behindis Big Data. “• and of course.. Thereis the data that...
What is Big Data?• It is a not a technology– it is a shift in how weview and useinformation• Taking large amounts ofinform...
3 attributes of Big Data• Large• Fast (manualtime needed)• andunstructured(formatsdiffer)=3 Vs of BigDataDraft Content for...
Big Data• Relational (relationships) database - our ILS systemsare often relational databases• Mathematical database – com...
Concerns of Big Data• Privacy – erodes privacy potentially leaking privateinformation• Justify stereotypes (data can be mi...
Which side of the fence?• Big Data is going to change our lives!• Are you• a semantic idealist ? if we can taxonomizeand o...
So, how would you file a cup of coffee?• Depends upon how you will usethe information!• Understandings do not takeadvantag...
Humans and technology• Our reasoning can be flawed - we makedecisions evolutionary – we look atsimple correlations and pat...
Shared data• We are a mosaic of data from other resources• Unified digital history – record of all of ourdata and could ag...
Linked data makes it possible• Linked data keeps us from having to re-enter orcopy informationIt makes data:• reusable• ea...
Linked data makes it possible• It can build relationships in different ways -allowing us to create temporary collections (...
Linked data makes it possible• Linked data keeps us from having to re-enter orcopy informationIt makes data:• reusable• ea...
Thinking of data in the library environment• Automation and new technologies• The web has changed• Large scale bibliograph...
Discussion points• Obviously, WorldCat is a shared data resourcewe have all been using for years. What aresome other examp...
Upcoming SlideShare
Loading in …5
×

Shared Data & Big Data for Libraries

998 views

Published on

Brief overview of open data, big data and sharing data ; discussion followed (based on Alastair Croll's presentation at ALA). robin fay @georgiawebgurl ; peter murray (lyrasis)

Published in: Technology, Education
  • Be the first to comment

Shared Data & Big Data for Libraries

  1. 1. Shared Data: What it Meansfor the Future of LibrariesRobin Fay @georgiawebgurlHead, DBM/Cataloging / UGA LibrariesPeter MurrayLyrasisDraft Content for Discussion group 05.01.2013 / robinfay
  2. 2. Agenda• Overview of big data▫ What is big data? What is shared data?▫ Implications and challenges▫ Background: Alistair Croll talk at ALAMidwinterhttp://www.youtube.com/watch?v=Ic_BlPesEls• Discussion framed around Alistair’spresentation topicsDraft Content for Discussion group 04.30.2013
  3. 3. How did our data get big?• Technology that has unforeseen consequences• Technology changes.• We leave digital trails wherever we go.• Think> internet browsing history, email,medical records, bank transactions, buyinghistory at shopping sites, Amazon reviews,Facebook photos, comments on websites, andmuch more.Draft Content for Discussion group 05.01.2013
  4. 4. How did our data get big?• “Collectively the datathat we leave behindis Big Data. “• and of course.. Thereis the data thatothers (people andmachines) createabout us.• Big Data is about usand has far reachingconsequences.Draft Content for Discussion group 05.01.2013
  5. 5. What is Big Data?• It is a not a technology– it is a shift in how weview and useinformation• Taking large amounts ofinformation spreadacross many differentresources in differentformats making themexplore• It doesn’t have to be“that big just biggerthan what you can gothrough by hand”Draft Content for Discussion group 05.01.2013
  6. 6. 3 attributes of Big Data• Large• Fast (manualtime needed)• andunstructured(formatsdiffer)=3 Vs of BigDataDraft Content for Discussion group 05.01.2013
  7. 7. Big Data• Relational (relationships) database - our ILS systemsare often relational databases• Mathematical database – computations• Big Data is the intersection of two• Health– analyzing health records to identify allergies,sickness, etc• Philanthropy (datakind) – analyze behavior offarmers and knowledge workers to evaluate the impact(ROI) of philanthropic work• Think about potential for library use: we have patrondata, bibliographic data and more!
  8. 8. Concerns of Big Data• Privacy – erodes privacy potentially leaking privateinformation• Justify stereotypes (data can be misused or used in anegative) and polarize social groups• Facebook open graph search – pulling togetherinformation from diverse information to get lists ofseemingly innocent ways such as movie watchinghabits or music can be used in negative ways toreinforce stereotypes or drawn conclusions aboutpeople• “Personalization can look like prejudice”• We live in grey areas• Computers do not understand thatDraft Content for Discussion group 05.01.2013
  9. 9. Which side of the fence?• Big Data is going to change our lives!• Are you• a semantic idealist ? if we can taxonomizeand organize it, we can make sense of it▫ Wolfram Alpha – we can ask it and it will reason(mathematical)• A chaotic nihilist? Algorithms will handle it –correct data will bubble up given enough information▫ Watson – doesn’t know answers but will analyze tointerpret answerDraft Content for Discussion group 05.01.2013
  10. 10. So, how would you file a cup of coffee?• Depends upon how you will usethe information!• Understandings do not takeadvantage of digital informationwhich slows semantic idealism –much information not organizedso we have to rely algorithms (fornow) but it is vunerable.• Tagging is often done bymachines – even in libraries webatch load, harvest, update dataglobally.Draft Content for Discussion group 05.01.2013
  11. 11. Humans and technology• Our reasoning can be flawed - we makedecisions evolutionary – we look atsimple correlations and patterns (falsepositives)• If comments after a post are highlynegative, responders are more likely totake polarizing viewpoints• Even with math is good, data can bewrongDraft Content for Discussion group 05.01.2013
  12. 12. Shared data• We are a mosaic of data from other resources• Unified digital history – record of all of ourdata and could aggregate health information andshare with doctors – just one example• Veracity (can verify) and Value (how we canmake sense of our data)• Shared data : connecting networks will collectdata; algorithms will tag and assign metadatabut it will be up to humans to add value - thiscan then be shared in ways that are usefulDraft Content for Discussion group 05.01.2013
  13. 13. Linked data makes it possible• Linked data keeps us from having to re-enter orcopy informationIt makes data:• reusable• easy to correct (correct one record instead ofmultiples)• efficient• and potentially useful to othersDraft Content for Discussion group 05.01.2013
  14. 14. Linked data makes it possible• It can build relationships in different ways -allowing us to create temporary collections (auser could organize their search results in a waythat makes sense to them) or more permanent(collocating ALL works by a particular authormore easily; pulling together photographs moreeasily)• It can help make sense of Big Data andfacilitate sharing data.
  15. 15. Linked data makes it possible• Linked data keeps us from having to re-enter orcopy informationIt makes data:• reusable• easy to correct (correct one record instead ofmultiples)• efficient• and potentially useful to others
  16. 16. Thinking of data in the library environment• Automation and new technologies• The web has changed• Large scale bibliographic databases• User expectations and needs• Patron data• Cooperative cataloging• Greater variety of media in library collections(electronic!)• FRBR is our data model – semantic webfriendly!Draft Content for Discussion group 05.01.2013
  17. 17. Discussion points• Obviously, WorldCat is a shared data resourcewe have all been using for years. What aresome other examples of big data, shared data,or linked data that libraries use now?2. What are some examples of data thatlibraries could share that we arent sharingalready?3. What are some of the pitfalls of data sharingon a massive scale?Draft Content for Discussion group 05.01.2013

×