• Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
87
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Digital.Humani,es@Oxford  Summer  School,  3  July  2012    Humanities  Research  Data  –   Rate  me!   Wolfram  Horstmann  
  • 2. The  Research  Data  Question   h>p://www.flickr.com/photos/desconciertos/160752180/  Data-­‐driven  research  is  called  the  4th  Paradigm  in  the  Sciences.  Where  are  humani;es  in  the   current  discussion  about  research  data?  
  • 3.  Ratings,  Skepticism  &  Anxiety     h>p://www.flickr.com/photos/komoda/7187391601/  Research  Excellence  Framework  is  a  reality.  But  it  is  objected  that:  “Humani;es  research   threatened  by  demands  for  economic  impact”  Guardian  13  October  2009    
  • 4. Outline   The  current  awareness  of  the  importance  of   research  data  provides  opportuni;es  for  the   humani;es  to  show  their  value.   ~   The  challenge  is  to  communicate  what  research   data  means  for  the  humani;es.   ~   The  proposal  is  to  state  the  obvious  more  clearly:  text  and  images  as  research  data  of  the  humani;es   and  libraries  as  humani;es  research  facili;es.  
  • 5. HUMANITIES  AND  LIBRARIES  AS  SOULMATES  
  • 6. Texts  and  Images  as  Data  http://www.Flickr.com/photos/gorgmorg/9944210/   Humani;es  work  with  texts  and  images  as  other  subject  areas     work  with  ma>er,  wetware,  hardware  or  numbers.  
  • 7. Libraries  as  Research  Facilities   h>p://vi.sualize.us/carl_spitzweg_bucherworm_1850_books_library_ladder_reading_picture_2Qp9.html   Humani;es  have  ins;tu;onalized  their  research  facili;es  centuries  ago,    other  subject  areas  did  it  much  later,  with  labs  and  centers  like  CERN  or  EMBL.  
  • 8. The  Advent  of  the  Digital  h>p://www.flickr.com/photos/flex/27334821/   h>p://tei.oucs.ox.ac.uk/Talks/2008-­‐08-­‐kazan/exercise-­‐2.xml   h>p://www.bodley.ox.ac.uk/librarian/rpc/manchesterpres/slide15.jpg   Transforming  the  physical  research  facili;es  into  digital  is  a  laborious     and  expensive  exercise  –  and  its  poten;al  is  not  yet  exploited.  
  • 9. Digital  Humanities  &  Libraries   h>p://adamcrymble.blogspot.com.es/2012/01/is-­‐old-­‐bailey-­‐online-­‐film-­‐or-­‐science.html   World  Data  Centers  or  the  EBI  are  centralized     –  can  Humani;es  Data  Centers  can  be  at  each  ins;tu;on?  
  • 10. SOME  EXAMPLES  
  • 11. Digital  Resources  in  the  Bodleian  ~  approaching  petabyte   scale  of  highly   structured  storage  for   texts  and  images  ~  2.000.000  digi;zed   images,  another   Million  to  come  in  the   next  3  years,  plus   350.000  Google   Books   REFERENCE  MISSING  ~  100  virtual  machines   …  and  by  far  most  of  these  are  resources  of  the  Humani;es.  
  • 12. Cultures  of  Knowledge   h>p://www.history.ox.ac.uk/coj/  An  example  of  highly  structured,  intellectually  curated  data:  more  than  unique  12.000     people  and  3500  loca;ons  iden;fied  in  60.000  le>ers  with  25.000  annota;ons.  
  • 13. What’s  the  Score?   h>p://www.whats-­‐the-­‐score.org/  In  only    a  few  months  over  10.000  scores  have  been  described  by  the  public.    
  • 14. Broadside  Ballads   h>p://ballads.bodley.ox.ac.uk  Collabora;ve  research  introduces  novel  quali;es     into  humani;es  research  data  management.  
  • 15. Google  Books  at  the  Bodleian   30  Apr  -­‐  6  May   28  May  -­‐  2  Jun   26  Mar  -­‐  1  Apr   14-­‐20  May   21-­‐27  May   12-­‐18  Mar   19-­‐25  Mar   16-­‐22  Apr   23-­‐29  Apr   7-­‐13  May   9-­‐15  Apr   2-­‐8  Apr     Total   5150   3338   7111   3010   3955   4528   6901   4566   6883   5300   5165   2844   .uk   1202   2088   5950   1705   2532   3360   5386   3445   3667   2704   3092   1347   .ac.uk   1033   1328   5751   1610   1262   2970   4482   3123   2988   2525   2803   1194   .ox.ac.uk   991   1296   5636   1559   1249   2938   4435   3111   2973   2498   2737   1186   Bodleian   Libraries   291   464   516   306   319   524   562   680   552   499   649   224   .bodley   0   0   15   3   3   8   14   8   6   21   7   4   .bodleian   0   0   0   0   0   0   0   0   0   1   0   0   .ouls   106   48   43   26   15   88   89   94   39   50   112   39   .sers   79   187   102   63   64   154   105   131   139   181   126   26   .library-­‐public   0   4   0   3   3   0   3   3   3   0   2   0   .bodley-­‐open   3   9   17   4   7   18   10   14   11   6   17   5   .bodley-­‐public   5   14   14   12   19   28   21   32   18   21   30   18   .odl   0   0   0   0   0   0   0   0   0   0   0   0   .ouls-­‐open   98   202   325   195   205   223   313   381   322   212   348   128   .saclib   0   0   0   0   2   0   1   14   10   4   3   1   .taylor   0   0   0   0   1   5   6   3   4   3   4   3   Approaching  one  download  a  minute:  350.000  Google  books  with     es;mated  10.000.000  pages  and  25.000.000.000  words  
  • 16. THE  STORY  SO  FAR  
  • 17. Size  matters!   h>p://randommiza;on.com/2011/03/08/library-­‐has-­‐giant-­‐books-­‐for-­‐facade/  Even  though  humani;es  oken  use  qualita;ve  and  hermeneu;c  methodology  –  rather  than   quan;ta;ve  –  the  size  of  data  is  significant.  
  • 18. Structure  matters!  011010101001010101010101011000100010101001010001000101010010011010101001010101010101011000100010101001010001000101010010011010101001010101010101011000100010101001010001000101010010011010101001010101010101011000100010101001010001000101010010011010101001010101010101011000100010101001010001000101010   h>p://cacm.acm.org/magazines/2010/4/81499-­‐the-­‐data-­‐structure-­‐canon/fulltext   Sizable  numbers  will  not  give  a  thorough  idea  of  digital  humani;es  data    –  structure  is  evenly  important.  This  can  only  be  understood  by  example.  
  • 19. Collaboration  matters!   h>p://www.flickr.com/photos/ludovicmauduit/2646525907  Involvement  of  colleagues  in  collabora;ve  research  and  the  public  in     crowdsourcing  makes  a  difference.  
  • 20. RESEARCH  DATA  CHALLENGES  IN  THE  HUMANITIES  
  • 21. 1st  Challenge:  Diversity   h>p://www.ucl.ac.uk/archaeology/studying/undergraduate/courses/ARCL2037  Humani;es  have  a  varied  typology  of  research  data,  oken  requiring  idiographic  approaches.   Thus,  standardiza;on  is  difficult  (cf.  cita;on),  and  so  is  finding  computa;onal  skills.    
  • 22. 2nd  Challenge:  Openness   h>p://www.flickr.com/photos/uncene/364730693/  As  with  all  researchers,  compe;;on,  privacy  and  exploita;on  are  impediments  to  data     sharing.  Do  humani;es  more  than  others  keep  the  “ivory  tower”  aptude?  
  • 23. Accessibility  of  Humanities  Texts   Wal;nger,  U.,  Mehler,  A.,  Lösch,  M.,  &  Horstmann,  W.  (2011).  Hierarchical  Classifica;on  of  OAI  Metadata  Using  the  DDC  Taxonomy.  In  Chambers  et  al  (Eds.),  Advanced  Language   Technologies  for  Digital  Libraries  (Vol.  6699,  pp.  29  -­‐  40).  Berlin  /  Heidelberg:  Springer.   Lösch,  M.,  Wal;nger,  U.,  Horstmann,  W.,  &  Mehler,  A.  (2011).  Building  a  DDC-­‐annotated  Corpus  from  OAI  Metadata.  Journal  of  Digital  Informa;on,  12(2)  From  some  30.000.000  bibliographic  records  it  is  hard  to  fill  the  humani;es  corpus.     This  might  constrain  discoverability  of  Humani;es  resources.  
  • 24. 3rd  Challenge:  Inherent  Obstacles  Humani;es  research  data  show  some  peculiari;es.  An  extreme  example  is  the  closure  of   archaeological  data  to  protect  sites  against  tomb  raiders.   Research  in  the  Humani;es  and  Social  Sciences  :  Hogenaar,  A.  ,  H.  Tjalsma,  &  M.  Priddy.  2011.  “Research  in  the  Humani;es  and  Social  Sciences”  h>p://dx.doi.org/10.2390/PUB-­‐2011-­‐7  
  • 25. 4th  Challenge:  Implementing  Policy   Deposit of resources or datasets Grant Holders in all areas must make any significant electronic resources or datasets created as a result of research funded by the Council available in an accessible and appropriate depository for at least three years after the end of their grant. The choice of depository should be appropriate to the nature of the project and accessible to the targeted audiences for the material produced. h>p://www.ahrc.ac.uk/FundingOpportuni;es/Documents/Research%20Funding%20Guide.pdf  Funders  policies  are  an  approach  for  opening  up  data  –  but  humani;es  produce     much  data  outside  of  the  regular  project  life  cycle.  
  • 26. RESEARCH  DATA  OPPORTUNITIES  IN  THE  HUMANITIES  
  • 27. 1st  Opportunity:  Public   Understanding   h>p://www.queenvictoriasjournals.org/home.do  Humani;es  research  data  are  oken  easier  understood  by  the  public  than  science  data.  The   “Impact  Regime”  may  even  be  an  advantage  for  the  humani;es.  
  • 28. 2nd  Opportunity:  Cultural  Heritage   h>p://www.europeana.eu/portal/   They  are  more  likely  to  be     accessed  and  preserved  than  research  data  in  other  subject  areas.    
  • 29. 3rd  Opportunity:  Infrastructure   Na;onal  Library  of  China  The  requirements  of  infrastructure  for  many  humani;es  research  data  resemble  those    of  digital  libraries.  No  new  research  facili;es  have  to  be  built.        
  • 30. 4th  Opportunity:  New  Metrics   http://newsinfo.iu.edu/pub/libs/images/usr/9584_h.jpg  It  is  likely  that  humani;es  research  data  have  an  web  impact  advantage.  High  societal     interest  could  result  in  higher  web-­‐o-­‐metric  and  usage  sta;s;cs  ra;ngs.    
  • 31. CONCLUSION  
  • 32. Another  mindset?  …to  see  text  &  images  as  humani;es  research  data.   ~   …to  see  the  humani;es  as  data  intensive.   ~   …to  see  a  web  impact  advantage  for  the   humani;es.   ~   …to  see  libraries  as  humani;es  research  facili;es.    
  • 33. Recommendations   Exploit  the  good  accessibility  of  humani;es  research  themes   through  newspapers,  exhibi;ons,  crowdsourcing  and  ci;zen   science.   ~   Make  as  many  research  outputs  web  accessible  as  possible.   ~  Invest  in  and  support  new  metrics  such  as  usage  sta;s;cs  and   web-­‐impact.     ~   Strengthen  partnership  between  humani;es  and  other   disciplines  and  libraries.  
  • 34. Suggestion        Rate  your  data!  
  • 35. Thank you