SlideShare a Scribd company logo
1 of 17
Download to read offline
The	
  HathiTrust	
  Research	
  Center	
  
(HTRC):	
  An	
  Overview	
  and	
  Demo	
  
IU	
  Librarians’	
  Day	
  |	
  IUPUI	
  Libraries	
  |	
  06.07.13	
  
	
  
Robert	
  H.	
  McDonald	
  -­‐	
  @mcdonald	
  -­‐	
  IU	
  Libraries	
  
Yiming	
  Sun	
  –	
  IU	
  Data	
  to	
  Insight	
  Center	
  
Miao	
  Chen	
  –	
  IU	
  Data	
  to	
  Insight	
  Center	
  
Tweet	
  US	
  -­‐	
  @HathiTrust	
  	
  #HTRC	
  
Speaker	
  Deck	
  Slides	
  
06.07.13	
   IU	
  Librarians’	
  Day	
   #HTRC	
  	
  @HathiTrust	
  
hHp://bit.ly/13gWD7C	
  
HTRC	
  Mission	
  
•  Public	
  research	
  arm	
  of	
  the	
  HathiTrust	
  
•  Help	
  researchers	
  world-­‐wide	
  to	
  accomplish	
  
tera-­‐scale	
  text	
  data-­‐mining	
  and	
  analysis	
  
– Develop	
  cuOng-­‐edge	
  soPware	
  tools	
  for	
  
processing,	
  analyzing	
  text	
  
– Develop	
  cyberinfrastructure	
  to	
  enable	
  HPC	
  access	
  
to	
  the	
  HathiTrust	
  Digital	
  Library	
  	
  
•  Established:	
  	
  July,	
  2011	
  
•  CollaboraWve	
  center:	
  	
  Indiana	
  University	
  &	
  
University	
  of	
  Illinois	
  
	
  06.07.13	
   IU	
  Librarians’	
  Day	
   #HTRC	
  	
  @HathiTrust	
  
HTRC	
  Governance	
  
•  Reports	
  to	
  the	
  HathiTrust	
  Board	
  of	
  Governors	
  
•  HTRC	
  ExecuWve	
  Commi[ee	
  
–  J.	
  Stephen	
  Downie	
  (Co-­‐director),	
  Professor	
  and	
  Associate	
  
Dean	
  for	
  Research,	
  University	
  of	
  Illinois	
  GSLIS	
  
–  Beth	
  Plale	
  (Co-­‐director	
  and	
  Chair),	
  Director	
  Data	
  To	
  Insight	
  
Center	
  and	
  professor	
  in	
  the	
  School	
  of	
  InformaWcs	
  and	
  
CompuWng	
  at	
  Indiana	
  University	
  	
  
–  Robert	
  H.	
  McDonald,	
  Associate	
  Dean	
  of	
  Libraries/Deputy	
  
Director	
  Data	
  to	
  Insight	
  Center	
  at	
  Indiana	
  University	
  
–  Beth	
  Sandore	
  Namachchivaya,	
  Associate	
  University	
  Librarian	
  
for	
  InformaWon	
  Technology	
  Planning	
  &	
  Policy	
  at	
  the	
  
University	
  of	
  Illinois	
  	
  
–  John	
  Unsworth,	
  Vice	
  Provost	
  for	
  Library	
  &	
  Technology	
  
Services	
  and	
  Chief	
  InformaWon	
  Officer	
  at	
  Brandeis	
  University	
  
•  HTRC	
  Advisory	
  Board	
  (See	
  members	
  next	
  slide)	
  
•  Google	
  Public	
  Domain	
  agreement	
  –	
  in	
  place	
  for	
  IU	
  and	
  
UIUC	
  
HTRC	
  Advisory	
  Board	
  
•  Cathy	
  Blake,	
  University	
  of	
  Illinois,	
  Urbana-­‐Champaign	
  
•  Beth	
  Cate,	
  Indiana	
  University	
  
•  Greg	
  Crane,	
  TuPs	
  University	
  
•  Laine	
  Farley,	
  California	
  Digital	
  Library	
  
•  Brian	
  Geiger,	
  University	
  of	
  California	
  at	
  Riverside	
  
•  David	
  Greenbaum,	
  University	
  of	
  California	
  at	
  Berkeley	
  
•  FoWs	
  Jannidis,	
  University	
  of	
  Wurzberg,	
  Germany	
  
•  Ma[hew	
  Jockers,	
  Stanford	
  University	
  
•  Jim	
  Neal,	
  Columbia	
  University	
  
•  Bill	
  Newman,	
  Indiana	
  University	
  
•  Bethany	
  Nowviskie,	
  University	
  of	
  Virginia	
  
•  Andrey	
  Rzhetsky,	
  University	
  of	
  Chicago	
  
•  Pat	
  Steele,	
  University	
  of	
  Maryland	
  
•  Craig	
  Stewart,	
  Indiana	
  University	
  
•  David	
  Theo	
  Goldberg,	
  University	
  of	
  California	
  at	
  Irvine	
  
•  John	
  Towns,	
  NaWonal	
  Center	
  for	
  SupercompuWng	
  ApplicaWons	
  
•  Madelyn	
  Wessel,	
  University	
  of	
  Virginia	
  
HTRC	
  Timeline	
  
•  Phase	
  I:	
  	
  18-­‐month	
  development	
  cycle	
  
– Began	
  01	
  July	
  2011	
  
– Demo	
  of	
  capability	
  September	
  2012	
  (14	
  mo	
  mark)	
  
at	
  HTRC	
  UnCamp	
  I	
  
•  Phase	
  II:	
  	
  broad	
  availability	
  of	
  resource,	
  begins	
  
31	
  March	
  2013	
  
– New	
  HTRC	
  Asst.	
  Director	
  for	
  EducaWon	
  and	
  
Outreach	
  (Miao	
  Chen)	
  
– New	
  listserv	
  to	
  drive	
  user	
  input:	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
htrc-­‐usergroup-­‐l	
  @	
  list.indiana.edu	
  06.07.13	
   IU	
  Librarians’	
  Day	
   #HTRC	
  	
  @HathiTrust	
  
HTRC	
  Next	
  Steps	
  
•  Phase	
  2	
  availability	
  of	
  resource	
  31	
  March	
  2013	
  
•  Thanks	
  to:	
  	
  
	
  
Photos from HTRC UnCamp 9.10.12
at Indiana University
HTRC	
  Phase	
  2:	
  Current	
  Thrusts	
  
•  Grow	
  HTRC	
  User-­‐base	
  
– Outreach	
  and	
  Engagement	
  
•  Input	
  from	
  HTRC	
  Advisory	
  Board	
  
•  Input	
  from	
  HT	
  BOG	
  
– Town	
  Hall	
  Groups	
  at	
  DH,	
  JCDL,	
  JADH	
  
– Online	
  Town	
  Hall	
  Groups	
  
•  Develop	
  New	
  SpecificaWons	
  from	
  User-­‐Based	
  
Agile	
  Development	
  Methodology	
  
•  Develop	
  and	
  Integrate	
  Sloan	
  Cloud	
  
Components	
  into	
  the	
  HTRC	
  Infrastructure	
  
	
  06.07.13	
   IU	
  Librarians’	
  Day	
   #HTRC	
  	
  @HathiTrust	
  
 
	
  
06.07.13	
   IU	
  Librarians’	
  Day	
   #HTRC	
  	
  @HathiTrust	
  
HTRC	
  
Architecture	
  
Overview	
  
Data	
  API	
  access	
  interface	
  
Portal	
  
Security	
  (OAuth2	
  WSO2	
  IS)	
  
Algorithms	
  
and	
  Worksets	
  
Registry	
  
(WSO2	
  GR)	
  
ApplicaWon	
  
submission	
  
Audit	
  	
  
Cassandra	
  
cluster	
  
volume	
  
store	
  
Solr	
  
index	
  EnWty	
  
ExtracWon	
  
Topic	
  
Modeling	
  
Sentence	
  
Tokenizer	
  
Word	
  
posiWon	
  
Latent	
  
semanWc	
  
analysis	
  
High	
  level	
  apps	
  
Compute	
  resources	
   Storage	
  resources	
  
Blacklight	
  
 
	
  
06.07.13	
   IU	
  Librarians’	
  Day	
   #HTRC	
  	
  @HathiTrust	
  
VM	
  
Request	
  
VM	
  
Image	
  
Manager	
  
VM	
  
Image	
  
Store	
  
VM	
  
Image	
  
Builder	
  
VM	
  
Manager	
  
VM	
  
instance	
  
Sloan	
  
Cloud	
  
SSH	
  
Non-­‐consumpWve	
  
Output	
  Storage	
  
user	
  
HTRC	
  Non-­‐
ConsumpUve	
  Research	
  
Access	
  (Sloan	
  Cloud)	
  
HTRC	
  DemonstraWon	
  
•  Yiming	
  Sun	
  –	
  Lead	
  Technical	
  Architect	
  HTRC	
  
06.07.13	
   IU	
  Librarians’	
  Day	
   #HTRC	
  	
  @HathiTrust	
  
Metadata	
  Enhancement	
  
•  Current	
  metadata	
  fields	
  are	
  MARC-­‐based	
  
– E.g.	
  publicaWon	
  date,	
  authors,	
  Wtle,	
  subject	
  
•  MARC	
  fields	
  are	
  fundamental	
  
•  Needed	
  more	
  fields	
  of	
  users’	
  interest	
  for	
  
granular	
  analyWcs	
  (Metadata	
  Enhancement)	
  
•  Solicit	
  user	
  requirements	
  and	
  prioriWze	
  for	
  
implementaWon	
  
– Mainly	
  digital	
  humaniWes	
  uses	
  now	
  
06.07.13	
   IU	
  Librarians’	
  Day	
   #HTRC	
  	
  @HathiTrust	
  
Top	
  Metadata	
  Enhancement	
  Items	
  
•  1st	
  round	
  user	
  requirement	
  collecWon,	
  top	
  3	
  
items	
  were	
  metadata	
  related:	
  
– Word	
  frequency	
  count	
  and	
  document	
  length	
  for	
  a	
  
volume	
  
– Metadata	
  de-­‐duplicaWon	
  
– Author	
  Gender	
  Analysis	
  
•  These	
  top	
  3	
  items	
  are	
  in	
  process	
  for	
  
funcWonality	
  within	
  the	
  current	
  producWon	
  
system	
  and	
  will	
  be	
  available	
  in	
  the	
  next	
  
quarter.	
  
06.07.13	
   IU	
  Librarians’	
  Day	
   #HTRC	
  	
  @HathiTrust	
  
Other	
  Metadata	
  Enhancement	
  Items	
  
•  Stats	
  analysis:	
  m-­‐idf	
  
•  Readability	
  score	
  
•  Language	
  
•  Topic	
  modeling	
  (e.g.	
  LDA	
  probability)	
  
•  Genre	
  
•  Era	
  of	
  compilaWon	
  
•  Book	
  length	
  (e.g.	
  short	
  or	
  long)	
  
•  Concordance	
  index	
  (indexing	
  with	
  context)	
  
06.07.13	
   IU	
  Librarians’	
  Day	
   #HTRC	
  	
  @HathiTrust	
  
HTRC	
  Upcoming	
  Events	
  
•  DH	
  2013	
  –	
  July	
  16-­‐29,	
  2013	
  
•  JCDL	
  2013	
  –	
  July	
  22-­‐26,	
  2013	
  
•  HathiTrust	
  Research	
  Center	
  UnCamp	
  –	
  	
  	
  	
  	
  	
  	
  	
  
Sept	
  8-­‐9,	
  2013	
  –	
  University	
  of	
  Illinois	
  
•  Catapult	
  Symposium	
  –	
  IUB	
  –	
  Sept	
  2013	
  
•  JADH	
  2013	
  –	
  September	
  19-­‐21,	
  2013	
  
•  Ohio	
  State	
  University	
  –	
  Library	
  Symposium	
  –	
  
October	
  2013	
  
•  Educause	
  2013	
  –	
  October	
  15-­‐18,	
  2013	
  
Thank	
  You	
  
•  This	
  presentaWon	
  was	
  made	
  possible	
  with	
  content	
  
provided	
  by	
  many	
  HTRC	
  colleagues	
  John	
  Unsworth,	
  J.	
  
Stephen	
  Downie,	
  Robert	
  H.	
  McDonald,	
  Beth	
  Sandore,	
  
Yiming	
  Sun,	
  Guangchen	
  Ruan,	
  Lore[a	
  Auvil,	
  Kirk	
  Hess,	
  
and	
  many	
  others…	
  
•  The	
  HTRC	
  Non-­‐ConsumpWve	
  Research	
  Grant	
  is	
  
graciously	
  funded	
  by	
  the	
  Alfred	
  P.	
  Sloan	
  FoundaWon	
  
•  IU	
  D2I-­‐PTI	
  is	
  graciously	
  funded	
  by	
  The	
  Lilly	
  Endowment,	
  
Inc.	
  
•  HTRC	
  -­‐	
  h[p://www.hathitrust.org/htrc	
  
•  IU	
  D2I	
  Center	
  -­‐	
  h[p://d2i.indiana.edu/	
  
•  UIUC	
  GSLIS	
  -­‐	
  h[p://www.lis.illinois.edu/	
  
	
  
06.07.13	
   IU	
  Librarians’	
  Day	
   #HTRC	
  	
  @HathiTrust	
  
Contact	
  InformaWon	
  
•  General	
  
– Robert	
  H.	
  McDonald,	
  HTRC	
  ExecuWve	
  Commi[ee	
  
– rhmcdona@indiana.edu	
  
•  Technical	
  
– Yiming	
  Sun,	
  Chief	
  Architect,	
  
yimsun@indiana.edu	
  
•  Requests	
  for	
  capability,	
  interest	
  
– Miao	
  Chen,	
  HTRC	
  Asst.	
  Director	
  of	
  EducaWon	
  and	
  
Outreach,	
  miaochen@indiana.edu	
  
06.07.13	
   IU	
  Librarians’	
  Day	
   #HTRC	
  	
  @HathiTrust	
  

More Related Content

What's hot

Support When It Counts - library roles in public access to federally-funded r...
Support When It Counts - library roles in public access to federally-funded r...Support When It Counts - library roles in public access to federally-funded r...
Support When It Counts - library roles in public access to federally-funded r...Hilary Davis
 
Reveal Digital: innovative library crowdfunding model for open access digita...
Reveal Digital:  innovative library crowdfunding model for open access digita...Reveal Digital:  innovative library crowdfunding model for open access digita...
Reveal Digital: innovative library crowdfunding model for open access digita...PaolaMarchionni
 
Keystone summer school 2015 paolo-missier-provenance
Keystone summer school 2015 paolo-missier-provenanceKeystone summer school 2015 paolo-missier-provenance
Keystone summer school 2015 paolo-missier-provenancePaolo Missier
 
Linked Open Data: Identifying Opportunities
Linked Open Data: Identifying OpportunitiesLinked Open Data: Identifying Opportunities
Linked Open Data: Identifying OpportunitiesLibrary_Connect
 
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...Micah Altman
 
The FOSTER project - general overview
The FOSTER project - general overviewThe FOSTER project - general overview
The FOSTER project - general overviewMartin Donnelly
 
The Embedded Data Librarian
The Embedded Data LibrarianThe Embedded Data Librarian
The Embedded Data LibrarianLibrary_Connect
 
Realizing the Potential of Research Data by Carole L. Palmer
Realizing the Potential of Research Data by Carole L. Palmer Realizing the Potential of Research Data by Carole L. Palmer
Realizing the Potential of Research Data by Carole L. Palmer carolelynnpalmer
 
"From Reading Rooms to Research Commons" Sheila Corrall, DARTS4
"From Reading Rooms to Research Commons" Sheila Corrall, DARTS4"From Reading Rooms to Research Commons" Sheila Corrall, DARTS4
"From Reading Rooms to Research Commons" Sheila Corrall, DARTS4ARLGSW
 
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...African Open Science Platform
 
Building the Archive of DH Research
Building the Archive of DH ResearchBuilding the Archive of DH Research
Building the Archive of DH ResearchHarriett Green
 
2015CVKurtWagner
2015CVKurtWagner2015CVKurtWagner
2015CVKurtWagnerKurt Wagner
 

What's hot (20)

Support When It Counts - library roles in public access to federally-funded r...
Support When It Counts - library roles in public access to federally-funded r...Support When It Counts - library roles in public access to federally-funded r...
Support When It Counts - library roles in public access to federally-funded r...
 
Reveal Digital: innovative library crowdfunding model for open access digita...
Reveal Digital:  innovative library crowdfunding model for open access digita...Reveal Digital:  innovative library crowdfunding model for open access digita...
Reveal Digital: innovative library crowdfunding model for open access digita...
 
Llauferseiler "OU Libraries: Opportunities Supporting Research and Education"
Llauferseiler "OU Libraries: Opportunities Supporting Research and Education"Llauferseiler "OU Libraries: Opportunities Supporting Research and Education"
Llauferseiler "OU Libraries: Opportunities Supporting Research and Education"
 
Keystone summer school 2015 paolo-missier-provenance
Keystone summer school 2015 paolo-missier-provenanceKeystone summer school 2015 paolo-missier-provenance
Keystone summer school 2015 paolo-missier-provenance
 
Linked Open Data: Identifying Opportunities
Linked Open Data: Identifying OpportunitiesLinked Open Data: Identifying Opportunities
Linked Open Data: Identifying Opportunities
 
Goldman "Collaboratively Build Data Science Services and Skills"
Goldman "Collaboratively Build Data Science Services and Skills"Goldman "Collaboratively Build Data Science Services and Skills"
Goldman "Collaboratively Build Data Science Services and Skills"
 
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
WORLDMAP: A SPATIAL INFRASTRUCTURE TO SUPPORT TEACHING AND RESEARCH (BROWN BA...
 
The FOSTER project - general overview
The FOSTER project - general overviewThe FOSTER project - general overview
The FOSTER project - general overview
 
NISO Working Group Connection Live! Research Data Metrics Landscape: An Updat...
NISO Working Group Connection Live! Research Data Metrics Landscape: An Updat...NISO Working Group Connection Live! Research Data Metrics Landscape: An Updat...
NISO Working Group Connection Live! Research Data Metrics Landscape: An Updat...
 
NISO — Cutting Edges with Company: Emerging Technologies as a Collective Effort
NISO — Cutting Edges with Company: Emerging Technologies as a Collective EffortNISO — Cutting Edges with Company: Emerging Technologies as a Collective Effort
NISO — Cutting Edges with Company: Emerging Technologies as a Collective Effort
 
The Embedded Data Librarian
The Embedded Data LibrarianThe Embedded Data Librarian
The Embedded Data Librarian
 
Orcutt ivey New Needs New Approaches: Libraries as Technology Collaborators
Orcutt ivey New Needs New Approaches: Libraries as Technology CollaboratorsOrcutt ivey New Needs New Approaches: Libraries as Technology Collaborators
Orcutt ivey New Needs New Approaches: Libraries as Technology Collaborators
 
Realizing the Potential of Research Data by Carole L. Palmer
Realizing the Potential of Research Data by Carole L. Palmer Realizing the Potential of Research Data by Carole L. Palmer
Realizing the Potential of Research Data by Carole L. Palmer
 
Williams Open Refine for Librarians
Williams Open Refine for LibrariansWilliams Open Refine for Librarians
Williams Open Refine for Librarians
 
"From Reading Rooms to Research Commons" Sheila Corrall, DARTS4
"From Reading Rooms to Research Commons" Sheila Corrall, DARTS4"From Reading Rooms to Research Commons" Sheila Corrall, DARTS4
"From Reading Rooms to Research Commons" Sheila Corrall, DARTS4
 
NISO Webinar: Understanding Critical Elements of E-books: Part 2: Heritage Lo...
NISO Webinar: Understanding Critical Elements of E-books: Part 2: Heritage Lo...NISO Webinar: Understanding Critical Elements of E-books: Part 2: Heritage Lo...
NISO Webinar: Understanding Critical Elements of E-books: Part 2: Heritage Lo...
 
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...
 
Building the Archive of DH Research
Building the Archive of DH ResearchBuilding the Archive of DH Research
Building the Archive of DH Research
 
Allard - Research Data Services in Libraries
Allard - Research Data Services in LibrariesAllard - Research Data Services in Libraries
Allard - Research Data Services in Libraries
 
2015CVKurtWagner
2015CVKurtWagner2015CVKurtWagner
2015CVKurtWagner
 

Similar to The HathiTrust Research Center (HTRC): An Overview and Demo

JCDL 2015 Tutorial Opening Slides
JCDL 2015 Tutorial Opening SlidesJCDL 2015 Tutorial Opening Slides
JCDL 2015 Tutorial Opening SlidesRobert H. McDonald
 
The HathiTrust Research Center: Big Data Analytics in a Secure Data Framework
The HathiTrust Research Center: Big Data Analytics in a Secure Data FrameworkThe HathiTrust Research Center: Big Data Analytics in a Secure Data Framework
The HathiTrust Research Center: Big Data Analytics in a Secure Data FrameworkRobert H. McDonald
 
OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.OCLC
 
Workshop 4: Open Science & Open Data for Librarians/Ina Smith
Workshop 4: Open Science & Open Data for Librarians/Ina SmithWorkshop 4: Open Science & Open Data for Librarians/Ina Smith
Workshop 4: Open Science & Open Data for Librarians/Ina SmithAfrican Open Science Platform
 
Teaching Data Science to Undergraduate Students
Teaching Data Science to Undergraduate StudentsTeaching Data Science to Undergraduate Students
Teaching Data Science to Undergraduate StudentsNicole Vasilevsky
 
Digital Scholarly Communication @Claremont Colleges
Digital Scholarly Communication @Claremont CollegesDigital Scholarly Communication @Claremont Colleges
Digital Scholarly Communication @Claremont CollegesAshley Sanders, Ph.D.
 
Challenges and Opportunities in Customizing Library Repository User Interfaces
Challenges and Opportunities in Customizing Library Repository User InterfacesChallenges and Opportunities in Customizing Library Repository User Interfaces
Challenges and Opportunities in Customizing Library Repository User InterfacesRachel Vacek
 
Slides | Research data literacy and the library
Slides | Research data literacy and the librarySlides | Research data literacy and the library
Slides | Research data literacy and the libraryColleen DeLory
 
Slides | Research data literacy and the library
Slides | Research data literacy and the librarySlides | Research data literacy and the library
Slides | Research data literacy and the libraryLibrary_Connect
 
"Filling the digital preservation gap" with Archivematica
"Filling the digital preservation gap" with Archivematica"Filling the digital preservation gap" with Archivematica
"Filling the digital preservation gap" with ArchivematicaJenny Mitcham
 
Research into Practice case study 2: Library linked data implementations an...
	Research into Practice case study 2:  Library linked data implementations an...	Research into Practice case study 2:  Library linked data implementations an...
Research into Practice case study 2: Library linked data implementations an...Hazel Hall
 
Plale HathiTrust El Colegio de Mexico May2014
Plale HathiTrust El Colegio de Mexico May2014Plale HathiTrust El Colegio de Mexico May2014
Plale HathiTrust El Colegio de Mexico May2014Beth Plale
 
Institutional Repositories
Institutional RepositoriesInstitutional Repositories
Institutional RepositoriesSridhar Gutam
 
Identifiers for Researchers and Data: Increasing Attribution and Discovery– J...
Identifiers for Researchers and Data: Increasing Attribution and Discovery– J...Identifiers for Researchers and Data: Increasing Attribution and Discovery– J...
Identifiers for Researchers and Data: Increasing Attribution and Discovery– J...ALISS
 
Research data support: a growth area for academic libraries?
Research data support: a growth area for academic libraries?Research data support: a growth area for academic libraries?
Research data support: a growth area for academic libraries? Robin Rice
 
Next generation data services at the Marriott Library
Next generation data services at the Marriott LibraryNext generation data services at the Marriott Library
Next generation data services at the Marriott LibraryRebekah Cummings
 

Similar to The HathiTrust Research Center (HTRC): An Overview and Demo (20)

JCDL 2015 Tutorial Opening Slides
JCDL 2015 Tutorial Opening SlidesJCDL 2015 Tutorial Opening Slides
JCDL 2015 Tutorial Opening Slides
 
The HathiTrust Research Center: Big Data Analytics in a Secure Data Framework
The HathiTrust Research Center: Big Data Analytics in a Secure Data FrameworkThe HathiTrust Research Center: Big Data Analytics in a Secure Data Framework
The HathiTrust Research Center: Big Data Analytics in a Secure Data Framework
 
OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.OCLC Research Update at ALA Chicago. June 26, 2017.
OCLC Research Update at ALA Chicago. June 26, 2017.
 
Workshop 4: Open Science & Open Data for Librarians/Ina Smith
Workshop 4: Open Science & Open Data for Librarians/Ina SmithWorkshop 4: Open Science & Open Data for Librarians/Ina Smith
Workshop 4: Open Science & Open Data for Librarians/Ina Smith
 
Sept 18 NISO Webinar: Research Data Curation, Part 2: Libraries and Big Data ...
Sept 18 NISO Webinar: Research Data Curation, Part 2: Libraries and Big Data ...Sept 18 NISO Webinar: Research Data Curation, Part 2: Libraries and Big Data ...
Sept 18 NISO Webinar: Research Data Curation, Part 2: Libraries and Big Data ...
 
Pace "How the Community Wants to Serve Its Constituents"
Pace "How the Community Wants to Serve Its Constituents"Pace "How the Community Wants to Serve Its Constituents"
Pace "How the Community Wants to Serve Its Constituents"
 
Teaching Data Science to Undergraduate Students
Teaching Data Science to Undergraduate StudentsTeaching Data Science to Undergraduate Students
Teaching Data Science to Undergraduate Students
 
Digital Scholarly Communication @Claremont Colleges
Digital Scholarly Communication @Claremont CollegesDigital Scholarly Communication @Claremont Colleges
Digital Scholarly Communication @Claremont Colleges
 
Challenges and Opportunities in Customizing Library Repository User Interfaces
Challenges and Opportunities in Customizing Library Repository User InterfacesChallenges and Opportunities in Customizing Library Repository User Interfaces
Challenges and Opportunities in Customizing Library Repository User Interfaces
 
Challenges & Opportunities in Customizing Library IR User Interfaces
Challenges & Opportunities in Customizing Library IR User InterfacesChallenges & Opportunities in Customizing Library IR User Interfaces
Challenges & Opportunities in Customizing Library IR User Interfaces
 
Open Science and Open Data for Librarians
Open Science and Open Data for LibrariansOpen Science and Open Data for Librarians
Open Science and Open Data for Librarians
 
Slides | Research data literacy and the library
Slides | Research data literacy and the librarySlides | Research data literacy and the library
Slides | Research data literacy and the library
 
Slides | Research data literacy and the library
Slides | Research data literacy and the librarySlides | Research data literacy and the library
Slides | Research data literacy and the library
 
"Filling the digital preservation gap" with Archivematica
"Filling the digital preservation gap" with Archivematica"Filling the digital preservation gap" with Archivematica
"Filling the digital preservation gap" with Archivematica
 
Research into Practice case study 2: Library linked data implementations an...
	Research into Practice case study 2:  Library linked data implementations an...	Research into Practice case study 2:  Library linked data implementations an...
Research into Practice case study 2: Library linked data implementations an...
 
Plale HathiTrust El Colegio de Mexico May2014
Plale HathiTrust El Colegio de Mexico May2014Plale HathiTrust El Colegio de Mexico May2014
Plale HathiTrust El Colegio de Mexico May2014
 
Institutional Repositories
Institutional RepositoriesInstitutional Repositories
Institutional Repositories
 
Identifiers for Researchers and Data: Increasing Attribution and Discovery– J...
Identifiers for Researchers and Data: Increasing Attribution and Discovery– J...Identifiers for Researchers and Data: Increasing Attribution and Discovery– J...
Identifiers for Researchers and Data: Increasing Attribution and Discovery– J...
 
Research data support: a growth area for academic libraries?
Research data support: a growth area for academic libraries?Research data support: a growth area for academic libraries?
Research data support: a growth area for academic libraries?
 
Next generation data services at the Marriott Library
Next generation data services at the Marriott LibraryNext generation data services at the Marriott Library
Next generation data services at the Marriott Library
 

More from Robert H. McDonald

ER&L The Role of Choice in the Future of Discovery Evaluations Panel
ER&L The Role of Choice in the Future of Discovery Evaluations PanelER&L The Role of Choice in the Future of Discovery Evaluations Panel
ER&L The Role of Choice in the Future of Discovery Evaluations PanelRobert H. McDonald
 
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...Robert H. McDonald
 
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...Robert H. McDonald
 
TLT Discussion on "Saving My Stuff" - 06.05.15
TLT Discussion on "Saving My Stuff" - 06.05.15TLT Discussion on "Saving My Stuff" - 06.05.15
TLT Discussion on "Saving My Stuff" - 06.05.15Robert H. McDonald
 
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterElephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterRobert H. McDonald
 
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...Robert H. McDonald
 
ER&L 2015 Closing Keynote Slides
ER&L 2015 Closing Keynote SlidesER&L 2015 Closing Keynote Slides
ER&L 2015 Closing Keynote SlidesRobert H. McDonald
 
HathiTrust Research Center Data Capsule Overview 09.10.14
HathiTrust Research Center Data Capsule Overview 09.10.14HathiTrust Research Center Data Capsule Overview 09.10.14
HathiTrust Research Center Data Capsule Overview 09.10.14Robert H. McDonald
 
Owning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your PatronsOwning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your PatronsRobert H. McDonald
 
Kuali OLE: Enabling Choices for Libraries
Kuali OLE: Enabling Choices for LibrariesKuali OLE: Enabling Choices for Libraries
Kuali OLE: Enabling Choices for LibrariesRobert H. McDonald
 
Charleston Seminar Being Earnest with our Collections - Legacy to Cloud
Charleston Seminar Being Earnest with our Collections - Legacy to CloudCharleston Seminar Being Earnest with our Collections - Legacy to Cloud
Charleston Seminar Being Earnest with our Collections - Legacy to CloudRobert H. McDonald
 
SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science Robert H. McDonald
 
New Perspectives for Business Intelligence: Library and Research Technologies...
New Perspectives for Business Intelligence: Library and Research Technologies...New Perspectives for Business Intelligence: Library and Research Technologies...
New Perspectives for Business Intelligence: Library and Research Technologies...Robert H. McDonald
 
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...Robert H. McDonald
 
GOKb & KB+: An International Partnership to leverage Open Access and Communit...
GOKb & KB+: An International Partnership to leverage Open Access and Communit...GOKb & KB+: An International Partnership to leverage Open Access and Communit...
GOKb & KB+: An International Partnership to leverage Open Access and Communit...Robert H. McDonald
 
HathiTrust Research Center: The Fast Version
HathiTrust Research Center: The Fast VersionHathiTrust Research Center: The Fast Version
HathiTrust Research Center: The Fast VersionRobert H. McDonald
 
Building a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceBuilding a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceRobert H. McDonald
 

More from Robert H. McDonald (20)

ER&L The Role of Choice in the Future of Discovery Evaluations Panel
ER&L The Role of Choice in the Future of Discovery Evaluations PanelER&L The Role of Choice in the Future of Discovery Evaluations Panel
ER&L The Role of Choice in the Future of Discovery Evaluations Panel
 
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
 
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
 
TLT Discussion on "Saving My Stuff" - 06.05.15
TLT Discussion on "Saving My Stuff" - 06.05.15TLT Discussion on "Saving My Stuff" - 06.05.15
TLT Discussion on "Saving My Stuff" - 06.05.15
 
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterElephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
 
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
 
ER&L 2015 Closing Keynote Slides
ER&L 2015 Closing Keynote SlidesER&L 2015 Closing Keynote Slides
ER&L 2015 Closing Keynote Slides
 
HathiTrust Research Center Data Capsule Overview 09.10.14
HathiTrust Research Center Data Capsule Overview 09.10.14HathiTrust Research Center Data Capsule Overview 09.10.14
HathiTrust Research Center Data Capsule Overview 09.10.14
 
Owning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your PatronsOwning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your Patrons
 
Kuali OLE: Enabling Choices for Libraries
Kuali OLE: Enabling Choices for LibrariesKuali OLE: Enabling Choices for Libraries
Kuali OLE: Enabling Choices for Libraries
 
Charleston Seminar Being Earnest with our Collections - Legacy to Cloud
Charleston Seminar Being Earnest with our Collections - Legacy to CloudCharleston Seminar Being Earnest with our Collections - Legacy to Cloud
Charleston Seminar Being Earnest with our Collections - Legacy to Cloud
 
SCONUL Kuali OLE Briefing
SCONUL Kuali OLE BriefingSCONUL Kuali OLE Briefing
SCONUL Kuali OLE Briefing
 
SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science
 
New Perspectives for Business Intelligence: Library and Research Technologies...
New Perspectives for Business Intelligence: Library and Research Technologies...New Perspectives for Business Intelligence: Library and Research Technologies...
New Perspectives for Business Intelligence: Library and Research Technologies...
 
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...
 
GOKb & KB+: An International Partnership to leverage Open Access and Communit...
GOKb & KB+: An International Partnership to leverage Open Access and Communit...GOKb & KB+: An International Partnership to leverage Open Access and Communit...
GOKb & KB+: An International Partnership to leverage Open Access and Communit...
 
Kuali OLE @ LITA Forum 2012
Kuali OLE @ LITA Forum 2012Kuali OLE @ LITA Forum 2012
Kuali OLE @ LITA Forum 2012
 
HathiTrust Research Center: The Fast Version
HathiTrust Research Center: The Fast VersionHathiTrust Research Center: The Fast Version
HathiTrust Research Center: The Fast Version
 
HTRC Architecture Overview
HTRC Architecture OverviewHTRC Architecture Overview
HTRC Architecture Overview
 
Building a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceBuilding a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability Science
 

Recently uploaded

Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 

Recently uploaded (20)

Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 

The HathiTrust Research Center (HTRC): An Overview and Demo

  • 1. The  HathiTrust  Research  Center   (HTRC):  An  Overview  and  Demo   IU  Librarians’  Day  |  IUPUI  Libraries  |  06.07.13     Robert  H.  McDonald  -­‐  @mcdonald  -­‐  IU  Libraries   Yiming  Sun  –  IU  Data  to  Insight  Center   Miao  Chen  –  IU  Data  to  Insight  Center   Tweet  US  -­‐  @HathiTrust    #HTRC  
  • 2. Speaker  Deck  Slides   06.07.13   IU  Librarians’  Day   #HTRC    @HathiTrust   hHp://bit.ly/13gWD7C  
  • 3. HTRC  Mission   •  Public  research  arm  of  the  HathiTrust   •  Help  researchers  world-­‐wide  to  accomplish   tera-­‐scale  text  data-­‐mining  and  analysis   – Develop  cuOng-­‐edge  soPware  tools  for   processing,  analyzing  text   – Develop  cyberinfrastructure  to  enable  HPC  access   to  the  HathiTrust  Digital  Library     •  Established:    July,  2011   •  CollaboraWve  center:    Indiana  University  &   University  of  Illinois    06.07.13   IU  Librarians’  Day   #HTRC    @HathiTrust  
  • 4. HTRC  Governance   •  Reports  to  the  HathiTrust  Board  of  Governors   •  HTRC  ExecuWve  Commi[ee   –  J.  Stephen  Downie  (Co-­‐director),  Professor  and  Associate   Dean  for  Research,  University  of  Illinois  GSLIS   –  Beth  Plale  (Co-­‐director  and  Chair),  Director  Data  To  Insight   Center  and  professor  in  the  School  of  InformaWcs  and   CompuWng  at  Indiana  University     –  Robert  H.  McDonald,  Associate  Dean  of  Libraries/Deputy   Director  Data  to  Insight  Center  at  Indiana  University   –  Beth  Sandore  Namachchivaya,  Associate  University  Librarian   for  InformaWon  Technology  Planning  &  Policy  at  the   University  of  Illinois     –  John  Unsworth,  Vice  Provost  for  Library  &  Technology   Services  and  Chief  InformaWon  Officer  at  Brandeis  University   •  HTRC  Advisory  Board  (See  members  next  slide)   •  Google  Public  Domain  agreement  –  in  place  for  IU  and   UIUC  
  • 5. HTRC  Advisory  Board   •  Cathy  Blake,  University  of  Illinois,  Urbana-­‐Champaign   •  Beth  Cate,  Indiana  University   •  Greg  Crane,  TuPs  University   •  Laine  Farley,  California  Digital  Library   •  Brian  Geiger,  University  of  California  at  Riverside   •  David  Greenbaum,  University  of  California  at  Berkeley   •  FoWs  Jannidis,  University  of  Wurzberg,  Germany   •  Ma[hew  Jockers,  Stanford  University   •  Jim  Neal,  Columbia  University   •  Bill  Newman,  Indiana  University   •  Bethany  Nowviskie,  University  of  Virginia   •  Andrey  Rzhetsky,  University  of  Chicago   •  Pat  Steele,  University  of  Maryland   •  Craig  Stewart,  Indiana  University   •  David  Theo  Goldberg,  University  of  California  at  Irvine   •  John  Towns,  NaWonal  Center  for  SupercompuWng  ApplicaWons   •  Madelyn  Wessel,  University  of  Virginia  
  • 6. HTRC  Timeline   •  Phase  I:    18-­‐month  development  cycle   – Began  01  July  2011   – Demo  of  capability  September  2012  (14  mo  mark)   at  HTRC  UnCamp  I   •  Phase  II:    broad  availability  of  resource,  begins   31  March  2013   – New  HTRC  Asst.  Director  for  EducaWon  and   Outreach  (Miao  Chen)   – New  listserv  to  drive  user  input:                                                                     htrc-­‐usergroup-­‐l  @  list.indiana.edu  06.07.13   IU  Librarians’  Day   #HTRC    @HathiTrust  
  • 7. HTRC  Next  Steps   •  Phase  2  availability  of  resource  31  March  2013   •  Thanks  to:       Photos from HTRC UnCamp 9.10.12 at Indiana University
  • 8. HTRC  Phase  2:  Current  Thrusts   •  Grow  HTRC  User-­‐base   – Outreach  and  Engagement   •  Input  from  HTRC  Advisory  Board   •  Input  from  HT  BOG   – Town  Hall  Groups  at  DH,  JCDL,  JADH   – Online  Town  Hall  Groups   •  Develop  New  SpecificaWons  from  User-­‐Based   Agile  Development  Methodology   •  Develop  and  Integrate  Sloan  Cloud   Components  into  the  HTRC  Infrastructure    06.07.13   IU  Librarians’  Day   #HTRC    @HathiTrust  
  • 9.     06.07.13   IU  Librarians’  Day   #HTRC    @HathiTrust   HTRC   Architecture   Overview   Data  API  access  interface   Portal   Security  (OAuth2  WSO2  IS)   Algorithms   and  Worksets   Registry   (WSO2  GR)   ApplicaWon   submission   Audit     Cassandra   cluster   volume   store   Solr   index  EnWty   ExtracWon   Topic   Modeling   Sentence   Tokenizer   Word   posiWon   Latent   semanWc   analysis   High  level  apps   Compute  resources   Storage  resources   Blacklight  
  • 10.     06.07.13   IU  Librarians’  Day   #HTRC    @HathiTrust   VM   Request   VM   Image   Manager   VM   Image   Store   VM   Image   Builder   VM   Manager   VM   instance   Sloan   Cloud   SSH   Non-­‐consumpWve   Output  Storage   user   HTRC  Non-­‐ ConsumpUve  Research   Access  (Sloan  Cloud)  
  • 11. HTRC  DemonstraWon   •  Yiming  Sun  –  Lead  Technical  Architect  HTRC   06.07.13   IU  Librarians’  Day   #HTRC    @HathiTrust  
  • 12. Metadata  Enhancement   •  Current  metadata  fields  are  MARC-­‐based   – E.g.  publicaWon  date,  authors,  Wtle,  subject   •  MARC  fields  are  fundamental   •  Needed  more  fields  of  users’  interest  for   granular  analyWcs  (Metadata  Enhancement)   •  Solicit  user  requirements  and  prioriWze  for   implementaWon   – Mainly  digital  humaniWes  uses  now   06.07.13   IU  Librarians’  Day   #HTRC    @HathiTrust  
  • 13. Top  Metadata  Enhancement  Items   •  1st  round  user  requirement  collecWon,  top  3   items  were  metadata  related:   – Word  frequency  count  and  document  length  for  a   volume   – Metadata  de-­‐duplicaWon   – Author  Gender  Analysis   •  These  top  3  items  are  in  process  for   funcWonality  within  the  current  producWon   system  and  will  be  available  in  the  next   quarter.   06.07.13   IU  Librarians’  Day   #HTRC    @HathiTrust  
  • 14. Other  Metadata  Enhancement  Items   •  Stats  analysis:  m-­‐idf   •  Readability  score   •  Language   •  Topic  modeling  (e.g.  LDA  probability)   •  Genre   •  Era  of  compilaWon   •  Book  length  (e.g.  short  or  long)   •  Concordance  index  (indexing  with  context)   06.07.13   IU  Librarians’  Day   #HTRC    @HathiTrust  
  • 15. HTRC  Upcoming  Events   •  DH  2013  –  July  16-­‐29,  2013   •  JCDL  2013  –  July  22-­‐26,  2013   •  HathiTrust  Research  Center  UnCamp  –                 Sept  8-­‐9,  2013  –  University  of  Illinois   •  Catapult  Symposium  –  IUB  –  Sept  2013   •  JADH  2013  –  September  19-­‐21,  2013   •  Ohio  State  University  –  Library  Symposium  –   October  2013   •  Educause  2013  –  October  15-­‐18,  2013  
  • 16. Thank  You   •  This  presentaWon  was  made  possible  with  content   provided  by  many  HTRC  colleagues  John  Unsworth,  J.   Stephen  Downie,  Robert  H.  McDonald,  Beth  Sandore,   Yiming  Sun,  Guangchen  Ruan,  Lore[a  Auvil,  Kirk  Hess,   and  many  others…   •  The  HTRC  Non-­‐ConsumpWve  Research  Grant  is   graciously  funded  by  the  Alfred  P.  Sloan  FoundaWon   •  IU  D2I-­‐PTI  is  graciously  funded  by  The  Lilly  Endowment,   Inc.   •  HTRC  -­‐  h[p://www.hathitrust.org/htrc   •  IU  D2I  Center  -­‐  h[p://d2i.indiana.edu/   •  UIUC  GSLIS  -­‐  h[p://www.lis.illinois.edu/     06.07.13   IU  Librarians’  Day   #HTRC    @HathiTrust  
  • 17. Contact  InformaWon   •  General   – Robert  H.  McDonald,  HTRC  ExecuWve  Commi[ee   – rhmcdona@indiana.edu   •  Technical   – Yiming  Sun,  Chief  Architect,   yimsun@indiana.edu   •  Requests  for  capability,  interest   – Miao  Chen,  HTRC  Asst.  Director  of  EducaWon  and   Outreach,  miaochen@indiana.edu   06.07.13   IU  Librarians’  Day   #HTRC    @HathiTrust