SlideShare a Scribd company logo
The HathiTrust Research Center: 
Big Data Analytics in a Secure 
pti.iu.edu/sc14 
Data Framework 
@hathitrust #SC14 
Beth Plale | @bplale 
Director Data to Insight Center | Indiana University 
Robert H. McDonald | @mcdonald 
Deputy Director Data to Insight Center | Indiana University
pti.iu.edu/sc14 
@hathitrust #SC14 
Outline 
• What is the HTRC? 
• Non-Consumptive Research Paradigm 
• Current Architecture 
• Future Architecture 
• Advanced Collaborative Support (RFP) 
• HTRC Science on a Sphere 
• HTRC @ Events
pti.iu.edu/sc14 
@hathitrust #SC14 
HathiTrust Digital Library 
• HathiTrust is a partnership of 
90+ academic & research 
institutions, offering a collection 
of millions of digitized titles. 
• http://hathitrust.org 
– IU is a founding member of the 
HathiTrust along with University of 
Michigan, University of California, 
and the University of Virginia
@hathitrust #SC14 
HathiTrust Research Center 
Mission 
• Public research arm of HathiTrust 
• Goal: enable researchers world-wide to accomplish tera-scale 
pti.iu.edu/sc14 
text data-mining and analysis 
– Develop cutting-edge software tools for processing, analyzing 
text 
– Develop cyberinfrastructure to enable HPC access to the 
HathiTrust Digital Library 
• Established: July, 2011 
• Collaborative center: Indiana University & University of 
Illinois
pti.iu.edu/sc14 
@hathitrust #SC14 
HTRC Timeline 
• Phase I: development 01 Jul 2011 – 31 Mar 2013 
– HTRC software and services release v1.0 
https://github.com/htrc 
• Phase II: outreach, 01 Apr 2013 – 30 June 2014 
– 2nd HTRC UnCamp Sep ’13 
• Phase III: operations, 01 July 2014 - present
pti.iu.edu/sc14 
@hathitrust #SC14 
HTRC Current Users 
Projected Use 2019 
Digital 
Humanities 
(60) 
Education 
(60) 
Informatics 
(60) 
Observers 
(20) 
194 existing user accounts 
Lots of user accounts; good 
starting point. 
Improve : 
• Increase amount of real work 
being accomplished as 
measured by usage on HTRC’s 
compute resources Quarry and 
Big Red II at IU 
• Develop educational uses 
• Develop informatics uses 
• Decrease number of observers 
to 10% 
 Project 200 users at any one time 
of which 90% are doing relevant 
education/scholarship 
6
pti.iu.edu/sc14 
@hathitrust #SC14 
Non-Consumptive Research 
Paradigm 
• No action or set of actions on part of users, 
either acting alone or in cooperation with other 
users over duration of one or multiple sessions 
can result in sufficient information gathered from 
collection of copyrighted works to reassemble 
pages from collection. 
• Definition disallows collusion between users, or 
accumulation of material over time. 
Differentiates human researcher from proxy 
which is not a user. Users are human beings.
pti.iu.edu/sc14 
@hathitrust #SC14 
HTRC 
All the complexity 
Complexity hiding interface 
Request 
Spatial plots 
Statistical plots 
Tabular info
pti.iu.edu/sc14 
@hathitrust #SC14 
HTRC Version 2.0
pti.iu.edu/sc14 
@hathitrust #SC14 
HTRC Goals 
• Provide a persistent and sustainable structure to 
enable original and cutting edge research. 
– Leverage data storage and computational infrastructure at Indiana 
& Illinois 
– Stimulate community development of new functionality and tools 
– Use tools to enable discoveries that would not be possible without 
the HTRC 
• Enable scholars to fully utilize content of 
HathiTrust Library while preventing intellectual 
property misuse within U.S. copyright law. 
– Provision secure computational and data environment for scholars 
to perform research using HathiTrust Digital Library.
pti.iu.edu/sc14 
@hathitrust #SC14 
HTRC Organization 
2014-18 
HTRC Executive 
Mgmt 
Administrative 
Support 
Core 
Development 
Advanced 
Research 
Advanced 
Collaborative 
Support 
Scholarly 
Commons
HTRC Data Capsule 
pti.iu.edu/sc14 
@hathitrust #SC14 
HTRC Data Capsule@IU 
Team 
• Beth Plale (PI) 
• Jiaan Zeng 
• Guangchen Ruan 
HTRC Data Capsule@Michigan Team 
• Atul Prakash (PI) 
• Alexander Crowell 
Jiaan Zeng, Guangchen Ruan, Alexander Crowell, Atul Prakash, and 
Beth Plale. 2014. Cloud computing data capsules for non-consumptiveuse 
of texts. In Proceedings of the 5th ACM workshop 
on Scientific cloud computing (ScienceCloud '14). ACM, New York, 
NY, USA, 9-16. DOI=10.1145/2608029.2608031 
http://doi.acm.org/10.1145/2608029.2608031 
Special Thanks to 
• Samitha Liyanage 
• Milinda Pathirage 
• Zong Peng 
• Earlence Fernandes 
• Ajit Aluri
User Authentication 
pti.iu.edu/sc14 
@hathitrust #SC14 
HTRC Data Capsule 
VM-1 … 
Host-1 
Web UI 
Web Services 
Hypervisor Scripts 
… 
Database 
Firewall 
Audit 
Image Store 
Volume Store 
VM-k 
VM-1 … VM-k 
Host-N 
Web front end Web service Backend
@hathitrust #SC14 
HTRC Data Capsule Workflow 
pti.iu.edu/sc14
@hathitrust #SC14 
Data Capsule Screenshots 
pti.iu.edu/sc14 
Maintenance Mode 
Secure Mode
pti.iu.edu/sc14 
@hathitrust #SC14 
HTRC Science on a Sphere #SC14 
1. Texts published per 
country 
2. HathiTrust Member 
Institutions 
3. HT Google analytics
@hathitrust #SC14 
HTRC Advanced Collaborative Support 
• ACS will be offered on a rolling basis over next 
pti.iu.edu/sc14 
four years 2014-18 
• 1st RFP Call Deadline is Jan 8, 2015 5:00pm 
eastern 
– RFP - http://www.hathitrust.org/htrc/acs-rfp 
• For more info on the Advanced Collaborative 
Support please contact: 
htrc.acs.awards@gmail.com
pti.iu.edu/sc14 
@hathitrust #SC14 
HTRC@Events 
• DHCS 2014, Oct 22, 2014 
Evanston, IL 
• SC14 – IU Booth, Nov 17-19, 
2014, New Orleans, LA 
• CLIR/CNI Workshop on 
Expanded Access to 
Collections, Dec. 7, 2014, 
Washington, DC 
• HTRC UnCamp 2015 – March 
30-31, 2015 Ann Arbor, MI
pti.iu.edu/sc14 
@hathitrust #SC14 
Thank You 
HTRC IU Team 
• Beth Plale (PI) 
• Robert H. McDonald 
• Miao Chen 
• Guangchen Ruan 
• Zong Peng 
• Milinda Pathirage 
• Samitha Liyanage 
• Leena Unnikrishnan 
• Nicholae Cline 
HTRC UIUC Team 
• J. Stephen Downie (PI) 
• Beth Namachchivaya 
• Megan Senseney 
• Sayan Bhattacharyya 
• Colleen Fallaw 
• Loretta Auvil 
• Boris Capitanu 
• Harriet Green
@hathitrust #SC14 
More Information on HTRC 
• For details http://www.hathitrust.org/htrc/faq 
• General contact info 
pti.iu.edu/sc14 
– J. Stephen Downie, Co-Director HTRC, 
jdownie@Illinois.edu 
– Beth Plale, Co-Director HTRC, plale@indiana.edu 
• Requests for capability, interest 
– Miao Chen, Asst. Director for Outreach HTRC 
miaochen@indiana.edu
@hathitrust #SC14 
The HathiTrust Research Center: 
Big Data Analytics in a Secure 
pti.iu.edu/sc14 
Data Framework 
For more on HTRC: http://www.hathitrust.org/htrc 
For these slides go to:

More Related Content

What's hot

Research Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social SciencesResearch Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social Sciences
Celia Emmelhainz
 
Research as infrastructure, Digital Humanities Congress, Sheffield 2012
Research as infrastructure, Digital Humanities Congress, Sheffield 2012Research as infrastructure, Digital Humanities Congress, Sheffield 2012
Research as infrastructure, Digital Humanities Congress, Sheffield 2012
University of South Australlia
 
Building Capacity for Open Science
Building Capacity for Open ScienceBuilding Capacity for Open Science
Building Capacity for Open Science
Kaitlin Thaney
 
CST4599 July 2020
CST4599 July 2020 CST4599 July 2020
CST4599 July 2020
EISLibrarian
 
Research into Practice case study 2: Library linked data implementations an...
	Research into Practice case study 2:  Library linked data implementations an...	Research into Practice case study 2:  Library linked data implementations an...
Research into Practice case study 2: Library linked data implementations an...
Hazel Hall
 
Linked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesLinked Open Data for Digital Humanities
Linked Open Data for Digital Humanities
Christophe Guéret
 
Research Data Management at the University of Edinburgh
Research Data Management at the University of EdinburghResearch Data Management at the University of Edinburgh
Research Data Management at the University of Edinburgh
EDINA, University of Edinburgh
 
Research data management: a tale of two paradigms:
Research data management: a tale of two paradigms: Research data management: a tale of two paradigms:
Research data management: a tale of two paradigms:
Martin Donnelly
 
Digital Humanities by Ingrid Thomson
Digital Humanities  by Ingrid ThomsonDigital Humanities  by Ingrid Thomson
Digital Humanities by Ingrid Thomson
pvhead123
 
From Theory to Practice: Can Opennesss Improve the Quality of OER Research?
From Theory to Practice: Can Opennesss Improve the Quality of OER Research? From Theory to Practice: Can Opennesss Improve the Quality of OER Research?
From Theory to Practice: Can Opennesss Improve the Quality of OER Research?
Beck Pitt
 
Research Data in the Arts and Humanities: A Few Tricky Questions
Research Data in the Arts and Humanities: A Few Tricky QuestionsResearch Data in the Arts and Humanities: A Few Tricky Questions
Research Data in the Arts and Humanities: A Few Tricky Questions
Martin Donnelly
 
Pampel/Bertelnmann/Hobohm: Data Librarianship
Pampel/Bertelnmann/Hobohm: Data LibrarianshipPampel/Bertelnmann/Hobohm: Data Librarianship
Pampel/Bertelnmann/Hobohm: Data Librarianship
Hans-Christoph Hobohm
 
Research 101 for Mid-Career Students
Research 101 for Mid-Career StudentsResearch 101 for Mid-Career Students
Research 101 for Mid-Career Students
Abby Clobridge
 
co:op-READ-Convention Marburg - Milena Dobreva
co:op-READ-Convention Marburg - Milena Dobrevaco:op-READ-Convention Marburg - Milena Dobreva
co:op-READ-Convention Marburg - Milena Dobreva
ICARUS - International Centre for Archival Research
 
The liaison librarian: connecting with the qualitative research lifecycle
The liaison librarian: connecting with the qualitative research lifecycleThe liaison librarian: connecting with the qualitative research lifecycle
The liaison librarian: connecting with the qualitative research lifecycle
Celia Emmelhainz
 
From DARPA to Shakespeare: All the Data we Can Handle
From DARPA to Shakespeare: All the Data we Can Handle From DARPA to Shakespeare: All the Data we Can Handle
From DARPA to Shakespeare: All the Data we Can Handle
Kimberly Hoffman
 
Introducing Web of Science Profiles
Introducing Web of Science ProfilesIntroducing Web of Science Profiles
Introducing Web of Science Profiles
ORCID, Inc
 
CUA Humanities Lecture on Scholarly Communications LSC634 Fall2014
CUA Humanities Lecture on Scholarly Communications LSC634 Fall2014CUA Humanities Lecture on Scholarly Communications LSC634 Fall2014
CUA Humanities Lecture on Scholarly Communications LSC634 Fall2014
Kimberly Hoffman
 
Making Sense of Digital Humanities: a Conversation Starter
Making Sense of Digital Humanities: a Conversation Starter Making Sense of Digital Humanities: a Conversation Starter
Making Sense of Digital Humanities: a Conversation Starter
University of Cape Town
 
Cosi Usage Data
Cosi   Usage DataCosi   Usage Data
Cosi Usage Data
daveyp
 

What's hot (20)

Research Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social SciencesResearch Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social Sciences
 
Research as infrastructure, Digital Humanities Congress, Sheffield 2012
Research as infrastructure, Digital Humanities Congress, Sheffield 2012Research as infrastructure, Digital Humanities Congress, Sheffield 2012
Research as infrastructure, Digital Humanities Congress, Sheffield 2012
 
Building Capacity for Open Science
Building Capacity for Open ScienceBuilding Capacity for Open Science
Building Capacity for Open Science
 
CST4599 July 2020
CST4599 July 2020 CST4599 July 2020
CST4599 July 2020
 
Research into Practice case study 2: Library linked data implementations an...
	Research into Practice case study 2:  Library linked data implementations an...	Research into Practice case study 2:  Library linked data implementations an...
Research into Practice case study 2: Library linked data implementations an...
 
Linked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesLinked Open Data for Digital Humanities
Linked Open Data for Digital Humanities
 
Research Data Management at the University of Edinburgh
Research Data Management at the University of EdinburghResearch Data Management at the University of Edinburgh
Research Data Management at the University of Edinburgh
 
Research data management: a tale of two paradigms:
Research data management: a tale of two paradigms: Research data management: a tale of two paradigms:
Research data management: a tale of two paradigms:
 
Digital Humanities by Ingrid Thomson
Digital Humanities  by Ingrid ThomsonDigital Humanities  by Ingrid Thomson
Digital Humanities by Ingrid Thomson
 
From Theory to Practice: Can Opennesss Improve the Quality of OER Research?
From Theory to Practice: Can Opennesss Improve the Quality of OER Research? From Theory to Practice: Can Opennesss Improve the Quality of OER Research?
From Theory to Practice: Can Opennesss Improve the Quality of OER Research?
 
Research Data in the Arts and Humanities: A Few Tricky Questions
Research Data in the Arts and Humanities: A Few Tricky QuestionsResearch Data in the Arts and Humanities: A Few Tricky Questions
Research Data in the Arts and Humanities: A Few Tricky Questions
 
Pampel/Bertelnmann/Hobohm: Data Librarianship
Pampel/Bertelnmann/Hobohm: Data LibrarianshipPampel/Bertelnmann/Hobohm: Data Librarianship
Pampel/Bertelnmann/Hobohm: Data Librarianship
 
Research 101 for Mid-Career Students
Research 101 for Mid-Career StudentsResearch 101 for Mid-Career Students
Research 101 for Mid-Career Students
 
co:op-READ-Convention Marburg - Milena Dobreva
co:op-READ-Convention Marburg - Milena Dobrevaco:op-READ-Convention Marburg - Milena Dobreva
co:op-READ-Convention Marburg - Milena Dobreva
 
The liaison librarian: connecting with the qualitative research lifecycle
The liaison librarian: connecting with the qualitative research lifecycleThe liaison librarian: connecting with the qualitative research lifecycle
The liaison librarian: connecting with the qualitative research lifecycle
 
From DARPA to Shakespeare: All the Data we Can Handle
From DARPA to Shakespeare: All the Data we Can Handle From DARPA to Shakespeare: All the Data we Can Handle
From DARPA to Shakespeare: All the Data we Can Handle
 
Introducing Web of Science Profiles
Introducing Web of Science ProfilesIntroducing Web of Science Profiles
Introducing Web of Science Profiles
 
CUA Humanities Lecture on Scholarly Communications LSC634 Fall2014
CUA Humanities Lecture on Scholarly Communications LSC634 Fall2014CUA Humanities Lecture on Scholarly Communications LSC634 Fall2014
CUA Humanities Lecture on Scholarly Communications LSC634 Fall2014
 
Making Sense of Digital Humanities: a Conversation Starter
Making Sense of Digital Humanities: a Conversation Starter Making Sense of Digital Humanities: a Conversation Starter
Making Sense of Digital Humanities: a Conversation Starter
 
Cosi Usage Data
Cosi   Usage DataCosi   Usage Data
Cosi Usage Data
 

Similar to The HathiTrust Research Center: Big Data Analytics in a Secure Data Framework

JCDL 2015 Tutorial Opening Slides
JCDL 2015 Tutorial Opening SlidesJCDL 2015 Tutorial Opening Slides
JCDL 2015 Tutorial Opening Slides
Robert H. McDonald
 
The HathiTrust Research Center: An Overview of Advanced Computational Services
The HathiTrust Research Center: An Overview of Advanced Computational ServicesThe HathiTrust Research Center: An Overview of Advanced Computational Services
The HathiTrust Research Center: An Overview of Advanced Computational Services
Robert H. McDonald
 
The HathiTrust Research Center (HTRC): An Overview and Demo
The HathiTrust Research Center (HTRC): An Overview and DemoThe HathiTrust Research Center (HTRC): An Overview and Demo
The HathiTrust Research Center (HTRC): An Overview and Demo
Robert H. McDonald
 
HathiTrust Research Center Data Capsule Overview 09.10.14
HathiTrust Research Center Data Capsule Overview 09.10.14HathiTrust Research Center Data Capsule Overview 09.10.14
HathiTrust Research Center Data Capsule Overview 09.10.14
Robert H. McDonald
 
Teaching Data Science to Undergraduate Students
Teaching Data Science to Undergraduate StudentsTeaching Data Science to Undergraduate Students
Teaching Data Science to Undergraduate Students
Nicole Vasilevsky
 
Building a Public Research Center for the HathiTrust Digital Library
Building a Public Research Center for the HathiTrust Digital LibraryBuilding a Public Research Center for the HathiTrust Digital Library
Building a Public Research Center for the HathiTrust Digital Library
Robert H. McDonald
 
SGCI - Science Gateways: Sustainability via On-Campus Teams
SGCI - Science Gateways: Sustainability via On-Campus TeamsSGCI - Science Gateways: Sustainability via On-Campus Teams
SGCI - Science Gateways: Sustainability via On-Campus Teams
Sandra Gesing
 
Open data & knowledge solutions - a cgiar perspective dileep
Open data & knowledge solutions - a cgiar perspective dileepOpen data & knowledge solutions - a cgiar perspective dileep
Open data & knowledge solutions - a cgiar perspective dileep
FRANK Water
 
BLC & Digital Science: Mark Hahnel, Figshare
BLC & Digital Science: Mark Hahnel, FigshareBLC & Digital Science: Mark Hahnel, Figshare
BLC & Digital Science: Mark Hahnel, Figshare
Boston Library Consortium, Inc.
 
Curation Service Models - Michael Witt - RDAP12
Curation Service Models - Michael Witt - RDAP12Curation Service Models - Michael Witt - RDAP12
Curation Service Models - Michael Witt - RDAP12
ASIS&T
 
Immersive informatics - research data management at Pitt iSchool and Carnegie...
Immersive informatics - research data management at Pitt iSchool and Carnegie...Immersive informatics - research data management at Pitt iSchool and Carnegie...
Immersive informatics - research data management at Pitt iSchool and Carnegie...
Keith Webster
 
Embedding OA within research practice: the HHuLOA JISC OA PathFinder project
Embedding OA within research practice: the HHuLOA JISC OA PathFinder projectEmbedding OA within research practice: the HHuLOA JISC OA PathFinder project
Embedding OA within research practice: the HHuLOA JISC OA PathFinder project
northerncollaboration
 
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...
Sandra Gesing
 
Introduction to UC San Diego’s Integrated Digital Infrastructure
Introduction to UC San Diego’s Integrated Digital InfrastructureIntroduction to UC San Diego’s Integrated Digital Infrastructure
Introduction to UC San Diego’s Integrated Digital Infrastructure
Larry Smarr
 
African Open Science Platform: Pilot Phase
African Open Science Platform: Pilot PhaseAfrican Open Science Platform: Pilot Phase
African Open Science Platform: Pilot Phase
Academy of Science of South Africa (ASSAf)
 
RDM skills
RDM skillsRDM skills
RDM skills
Sarah Jones
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Carole Goble
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
ICPSR
 
Data Strategy and Services at the British Library: Data, Software and PIDs
Data Strategy and Services at the British Library: Data, Software and PIDsData Strategy and Services at the British Library: Data, Software and PIDs
Data Strategy and Services at the British Library: Data, Software and PIDs
Sarah Anna Stewart
 
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
Sandra Gesing
 

Similar to The HathiTrust Research Center: Big Data Analytics in a Secure Data Framework (20)

JCDL 2015 Tutorial Opening Slides
JCDL 2015 Tutorial Opening SlidesJCDL 2015 Tutorial Opening Slides
JCDL 2015 Tutorial Opening Slides
 
The HathiTrust Research Center: An Overview of Advanced Computational Services
The HathiTrust Research Center: An Overview of Advanced Computational ServicesThe HathiTrust Research Center: An Overview of Advanced Computational Services
The HathiTrust Research Center: An Overview of Advanced Computational Services
 
The HathiTrust Research Center (HTRC): An Overview and Demo
The HathiTrust Research Center (HTRC): An Overview and DemoThe HathiTrust Research Center (HTRC): An Overview and Demo
The HathiTrust Research Center (HTRC): An Overview and Demo
 
HathiTrust Research Center Data Capsule Overview 09.10.14
HathiTrust Research Center Data Capsule Overview 09.10.14HathiTrust Research Center Data Capsule Overview 09.10.14
HathiTrust Research Center Data Capsule Overview 09.10.14
 
Teaching Data Science to Undergraduate Students
Teaching Data Science to Undergraduate StudentsTeaching Data Science to Undergraduate Students
Teaching Data Science to Undergraduate Students
 
Building a Public Research Center for the HathiTrust Digital Library
Building a Public Research Center for the HathiTrust Digital LibraryBuilding a Public Research Center for the HathiTrust Digital Library
Building a Public Research Center for the HathiTrust Digital Library
 
SGCI - Science Gateways: Sustainability via On-Campus Teams
SGCI - Science Gateways: Sustainability via On-Campus TeamsSGCI - Science Gateways: Sustainability via On-Campus Teams
SGCI - Science Gateways: Sustainability via On-Campus Teams
 
Open data & knowledge solutions - a cgiar perspective dileep
Open data & knowledge solutions - a cgiar perspective dileepOpen data & knowledge solutions - a cgiar perspective dileep
Open data & knowledge solutions - a cgiar perspective dileep
 
BLC & Digital Science: Mark Hahnel, Figshare
BLC & Digital Science: Mark Hahnel, FigshareBLC & Digital Science: Mark Hahnel, Figshare
BLC & Digital Science: Mark Hahnel, Figshare
 
Curation Service Models - Michael Witt - RDAP12
Curation Service Models - Michael Witt - RDAP12Curation Service Models - Michael Witt - RDAP12
Curation Service Models - Michael Witt - RDAP12
 
Immersive informatics - research data management at Pitt iSchool and Carnegie...
Immersive informatics - research data management at Pitt iSchool and Carnegie...Immersive informatics - research data management at Pitt iSchool and Carnegie...
Immersive informatics - research data management at Pitt iSchool and Carnegie...
 
Embedding OA within research practice: the HHuLOA JISC OA PathFinder project
Embedding OA within research practice: the HHuLOA JISC OA PathFinder projectEmbedding OA within research practice: the HHuLOA JISC OA PathFinder project
Embedding OA within research practice: the HHuLOA JISC OA PathFinder project
 
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...
SGCI Science Gateways: Software sustainability via on-campus teams - Webinar ...
 
Introduction to UC San Diego’s Integrated Digital Infrastructure
Introduction to UC San Diego’s Integrated Digital InfrastructureIntroduction to UC San Diego’s Integrated Digital Infrastructure
Introduction to UC San Diego’s Integrated Digital Infrastructure
 
African Open Science Platform: Pilot Phase
African Open Science Platform: Pilot PhaseAfrican Open Science Platform: Pilot Phase
African Open Science Platform: Pilot Phase
 
RDM skills
RDM skillsRDM skills
RDM skills
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
 
Data Strategy and Services at the British Library: Data, Software and PIDs
Data Strategy and Services at the British Library: Data, Software and PIDsData Strategy and Services at the British Library: Data, Software and PIDs
Data Strategy and Services at the British Library: Data, Software and PIDs
 
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
 

More from Robert H. McDonald

ER&L The Role of Choice in the Future of Discovery Evaluations Panel
ER&L The Role of Choice in the Future of Discovery Evaluations PanelER&L The Role of Choice in the Future of Discovery Evaluations Panel
ER&L The Role of Choice in the Future of Discovery Evaluations Panel
Robert H. McDonald
 
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
Robert H. McDonald
 
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Robert H. McDonald
 
TLT Discussion on "Saving My Stuff" - 06.05.15
TLT Discussion on "Saving My Stuff" - 06.05.15TLT Discussion on "Saving My Stuff" - 06.05.15
TLT Discussion on "Saving My Stuff" - 06.05.15
Robert H. McDonald
 
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterElephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
Robert H. McDonald
 
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Robert H. McDonald
 
ER&L 2015 Closing Keynote Slides
ER&L 2015 Closing Keynote SlidesER&L 2015 Closing Keynote Slides
ER&L 2015 Closing Keynote Slides
Robert H. McDonald
 
Owning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your PatronsOwning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your Patrons
Robert H. McDonald
 
Kuali OLE: Enabling Choices for Libraries
Kuali OLE: Enabling Choices for LibrariesKuali OLE: Enabling Choices for Libraries
Kuali OLE: Enabling Choices for Libraries
Robert H. McDonald
 
Charleston Seminar Being Earnest with our Collections - Legacy to Cloud
Charleston Seminar Being Earnest with our Collections - Legacy to CloudCharleston Seminar Being Earnest with our Collections - Legacy to Cloud
Charleston Seminar Being Earnest with our Collections - Legacy to Cloud
Robert H. McDonald
 
SCONUL Kuali OLE Briefing
SCONUL Kuali OLE BriefingSCONUL Kuali OLE Briefing
SCONUL Kuali OLE Briefing
Robert H. McDonald
 
SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science
Robert H. McDonald
 
New Perspectives for Business Intelligence: Library and Research Technologies...
New Perspectives for Business Intelligence: Library and Research Technologies...New Perspectives for Business Intelligence: Library and Research Technologies...
New Perspectives for Business Intelligence: Library and Research Technologies...
Robert H. McDonald
 
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...
Robert H. McDonald
 
GOKb & KB+: An International Partnership to leverage Open Access and Communit...
GOKb & KB+: An International Partnership to leverage Open Access and Communit...GOKb & KB+: An International Partnership to leverage Open Access and Communit...
GOKb & KB+: An International Partnership to leverage Open Access and Communit...
Robert H. McDonald
 
Kuali OLE @ LITA Forum 2012
Kuali OLE @ LITA Forum 2012Kuali OLE @ LITA Forum 2012
Kuali OLE @ LITA Forum 2012
Robert H. McDonald
 
HathiTrust Research Center: The Fast Version
HathiTrust Research Center: The Fast VersionHathiTrust Research Center: The Fast Version
HathiTrust Research Center: The Fast Version
Robert H. McDonald
 
HTRC Architecture Overview
HTRC Architecture OverviewHTRC Architecture Overview
HTRC Architecture Overview
Robert H. McDonald
 
Building a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceBuilding a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability Science
Robert H. McDonald
 
Panel Session: VIVO and the data culture of universities-VIVO@IU
Panel Session: VIVO and the data culture of universities-VIVO@IUPanel Session: VIVO and the data culture of universities-VIVO@IU
Panel Session: VIVO and the data culture of universities-VIVO@IU
Robert H. McDonald
 

More from Robert H. McDonald (20)

ER&L The Role of Choice in the Future of Discovery Evaluations Panel
ER&L The Role of Choice in the Future of Discovery Evaluations PanelER&L The Role of Choice in the Future of Discovery Evaluations Panel
ER&L The Role of Choice in the Future of Discovery Evaluations Panel
 
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
 
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
 
TLT Discussion on "Saving My Stuff" - 06.05.15
TLT Discussion on "Saving My Stuff" - 06.05.15TLT Discussion on "Saving My Stuff" - 06.05.15
TLT Discussion on "Saving My Stuff" - 06.05.15
 
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterElephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
 
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
 
ER&L 2015 Closing Keynote Slides
ER&L 2015 Closing Keynote SlidesER&L 2015 Closing Keynote Slides
ER&L 2015 Closing Keynote Slides
 
Owning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your PatronsOwning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your Patrons
 
Kuali OLE: Enabling Choices for Libraries
Kuali OLE: Enabling Choices for LibrariesKuali OLE: Enabling Choices for Libraries
Kuali OLE: Enabling Choices for Libraries
 
Charleston Seminar Being Earnest with our Collections - Legacy to Cloud
Charleston Seminar Being Earnest with our Collections - Legacy to CloudCharleston Seminar Being Earnest with our Collections - Legacy to Cloud
Charleston Seminar Being Earnest with our Collections - Legacy to Cloud
 
SCONUL Kuali OLE Briefing
SCONUL Kuali OLE BriefingSCONUL Kuali OLE Briefing
SCONUL Kuali OLE Briefing
 
SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science SEAD Datanet and Sustainability Science
SEAD Datanet and Sustainability Science
 
New Perspectives for Business Intelligence: Library and Research Technologies...
New Perspectives for Business Intelligence: Library and Research Technologies...New Perspectives for Business Intelligence: Library and Research Technologies...
New Perspectives for Business Intelligence: Library and Research Technologies...
 
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...
 
GOKb & KB+: An International Partnership to leverage Open Access and Communit...
GOKb & KB+: An International Partnership to leverage Open Access and Communit...GOKb & KB+: An International Partnership to leverage Open Access and Communit...
GOKb & KB+: An International Partnership to leverage Open Access and Communit...
 
Kuali OLE @ LITA Forum 2012
Kuali OLE @ LITA Forum 2012Kuali OLE @ LITA Forum 2012
Kuali OLE @ LITA Forum 2012
 
HathiTrust Research Center: The Fast Version
HathiTrust Research Center: The Fast VersionHathiTrust Research Center: The Fast Version
HathiTrust Research Center: The Fast Version
 
HTRC Architecture Overview
HTRC Architecture OverviewHTRC Architecture Overview
HTRC Architecture Overview
 
Building a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceBuilding a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability Science
 
Panel Session: VIVO and the data culture of universities-VIVO@IU
Panel Session: VIVO and the data culture of universities-VIVO@IUPanel Session: VIVO and the data culture of universities-VIVO@IU
Panel Session: VIVO and the data culture of universities-VIVO@IU
 

Recently uploaded

CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
The Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve ThomasonThe Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve Thomason
Steve Thomason
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
Celine George
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
Jheel Barad
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
EduSkills OECD
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
bennyroshan06
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
Excellence Foundation for South Sudan
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 

Recently uploaded (20)

CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
The Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve ThomasonThe Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve Thomason
 
How to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS ModuleHow to Split Bills in the Odoo 17 POS Module
How to Split Bills in the Odoo 17 POS Module
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
 
Introduction to Quality Improvement Essentials
Introduction to Quality Improvement EssentialsIntroduction to Quality Improvement Essentials
Introduction to Quality Improvement Essentials
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 

The HathiTrust Research Center: Big Data Analytics in a Secure Data Framework

  • 1. The HathiTrust Research Center: Big Data Analytics in a Secure pti.iu.edu/sc14 Data Framework @hathitrust #SC14 Beth Plale | @bplale Director Data to Insight Center | Indiana University Robert H. McDonald | @mcdonald Deputy Director Data to Insight Center | Indiana University
  • 2. pti.iu.edu/sc14 @hathitrust #SC14 Outline • What is the HTRC? • Non-Consumptive Research Paradigm • Current Architecture • Future Architecture • Advanced Collaborative Support (RFP) • HTRC Science on a Sphere • HTRC @ Events
  • 3. pti.iu.edu/sc14 @hathitrust #SC14 HathiTrust Digital Library • HathiTrust is a partnership of 90+ academic & research institutions, offering a collection of millions of digitized titles. • http://hathitrust.org – IU is a founding member of the HathiTrust along with University of Michigan, University of California, and the University of Virginia
  • 4. @hathitrust #SC14 HathiTrust Research Center Mission • Public research arm of HathiTrust • Goal: enable researchers world-wide to accomplish tera-scale pti.iu.edu/sc14 text data-mining and analysis – Develop cutting-edge software tools for processing, analyzing text – Develop cyberinfrastructure to enable HPC access to the HathiTrust Digital Library • Established: July, 2011 • Collaborative center: Indiana University & University of Illinois
  • 5. pti.iu.edu/sc14 @hathitrust #SC14 HTRC Timeline • Phase I: development 01 Jul 2011 – 31 Mar 2013 – HTRC software and services release v1.0 https://github.com/htrc • Phase II: outreach, 01 Apr 2013 – 30 June 2014 – 2nd HTRC UnCamp Sep ’13 • Phase III: operations, 01 July 2014 - present
  • 6. pti.iu.edu/sc14 @hathitrust #SC14 HTRC Current Users Projected Use 2019 Digital Humanities (60) Education (60) Informatics (60) Observers (20) 194 existing user accounts Lots of user accounts; good starting point. Improve : • Increase amount of real work being accomplished as measured by usage on HTRC’s compute resources Quarry and Big Red II at IU • Develop educational uses • Develop informatics uses • Decrease number of observers to 10%  Project 200 users at any one time of which 90% are doing relevant education/scholarship 6
  • 7. pti.iu.edu/sc14 @hathitrust #SC14 Non-Consumptive Research Paradigm • No action or set of actions on part of users, either acting alone or in cooperation with other users over duration of one or multiple sessions can result in sufficient information gathered from collection of copyrighted works to reassemble pages from collection. • Definition disallows collusion between users, or accumulation of material over time. Differentiates human researcher from proxy which is not a user. Users are human beings.
  • 8. pti.iu.edu/sc14 @hathitrust #SC14 HTRC All the complexity Complexity hiding interface Request Spatial plots Statistical plots Tabular info
  • 10. pti.iu.edu/sc14 @hathitrust #SC14 HTRC Goals • Provide a persistent and sustainable structure to enable original and cutting edge research. – Leverage data storage and computational infrastructure at Indiana & Illinois – Stimulate community development of new functionality and tools – Use tools to enable discoveries that would not be possible without the HTRC • Enable scholars to fully utilize content of HathiTrust Library while preventing intellectual property misuse within U.S. copyright law. – Provision secure computational and data environment for scholars to perform research using HathiTrust Digital Library.
  • 11. pti.iu.edu/sc14 @hathitrust #SC14 HTRC Organization 2014-18 HTRC Executive Mgmt Administrative Support Core Development Advanced Research Advanced Collaborative Support Scholarly Commons
  • 12. HTRC Data Capsule pti.iu.edu/sc14 @hathitrust #SC14 HTRC Data Capsule@IU Team • Beth Plale (PI) • Jiaan Zeng • Guangchen Ruan HTRC Data Capsule@Michigan Team • Atul Prakash (PI) • Alexander Crowell Jiaan Zeng, Guangchen Ruan, Alexander Crowell, Atul Prakash, and Beth Plale. 2014. Cloud computing data capsules for non-consumptiveuse of texts. In Proceedings of the 5th ACM workshop on Scientific cloud computing (ScienceCloud '14). ACM, New York, NY, USA, 9-16. DOI=10.1145/2608029.2608031 http://doi.acm.org/10.1145/2608029.2608031 Special Thanks to • Samitha Liyanage • Milinda Pathirage • Zong Peng • Earlence Fernandes • Ajit Aluri
  • 13. User Authentication pti.iu.edu/sc14 @hathitrust #SC14 HTRC Data Capsule VM-1 … Host-1 Web UI Web Services Hypervisor Scripts … Database Firewall Audit Image Store Volume Store VM-k VM-1 … VM-k Host-N Web front end Web service Backend
  • 14. @hathitrust #SC14 HTRC Data Capsule Workflow pti.iu.edu/sc14
  • 15. @hathitrust #SC14 Data Capsule Screenshots pti.iu.edu/sc14 Maintenance Mode Secure Mode
  • 16. pti.iu.edu/sc14 @hathitrust #SC14 HTRC Science on a Sphere #SC14 1. Texts published per country 2. HathiTrust Member Institutions 3. HT Google analytics
  • 17. @hathitrust #SC14 HTRC Advanced Collaborative Support • ACS will be offered on a rolling basis over next pti.iu.edu/sc14 four years 2014-18 • 1st RFP Call Deadline is Jan 8, 2015 5:00pm eastern – RFP - http://www.hathitrust.org/htrc/acs-rfp • For more info on the Advanced Collaborative Support please contact: htrc.acs.awards@gmail.com
  • 18. pti.iu.edu/sc14 @hathitrust #SC14 HTRC@Events • DHCS 2014, Oct 22, 2014 Evanston, IL • SC14 – IU Booth, Nov 17-19, 2014, New Orleans, LA • CLIR/CNI Workshop on Expanded Access to Collections, Dec. 7, 2014, Washington, DC • HTRC UnCamp 2015 – March 30-31, 2015 Ann Arbor, MI
  • 19. pti.iu.edu/sc14 @hathitrust #SC14 Thank You HTRC IU Team • Beth Plale (PI) • Robert H. McDonald • Miao Chen • Guangchen Ruan • Zong Peng • Milinda Pathirage • Samitha Liyanage • Leena Unnikrishnan • Nicholae Cline HTRC UIUC Team • J. Stephen Downie (PI) • Beth Namachchivaya • Megan Senseney • Sayan Bhattacharyya • Colleen Fallaw • Loretta Auvil • Boris Capitanu • Harriet Green
  • 20. @hathitrust #SC14 More Information on HTRC • For details http://www.hathitrust.org/htrc/faq • General contact info pti.iu.edu/sc14 – J. Stephen Downie, Co-Director HTRC, jdownie@Illinois.edu – Beth Plale, Co-Director HTRC, plale@indiana.edu • Requests for capability, interest – Miao Chen, Asst. Director for Outreach HTRC miaochen@indiana.edu
  • 21. @hathitrust #SC14 The HathiTrust Research Center: Big Data Analytics in a Secure pti.iu.edu/sc14 Data Framework For more on HTRC: http://www.hathitrust.org/htrc For these slides go to:

Editor's Notes

  1. HTRC hides complexity of analytics. In this sense, it is like Google search, which is a simple interface that hides complexity to search billions of pages. The kinds of things returned from HTRC interaction are spatial relationship of words (and their frequency obviously), statistical plots of information or tabular information.
  2. Shifting the complexity hiding interface to the right, we open up the cloud to see what’s inside. HTRC at it simplest has 1) algorithms – these are drawn from SEASR and from other analysis tool suites including Mahout and mapreduce, the 2) HT corpus (and subsets of the corpus that users either have personally as part of a workset, or are publically available, and 3) other data sets that are used. HTRC brokers the bringing together of these pieces so that computation can take place on a resource like Big Red II (or XSEDE). Note that there is an arrow from the compute engine to the complexity hiding interface. This is because researcher interaction with the texts isn’t an automated workflow; it is one requiring levels of interaction with the computation as it is running.
  3. Jiaan Zeng, Guangchen Ruan, Alexander Crowell, Atul Prakash, and Beth Plale. 2014. Cloud computing data capsules for non-consumptiveuse of texts. In Proceedings of the 5th ACM workshop on Scientific cloud computing (ScienceCloud '14). ACM, New York, NY, USA, 9-16. DOI=10.1145/2608029.2608031 http://doi.acm.org/10.1145/2608029.2608031
  4. 1.) Texts published per country Data were from the Gender metadata work. It was used because it has volume authors and country of publication information. The total records have 60K volumes, with some country fields missing 2.) HathiTrust Member institutions It maps geolocations of the UnCamp 2013 participants; The text band shows UnCamp 15’ and 13M books available for non-consumptive use soon 3.) HT Google analytics Shows HT webpage use over the time, by aggregating over the quarter A drop around 2013 summer: could possibly be cause by summer break