The HathiTrust Research Center: Enabling New Knowledge Through Shared Infrastructure
Robert McDonald - HathiTrust Research Center Executive committee member; Associate Dean for Library Technologies, Indiana University
Jan 14 NISO Webinar
Net Neutrality: Will Library Resources be stuck in the Slow Lane?
About the Webinar
Net Neutrality is an issue that has been increasingly in the news, but it is something that has affected libraries for a lot longer. Many public libraries are in underserved communities where patrons may not have personal access to the internet, so the use of the public libraries' resources is critical for them. Without net neutrality, those public libraries may not be able to cost-effectively provide such Internet service. For the scholarly and academic communities, scholarly resources could be resigned to the slow lane of the net, if content providers and libraries don't have the resources to pay for the "fast lane." As resources increasingly go multimedia, requiring greater bandwidth, will libraries and content platform providers be saddled with taking on added costs to ensure reliable access?
Net neutrality begins with the basic idea that the Internet is a fair and democratic platform for all. Organizations such as the American Library Association, the Association of Research Libraries, EDUCAUSE, and Internet2, among others, have spoken out about the critical need for retaining net neutrality in the library, higher education, and research communities.
In this webinar, presenters will help define Net Neutrality, what could happen without it, and how it can impact public and academic libraries, and the wider information community.
Agenda
Introduction
Todd Carpenter, Executive Director, NISO
Network Neutrality Principles and Policy for Libraries & Higher Education
Larra Clark, Deputy Director, Office for Information Technology Policy, American Library Association
Network neutrality: The Public Library Perspective
Holly Carroll, Executive Director, Poudre River Public Library District
Academic Libraries and Net Neutrality
Jonathan Miller, Library Director, Olin Library of Rollins College
March 18 NISO Two Part Webinar: Is Granularity the Next Discovery Frontier? Part 2: The Business Complexities of Granular Discovery
Introduction
Nettie Lagace, Associate Director for Programs, NISO
Granular Discovery: A Discipline-Based Approach
Andrea Eastman-Mullins, Chief Operating Officer, Alexander Street Press
Making Open Data Discoverable
Dan Valen, Product Specialist, figshare
When Granularity Met Discovery: The Complexities of Granular Content Discovery
Dave Hovenden, Content Operations Manager, the Summon® Service, ProQuest
Keynote talk to LEARN (LERU/H2020 project) for research data management. Emphasizes that problems are cultural not technical. Promotes modern approaches such as Git / continuousIntegration, announces DAT. Asserts that the Right to Read in the Right to Mine. Calls for widespread development of contentmining (TDM)
From Open Data to Open Science, by Geoffrey BoultonLEARN Project
1st LEARN Workshop. Embedding Research Data as part of the research cycle. 29 Jan 2016. Presentation by Geoffrey Boulton, University of Edinburgh & CODATA
How can we ensure research data is re-usable? The role of Publishers in Resea...LEARN Project
How can we ensure research data is re-usable? The role of Publishers in Research Data Management, by Catriona MacCallum. 2nd LEARN Workshop, Vienna, 6th April 2016
Jan 14 NISO Webinar
Net Neutrality: Will Library Resources be stuck in the Slow Lane?
About the Webinar
Net Neutrality is an issue that has been increasingly in the news, but it is something that has affected libraries for a lot longer. Many public libraries are in underserved communities where patrons may not have personal access to the internet, so the use of the public libraries' resources is critical for them. Without net neutrality, those public libraries may not be able to cost-effectively provide such Internet service. For the scholarly and academic communities, scholarly resources could be resigned to the slow lane of the net, if content providers and libraries don't have the resources to pay for the "fast lane." As resources increasingly go multimedia, requiring greater bandwidth, will libraries and content platform providers be saddled with taking on added costs to ensure reliable access?
Net neutrality begins with the basic idea that the Internet is a fair and democratic platform for all. Organizations such as the American Library Association, the Association of Research Libraries, EDUCAUSE, and Internet2, among others, have spoken out about the critical need for retaining net neutrality in the library, higher education, and research communities.
In this webinar, presenters will help define Net Neutrality, what could happen without it, and how it can impact public and academic libraries, and the wider information community.
Agenda
Introduction
Todd Carpenter, Executive Director, NISO
Network Neutrality Principles and Policy for Libraries & Higher Education
Larra Clark, Deputy Director, Office for Information Technology Policy, American Library Association
Network neutrality: The Public Library Perspective
Holly Carroll, Executive Director, Poudre River Public Library District
Academic Libraries and Net Neutrality
Jonathan Miller, Library Director, Olin Library of Rollins College
March 18 NISO Two Part Webinar: Is Granularity the Next Discovery Frontier? Part 2: The Business Complexities of Granular Discovery
Introduction
Nettie Lagace, Associate Director for Programs, NISO
Granular Discovery: A Discipline-Based Approach
Andrea Eastman-Mullins, Chief Operating Officer, Alexander Street Press
Making Open Data Discoverable
Dan Valen, Product Specialist, figshare
When Granularity Met Discovery: The Complexities of Granular Content Discovery
Dave Hovenden, Content Operations Manager, the Summon® Service, ProQuest
Keynote talk to LEARN (LERU/H2020 project) for research data management. Emphasizes that problems are cultural not technical. Promotes modern approaches such as Git / continuousIntegration, announces DAT. Asserts that the Right to Read in the Right to Mine. Calls for widespread development of contentmining (TDM)
From Open Data to Open Science, by Geoffrey BoultonLEARN Project
1st LEARN Workshop. Embedding Research Data as part of the research cycle. 29 Jan 2016. Presentation by Geoffrey Boulton, University of Edinburgh & CODATA
How can we ensure research data is re-usable? The role of Publishers in Resea...LEARN Project
How can we ensure research data is re-usable? The role of Publishers in Research Data Management, by Catriona MacCallum. 2nd LEARN Workshop, Vienna, 6th April 2016
The Challenges of Making Data Travel, by Sabina LeonelliLEARN Project
1st LEARN Workshop. Embedding Research Data as part of the research cycle. 29 Jan 2016. Presentation by Sabina Leonelli, Exeter Centre for the Study of Life Sciences (Egenis) & Department of Sociology, Philosophy and Anthropology, University of Exeter
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...LIBER Europe
A presentation by Dr. Liz Lyon of the United Kingdom Office for Library and Information Networking, as given at LIBER's 42nd annual conference in Munich, Germany.
Enabling Data-Intensive Science Through Data InfrastructuresLIBER Europe
These slides are from a talk given at LIBER's 42nd annual conference by Carlos Morais Pires of the European Commission.
In light of the current data deluge, and plans by the European Commission to harness this deluge through the implementation of e-infrastructures for data driven science under Horizon 2020, Pires issued a call to action to libraries to engage in the data infrastructure and bring their own unique, and now much needed competencies, to bear in bringing meaning to, and spreading the word about, data-driven science.
NISO Two Part Webinar:
Is Granularity the Next Discovery Frontier?
Part 1: Supporting Direct Access to Increasingly Granular Chunks of Content
Working with Metadata Challenges to Support Granular Levels of Access and Descriptions
Myung-Ja (MJ) Han, Metadata Librarian University of Illinois at Urbana-Champaign Urbana, Illinois
Granular Discovery: User Experience Challenges and Opportunities
Tito Sierra, Director of Product Management, EBSCO Information Services
From Unstructured Content to Granular Insights
Daniel Mayer, Vice President of Product & Marketing, TEMIS
B2: Open Up: Open Data in the Public SectorMarieke Guy
Parallel session [B2: Open Up: Open Data in the Public Sector] run at the Institutional Web Management Workshop 2013 (IWMW 2013) event, University of Bath on 26 - 28th June 2013.
Research into Practice case study 2: Library linked data implementations an...Hazel Hall
The research underlying this presentation explored the role that libraries play in the linked data context. Focusing on European national libraries and Scottish libraries, multiple data gathering methods and constant comparative analysis were applied in the study. Amongst the findings, a general lack of awareness within the library community of the Semantic Web and the implications of linked data was identified. At the same time, there is recognition that linked data augments the discoverability and enhances the interoperability of library data. The presentation will include recommendations for the application of the findings of this research in practice.
This presentation was provided by Michael Levine-Clark of the University of Denver during a joint NISO-ICSTI webinar on the topic of text and data mining on June 30, 2016
Keystone summer school 2015 paolo-missier-provenancePaolo Missier
Lecture on Provenance modelling, given at the first Keystone Summer School, Malta July 2015.
With thanks to Prof. Luc Moreau for contributing some of the slide material from his own tutorial
An introduction to open science, why it's important and how to do it. This presentation was given at the European Medical Students Association (EMSA) event, 'Open Access in Action' in Berlin on 14th-15th September 2015
Digital transformations: new challenges for the arts and humanities - Andrew ...Jisc
‘Digital Transformations’ is one of four major stretegic themes currently being developed by the Arts and Humaniies Research Council.
In this presentation, the Theme Leader Fellow will explore some of the work that has been undertaken by projects funded within this strand and will consider how they reflect the wider possibilities and challenges presented to the arts and humanities by such developments as data analytics, linking of data, visulalisation and the internet of things. The way in which the arts and humanities can also offer a distinctive perspective on such issues as identity, authenticity, cretivity and the digital economy will also be discussed.
A open science presentation focusing on the benefits to be gained and basic practices to follow. This was given on behalf of FOSTER at the Open Science Boos(t)camp event at KU Leuven on 24th October 2014.
The HathiTrust Research Center: An Overview of Advanced Computational ServicesRobert H. McDonald
These are my slides from the DPLAFest 2015 held in Indianapolis, IN on 04/17/2015-04/18/2015.
For more see - https://dplafest2015.sched.org/event/a1cfbaca67fd71a2409d28d9b27b1351
Liberating facts from the scientific literature - Jisc Digifest 2016Jisc
Text and data mining (TDM) techniques can be applied to a wide range of materials, from published research papers, books and theses, to cultural heritage materials, digitised collections, administrative and management reports and documentation, etc. Use cases include academic research, resource discovery and business intelligence.
This workshop will show the value and benefits of TDM techniques and demonstrate how ContentMine aims to liberate 100,000,000 facts from the scientific literature, and ContentMine will provide a hands on demo on a topical and accessible scientific/medical subject.
The changing landscape of scholarly communication: presentation to the NFAIS ...Keith Webster
Presentation on the changing relationships between research libraries, publishers, researchers and technology, and the impact of government policy on scholarly publishing and open access.
The Challenges of Making Data Travel, by Sabina LeonelliLEARN Project
1st LEARN Workshop. Embedding Research Data as part of the research cycle. 29 Jan 2016. Presentation by Sabina Leonelli, Exeter Centre for the Study of Life Sciences (Egenis) & Department of Sociology, Philosophy and Anthropology, University of Exeter
Roadmaps, Roles and Re-engineering: Developing Data Informatics Capability in...LIBER Europe
A presentation by Dr. Liz Lyon of the United Kingdom Office for Library and Information Networking, as given at LIBER's 42nd annual conference in Munich, Germany.
Enabling Data-Intensive Science Through Data InfrastructuresLIBER Europe
These slides are from a talk given at LIBER's 42nd annual conference by Carlos Morais Pires of the European Commission.
In light of the current data deluge, and plans by the European Commission to harness this deluge through the implementation of e-infrastructures for data driven science under Horizon 2020, Pires issued a call to action to libraries to engage in the data infrastructure and bring their own unique, and now much needed competencies, to bear in bringing meaning to, and spreading the word about, data-driven science.
NISO Two Part Webinar:
Is Granularity the Next Discovery Frontier?
Part 1: Supporting Direct Access to Increasingly Granular Chunks of Content
Working with Metadata Challenges to Support Granular Levels of Access and Descriptions
Myung-Ja (MJ) Han, Metadata Librarian University of Illinois at Urbana-Champaign Urbana, Illinois
Granular Discovery: User Experience Challenges and Opportunities
Tito Sierra, Director of Product Management, EBSCO Information Services
From Unstructured Content to Granular Insights
Daniel Mayer, Vice President of Product & Marketing, TEMIS
B2: Open Up: Open Data in the Public SectorMarieke Guy
Parallel session [B2: Open Up: Open Data in the Public Sector] run at the Institutional Web Management Workshop 2013 (IWMW 2013) event, University of Bath on 26 - 28th June 2013.
Research into Practice case study 2: Library linked data implementations an...Hazel Hall
The research underlying this presentation explored the role that libraries play in the linked data context. Focusing on European national libraries and Scottish libraries, multiple data gathering methods and constant comparative analysis were applied in the study. Amongst the findings, a general lack of awareness within the library community of the Semantic Web and the implications of linked data was identified. At the same time, there is recognition that linked data augments the discoverability and enhances the interoperability of library data. The presentation will include recommendations for the application of the findings of this research in practice.
This presentation was provided by Michael Levine-Clark of the University of Denver during a joint NISO-ICSTI webinar on the topic of text and data mining on June 30, 2016
Keystone summer school 2015 paolo-missier-provenancePaolo Missier
Lecture on Provenance modelling, given at the first Keystone Summer School, Malta July 2015.
With thanks to Prof. Luc Moreau for contributing some of the slide material from his own tutorial
An introduction to open science, why it's important and how to do it. This presentation was given at the European Medical Students Association (EMSA) event, 'Open Access in Action' in Berlin on 14th-15th September 2015
Digital transformations: new challenges for the arts and humanities - Andrew ...Jisc
‘Digital Transformations’ is one of four major stretegic themes currently being developed by the Arts and Humaniies Research Council.
In this presentation, the Theme Leader Fellow will explore some of the work that has been undertaken by projects funded within this strand and will consider how they reflect the wider possibilities and challenges presented to the arts and humanities by such developments as data analytics, linking of data, visulalisation and the internet of things. The way in which the arts and humanities can also offer a distinctive perspective on such issues as identity, authenticity, cretivity and the digital economy will also be discussed.
A open science presentation focusing on the benefits to be gained and basic practices to follow. This was given on behalf of FOSTER at the Open Science Boos(t)camp event at KU Leuven on 24th October 2014.
The HathiTrust Research Center: An Overview of Advanced Computational ServicesRobert H. McDonald
These are my slides from the DPLAFest 2015 held in Indianapolis, IN on 04/17/2015-04/18/2015.
For more see - https://dplafest2015.sched.org/event/a1cfbaca67fd71a2409d28d9b27b1351
Liberating facts from the scientific literature - Jisc Digifest 2016Jisc
Text and data mining (TDM) techniques can be applied to a wide range of materials, from published research papers, books and theses, to cultural heritage materials, digitised collections, administrative and management reports and documentation, etc. Use cases include academic research, resource discovery and business intelligence.
This workshop will show the value and benefits of TDM techniques and demonstrate how ContentMine aims to liberate 100,000,000 facts from the scientific literature, and ContentMine will provide a hands on demo on a topical and accessible scientific/medical subject.
The changing landscape of scholarly communication: presentation to the NFAIS ...Keith Webster
Presentation on the changing relationships between research libraries, publishers, researchers and technology, and the impact of government policy on scholarly publishing and open access.
HathiTrust Research Center Secure CommonsBeth Plale
Introduces HTRC secure commons, expanded secure infrastructure and services for text mining of HT digital data. Shows results comparing n-gram discovery using Solr full text index and a framework using mapReduce. Compute time over 1 million digital volumes is 1 day with 1024 cores. Weaknesses of Solr in n-gram identification are explored.
The Power of Engagement and Tools for ConnectingKelvin Thompson
AUDIO: access session audio to accompany these slides at http://ofcoursesonline.com/thompson_fsi2015.mp3 [copy/paste]
Slides from keynote address at 2015 Faculty Summer Institute at the University of Illinois
The HathiTrust Research Center: Big Data Analytics in a Secure Data FrameworkRobert H. McDonald
This is the presentation on the HTRC given at the Indiana University booth at Supercomputing 2014 by Beth Plale - Co-Director HTRC and Robert McDonald - HTRC Executive Management Group.
This presentation was provided by Lisa Johnston, University of Minnesota, for a NISO Virtual Conference on data curation held on Wednesday, August 31, 2016
RDAP 15: Research Data Integration in the Purdue LibrariesASIS&T
Research Data Access and Preservation Summit, 2015
Minneapolis, MN
April 22-23, 2015
Lisa Zilinski, Data Specialist, Carnegie Mellon University
Amy Barton, Metadata Specialist, Purdue
Tao Zhang, Digital User Experience Specialist, Purdue
Line Pouchard, Computational Science Information Specialist, Purdue
Pete E. Pascuzzi, Molecular Biosciences Information Specialist, Purdue
PEARC17: ARCC Identity and Access Management, Security and related topics. Cy...Florence Hudson
This presentation explains the NSF EAGER #1650445 Cybersecurity Research Transition To Practice (TTP) Acceleration funded program led by Internet2, inviting researchers and practitioners of IT and cybersecurity to participate.
Paradise Lost and The Right to Read is the Right to Minepetermurrayrust
Presented to UIUC CIRSS seminars to a mixed group of Library, CS, domain scientists with a great contingent of Early Career Researchers. Starts by honouring the creation of the wonderful NCSA Mosaic at UIUC in 1993 and the paradise of knowledge and community it opened. Then shows the gradual and tragic decline of the web into a megacorporate neocolonialist empire, where knowledge is sacrificed for money and power.
You have seen many of the slides before but the words are different and have been recorded.
This presentation was provided by William Mattingly of the Smithsonian Institution, during the closing segment of the NISO training series "AI & Prompt Design." Session Eight: Limitations and Potential Solutions, was held on May 23, 2024.
This presentation was provided by William Mattingly of the Smithsonian Institution, during the seventh segment of the NISO training series "AI & Prompt Design." Session 7: Open Source Language Models, was held on May 16, 2024.
This presentation was provided by William Mattingly of the Smithsonian Institution, during the sixth segment of the NISO training series "AI & Prompt Design." Session Six: Text Classification with LLMs, was held on May 9, 2024.
This presentation was provided by William Mattingly of the Smithsonian Institution, during the fifth segment of the NISO training series "AI & Prompt Design." Session Five: Named Entity Recognition with LLMs, was held on May 2, 2024.
This presentation was provided by William Mattingly of the Smithsonian Institution, during the fourth segment of the NISO training series "AI & Prompt Design." Session Four: Structured Data and Assistants, was held on April 25, 2024.
This presentation was provided by William Mattingly of the Smithsonian Institution, during the third segment of the NISO training series "AI & Prompt Design." Session Three: Beginning Conversations, was held on April 18, 2024.
This presentation was provided by Kaveh Bazargan of River Valley Technologies, during the NISO webinar "Sustainability in Publishing." The event was held April 17, 2024.
This presentation was provided by Dana Compton of the American Society of Civil Engineers (ASCE), during the NISO webinar "Sustainability in Publishing." The event was held April 17, 2024.
This presentation was provided by William Mattingly of the Smithsonian Institution, during the second segment of the NISO training series "AI & Prompt Design." Session Two: Large Language Models, was held on April 11, 2024.
This presentation was provided by Teresa Hazen of the University of Arizona, Geoff Morse of Northwestern University. and Ken Varnum of the University of Michigan, during the Spring ODI Conformance Statement Workshop for Libraries. This event was held on April 9, 2024
This presentation was provided by William Mattingly of the Smithsonian Institution, during the opening segment of the NISO training series "AI & Prompt Design." Session One: Introduction to Machine Learning, was held on April 4, 2024.
This presentation was provided by William Mattingly of the Smithsonian Institution, for the eight and final session of NISO's 2023 Training Series on Text and Data Mining. Session eight, "Building Data Driven Applications" was held on Thursday, December 7, 2023.
This presentation was provided by William Mattingly of the Smithsonian Institution, for the seventh session of NISO's 2023 Training Series on Text and Data Mining. Session seven, "Vector Databases and Semantic Searching" was held on Thursday, November 30, 2023.
This presentation was provided by William Mattingly of the Smithsonian Institution, for the sixth session of NISO's 2023 Training Series on Text and Data Mining. Session six, "Text Mining Techniques" was held on Thursday, November 16, 2023.
This presentation was provided by William Mattingly of the Smithsonian Institution, for the fifth session of NISO's 2023 Training Series on Text and Data Mining. Session five, "Text Processing for Library Data" was held on Thursday, November 9, 2023.
This presentation was provided by Todd Carpenter, Executive Director, during the NISO webinar on "Strategic Planning." The event was held virtually on November 8, 2023.
This presentation was provided by Rhonda Ross of CAS, a division of the American Chemical Society, and Jonathan Clark of the International DOI Foundation, during the NISO webinar on "Strategic Planning." The event was held virtually on November 8, 2023.
This presentation was provided by William Mattingly of the Smithsonian Institution, for the fourth session of NISO's 2023 Training Series on Text and Data Mining. Session four, "Data Mining Techniques" was held on Thursday, November 2, 2023.
This presentation was provided by Tiffany Straza of UNESCO, during the two-day "NISO Tech Summit: Reflections Upon The Year of Open Science." Day two was held on October 26, 2023.
More from National Information Standards Organization (NISO) (20)
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
The Indian economy is classified into different sectors to simplify the analysis and understanding of economic activities. For Class 10, it's essential to grasp the sectors of the Indian economy, understand their characteristics, and recognize their importance. This guide will provide detailed notes on the Sectors of the Indian Economy Class 10, using specific long-tail keywords to enhance comprehension.
For more information, visit-www.vavaclasses.com
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
We all have good and bad thoughts from time to time and situation to situation. We are bombarded daily with spiraling thoughts(both negative and positive) creating all-consuming feel , making us difficult to manage with associated suffering. Good thoughts are like our Mob Signal (Positive thought) amidst noise(negative thought) in the atmosphere. Negative thoughts like noise outweigh positive thoughts. These thoughts often create unwanted confusion, trouble, stress and frustration in our mind as well as chaos in our physical world. Negative thoughts are also known as “distorted thinking”.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
How to Split Bills in the Odoo 17 POS ModuleCeline George
Bills have a main role in point of sale procedure. It will help to track sales, handling payments and giving receipts to customers. Bill splitting also has an important role in POS. For example, If some friends come together for dinner and if they want to divide the bill then it is possible by POS bill splitting. This slide will show how to split bills in odoo 17 POS.
The Art Pastor's Guide to Sabbath | Steve ThomasonSteve Thomason
What is the purpose of the Sabbath Law in the Torah. It is interesting to compare how the context of the law shifts from Exodus to Deuteronomy. Who gets to rest, and why?
Ethnobotany and Ethnopharmacology:
Ethnobotany in herbal drug evaluation,
Impact of Ethnobotany in traditional medicine,
New development in herbals,
Bio-prospecting tools for drug discovery,
Role of Ethnopharmacology in drug evaluation,
Reverse Pharmacology.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Unit 8 - Information and Communication Technology (Paper I).pdf
November 18, 2015 NISO Webinar: Text Mining: Digging Deep for Knowledge
1. The HathiTrust Research Center:
Enabling New Knowledge Through Shared
Infrastructure
NISO Webinar| 11.18.15
Robert H. McDonald | Associate Dean | Indiana University
Miao Chen |Assistant Director for Outeach| HTRC
Eleanor Dickson | HTRC DH Specialist | University of Illinois
Tweet us - @HathiTrust #HTRC
HATHI TRUST RESEARCH CENTER
Tweet us - @HathiTrust #HTRC
3. Today’s Outline
• Overview of HTRC and Services
• Challenges for Text Mining in HT
• Use Cases – Distant Reading
• Scholars Commons and HTRC
• Upcoming Events
4. • Repository
– 13+ million volumes | 3+ billion pages
– 50% of volumes are in English
– Material from the 15th C. on | 20th C.
concentration
– 70% in copyright or undetermined | 30% open
• Interface
– Search and read books
in the public domain
About the HathiTrust Digital Library
10. Non-Consumptive Research Paradigm
• No action or set of actions on part of users, either
acting alone or in cooperation with other users over
duration of one or multiple sessions can result in
sufficient information gathered from collection of
copyrighted works to reassemble pages from
collection.
• Definition disallows collusion between users, or
accumulation of material over time. Differentiates
human researcher from proxy which is not a user.
Users are human beings.
11.
12.
13. HT Points of Access
• HathiTrust Digital Library
– HT data request
– HathiFiles
– Bib and Data APIs
• HTRC datasets
– Extracted Features dataset
• https://sharc.hathitrust.org/features
– Genre in English Language Literature (Underwood)
• http://bit.ly/1LYGqtw
• HTRC Data Capsule
• HTRC Portal and Workset Builder
14. HT Points of Access
• HathiTrust Digital Library
– HT data request
– HathiFiles
– Bib and Data APIs
• HTRC datasets
– Extracted Features dataset
• https://sharc.hathitrust.org/features
– Genre in English Language Literature (Underwood)
• http://bit.ly/1LYGqtw
• HTRC Data Capsule
• HTRC Portal and Workset Builder
22. Scholarly Commons and Outreach
• Design, Development and Delivery
• Outreach Activities
• Education Activities
23. Scholarly Commons: Education
• “Digging Deeper, Reaching Further: Libraries
Empowering Users to Mine the HT DL Resources”
• IMLS Laura Bush 21st Century Librarian Program
• Led by Harriett Green (Illinois)
• Collaborating institutions:
• University of Illinois (CIC-HT)
• Indiana University (CIC-HT)
• University of North Carolina (Chapel Hill) (HT)
• Lafayette College (HT)
• Northwestern University (CIC-HT)
24. HTRC Resources
• Step by step tutorial for HTRC Data Capsule
– https://wiki.htrc.illinois.edu/pages/viewpage.actio
n?pageId=22085965 (on HTRC Knowledge Base)
• To use Data Capsule, sign up and log into the
HTRC portal
– https://sharc.hathitrust.org
25.
26. HTRC@Events
• SuperComputing 15
– (Nov 16-19, 2015)
• Chicago Colloquium on Digital
Humanities & Computer Science
– (Nov 13-15, 2015)
• Modern Language Association
– (Jan 7-10, 2015)
• DHSI Summer Institute
Workshop
– (June 5, 2016)
HATHI TRUST RESEARCH CENTER
27. Many thanks …
HTRC IU Team
• Beth Plale (PI)
• Robert H. McDonald
• Miao Chen
• Guangchen Ruan
• Zong Peng
• Milinda Pathirage
• Samitha Liyanage
• Jiaan Zeng
• Zong Peng
• Leena Unnikrishnan
• Nicholae Cline
• Leanne Mobley
HTRC UIUC Team
• J. Stephen Downie (PI)
• Beth Namachchivaya
• Megan Senseney
• Sayan Bhattacharyya
• Loretta Auvil
• Boris Capitanu
• Harriet Green
• Eleanor Dickson
28. Photo by Marcus Ramberg - Creative Commons Attribution-NonCommercial License https://www.flickr.com/photos/40021607@N00 Created with Haiku Deck
The Scholarly Commons component of HTRC involves librarians and informatics professionals at Indiana, Illinois, and other institutions collaborating to provide baseline education and research consultations to support researchers get their text data mining research off the ground
Focus on improving the user eXperience through user requirements study: conducted seven interviews for the User Requirements Study, with plans to complete 15-20 interviews through Fall 2015
Outreach activities:
In addition to the annual HTRC UnCamp outreach event, HTRC representatives gave more than 35 presentations or workshops in 7 countries this year.
6 workshops were taught at venues both local and national.
Local workshops were presented at Illinois and Indiana.
Conference workshops: HASTAC 2015, the Modern Language Association Annual Convention 2015, and the 2015 Joint Conference on Digital Libraries.
29 talks or lectures were presented at professional conferences across a number of disciplines, including: Supercomputing 2014, the 2015 Joint Conference on Digital Libraries (papers and a keynote address), and Digital Humanities 2015.
128 people from 24 institutions attended the HT/HTRC UnCamp held March 2015 in Ann Arbor, MI.
Education activities: development of workshops in the spring of 2015 facilitated a burst of national and international workshop activity this summer.
hope to deliver at Committee on Institutional Cooperation (CIC) libraries. The idea is to begin with the CIC institutions, gain feedback, refine the workshops, and then expand to the broader HT member library community.
IMLS Laura Bush 21st Century Librarian grant:
Led by HTRC associate, Harriett Green, and along with librarians at Indiana University, Northwestern University, University of North Carolina at Chapel-Hill, Lafayette College, the HTRC, has just been awarded an Institute for Museum and Library Services Laura Bush 21st Century Librarian grant entitled, “Digging Deeper Reaching Further, Libraries Empowering Users to Mine the HathiTrust Digital Library Resources.” This project, with an award amount of $398,845, runs from October 1, 2015 to September 30, 2018 and is intended to develop a shared curriculum for use in academic libraries including a train-the-trainer series designed to assist librarians in getting started with the tools, services, and related research methodologies of the HTRC.