SlideShare a Scribd company logo
1 of 35
Download to read offline
Copyright & Fair Use for Digital Projects
Text Data Mining & Publishing
UC Berkeley Library
Rachael Samberg, J.D., MLIS
Stacy Reardon, MA, MLS
What you can do,
not what you can’t
Scholars are
turning content
into data
But scholars
(and
academic
staff, in
supporting
them) face
questions
about rights
The Basics of TDM
“Text mining is the use of automated tools, techniques or
technology to process large volumes of digital content
that is often not well structured - to identify and
select relevant information; to extract information from
the content, to identify relationships within / between /
across documents and incidents or events for
meta-analysis.”
- from Text & Data Mining - A Librarian Overview by Ann
Oakerson (2013)
TDM
Literacies
Contracts
Privacy
Copyright
Ethics &
Policy
Other
Statutes/
Use Cases
Copyright
Exclusive rights to original
expression for limited
periods of time
Exclusive Rights
▪Reproduction
▪Derivative works
▪Distribution
▪Public performance
▪Public display
Public Domain
War and Peace, Tolstoy, English
translation 1899 CDC report
Facts & Ideas
Nicholas Mazza,
Poetry therapy: Toward a
research agenda for the 1990s,
The Arts in Psychotherapy,
Volume 20, Issue 1,1993,51-59,
Content Data about the content
TDM researchers can use copyrighted content!
Fair Use
17 U.S.C.§ 107
“The fair use of a
copyrighted work…for
prposes such as
criticism, comment,
news reporting,
teaching…, scholarship,
or research, is not an
infringement of
copyright.”
Four-Factor Balancing Test
1. Purpose & character of use
“Transformativeness” often
dominates
2. Nature of copyrighted work
Whether factual/scholarly work
3. Amount and substantiality
Size & importance of portion
4. Effect on potential market
Whether it supplants market
Authors Guild v. HathiTrust
755 F.3d 87 (2d Cir. 2014)
Textual analysis that digital
library enabled was
transformative under factor
one, and overall fair
Authors Guild v. Google
804 F.3d 202 (2d Cir. 2015)
Creation of full-text
searchable database with
“snippet view” and “ngram
viewer” [search strings]
were fair uses
iParadigms, 562 F. 3d 630
(4th Cir. 2009)
Plagiarism detection
software that replicated
content to detect
similarities was fair use
From research
to publishing
Fox News v TVEyes,
883 F.3d 169 (2018)
Basic functionality and
archiving features were
fair use, but making
available 10-minute
clips was not
● Likely fair to digitize to
conduct text data mining
(w/security precautions)
● May not be fair to republish
large portions of content
● May not be fair to circulate
the digitized texts/corpus
● Case-by-case
Takeaways
Contracts
Database Agreements
Challenges:
- Terms
- Visibility
Archives
Agreement
“I understand that
permission to publish, or
otherwise publicly use,
materials . . . must be
[granted by library]
I understand further that
the University makes no
representation that it is
the owner of the
copyright... and that
permission to publish must
also be obtained from the
owner of the copyright.”
Website Terms
“If you intend to
quote extensive
amounts of text, use
other original
content, or
reproduce images
from this site,
please contact us
for permission.”
California Digital Library’s Model Database Language
Authorized Users may use the Licensed Materials to
perform and engage in text and/or data mining
activities for academic research, scholarship, and
other educational purposes... and may utilize and
share the results of text and/or data mining in their
scholarly work and make the results available for use
by others, so long as the purpose is not to create a
product for use by third parties that would substitute
for the Licensed Materials.
CDL Model License: Preserving Fair Use
Notwithstanding the foregoing, nothing in this
agreement shall otherwise restrict uses of the
material that would be fair use pursuant to 17 U.S.C.§
107 et seq.
● Agreements may constrict uses that
would otherwise be fair
● Familiarize yourself with the
agreement(s), ask for help,
evaluate risk
● Alternatives:
○ Check to see if site has an API
○ Negotiate with content providers
/ ask permission
Takeaways
Other
Statutes/
Use Cases
- Computer Fraud &
Abuse Act
- Digital Rights
Management (DRM) &
Digital Millennium
Copyright Act
Other Issues
Privacy
Rights of Privacy
● © protects copyright holders'
property rights
● Privacy protects people who are
subjects of works
● Fed’l (FERPA, HIPAA) vs. State
● State limits
○ Expire at death
○ Newsworthiness and permission
are defenses
Ethics &
Policy
- Indigenous knowledge
- Cultural heritage
materials
- Endangered species
protection
Exercise
http://ucblib.link/rw
UC Berkeley Library
Rachael Samberg, J.D., MLIS
Stacy Reardon, MA, MLS
Text Data Mining & Publishing
Text Data Mining Guide (Library)
guides.lib.berkeley.edu/text-mining
TDM Access Help
tdm-access@berkeley.edu

More Related Content

What's hot

Open Opportunities
Open OpportunitiesOpen Opportunities
Open OpportunitiesRuss White
 
AZ to eDiscovery
AZ to eDiscoveryAZ to eDiscovery
AZ to eDiscoveryeamonnsfl
 
Di d dlf_handout
Di d dlf_handoutDi d dlf_handout
Di d dlf_handoutcwilliford
 
Mid-Sweden University/SNIA Conference 13 October 2008
Mid-Sweden University/SNIA Conference 13 October 2008Mid-Sweden University/SNIA Conference 13 October 2008
Mid-Sweden University/SNIA Conference 13 October 2008Mark Conrad
 
Technologies and infrastructures supporting text and data analytics: Challeng...
Technologies and infrastructures supporting text and data analytics: Challeng...Technologies and infrastructures supporting text and data analytics: Challeng...
Technologies and infrastructures supporting text and data analytics: Challeng...FutureTDM
 
Data Sharing and the Polar Information Commons
Data Sharing and the Polar Information CommonsData Sharing and the Polar Information Commons
Data Sharing and the Polar Information CommonsKaitlin Thaney
 
20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open Science20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open ScienceOpenAIRE
 
20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open Science20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open ScienceOpenAIRE
 
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing DataOpenAIRE
 
A profile of Applied Data Analysis Lab (ADA Lab)
A profile of Applied Data Analysis Lab (ADA Lab)A profile of Applied Data Analysis Lab (ADA Lab)
A profile of Applied Data Analysis Lab (ADA Lab)Lukasz Bolikowski
 
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...OpenAIRE
 
Semantic data mining of literature
Semantic data mining of literatureSemantic data mining of literature
Semantic data mining of literaturevbrant
 
Supporting the uptake of TDM
Supporting the uptake of TDMSupporting the uptake of TDM
Supporting the uptake of TDMopenminted_eu
 
OpenMinTeD, LIBER conference 2017
OpenMinTeD, LIBER conference 2017OpenMinTeD, LIBER conference 2017
OpenMinTeD, LIBER conference 2017openminted_eu
 
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing DataOpenAIRE
 

What's hot (20)

ODiP: Reproducibility, open data and GDPR
ODiP: Reproducibility, open data and GDPRODiP: Reproducibility, open data and GDPR
ODiP: Reproducibility, open data and GDPR
 
Open Opportunities
Open OpportunitiesOpen Opportunities
Open Opportunities
 
AZ to eDiscovery
AZ to eDiscoveryAZ to eDiscovery
AZ to eDiscovery
 
Di d dlf_handout
Di d dlf_handoutDi d dlf_handout
Di d dlf_handout
 
Mid-Sweden University/SNIA Conference 13 October 2008
Mid-Sweden University/SNIA Conference 13 October 2008Mid-Sweden University/SNIA Conference 13 October 2008
Mid-Sweden University/SNIA Conference 13 October 2008
 
Technologies and infrastructures supporting text and data analytics: Challeng...
Technologies and infrastructures supporting text and data analytics: Challeng...Technologies and infrastructures supporting text and data analytics: Challeng...
Technologies and infrastructures supporting text and data analytics: Challeng...
 
Digital Nightmares: Accessing the Technology
Digital Nightmares: Accessing the TechnologyDigital Nightmares: Accessing the Technology
Digital Nightmares: Accessing the Technology
 
Data Sharing and the Polar Information Commons
Data Sharing and the Polar Information CommonsData Sharing and the Polar Information Commons
Data Sharing and the Polar Information Commons
 
20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open Science20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open Science
 
20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open Science20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open Science
 
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
 
A profile of Applied Data Analysis Lab (ADA Lab)
A profile of Applied Data Analysis Lab (ADA Lab)A profile of Applied Data Analysis Lab (ADA Lab)
A profile of Applied Data Analysis Lab (ADA Lab)
 
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...
 
Semantic data mining of literature
Semantic data mining of literatureSemantic data mining of literature
Semantic data mining of literature
 
Minning WWW
Minning WWWMinning WWW
Minning WWW
 
Preparing research data for sharing
Preparing research data for sharingPreparing research data for sharing
Preparing research data for sharing
 
Supporting the uptake of TDM
Supporting the uptake of TDMSupporting the uptake of TDM
Supporting the uptake of TDM
 
Open Scientific Data
Open Scientific DataOpen Scientific Data
Open Scientific Data
 
OpenMinTeD, LIBER conference 2017
OpenMinTeD, LIBER conference 2017OpenMinTeD, LIBER conference 2017
OpenMinTeD, LIBER conference 2017
 
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
 

Similar to Text Data Mining & Publishing

FAIR vs GDPR: which will win?
FAIR vs GDPR: which will win?FAIR vs GDPR: which will win?
FAIR vs GDPR: which will win?Robin Rice
 
AI and Legal Tech in Context: Privacy and Security Commons
AI and Legal Tech in Context: Privacy and Security CommonsAI and Legal Tech in Context: Privacy and Security Commons
AI and Legal Tech in Context: Privacy and Security Commonsprofessormadison
 
Privacy policy information in data value chains
Privacy policy information in data value chainsPrivacy policy information in data value chains
Privacy policy information in data value chainsBig Data Value Association
 
How FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of dataHow FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of dataARDC
 
Transcript #4 fair -R for Reusable
Transcript   #4 fair -R for ReusableTranscript   #4 fair -R for Reusable
Transcript #4 fair -R for ReusableARDC
 
Introduction to research data management
Introduction to research data managementIntroduction to research data management
Introduction to research data managementdri_ireland
 
Open Science in Research Libraries: Research, Research Integrity and Legal As...
Open Science in Research Libraries: Research, Research Integrity and Legal As...Open Science in Research Libraries: Research, Research Integrity and Legal As...
Open Science in Research Libraries: Research, Research Integrity and Legal As...Marlon Domingus
 
The Regulation of Text and Data Mining
The Regulation of Text and Data MiningThe Regulation of Text and Data Mining
The Regulation of Text and Data MiningLIBER Europe
 
Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....
Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....
Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....ETH-Bibliothek
 
Data management plans
Data management plansData management plans
Data management plansBrad Houston
 
Managing Your Research Data for Maximum Impact -Rob Daley 300616_Shared
Managing Your Research Data for Maximum Impact -Rob Daley 300616_SharedManaging Your Research Data for Maximum Impact -Rob Daley 300616_Shared
Managing Your Research Data for Maximum Impact -Rob Daley 300616_SharedRob Daley
 
Data sharing: How, what and why?
Data sharing: How, what and why?Data sharing: How, what and why?
Data sharing: How, what and why?dancrane_open
 
OU Library Research Support webinar: Data sharing
OU Library Research Support webinar: Data sharingOU Library Research Support webinar: Data sharing
OU Library Research Support webinar: Data sharingDaniel Crane
 
08.03.17 licensing research data for reuse
08.03.17 licensing research data for reuse08.03.17 licensing research data for reuse
08.03.17 licensing research data for reuseRachael Samberg
 
What have we learned from talking with the TDM community?
What have we learned from talking with the TDM community?What have we learned from talking with the TDM community?
What have we learned from talking with the TDM community?FutureTDM
 
Google Digitization Project
Google Digitization ProjectGoogle Digitization Project
Google Digitization Projectakhilprasad
 

Similar to Text Data Mining & Publishing (20)

Librarian Legal Literacies for Text Data Mining
Librarian Legal Literacies for Text Data MiningLibrarian Legal Literacies for Text Data Mining
Librarian Legal Literacies for Text Data Mining
 
FAIR vs GDPR: which will win?
FAIR vs GDPR: which will win?FAIR vs GDPR: which will win?
FAIR vs GDPR: which will win?
 
How to share and publish data: resources, law, and policy
How to share and publish data: resources, law, and policyHow to share and publish data: resources, law, and policy
How to share and publish data: resources, law, and policy
 
AI and Legal Tech in Context: Privacy and Security Commons
AI and Legal Tech in Context: Privacy and Security CommonsAI and Legal Tech in Context: Privacy and Security Commons
AI and Legal Tech in Context: Privacy and Security Commons
 
Privacy policy information in data value chains
Privacy policy information in data value chainsPrivacy policy information in data value chains
Privacy policy information in data value chains
 
How FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of dataHow FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of data
 
Transcript #4 fair -R for Reusable
Transcript   #4 fair -R for ReusableTranscript   #4 fair -R for Reusable
Transcript #4 fair -R for Reusable
 
Introduction to research data management
Introduction to research data managementIntroduction to research data management
Introduction to research data management
 
Open Science in Research Libraries: Research, Research Integrity and Legal As...
Open Science in Research Libraries: Research, Research Integrity and Legal As...Open Science in Research Libraries: Research, Research Integrity and Legal As...
Open Science in Research Libraries: Research, Research Integrity and Legal As...
 
The Regulation of Text and Data Mining
The Regulation of Text and Data MiningThe Regulation of Text and Data Mining
The Regulation of Text and Data Mining
 
Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....
Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....
Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....
 
Data management plans
Data management plansData management plans
Data management plans
 
Managing Your Research Data for Maximum Impact -Rob Daley 300616_Shared
Managing Your Research Data for Maximum Impact -Rob Daley 300616_SharedManaging Your Research Data for Maximum Impact -Rob Daley 300616_Shared
Managing Your Research Data for Maximum Impact -Rob Daley 300616_Shared
 
Timo Minssen, "Big Data and Intellectual Property Rights in the Health and Li...
Timo Minssen, "Big Data and Intellectual Property Rights in the Health and Li...Timo Minssen, "Big Data and Intellectual Property Rights in the Health and Li...
Timo Minssen, "Big Data and Intellectual Property Rights in the Health and Li...
 
Getting data into the data repository
Getting data into the data repositoryGetting data into the data repository
Getting data into the data repository
 
Data sharing: How, what and why?
Data sharing: How, what and why?Data sharing: How, what and why?
Data sharing: How, what and why?
 
OU Library Research Support webinar: Data sharing
OU Library Research Support webinar: Data sharingOU Library Research Support webinar: Data sharing
OU Library Research Support webinar: Data sharing
 
08.03.17 licensing research data for reuse
08.03.17 licensing research data for reuse08.03.17 licensing research data for reuse
08.03.17 licensing research data for reuse
 
What have we learned from talking with the TDM community?
What have we learned from talking with the TDM community?What have we learned from talking with the TDM community?
What have we learned from talking with the TDM community?
 
Google Digitization Project
Google Digitization ProjectGoogle Digitization Project
Google Digitization Project
 

More from UC Berkeley Office of Scholarly Communication Services

More from UC Berkeley Office of Scholarly Communication Services (20)

Copyright (& Other Laws & Policies) and Your Dissertation
Copyright (& Other Laws & Policies) and Your DissertationCopyright (& Other Laws & Policies) and Your Dissertation
Copyright (& Other Laws & Policies) and Your Dissertation
 
Copyright, contracts & open licensing for digital scholarship
Copyright, contracts & open licensing for digital scholarshipCopyright, contracts & open licensing for digital scholarship
Copyright, contracts & open licensing for digital scholarship
 
Update on UC Berkeley Library Open Access Investment Process
Update on UC Berkeley Library Open Access Investment ProcessUpdate on UC Berkeley Library Open Access Investment Process
Update on UC Berkeley Library Open Access Investment Process
 
Opportunities for Open Access in Arts & Humanities
Opportunities for Open Access in Arts & HumanitiesOpportunities for Open Access in Arts & Humanities
Opportunities for Open Access in Arts & Humanities
 
Open access investment at the local level
Open access investment at the local levelOpen access investment at the local level
Open access investment at the local level
 
Building LLTDM: Copyright
Building LLTDM: CopyrightBuilding LLTDM: Copyright
Building LLTDM: Copyright
 
Copyright & Fair Use for Digital Projects
Copyright & Fair Use for Digital ProjectsCopyright & Fair Use for Digital Projects
Copyright & Fair Use for Digital Projects
 
Can We Digitize This? Should We? Navigating Ethics, Law, and Policy in Bringi...
Can We Digitize This? Should We? Navigating Ethics, Law, and Policy in Bringi...Can We Digitize This? Should We? Navigating Ethics, Law, and Policy in Bringi...
Can We Digitize This? Should We? Navigating Ethics, Law, and Policy in Bringi...
 
Managing & Maximizing Your Scholarly Impact
Managing & Maximizing Your Scholarly ImpactManaging & Maximizing Your Scholarly Impact
Managing & Maximizing Your Scholarly Impact
 
Copyright and Your Dissertation
Copyright and Your DissertationCopyright and Your Dissertation
Copyright and Your Dissertation
 
Publish Digital Books and Open Educational Resources with Pressbooks
Publish Digital Books and Open Educational Resources with PressbooksPublish Digital Books and Open Educational Resources with Pressbooks
Publish Digital Books and Open Educational Resources with Pressbooks
 
Can We Digitize This? Understanding Law, Policy, & Ethics in Bringing our Col...
Can We Digitize This? Understanding Law, Policy, & Ethics in Bringing our Col...Can We Digitize This? Understanding Law, Policy, & Ethics in Bringing our Col...
Can We Digitize This? Understanding Law, Policy, & Ethics in Bringing our Col...
 
TSPOA/SPC Webinar 3: Engaging societies and society journals in transitioning...
TSPOA/SPC Webinar 3: Engaging societies and society journals in transitioning...TSPOA/SPC Webinar 3: Engaging societies and society journals in transitioning...
TSPOA/SPC Webinar 3: Engaging societies and society journals in transitioning...
 
TSPOA/SPC Webinar 2: Funding pathways for learned society open access publis...
 TSPOA/SPC Webinar 2: Funding pathways for learned society open access publis... TSPOA/SPC Webinar 2: Funding pathways for learned society open access publis...
TSPOA/SPC Webinar 2: Funding pathways for learned society open access publis...
 
TSPOA/SPC Webinar 1: Understanding Learned Societies
TSPOA/SPC Webinar 1: Understanding Learned SocietiesTSPOA/SPC Webinar 1: Understanding Learned Societies
TSPOA/SPC Webinar 1: Understanding Learned Societies
 
Dipping a toe into the sea of scholarly publishing
Dipping a toe into the sea of  scholarly publishingDipping a toe into the sea of  scholarly publishing
Dipping a toe into the sea of scholarly publishing
 
Transitioning Society Publications to Open Access
Transitioning Society Publications to Open AccessTransitioning Society Publications to Open Access
Transitioning Society Publications to Open Access
 
Responsible Access For Digital Special Collections
Responsible Access For Digital Special CollectionsResponsible Access For Digital Special Collections
Responsible Access For Digital Special Collections
 
Copyright and Scholarly Publishing Issues and options for Publishing Librarians
Copyright and Scholarly Publishing Issues and options for Publishing LibrariansCopyright and Scholarly Publishing Issues and options for Publishing Librarians
Copyright and Scholarly Publishing Issues and options for Publishing Librarians
 
Publish or Perish Reframed: Navigating the New Landscape of Scholarly Publis...
Publish or Perish Reframed: Navigating the New Landscape  of Scholarly Publis...Publish or Perish Reframed: Navigating the New Landscape  of Scholarly Publis...
Publish or Perish Reframed: Navigating the New Landscape of Scholarly Publis...
 

Recently uploaded

How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 
Play hard learn harder: The Serious Business of Play
Play hard learn harder:  The Serious Business of PlayPlay hard learn harder:  The Serious Business of Play
Play hard learn harder: The Serious Business of PlayPooky Knightsmith
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
Economic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food AdditivesEconomic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food AdditivesSHIVANANDaRV
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17Celine George
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSAnaAcapella
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
UGC NET Paper 1 Unit 7 DATA INTERPRETATION.pdf
UGC NET Paper 1 Unit 7 DATA INTERPRETATION.pdfUGC NET Paper 1 Unit 7 DATA INTERPRETATION.pdf
UGC NET Paper 1 Unit 7 DATA INTERPRETATION.pdfNirmal Dwivedi
 
AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptNishitharanjan Rout
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
Simple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdfSimple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdfstareducators107
 
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdfFICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdfPondicherry University
 

Recently uploaded (20)

How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Play hard learn harder: The Serious Business of Play
Play hard learn harder:  The Serious Business of PlayPlay hard learn harder:  The Serious Business of Play
Play hard learn harder: The Serious Business of Play
 
Our Environment Class 10 Science Notes pdf
Our Environment Class 10 Science Notes pdfOur Environment Class 10 Science Notes pdf
Our Environment Class 10 Science Notes pdf
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Economic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food AdditivesEconomic Importance Of Fungi In Food Additives
Economic Importance Of Fungi In Food Additives
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
UGC NET Paper 1 Unit 7 DATA INTERPRETATION.pdf
UGC NET Paper 1 Unit 7 DATA INTERPRETATION.pdfUGC NET Paper 1 Unit 7 DATA INTERPRETATION.pdf
UGC NET Paper 1 Unit 7 DATA INTERPRETATION.pdf
 
AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.ppt
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
Simple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdfSimple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdf
 
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdfFICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
 
OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...
 

Text Data Mining & Publishing

  • 1. Copyright & Fair Use for Digital Projects Text Data Mining & Publishing UC Berkeley Library Rachael Samberg, J.D., MLIS Stacy Reardon, MA, MLS
  • 2. What you can do, not what you can’t
  • 5. The Basics of TDM “Text mining is the use of automated tools, techniques or technology to process large volumes of digital content that is often not well structured - to identify and select relevant information; to extract information from the content, to identify relationships within / between / across documents and incidents or events for meta-analysis.” - from Text & Data Mining - A Librarian Overview by Ann Oakerson (2013)
  • 7. Copyright Exclusive rights to original expression for limited periods of time
  • 9. Public Domain War and Peace, Tolstoy, English translation 1899 CDC report
  • 10. Facts & Ideas Nicholas Mazza, Poetry therapy: Toward a research agenda for the 1990s, The Arts in Psychotherapy, Volume 20, Issue 1,1993,51-59,
  • 11. Content Data about the content TDM researchers can use copyrighted content!
  • 12. Fair Use 17 U.S.C.§ 107 “The fair use of a copyrighted work…for prposes such as criticism, comment, news reporting, teaching…, scholarship, or research, is not an infringement of copyright.”
  • 13. Four-Factor Balancing Test 1. Purpose & character of use “Transformativeness” often dominates 2. Nature of copyrighted work Whether factual/scholarly work 3. Amount and substantiality Size & importance of portion 4. Effect on potential market Whether it supplants market
  • 14. Authors Guild v. HathiTrust 755 F.3d 87 (2d Cir. 2014) Textual analysis that digital library enabled was transformative under factor one, and overall fair Authors Guild v. Google 804 F.3d 202 (2d Cir. 2015) Creation of full-text searchable database with “snippet view” and “ngram viewer” [search strings] were fair uses
  • 15. iParadigms, 562 F. 3d 630 (4th Cir. 2009) Plagiarism detection software that replicated content to detect similarities was fair use
  • 17. Fox News v TVEyes, 883 F.3d 169 (2018) Basic functionality and archiving features were fair use, but making available 10-minute clips was not
  • 18. ● Likely fair to digitize to conduct text data mining (w/security precautions) ● May not be fair to republish large portions of content ● May not be fair to circulate the digitized texts/corpus ● Case-by-case Takeaways
  • 21. Archives Agreement “I understand that permission to publish, or otherwise publicly use, materials . . . must be [granted by library] I understand further that the University makes no representation that it is the owner of the copyright... and that permission to publish must also be obtained from the owner of the copyright.”
  • 22. Website Terms “If you intend to quote extensive amounts of text, use other original content, or reproduce images from this site, please contact us for permission.”
  • 23. California Digital Library’s Model Database Language Authorized Users may use the Licensed Materials to perform and engage in text and/or data mining activities for academic research, scholarship, and other educational purposes... and may utilize and share the results of text and/or data mining in their scholarly work and make the results available for use by others, so long as the purpose is not to create a product for use by third parties that would substitute for the Licensed Materials.
  • 24. CDL Model License: Preserving Fair Use Notwithstanding the foregoing, nothing in this agreement shall otherwise restrict uses of the material that would be fair use pursuant to 17 U.S.C.§ 107 et seq.
  • 25. ● Agreements may constrict uses that would otherwise be fair ● Familiarize yourself with the agreement(s), ask for help, evaluate risk ● Alternatives: ○ Check to see if site has an API ○ Negotiate with content providers / ask permission Takeaways
  • 27. - Computer Fraud & Abuse Act - Digital Rights Management (DRM) & Digital Millennium Copyright Act Other Issues
  • 29. Rights of Privacy ● © protects copyright holders' property rights ● Privacy protects people who are subjects of works ● Fed’l (FERPA, HIPAA) vs. State ● State limits ○ Expire at death ○ Newsworthiness and permission are defenses
  • 30.
  • 32.
  • 33. - Indigenous knowledge - Cultural heritage materials - Endangered species protection
  • 35. UC Berkeley Library Rachael Samberg, J.D., MLIS Stacy Reardon, MA, MLS Text Data Mining & Publishing Text Data Mining Guide (Library) guides.lib.berkeley.edu/text-mining TDM Access Help tdm-access@berkeley.edu