Successfully reported this slideshow.

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Data and the Law

  1. 1. Data and the law Dorothea Salo
  2. 2. C opy!ght and "e #gital humanities
  3. 3. Why do you care? •Do you want to USE something that may be under legal restrictions? •Publishing a photo in a book or ebook •Text-mining (“non-consumptive use”) •Extensive quotation •Classroom use •Do you want to MAKE something? •... and let other people legally use it? •... and let other people legally use it, but only under certain conditions? •... without them bugging you by email all the time? •Then you NEED the basics of copyright.
  4. 4. DH and copyright •DHers study and use a lot of copyrighted objects, in ways that sometimes create risk of infringement (or perceived infringement). •Copyright creates barriers to accessing materials that DHers would like to study. • Librarians can sometimes help break down these barriers. •DHers therefore need a base-level understanding of copyright, and a willingness to research beyond the base level. • And sometimes a willingness to take risks!
  5. 5. Sound off! •News or projects that have copyright implications for DHers?
  6. 6. Stuff you can use with relative ease •Public domain stuff •... if you can figure out what that is, cf. “Happy Birthday” lawsuit •Federal-government stuff •Openly-licensed stuff •Creative Commons is your friend! •Other licensed stuff •... but you better follow the terms of the license! •Other stuff, to an extent: “fair use”
  7. 7. H ow does copy!ght work?
  8. 8. What is it (in the US)? •A limited monopoly granted by federal law •over “original works of authorship” that are “fixed in a tangible medium of expression”* •‘To promote the progress of science and the useful arts’ •Is it still doing that? You decide. But I think not, on the whole. •Not unlimited! Not forever! By design! *yes, the Internet counts as “tangible” for copyright purposes
  9. 9. A copyrightable item must minimally... •Be original • Feist v. Rural Telephone Service: “sweat of the brow” does not suffice to make something copyrightable • This is one reason you’ll hear “data can’t be copyrighted.” •Be fixed in some “tangible” form • yes, the Internet counts!!!! •That’s it. But it wasn’t always. • Registration used to be required, not optional. • If you didn’t renew? You snooze, you lose. • Didn’t put an explicit copyright notice on it? Oops.
  10. 10. Copyright does not cover... •Ideas (only their fixed expression) •Databases and other fact collections! (notably different overseas) •Methods, processes, systems (patent!) •Recipes are uncopyrightable. Bet you didn’t know that. •Messy exception: software. •Words. (trademark!) Titles. Recipes. •Invented languages? Nobody’s sure. •Natural languages? Nope. •Works by the federal government •Works already in the public domain •no takebacks! ... except Golan v. Holder.
  11. 11. Copyright DOES cover... •Unpublished material • more straitly than published! and with different time rules! •Images and photographs • Are copyrighted! Just like text! • It’s not “fair use” just because you found it on the Internet. • It’s not “fair use” because you give credit; US copyright law says nothing about credit! •Sound (and fury)... • Same idea. • Except sound recordings do not fall under federal copyright at present! Patchwork of state law; talk of “harmonization.” • (Please don’t ask me about sampling. ARGH.)
  12. 12. Copyright lasts... •For something created 1978 or later: • Life of author plus 70 years • For corporate-created works (often “works for hire,”) 120 years after creation or 95 years after publication. • Copyright Act of 1976 •For something created between 1923 and 1977: • ... that’s a really good question, because of all the former copyright formalities that don’t exist now. •Pre-1923: probably public domain •Once copyright expires, the item is in the “public domain.”
  13. 13. What’s copyrightable?
  14. 14. Copyfraud •Claiming a copyright that either doesn’t exist, or is someone else’s. • Bridgeman v. Corel: “slavish copying” of a physical item, as in a photographic or digitized reproduction, fails copyright’s originality test. •Not illegal, sadly. •ENDEMIC. Don’t believe every copyright notice you read! • GLAM are not immune to copyfraud. • Remember: we have the rights we USE and DEFEND. You may have to intervene with your publishers! •Recommended: Jason Mazzone
  15. 15. Doctrine of first sale •Owning copyright in a work does not confer control over legally-made physical copies of that work. •Buyers who buy legally can share, lend, and resell their copies freely. •They can’t make copies of their legally-obtained copies without incurring copyright-litigation risk, however. •Wiley v. Kirtsaeng: copies purchased legally overseas ARE subject to first-sale, CAN be imported and resold legally. •There is no right of first sale in digital materials. Only physical ones.
  16. 16. Important note •Everything I’ve told you is for US works. •Copyright works differently elsewhere! (Yes, despite Berne.) •“Moral rights” of authors •Copyright term length •What is copyrightable •This is a wretched headache. •If you have an international-copyright question, SEE A LAWYER. Really. I mean it!
  17. 17. W hat good % copy!ght?
  18. 18. What can you do* with your copyrighted work? Copy Perform All rights sold separately! Republish Translate Adapt “derivative work” Broadcast Arrange Use as part of a new work Allow or restrict access Write a sequel * and prevent others from doing without permission
  19. 19. What can you do with your copyright? •Sell it, in whole or in part. •Sign it away without payment. •For the most part, this is what faculty do with their journal articles. •License it (i.e. give others permission to use some or all of your rights) •for broad or narrow purposes •temporarily or permanently •“exclusive”ly or non- •free or for pay. •It’s just like any other license. You negotiate it! (With a lawyer around.)
  20. 20. A “copyright transfer agreement” is what it sounds like! Once you transfer your exclusive copyright over a work to someone else, YOU NO LONGER OWN THE WORK. You have no say whatever in what is done with or to it, AND YOU CANNOT USE IT AS THOUGH YOU OWNED ITS COPYRIGHT. Publishers ask you to sign these. KNOW WHAT YOU ARE SIGNING.
  21. 21. Libraries and licensing •All those nifty ebooks and ejournals the library gets you access to? •The library pays ridiculous boatloads of money to LICENSE (not own!) them. •No first-sale! These are digital! •And their use is subject to whatever terms the publisher/aggregator and library signed. •You can’t treat ‘em like print, sorry!
  22. 22. Working with licensed materials •E.g. text mining, visualizations, etc. • here’s that “non-consumptive use” “distant reading” thing again... •Please don’t Just Do It! • Licensors monitor use. If you download a whole bunch at once, they’ll notice, and they’ll yank access for all of campus. • The worst-case consequences for you could be severe. We learned this, sadly, from what happened to Aaron Swartz. •Talk to your librarians. • A given aggregator may have a research program you can join. • Or the library may be able to work out a deal. • Or the licensor, when contacted through the library’s channels, may say “Oh. Huh. Sure, why not?”
  23. 23. “Orphan work” •Copyright can leave its original owner (via sale or other transfer), in part or in whole. • Authors die. So do publishers. Wills? Don’t make me laugh. •Copyright registration has been optional for many years. • It’s not optional if you actually want to sue! But you can still register after an infringement has taken place. •Result: large body of copyrighted works whose owners are unknown or unclear. • Especially from the mid-to-late 20th century. •What about digitization? DH work? Preservation?
  24. 24. D igital copy!ght
  25. 25. Copyright and the digital realm •Suddenly it’s a lot easier to make perfect copies! •Some of the workings of the Internet require copies! •Your web browser makes a copy of every page you see •Exception: “streaming media” •Current media business model is founded upon the difficulty of making perfect copies. •Solution (?): DRM!
  26. 26. Digital rights management •Technological jiggery-pokery that locks a digital file into certain uses • By device • By time or number-of-use limits • By software • By user or geography • Examples: various ebook schemes, DVD “zoning” •Eschenfelder: “technological protection measures” • DRM (“hard” TPM) plus heightened annoyance factors (“soft” TPM)
  27. 27. DRM and the library •DRMed files present a substantial digital preservation risk •E-journals and databases could use DRM on their materials... •... but mostly haven’t, preferring proxy servers and “annoyance factor” tricks (obfuscation, omission, polyglot, frustration) •And preservation practices for these are fairly well-established. •Ebooks, however, are another story.
  28. 28. DRM and the law: DMCA •Digital Millennium Copyright Act (1996) •Illegal to circumvent DRM •For us too! No exceptions for GLAM. Or fair use. Or research. •No, not even for preservation. •ISPs must take down allegedly copyright- infringing content when notified •Notable chilling effects •Sklyarov case (2001), Felten case (cryptography), Sony rootkit case •YouTube and other web properties are still struggling with how to manage DMCA at scale. This has bitten some DHers!
  29. 29. CFAA •Computer Fraud and Abuse Act •Meant to go after black-hat hackers •Loose enough wording for prosecutors to attack any terms-of-service violation •Used against Aaron Swartz, others •“Aaron’s Law” just introduced in Congress
  30. 30. Advocacy •Librarians are political animals, especially around intellectual-property and privacy law. •We have to be! •Faculty: please make common cause with us. We need more voices! •And humanists tend to be... less helpful than we’d like. •In the hopper: US copyright “reform,” international treaties, ebook access for the blind, open access to federally-funded research •Twitterers/Tumblarians: watch @ARLPolicy
  31. 31. Exceptions and workar&nds to copy!ght
  32. 32. Copyright permits... •Copying for certain socially-approved uses •Library preservation and patron service (“section 108”) •Classroom use (“the TEACH Act”) •Limited copying for other reasons: “fair use” (“section 107”) •Scholarship •Parody/satire •Etc.
  33. 33. Fair use •Possibly the least-understood concept in copyright! •An “affirmative defense” in a copyright lawsuit. •Though Kevin Smith notably disagrees with this analysis... •Principles and guidelines, not hard-and- fast rules.
  34. 34. How to know for sure whether a use is fair, in four simple steps 1.Copy a copyrighted work. 2.Get yourself sued by the work’s legitimate copyright owner. 3.Assert fair use as your defense. 4.Win the case. AFAIK, this is the only way.
  35. 35. I’m thinking you think this is a loony way to proceed. Good. I agree with you. But that means that what we’re doing is risk management.
  36. 36. Risk is never zero. I wish it could be too. I’m sorry. (but if it makes you feel any better, many copyright risks are overblown)
  37. 37. Four-factor fair use test •Character of the use •“Transformative use” finding favor with judges lately. •Nature of the work •Amount of the work copied •often considered as a percentage of the whole •also, “heart of the work” matters •Effect on the market for that work, if everybody did what you’re doing •part of this is asking whether there IS a market for the work in the first place!
  38. 38. Community fair-use principles •Started with documentarists •who couldn’t get insurance for their work because of perceived copyright-infringement risk... which, given litigious idiots who sue over background noise... was a rational stance. •So they published a “how documentarists use fair use” document. •Courts took notice. So more such documents have been created. •Academic libraries (ARL), journalism (Center for Social Media) •There isn’t one for DH. There should be. Talk to your professional organizations!
  39. 39. Creative Commons •What if you WANT people to reuse your stuff? • You could grant it to the public domain... • ... but then anybody can do anything with it. •Creative Commons is a middle ground. • Boilerplate language and machine-readable techniques for licensing copyrighted works to all comers! • Under certain conditions... •N.b.: CC is predicated on owning a copyright. If you don’t, you can’t use a CC license! • If there’s a copyright, but it’s not yours. (Jointly-held with others is okay.) • If it’s not copyrightable to begin with
  40. 40. CC license provisions •BY: Must attribute to creator. • On all CC licenses except CC0 (public domain dedication) •ND: No derivative works. •NC: Non-commercial use only. • Looks better than it is. Avoid! •SA: Share-alike • Release new work under the same or more liberal license. •These can be combined! •CC0: total rights waiver. • Special resonance for data!
  41. 41. CC and the humanities •So, that thing with the UK history editors... •University-press editors are often not friends of openness either. •I really hope 2013 is the year we start calling these people on their, um, errors and misrepresentations. •DH is in a good position to do that. •It’s more open and public than much of the humanities. •And slightly (only slightly!) less dependent on traditional book publishing.
  42. 42. Okay, so? •The point of keeping data is to reuse it! •Okay, there are other points, such as reproducibility and fraud detection. Still. The central reason we’re talking about data so intently is reuse value. •Data with legal strings attached are harder to reuse. So fewer people reuse them. •Kinda defeats the purpose, no? •This is why, as a digital humanist, YOU NEED TO CARE about open access and the Creative Commons. •And advocate for them! Again, humanists have lagged here.
  43. 43. O penness and o"er policies
  44. 44. Open movements •There are a lot of them. Don’t mix them up. •I know, I know, everybody else does. Well, everybody else is stupid! Don’t be stupid! •Open source SOFTWARE •Open access JOURNAL ARTICLES •(and occasionally books, but mostly journal articles) •Open (government) DATA •Open (notebook) SCIENCE •which is larger than open data! It opens the process of doing the science as well.
  45. 45. Open access funder mandate: NIH •Congress: “Hi, NIH. We think taxpayers should be able to read the research they fund!” • NIH: “Cool. We’ll build a repository for it, then.” •NIH, mid-2000s: “Hi, researchers. Please put your final manuscripts in PubMed Central.” • You can guess how well THAT worked. ~3% deposit rate. •Congress: “Okay, NIH, voluntary didn’t work; how about mandatory?” • Current deposit rate: about 67%. • But the NIH has only started cracking down on slackers. (Grant cycles are long.)
  46. 46. Keep in mind: universities are also funders! •DH centers, IT support, and libraries don’t exactly come free! •But it’s not easy (maybe not possible) for a university to impose an open-access mandate the way a funder can. •Tradition of “faculty governance” forbids. •Are there university OA mandates? Yes! •But they’re by faculty (usually faculty senates, sometimes individual schools/departments) for faculty. Always. Anything else, and faculty howl. •Humanists are the loudest howlers. Make of that what you will.
  47. 47. NSF data-management plans •As of January 2011, all NSF grant proposals must include a two-page data-management plan. • Got no data? Using someone else’s? Say so! • Data sharing required? Not necessarily. Just data management. • Best practices? Standards? Depends on the discipline/directorate, but for the most part, not yet. • Digital data only? Absolutely not! If you’re taking physical samples, you need to talk about them too. •Why am I talking about this here and now? • Because the NEH’s Office of Digital Programs has a similar policy! • Because the OSTP Memo bids fair to extend this to many more agencies!
  48. 48. Now: OSTP Memo •Office of Science and Technology Policy (part of the executive branch) •Big federal funders have until the end of July to explain how they’ll achieve open access AND open data for research they fund. •The NEH is not subject to the memo (budget too small), but they have announced they plan to comply anyway. •Pass the popcorn. This should be good.
  49. 49. How can your library help? •Getting the word out •Offering consultation services • often in collaboration with other campus units, e.g. IT • usually includes an informational website •Offering institutional repositories as data home • This is... problematic, but it’s something. •Training •In a very few cases: planning for and working toward greater involvement • e.g. Purdue, Penn State, California Digital Library, University of Prince Edward Island
  50. 50. Local data policies
  51. 51. Who has policies? Photo: “Who Am I?” Ahmad Hammoud CC-BY •Non-profit grant funders, now and then •The federal government, more and more often •State governments, in limited situations •Your institution, sometimes •Journals, sometimes •(not usually in the humanities)
  52. 52. What might a data policy cover? •Who “owns” data •How long you need to keep data •When and with whom you need to share data (or are forbidden from doing so) •What data you need to keep secure, and (sometimes) standards for doing so •What happens to “your” data when you graduate or change jobs or institutions •PAY CLOSE ATTENTION TO THIS, graduate students! This can bite you! Photo: “Martha” Ford Buchanan CC-BY
  53. 53. Institutional policies •Not all institutions have them. •Not all institutions enforce them. •But if you get in trouble, the policy will be used to throw the book at you. •FIND OUT. Wherever you go, whatever you do, FIND OUT. Photo: “Rowlandsway House, Wythenshawe” Gene Hunt CC-BY
  54. 54. O pen data e"ics challen(s
  55. 55. FERPA •If you want student records for your research, plan on getting student or parental consent (depending on student’s age). • Caveat: if you’re doing research FOR THE SCHOOL ITSELF, you’re probably off the hook, but you can’t use the data for anything else. •FERPA does not cover statistical data compilations in which students are not individually identifiable. •Graded assignments are covered (because the grade is protected). An assignment printout with no grade? Not covered!
  56. 56. IRB data questions •Institutional Review Board: ethics watchdog for research •Science has a pretty exploitative history. IRBs are designed to prevent harm to study subjects. •Still working to catch up, mentally, to the realities of e.g. Web research, open data •Consider referring ethics questions about data sharing to the IRB. They’re the last word. •Though realize you may have to educate them! IRBs are known to be... overzealous, many places.
  57. 57. “Extra risk” •Key variable for IRBs is “risk to participants.” •What are the additional risks of data retention and sharing? • Is Big Brother coming to get your study subjects? • Added deanonymization/reidentification risk? Cracking risk? • “If it’s on the open Internet, it’s fair game.” Well... •IRBs not entirely up on this just now. THEY WILL LEARN. •And more humanists, especially digital humanists, are doing work that falls under this kind of oversight.
  58. 58. Thanks! •Copyright 2011 by Dorothea Salo. •This lecture and slide deck are licensed under a Creative Commons Attribution 3.0 United States License.