Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data and the Law


Published on

Published in: Education, Technology
  • Be the first to comment

Data and the Law

  1. 1. Data and the lawDorothea Salo
  2. 2. Copy!ght and"e #gital humanities
  3. 3. Why do you care?•Do you want to USE something that maybe under legal restrictions?•Publishing a photo in a book or ebook•Text-mining (“non-consumptive use”)•Extensive quotation•Classroom use•Do you want to MAKE something?•... and let other people legally use it?•... and let other people legally use it, but only under certain conditions?•... without them bugging you by email all the time?•Then you NEED the basics of copyright.
  4. 4. DH and copyright•DHers study and use a lot of copyrightedobjects, in ways that sometimes create risk ofinfringement (or perceived infringement).•Copyright creates barriers to accessingmaterials that DHers would like to study.• Librarians can sometimes help break down these barriers.•DHers therefore need a base-levelunderstanding of copyright, and a willingnessto research beyond the base level.• And sometimes a willingness to take risks!
  5. 5. Sound off!•News or projects that have copyrightimplications for DHers?
  6. 6. Stuff you can usewith relative ease•Public domain stuff•... if you can figure out what that is, cf. “Happy Birthday” lawsuit•Federal-government stuff•Openly-licensed stuff•Creative Commons is your friend!•Other licensed stuff•... but you better follow the terms of the license!•Other stuff, to an extent: “fair use”
  7. 7. Howdoescopy!ght work?
  8. 8. What is it (in the US)?•A limited monopoly granted by federal law•over “original works of authorship” that are “fixed in a tangible mediumof expression”*•‘To promote the progress of science and theuseful arts’•Is it still doing that? You decide. But I think not, on the whole.•Not unlimited! Not forever! By design!*yes, the Internet counts as “tangible” for copyright purposes
  9. 9. A copyrightable itemmust minimally...•Be original• Feist v. Rural Telephone Service: “sweat of the brow” does not suffice tomake something copyrightable• This is one reason you’ll hear “data can’t be copyrighted.”•Be fixed in some “tangible” form• yes, the Internet counts!!!!•That’s it. But it wasn’t always.• Registration used to be required, not optional.• If you didn’t renew? You snooze, you lose.• Didn’t put an explicit copyright notice on it? Oops.
  10. 10. Copyright does not cover...•Ideas (only their fixed expression)•Databases and other fact collections! (notably different overseas)•Methods, processes, systems (patent!)•Recipes are uncopyrightable. Bet you didn’t know that.•Messy exception: software.•Words. (trademark!) Titles. Recipes.•Invented languages? Nobody’s sure.•Natural languages? Nope.•Works by the federal government•Works already in the public domain•no takebacks! ... except Golan v. Holder.
  11. 11. Copyright DOES cover...•Unpublished material• more straitly than published! and with different time rules!•Images and photographs• Are copyrighted! Just like text!• It’s not “fair use” just because you found it on the Internet.• It’s not “fair use” because you give credit; US copyright law says nothingabout credit!•Sound (and fury)...• Same idea.• Except sound recordings do not fall under federal copyright at present!Patchwork of state law; talk of “harmonization.”• (Please don’t ask me about sampling. ARGH.)
  12. 12. Copyright lasts...•For something created 1978 or later:• Life of author plus 70 years• For corporate-created works (often “works for hire,”) 120 years aftercreation or 95 years after publication.• Copyright Act of 1976•For something created between 1923 and1977:• ... that’s a really good question, because of all the former copyrightformalities that don’t exist now.•Pre-1923: probably public domain•Once copyright expires, the item is in the“public domain.”
  13. 13. What’s copyrightable?
  14. 14. Copyfraud•Claiming a copyright that either doesn’texist, or is someone else’s.• Bridgeman v. Corel: “slavish copying” of a physical item, as in aphotographic or digitized reproduction, fails copyright’s originality test.•Not illegal, sadly.•ENDEMIC. Don’t believe every copyrightnotice you read!• GLAM are not immune to copyfraud.• Remember: we have the rights we USE and DEFEND. You may have tointervene with your publishers!•Recommended: Jason Mazzone
  15. 15. Doctrine of first sale•Owning copyright in a work does notconfer control over legally-madephysical copies of that work.•Buyers who buy legally can share, lend,and resell their copies freely.•They can’t make copies of their legally-obtained copies withoutincurring copyright-litigation risk, however.•Wiley v. Kirtsaeng: copies purchased legally overseas ARE subjectto first-sale, CAN be imported and resold legally.•There is no right of first sale in digitalmaterials. Only physical ones.
  16. 16. Important note•Everything I’ve told you is for US works.•Copyright works differently elsewhere!(Yes, despite Berne.)•“Moral rights” of authors•Copyright term length•What is copyrightable•This is a wretched headache.•If you have an international-copyright question, SEE A LAWYER.Really. I mean it!
  17. 17. What good%copy!ght?
  18. 18. What can you do* withyour copyrighted work?CopyPerformAll rights sold separately!RepublishTranslateAdapt“derivative work”BroadcastArrangeUse as part ofa new workAllow orrestrict accessWrite a sequel* and prevent others from doing without permission
  19. 19. What can you do withyour copyright?•Sell it, in whole or in part.•Sign it away without payment.•For the most part, this is what faculty do with their journal articles.•License it (i.e. give others permission to usesome or all of your rights)•for broad or narrow purposes•temporarily or permanently•“exclusive”ly or non-•free or for pay.•It’s just like any other license. You negotiate it! (With a lawyer around.)
  20. 20. A “copyright transfer agreement”is what it sounds like!Once you transfer your exclusive copyright over a work to someone else,YOU NO LONGER OWN THE WORK.You have no say whatever in what is done with or to it,AND YOU CANNOT USE IT AS THOUGH YOU OWNED ITS COPYRIGHT.Publishers ask you to sign these. KNOW WHAT YOU ARE SIGNING.
  21. 21. Libraries and licensing•All those nifty ebooks and ejournals thelibrary gets you access to?•The library pays ridiculous boatloads ofmoney to LICENSE (not own!) them.•No first-sale! These are digital!•And their use is subject to whatever terms the publisher/aggregatorand library signed.•You can’t treat ‘em like print, sorry!
  22. 22. Working with licensed materials•E.g. text mining, visualizations, etc.• here’s that “non-consumptive use” “distant reading” thing again...•Please don’t Just Do It!• Licensors monitor use. If you download a whole bunch at once, they’ll notice,and they’ll yank access for all of campus.• The worst-case consequences for you could be severe. We learned this, sadly,from what happened to Aaron Swartz.•Talk to your librarians.• A given aggregator may have a research program you can join.• Or the library may be able to work out a deal.• Or the licensor, when contacted through the library’s channels, may say “Oh.Huh. Sure, why not?”
  23. 23. “Orphan work”•Copyright can leave its original owner (via saleor other transfer), in part or in whole.• Authors die. So do publishers. Wills? Don’t make me laugh.•Copyright registration has been optional formany years.• It’s not optional if you actually want to sue! But you can still register after aninfringement has taken place.•Result: large body of copyrighted works whoseowners are unknown or unclear.• Especially from the mid-to-late 20th century.•What about digitization? DH work? Preservation?
  24. 24. Digitalcopy!ght
  25. 25. Copyright and the digital realm•Suddenly it’s a lot easier to make perfectcopies!•Some of the workings of the Internet requirecopies!•Your web browser makes a copy of every page you see•Exception: “streaming media”•Current media business model is founded uponthe difficulty of making perfect copies.•Solution (?): DRM!
  26. 26. Digital rights management•Technological jiggery-pokery that locks adigital file into certain uses• By device• By time or number-of-use limits• By software• By user or geography• Examples: various ebook schemes, DVD “zoning”•Eschenfelder: “technological protectionmeasures”• DRM (“hard” TPM) plus heightened annoyance factors (“soft” TPM)
  27. 27. DRM and the library•DRMed files present a substantial digitalpreservation risk•E-journals and databases could use DRMon their materials...•... but mostly haven’t, preferring proxy servers and “annoyancefactor” tricks (obfuscation, omission, polyglot, frustration)•And preservation practices for these are fairly well-established.•Ebooks, however, are another story.
  28. 28. DRM and the law: DMCA•Digital Millennium Copyright Act (1996)•Illegal to circumvent DRM•For us too! No exceptions for GLAM. Or fair use. Or research.•No, not even for preservation.•ISPs must take down allegedly copyright-infringing content when notified•Notable chilling effects•Sklyarov case (2001), Felten case (cryptography), Sony rootkit case•YouTube and other web properties are still struggling with how to manageDMCA at scale. This has bitten some DHers!
  29. 29. CFAA•Computer Fraud and Abuse Act•Meant to go after black-hat hackers•Loose enough wording for prosecutorsto attack any terms-of-service violation•Used against Aaron Swartz, others•“Aaron’s Law” just introduced inCongress
  30. 30. Advocacy•Librarians are political animals, especiallyaround intellectual-property and privacy law.•We have to be!•Faculty: please make common cause with us. Weneed more voices!•And humanists tend to be... less helpful than we’d like.•In the hopper: US copyright “reform,”international treaties, ebook access for the blind,open access to federally-funded research•Twitterers/Tumblarians: watch @ARLPolicy
  31. 31. Exceptions and workar&ndsto copy!ght
  32. 32. Copyright permits...•Copying for certain socially-approved uses•Library preservation and patron service (“section 108”)•Classroom use (“the TEACH Act”)•Limited copying for other reasons: “fairuse” (“section 107”)•Scholarship•Parody/satire•Etc.
  33. 33. Fair use•Possibly the least-understood concept incopyright!•An “affirmative defense” in a copyrightlawsuit.•Though Kevin Smith notably disagrees with this analysis...•Principles and guidelines, not hard-and-fast rules.
  34. 34. How to know for surewhether a use is fair,in four simple steps1.Copy a copyrighted work.2.Get yourself sued by the work’slegitimate copyright owner.3.Assert fair use as your defense.4.Win the case.AFAIK, this is the only way.
  35. 35. I’m thinking you think this isa loony way to proceed.Good. I agree with you.But that means that what we’redoing is risk management.
  36. 36. Risk is never zero.I wish it could be too.I’m sorry.(but if it makes you feel any better, many copyrightrisks are overblown)
  37. 37. Four-factor fair use test•Character of the use•“Transformative use” finding favor with judges lately.•Nature of the work•Amount of the work copied•often considered as a percentage of the whole•also, “heart of the work” matters•Effect on the market for that work, ifeverybody did what you’re doing•part of this is asking whether there IS a market for the work inthe first place!
  38. 38. Community fair-use principles•Started with documentarists•who couldn’t get insurance for their work because of perceivedcopyright-infringement risk... which, given litigious idiots who sueover background noise... was a rational stance.•So they published a “how documentarists use fair use” document.•Courts took notice. So more suchdocuments have been created.•Academic libraries (ARL), journalism (Center for Social Media)•There isn’t one for DH. There should be.Talk to your professional organizations!
  39. 39. Creative Commons•What if you WANT people to reuse your stuff?• You could grant it to the public domain...• ... but then anybody can do anything with it.•Creative Commons is a middle ground.• Boilerplate language and machine-readable techniques for licensing copyrightedworks to all comers!• Under certain conditions...•N.b.: CC is predicated on owning a copyright. Ifyou don’t, you can’t use a CC license!• If there’s a copyright, but it’s not yours. (Jointly-held with others is okay.)• If it’s not copyrightable to begin with
  40. 40. CC license provisions•BY: Must attribute to creator.• On all CC licenses except CC0 (public domain dedication)•ND: No derivative works.•NC: Non-commercial use only.• Looks better than it is. Avoid!•SA: Share-alike• Release new work under the same or more liberal license.•These can be combined!•CC0: total rights waiver.• Special resonance for data!
  41. 41. CC and the humanities•So, that thing with the UK history editors...•University-press editors are often not friendsof openness either.•I really hope 2013 is the year we startcalling these people on their, um, errors andmisrepresentations.•DH is in a good position to do that.•It’s more open and public than much of the humanities.•And slightly (only slightly!) less dependent on traditional bookpublishing.
  42. 42. Okay, so?•The point of keeping data is to reuse it!•Okay, there are other points, such as reproducibility and fraud detection. Still.The central reason we’re talking about data so intently is reuse value.•Data with legal strings attached are harder toreuse. So fewer people reuse them.•Kinda defeats the purpose, no?•This is why, as a digital humanist, YOU NEEDTO CARE about open access and the CreativeCommons.•And advocate for them! Again, humanists have lagged here.
  43. 43. Opennessand o"er policies
  44. 44. Open movements•There are a lot of them. Don’t mix them up.•I know, I know, everybody else does. Well, everybody else is stupid! Don’tbe stupid!•Open source SOFTWARE•Open access JOURNAL ARTICLES•(and occasionally books, but mostly journal articles)•Open (government) DATA•Open (notebook) SCIENCE•which is larger than open data! It opens the process of doing the scienceas well.
  45. 45. Open access fundermandate: NIH•Congress: “Hi, NIH. We think taxpayers shouldbe able to read the research they fund!”• NIH: “Cool. We’ll build a repository for it, then.”•NIH, mid-2000s: “Hi, researchers. Please putyour final manuscripts in PubMed Central.”• You can guess how well THAT worked. ~3% deposit rate.•Congress: “Okay, NIH, voluntary didn’t work;how about mandatory?”• Current deposit rate: about 67%.• But the NIH has only started cracking down on slackers. (Grant cycles are long.)
  46. 46. Keep in mind: universitiesare also funders!•DH centers, IT support, and libraries don’texactly come free!•But it’s not easy (maybe not possible) for auniversity to impose an open-access mandatethe way a funder can.•Tradition of “faculty governance” forbids.•Are there university OA mandates? Yes!•But they’re by faculty (usually faculty senates, sometimes individualschools/departments) for faculty. Always. Anything else, and faculty howl.•Humanists are the loudest howlers. Make of that what you will.
  47. 47. NSF data-management plans•As of January 2011, all NSF grant proposalsmust include a two-page data-management plan.• Got no data? Using someone else’s? Say so!• Data sharing required? Not necessarily. Just data management.• Best practices? Standards? Depends on the discipline/directorate, but for themost part, not yet.• Digital data only? Absolutely not! If you’re taking physical samples, you needto talk about them too.•Why am I talking about this here and now?• Because the NEH’s Office of Digital Programs has a similar policy!• Because the OSTP Memo bids fair to extend this to many more agencies!
  48. 48. Now: OSTP Memo•Office of Science and Technology Policy(part of the executive branch)•Big federal funders have until the end of Julyto explain how they’ll achieve open accessAND open data for research they fund.•The NEH is not subject to the memo (budget too small), but they haveannounced they plan to comply anyway.•Pass the popcorn. This should be good.
  49. 49. How can your library help?•Getting the word out•Offering consultation services• often in collaboration with other campus units, e.g. IT• usually includes an informational website•Offering institutional repositories as data home• This is... problematic, but it’s something.•Training•In a very few cases: planning for and workingtoward greater involvement• e.g. Purdue, Penn State, California Digital Library, University of Prince Edward Island
  50. 50. Local datapolicies
  51. 51. Who has policies?Photo: “Who Am I?” Ahmad Hammoud CC-BY•Non-profit grant funders, nowand then•The federal government, moreand more often•State governments, in limitedsituations•Your institution, sometimes•Journals, sometimes•(not usually in the humanities)
  52. 52. What might a data policy cover?•Who “owns” data•How long you need to keep data•When and with whom you need to sharedata (or are forbidden from doing so)•What data you need to keep secure,and (sometimes) standards for doing so•What happens to “your” data when yougraduate or change jobs or institutions•PAY CLOSE ATTENTION TO THIS, graduate students! This can bite you!Photo: “Martha” Ford Buchanan CC-BY
  53. 53. Institutional policies•Not all institutions have them.•Not all institutions enforce them.•But if you get in trouble, the policy will be used to throw the book at you.•FIND OUT. Wherever you go, whateveryou do, FIND OUT.Photo: “Rowlandsway House, Wythenshawe” Gene Hunt CC-BY
  54. 54. Open datae"ics challen(s
  55. 55. FERPA•If you want student records for your research,plan on getting student or parental consent(depending on student’s age).• Caveat: if you’re doing research FOR THE SCHOOL ITSELF, you’re probablyoff the hook, but you can’t use the data for anything else.•FERPA does not cover statistical datacompilations in which students are notindividually identifiable.•Graded assignments are covered (because thegrade is protected). An assignment printoutwith no grade? Not covered!
  56. 56. IRB data questions•Institutional Review Board: ethics watchdog forresearch•Science has a pretty exploitative history. IRBs are designed to prevent harm tostudy subjects.•Still working to catch up, mentally, to therealities of e.g. Web research, open data•Consider referring ethics questions about datasharing to the IRB. They’re the last word.•Though realize you may have to educate them! IRBs are known to be...overzealous, many places.
  57. 57. “Extra risk”•Key variable for IRBs is “risk to participants.”•What are the additional risks of dataretention and sharing?• Is Big Brother coming to get your study subjects?• Added deanonymization/reidentification risk? Cracking risk?• “If it’s on the open Internet, it’s fair game.” Well...•IRBs not entirely up on this just now. THEYWILL LEARN.•And more humanists, especially digitalhumanists, are doing work that falls underthis kind of oversight.
  58. 58. Thanks!•Copyright 2011 by Dorothea Salo.•This lecture and slide deck are licensedunder a Creative Commons Attribution3.0 United States License.