Your SlideShare is downloading. ×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

June2014 brownbag privacy

747
views

Published on

his talk provides an overview of the changing landscape of information privacy with a focus on the possible consequences of these changes for researchers and research institutions. …

his talk provides an overview of the changing landscape of information privacy with a focus on the possible consequences of these changes for researchers and research institutions.


Personal information continues to become more available, increasingly easy to link to individuals, and increasingly important for research. New laws, regulations and policies governing information privacy continue to emerge, increasing the complexity of management. Trends in information collection and management — cloud storage, “big” data, and debates about the right to limit access to published but personal information complicate data management, and make traditional approaches to managing confidential data decreasingly effective.

Information Science Brown Bag talks, hosted by the Program on Information Science, consists of regular discussions and brainstorming sessions on all aspects of information science and uses of information science and technology to assess and solve institutional, social and research problems. These are informal talks. Discussions are often inspired by real-world problems being faced by the lead discussant.

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
747
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • 5 Minutes
  • Transcript

    • 1. Prepared for Program on Information Science – Brown Bag Talks MIT June 2014 Navigating the Changing Landscape of Information Privacy Dr. Micah Altman <escience@mit.edu> Director of Research, MIT Libraries Non-Resident Senior Fellow, Brookings Institution
    • 2. DISCLAIMER These opinions are my own, they are not the opinions of MIT, Brookings, any of the project funders, nor (with the exception of co-authored previously published work) my collaborators Secondary disclaimer: “It’s tough to make predictions, especially about the future!” -- Attributed to Woody Allen, Yogi Berra, Niels Bohr, Vint Cerf, Winston Churchill, Confucius, Disreali [sic], Freeman Dyson, Cecil B. Demille, Albert Einstein, Enrico Fermi, Edgar R. Fiedler, Bob Fourer, Sam Goldwyn, Allan Lamport, Groucho Marx, Dan Quayle, George Bernard Shaw, Casey Stengel, Will Rogers, M. Taub, Mark Twain, Kerr L. White, etc. Capturing Contributor Roles in Scholarly Publications
    • 3. Collaborators & Co-Conspirators • Privacy Tools for Sharing Research Data Team (Salil Vadhan, P.I.) http://privacytools.seas.harvard.edu/peopl e • Research Support Supported in part by NSF grant CNS-1237235 Capturing Contributor Roles in Scholarly Publications
    • 4. Related Work • Vadhan, S. , et al. 2010. “Re: Advance Notice of Proposed Rulemaking: Human Subjects Research Protections”. Available from: http://dataprivacylab.org/projects/irb/Vadhan.pdf Reprints available from: informatics.mit.edu Capturing Contributor Roles in Scholarly Publications
    • 5. Roadmap * Level setting – what is confidential information? * * Little Data, Big Data, & Privacy * * Elements of A Modern Framework for Managing Confidential Information * Capturing Contributor Roles in Scholarly Publications
    • 6. Capturing Contributor Roles in Scholarly Publications What is confidential information?
    • 7. Personally identifiable private information is common v. 24 (January IAP Session 1) • Includes information from a variety of sources, such as… – Research data, even if you aren’t the original collector – Student “records” such as e-mail, grades – Logs from web-servers, other systems • Lots of things are potentially identifying: – Under some federal laws: IP addresses, dates, zipcodes, … – Birth date + zipcode + gender uniquely identify ~87% of people in the U.S. [Sweeney 2002] Try it: http://aboutmyinfo.org/index.html – With date and place of birth, can guess first five digits of social security number (SSN) > 60% of the time. (Can guess the whole thing in under 10 tries, for a significant minority of people.) [Aquisti & Gross 2009] – Analysis of writing style or eclectic tastes has been used to identify individuals • Tables, graphs and maps can also reveal identifiable information Brownstein, et al., 2006 , NEJM 355(16), 7Managing Confidential Data
    • 8. Data Points on Data Privacy Capturing Contributor Roles in Scholarly Publications 2010
    • 9. What’s wrong with this picture? v. 24 (January IAP Session 1) Law, policy, ethics Research design … Information security Disclosure limitation Name SSN Birthdate Zipcode Gender Favorite Ice Cream # of crimes committed A. Jones 12341 01011961 02145 M Raspberry 0 B. Jones 12342 02021961 02138 M Pistachio 0 C. Jones 12343 11111972 94043 M Chocolate 0 D. Jones 12344 12121972 94043 M Hazelnut 0 E. Jones 12345 03251972 94041 F Lemon 0 F. Jones 12346 03251972 02127 F Lemon 1 G. Jones 12347 08081989 02138 F Peach 1 H. Smith 12348 01011973 63200 F Lime 2 I. Smith 12349 02021973 63300 M Mango 4 J. Smith 12350 02021973 63400 M Coconut 16 K. Smith 12351 03031974 64500 M Frog 32 L. Smith 12352 04041974 64600 M Vanilla 64 M. Smith 12353 04041974 64700 F Pumpkin 128 N. Smith- Jones 12354 04041974 64800 F Allergic 256 9Managing Confidential Data
    • 10. Name SSN Birthdate Zipcode Gender Favorite Ice Cream # of crimes committed A. Jones 12341 01011961 02145 M Raspberry 0 B. Jones 12342 02021961 02138 M Pistachio 0 C. Jones 12343 11111972 94043 M Chocolate 0 D. Jones 12344 12121972 94043 M Hazelnut 0 E. Jones 12345 03251972 94041 F Lemon 0 F. Jones 12346 03251972 02127 F Lemon 1 G. Jones 12347 08081989 02138 F Peach 1 H. Smith 12348 01011973 63200 F Lime 2 I. Smith 12349 02021973 63300 M Mango 4 J. Smith 12350 02021973 63400 M Coconut 16 K. Smith 12351 03031974 64500 M Frog 32 L. Smith 12352 04041974 64600 M Vanilla 64 M. Smith 12353 04041974 64700 F Pumpkin 128 N. Smith 12354 04041974 64800 F Allergic 256 What’s wrong with this picture? v. 24 (January IAP Session 1) Identifier Sensitive Private Identifier Private Identifier Identifier Sensitive Unexpected Response? Mass resident FERPA too? Californian Twins, separated at birth? 10Managing Confidential Data
    • 11. The Ethical View of Confidentiality v. 24 (January IAP Session 1) • Related to over the content, context, control, and use of information describing an individual. • Confidentiality is violated if something is learned about an individual outside of the intended context and use 11Managing Confidential Data
    • 12. The Legal View of Privacy v. 24 (January IAP Session 1) • Overlapping laws • Different laws apply to different cases • All affiliates subject to university policy (Not included: EU directive, foreign laws, classified data, …) 12Managing Confidential Data
    • 13. It’s Complicated Contract Intellectual Property Access Rights Confidentiality Copyright Fair Use DMCA Database Rights Moral Rights Intellectual Attribution Trade Secret Patent Trademark Common Rule 45 CFR 26 HIPAA FERPA EU Privacy Directive Privacy Torts (Invasion, Defamation) Rights of Publicity Sensitive but Unclassified Potentially Harmful (Archeological Sites, Endangered Species, Animal Testing, …) Classified FOIA CIPSEA State Privacy Laws EAR State FOI Laws Journal Replication Requirements Funder Open Access Contract License Click-Wrap TOU ITAR Export Restrictions
    • 14. Current Approached to Managing Confidential Information v. 24 (January IAP Session 1) • Information technology/security – constrain access to information • Legal – Vet and bind those who can access information • Statistical – Anonymize/deidentify information 14Managing Confidential Data
    • 15. Information Security Model v. 24 (January IAP Session 1) • [NIST 800-100, simplification of NIST 800-30] Law, policy, ethics Research design … Information security Disclosure limitation System Analysis Threat Modeling Vulnerability Identification Analysis - likelihood - impact - mitigating controls Institute Selected Controls Testing and Auditing Information Security Control Selection Process 15Managing Confidential Data
    • 16. SO MANY CONTROLS!!! v. 24 (January IAP Session 1) Access Control Low (impact) Medium-High (impact), adds… Policies; Account management *; Access Enforcement; Unsuccessful Login Attempts; System Use Notification; Restrict Anonymous Access*; Restrict Remote Access*; Restrict Wireless Access*; Restrict Mobile Devices*; Restrict use of External Information Systems*; Restrict Publicly Accessible Content Information flow enforcement; Separation of Duties; Least Privilege; Session Lock Security Awareness and Training Policies; Awareness; Training; Training Records Audit and Accountability Policies; Auditable Events *; Content of Audit Records *; Storage Capacity; Audit Review, Analysis and Reporting *; Time Stamps *; Protection of Audit Information; Audit Record Retention; Audit Generation Audit Reduction; Non-Repudiation Security Assessment and Authorization Policies; Assessments* ; System Connections; Planning; Authorization; Continuous Monitoring 16Managing Confidential Data
    • 17. SO MANY CONTROLS!!! v. 24 (January IAP Session 1) Configuration Management Low (impact) Medium-High (impact), adds… Policies; Baseline*; Impact Analysis; Settings*; Least Functionality; Component Inventory* Change Control; Access Restrictions for Change; Configuration Management Plan Contingency Planning Policies; Plan * ; Training *; Plan Testing*; System backup*; Recovery & Reconstitution * Alternate storage site; Alternate processing site; Telecomm Identification and Authentication Policies; Organizational Users*; Identifier Management; Authenticator Management *; Authenticator Feedback; Cryptographic Module Authentication; Non-Organizational Users Device identification and authentication Incident Response Policies; Training; Handling *; Monitoring; Reporting*; Response Assistance; Response Plan Testing Maintenance Policies; Control*; Non-Local Maintenance Restrictions*; Personnel Restrictions* Tools; Maintenance scheduling/timeliness 17Managing Confidential Data
    • 18. SO MANY CONTROLS!!! v. 24 (January IAP Session 1) Media Protection Low (impact) Medium-High (impact), adds… Policies; Access restrictions*; Sanitization Marking; Storage; Transport Physical and Environmental Protection Policies; Access Authorizations; Access Control*; Monitoring*; Visitor Control *; Records*; Emergency Lighting; Fire protection*; Temperature, Humidity, water damage*; Delivery and removal Network access control; Output device Access control; Power equipment access, shutoff, backup; Alternate work site; Location of information system components; information leakage Planning Policies, Plan, Rules of Behavior; Privacy Impact Assessment Activity planning Personnel Security Policies; Position categorization; Screening; Termination; Transfer; Access Agreements; Third- Parties; Sanctions Risk Assessment Policies; Categorization Assessment; Vulnerability Scanning* 18Managing Confidential Data
    • 19. v. 24 (January IAP Session 1) System and Services Acquisition Low (impact) Medium-High (impact), adds… Policies; Resource Allocation; Life Cycle Support; Acquisition*; Documentation; Software usage restrictions; User installed software restrictions; External information System Services restrictions Security Engineering; Developer configuration management; Developer security testing; supply chain protection; Trustworthiness System and Communications Protection Policies; Denial of Service Protection; Boundary protection*; Cryptographic key Management; Encryption; Public Access Protection; Collaborative computing devices restriction; Secure Name resolution* Application Partitioning; Restrictions on Shared Resources; Transmission integrity & confidentiality; Network Disconnection Procedure; Public Key Infrastructure Certificates; Mobile Code management; VOIP management; Session authenticity; Fail in known state; Protection of information at rest; Information system partitioning System and Information Integrity Policies, Flaw remediation*; Malicious code protection*; Security Advisory monitoring*; Information output handling Information system monitoring; Software and information integrity; Spam protection; Information input restrictions & validation; Error handling Program Management Plan; Security Officer Role; Resources; Inventory; Performance Measures; Enterprise architecture; Risk management strategy; Authorization process; Mission definition 19Managing Confidential Data SO MANY CONTROLS!!!
    • 20. So Many Laws!!! Contract Intellectual Property Access Rights Confidentiality Copyright Fair Use DMCA Database Rights Moral Rights Intellectual Attribution Trade Secret Patent Trademark Common Rule 45 CFR 26 HIPAA FERPA EU Privacy Directive Privacy Torts (Invasion, Defamation) Rights of Publicity Sensitive but Unclassified Potentially Harmful (Archeological Sites, Endangered Species, Animal Testing, …) Classified FOIA CIPSEA State Privacy Laws EAR State FOI Laws Journal Replication Requirements Funder Open Access Contract License Click-Wrap TOU ITAR Export Restrictions
    • 21. Statistics is complicated, too! Published Outputs * Jones * * 1961 021* * Jones * * 1961 021* * Jones * * 1972 9404* * Jones * * 1972 9404* * Jones * * 1972 9404* Modal Practice “The correlation between X and Y was large and statistically significant” Summary statistics Contingency table Public use sample microdata Information Visualization Managing Confidential Datav. 24 (January IAP Session 1) 21
    • 22. Practical Steps to Manage Confidential Research Data v. 24 (January IAP Session 1) • Identify potentially sensitive information in planning – Identify legal requirements, institutional requirements, data use agreements – Consider obtaining a certificate of confidentiality – Plan for IRB review • Reduce sensitivity of collected data in design • Separate sensitive information in collection • Encrypt sensitive information in transit • Desensitize information in processing – Removing names and other direct identifiers – Suppressing, aggregating, or perturbing indirect identifiers • Protect sensitive information in systems – Use systems that are controlled, securely configured, and audited – Ensure people are authenticated, authorized, licensed • Review sensitive information before dissemination – Review disclosure risk – Apply non-statistical disclosure limitation – Apply statistical disclosure limitation – Review past releases and publically available data – Check for changes in the law – Require a use agreement 22Managing Confidential Data
    • 23. Capturing Contributor Roles in Scholarly Publications Little Data, Big Data, & Privacy
    • 24. Little Data – Big World • The “Favorite Ice Cream” problem -- public information that is not risky can help us learn information that is risky • The “Doesn’t Stay in Vegas” problem -- information shared locally can be found anywhere • The “Data Exhaust problem” -- wherever you go, there you are, and your data too! Capturing Contributor Roles in Scholarly Publications
    • 25. New Data – New Challenges v. 24 (January IAP Session 1) • How to deidentify without completely destroying the data? – The “Netflix Problem”: large, sparse datasets that overlap can be probabilistically linked [Narayan and Shmatikov 2008] – The “GIS”: fine geo-spatial-temporal data impossible mask, when correlated with external data [Zimmerman 2008; ] – The “Facebook Problem”: Possible to identify masked network data, if only a few nodes controlled. [Backstrom, et. al 2007] – The “Blog problem” : Pseudononymous communication can be linked through textual analysis [Novak wet. al 2004] [For more examples see Vadhan, et al 2010] Source: [Calberese 2008; Real Time Rome Project 2007] 25Managing Confidential Data
    • 26. Algorithmic Discrimination Capturing Contributor Roles in Scholarly Publications Help! I’ve been mugged by a mugshot database!
    • 27. Legal Differences Across the Pond Capturing Contributor Roles in Scholarly Publications
    • 28. Some Open Questions for Research Data • Do individuals still expect a degree of privacy in the information they publicly share on the internet, despite how US law defines information privacy? Should this expectation be reasonable? • Some individuals employ techniques to limit public exposure of data, such as sending photos directly to individuals through SMS or e-mail, using Facebook privacy controls to limit sharing to a specified group of friends, deleting information previously made public, or sharing only through services that provide terms of service that restrict data uses by others, and follow techno-social norms around information sharing that may not be recognized in non-internet contexts. How, if at all, should such conventions and behaviors shape legal expectations and ethical notions of privacy? • Government agencies often exercise discretion with respect to the scope of information they release publicly. What best practices should government actors follow in redacting information prior to release? And when researchers are aware that best practices have not been adopted, and/or that stated anonymization policies have followed, do they have any obligations with respect to this data? • Services and participants in them often span multiple states, countries, and supranational regions, which may have different legal regulations and restrictions to use of data. When data is studied transnationally, are researchers obligated to honor data these laws and regulations that may apply in the country the user is located, where the services are hosted, and where the data in publicly accessed? To what extent are outcomes affected by stages in the data lifecycle, or to research that will be published in outlets with international reach? • Many web services, including social networks, have detailed terms of use that potentially restrict or prohibit secondary uses of data. When are researchers required to comply with these terms of use? • Should IRBs oversee human subjects that uses only mined data? Should researchers have an ethical duty to safeguard such information in their studies? How does the sensitivity of information factor into a determination whether information is public or private for research purposes? Should individuals on the internet have the ability to opt-out their information from studies? Capturing Contributor Roles in Scholarly Publications
    • 29. Capturing Contributor Roles in Scholarly Publications Elements of A Modern Framework for Managing Confidential Information
    • 30. Principles • The risks of informational harm are generally not a simple function of the presence or absence of specific fields, attributes, or keywords in the released set of data. Instead, much of the potential for harm stems from what one can learn or infer about individuals from the data release as a whole or when linked with available information. • Redaction, pseudonymization, coarsening and hashing, are often neither an adequate nor appropriate practice, and releasing less information is not always a better approach to privacy. As noted above, simple redaction of information that has been identified as sensitive is often not a guarantee of privacy protection. • Thoughtful analysis with expert consultation is necessary in order to evaluate the sensitivity of the data collected, to quantify the associated re-identification risks, and to design useful and safe release mechanisms. Naïve use of any data sharing model, including those we describe above, is unlikely to provide adequate protection. Capturing Contributor Roles in Scholarly Publications
    • 31. Analysis Framework • Any analysis of information privacy should address • the scope of information covered, • the sensitivity of that information (its potential to cause individual, group, or social harm), • the risk that sensitive information will be disclosed (by re-identification or other means), • the availability of control and accountability mechanisms (including review, auditing, and • enforcement), and • the suitability of existing data sharing models, as applied across the entire lifecycle of information use, from collection through dissemination and reuse. Capturing Contributor Roles in Scholarly Publications
    • 32. New Tools • Personal Data Stores • Information Accountability Framework • Interactive private computation • Multiparty computation • Synthetic information Capturing Contributor Roles in Scholarly Publications
    • 33. Integrated Lifecycle Policy Management • Aligning Legal and Computational Concepts – Regulatory language • Legal requirements across lifecycle stages – Metadata Schemas – Implied requirements – Questionnaires and elicitation instruments • Legal instruments -- capturing scientific privacy concepts in legal instruments consistently across lifecycle – service level agreements – consent terms – deposit agreement – data usage agreements Policy Research Update
    • 34. Additional References v. 24 (January IAP Session 1) • A Aquesti, L John, G Lowestein, 2009, "What is Privacy Worth", 21rst Rowkshop in Information Systems and Economics. • A. Blum, K. Ligett, A Roth, 2008. “A Learning Theory Approach to Non-Interactive Database Privacy”, STOC’08 • L. Backstrom, C. Dwork, J. Kleinberg. 2007, Wherefore Art Thou R3579X? Anonymized Social Networks, Hidden Patterns, and Structural Steganography. Proc. 16th Intl. World Wide Web Conference., KDD 008 • J. Brickell, and V. Shmatikov, 2008. The Cost of Privacy: Destruction of Data-Mining Utility in Annoymized Data Publishing • P. Buneman, A. Chapman an.d J. Cheney, 2006, ‘Provenance Management in Curated Databases’, in Proceedings of the 2006 ACM SIGMOD International Conference on Management o f Data, (Chicago, IL: 2006), 539‐550. http://portal.acm.org/citation.cfm?doid=1142473.1142534; • Calabrese F., Colonna M., Lovisolo P., Parata D., Ratti C., 2007, "Real-Time Urban Monitoring Using Cellular Phones: a Case-Study in Rome", Working paper # 1, SENSEable City Laboratory, MIT, Boston http://senseable.mit.edu/papers/, [also see the Real Time Rome Project [http://senseable.mit.edu/realtimerome/] • Campbell,. D. 2009, reported in D, Goodin 2009, Amazon's EC2 brings new might to password cracking, The Register, Nov 2, 2009, http://www.theregister.co.uk/2009/11/02/amazon_cloud_password_cracking/ • Dinur and K. Nissim. Revealing information while preserving privacy. Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 202–210, 2003. • C. Dwork, M Naor, O Reingold, G Rothblum, S Vadhan, 2009. When and How Can Data be Efficiently Released with Privacy, STOC 2009. • C Dwork, A. Smith, 2009. Differential Privacy for Statistics: What we know and what we want to learn, Journal of Privacy and Confdentiality 1(2)135-54 • C Dwork 2008, Differential Privacy, A Survey of Results. TAMC 2008, LCNS 4978, Springer Verlag. 1-19 • C. Dwork. Differential privacy. Proc. ICALP, 2006. • C. Dwork, F. McSherry, and K. Talwar. The price of privacy and the limits of LP decoding. Proceedings of the thirty-ninth annual ACM Symposium on Theory of Computing, pages 85–94, 2007. • C. Dwork, F. McSherry, K. Nissim, and A. Smith, Calibrating Noise to Sensitivity in Private Data Analysis, Proceedings of the 3rd IACR Theory of Cryptography Conference, 2006 • A. Desrosieres. 1998. The Politics of Large Numbers, Harvard U. Press. • S.E. Fienberg, M.E. Martin, and M.L. Straf (eds.), 1985. Sharing Research Data, Washington, D.C.: National Academies Press. • S. Fienberg, 2010. Towards a Bayesian Characterization of Privacy Protection & the Risk-Utility Tradeoff, IPAM--Data 2010 • B. C.M. Fung, K. Wang, R. Chen, P.S. Yu, 2010, Privacy Preserving Data Publishing: A Survey of Recent Developments, ACM CSUR42(4) • Greenwald, A. G. McGhee, D. E. Schwartz, J. L. K., 1998, "Measuring Individual Differences In Implicit Cognition: The Implicit Association Test", Journal of Personality and Social Psychology 74(6):1464-1480 • C. Herley, 2009, So Long and No Thanks for the Externalities: The Rational Rejection of Security Advice by Users; NSPW 09 • A. F. Karr, 2009 Statistical Analysis of Distributed Databases, journal of Privacy and Confidentiality (1)2: • Vadhan, S. , et al. 2010. “Re: Advance Notice of Proposed Rulemaking: Human Subjects Research Protections”. Available from: http://dataprivacylab.org/projects/irb/Vadhan.pdf • Popa, Raluca Ada, et al. "CryptDB: protecting confidentiality with encrypted query processing." Proceedings of the Twenty- Third ACM Symposium on Operating Systems Principles. ACM, 2011. 34Managing Confidential Data
    • 35. Additional References v. 24 (January IAP Session 1) • International Council For Science (ICSU) 2004. ICSU Report of the CSPR Assessment Panel on Scientific Data and Information. Report. • J. Klump, et. al, 2006. “Data publication in the open access initiative”, Data Science Journal Vol. 5 pp. 79-83. • E.A. Kolek, D. Saunders, 2008. Online Disclosure: An Empirical Examination of Undergraduate Facebook Profiles, NASPA Journal 45 (1): 1-25 • N. Li, T. Li, and S. Venkatasubramanian. T-closeness: privacy beyond k-anonymity and l-diversity. In Pro- ceedings of the IEEE ICDE 2007, 2007. • A. MachanavaJJhala, D Kifer, J Gehrke, M. Venkitasubramaniam, 2007,"l-Diversity: Privacy Beyond k-Anonymity" ACM Transactions on Knowledge Discovery from Data, 1(1): 1-52 • A. Meyerson, R. Williams, 2004. “On the complexity of Optimal K-Anonymity”, ACM Symposium on the Principles of Database Systems • Nature 461, 145 (10 September 2009) | doi:10.1038/461145a • A. Narayanan and V. Shmatikov, 2008, “Robust De-anonymization of Large Sparse Datasets” , Proc. of 29th IEEE Symposium on Security and Privacy (Forthcoming) • I Neamatullah, et. al, 2008, Automated de-identification of free-text medical records, BMC Medical Informatics and Decision Making 8:32 • J. Novak, P. Raghavan, A. Tomkins, 2004. Anti-aliasing on the Web, Proceedings of the 13th international conference on World Wide Web • National Science Board (NSB), 2005, Long-Lived Digital Data Collections: Enabling Research and Education in the 21rst Century, NSF. (NSB-05-40). • A Qcquisti, R. Gross 2009, “Predicting Social Security Numbers from Public Data”, PNAS 27(106): 10975–10980 • Sweeney, L., (2002) k-Anonymity: A Model for Protecting Privacy, International Journal on Uncertainty, Fuzziness, and Knowledge-based Systems, Vol. 10, No. 5, pp. 557 – 570. • Truta T.M., Vinay B. (2006), Privacy Protection: p-Sensitive k-Anonymity Property, International Workshop of Privacy Data Management (PDM2006), In Conjunction with 22th International Conference of Data Engineering (ICDE), Atlanta, Georgia. • O. Uzuner, et al, 2007, “Evaluating the State-of-the-Art in Automatic De-identification”, Journal of the American Medical Informatics Association 14(5):550 • W. Wagner & R. Steinzor, 2006. Rescuing Science from Politics, Cambridge U. Press. • Warner, S. 1965. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association 60(309):63–9. • D.L. Zimmerman, C. Pavlik , 2008. "Quantifying the Effects of Mask Metadata, Disclosure and Multiple Releases on the Confidentiality of Geographically Masked Health Data", Geographical Analysis 40: 52-76 35Managing Confidential Data
    • 36. Questions? E-mail: escience@mit.edu Web: informatics.mit.edu Capturing Contributor Roles in Scholarly Publications
    • 37. Questions? v. 24 (January IAP Session 1) Web: informatics.mit.edu 37Managing Confidential Data