SlideShare a Scribd company logo
1 of 30
Download to read offline
Big Data and Attacks on
Privacy: How to Properly
Anonymize Social
Networks and Databases
(and Keep Them That Way)
AC 298r Final Presentation
Ryan Lee and Jeffrey Wang
Obligatory Social Network Stats
http://www.mediabistro.com/alltwitter/files/2013/11/growth-of-social-media-2013.jpg
Uses of Social Data: Research
Bollen et al. (2011).
CS109 Harvard Univ.
Fall 2013
Christakis & Fowler (2010). Christakis & Fowler (2007).
Uses of Social Data: Marketing
Facebook.com
Bio-Rad
Chang, R., Lee, A., Ghoniem, M., Kosara, R., Ribarsky, W., Yang, J., ... & Sudjianto, A. (2008). Scalable and interactive
visual analysis of financial wire transactions for fraud detection. Information visualization, 7(1), 63-76.
Uses of Social Data: Government
Challenge: Privacy
Naive Approach: Anonymization
Name Favorite Pizza Favorite Course
Ryan Lee Supreme AC298r
Jeffrey Wang Pepperoni AC298r
Daniel Weinstock Anchovies AC298r
Naive Approach: Anonymization
Name Favorite Pizza Favorite Course
Ryan Lee Supreme AC298r
Jeffrey Wang Pepperoni AC298r
Daniel Weinstock Anchovies AC298r
Priority: Security
Concern: Digital Footprint
NSA Data Warehouse
Deanonymization is Possible
Sweeny, Fuzziness and Knowledge-based Systems, 2002
Netflix Prize 2
Netflix De-anon: How they did it
● 500,000 record dataset was super-sparse
Netflix “Anonymized” Data
Public Data (IMDb, twitter, blogs, etc.)
Match if:
time < threshold
movie rating < threshold
Names
Surnames in Genomic Sequences
TACATA is a real last name...
“Anonymized” Cell Phone Data
de Montjoye, Y. A., Hidalgo, C. A., Verleysen, M., & Blondel, V. D. (2013). Unique in the Crowd: The privacy bounds of human mobility. Scientific reports, 3.
Defenses (lol JK)
K-Anonymity
Sweeny, Fuzziness and Knowledge-based Systems, 2002
A Tough Problem
DOB, Gender, and ZIP Code is enough to
uniquely identify 87% of US Citizens
Sweeny, Fuzziness and Knowledge-based Systems, 2002
Solution?
First Last Age Race
Harry Stone 34 African American
John Reyser 36 Caucasian
Beatrice Stone 34 African American
John Delgado 22 Hispanic
Sweeny, Fuzziness and Knowledge-based Systems, 2002
Solution: Suppression and
Generalization
First Last Age Race
Harry Stone 34 African American
John Reyser 36 Caucasian
Beatrice Stone 34 African American
John Delgado 22 Hispanic
k=2: Polynomial Solution! (Simplex Matching)
k>=3: NP-Hard (Graph Decomposition)
Sweeny, Fuzziness and Knowledge-based Systems, 2002
● Users are ε times less likely to be identified if
they chose not to participate in the database
Differential Privacy
Dwork, ICALP, 2002
Anonymity in Social Networks
Peter S. Bearman, James Moody, and Katherine Stovel, Chains of
affection: The structure of adolescent romantic and sexual networks,
American Journal of Sociology 110, 44-91 (2004).
http://www-personal.umich.edu/~mejn/networks/addhealth.gif
High School Dating Network
Information-rich Network Structure
Backstrom, L., & Kleinberg, J. (2013). Romantic Partnerships and the Dispersion of Social Ties: A Network
Analysis of Relationship Status on Facebook. arXiv preprint arXiv:1310.6753.
Attacks on Social Networks
● Passive: Find yourselves
● Active: structural steganography
http://www.cse.psu.edu/~asmith/courses/privacy598d/www/lec-notes/Attacking%20Social%20Network%20FINAL.pdf
No isomorphic
No automorphism
Obfuscating Social Networks
Zhou and Pei, KAIS, 2011
Part 1: Construct Min-DFS Tree for
Neighborhood
Zhou and Pei, KAIS, 2011
2 Useful Properties
1. Social Networks follow a Power-Law
Distribution
2. Social Networks typically have a small
diameter (6 degrees of separation)
Step 2: Anonymize Similar Vertices
Zhou and Pei, KAIS, 2011
Step 3: ??? => Step 4: Profit!
Zhou and Pei, KAIS, 2011
thanks
bye

More Related Content

Viewers also liked

Data Privacy: Anonymization & Re-Identification
Data Privacy: Anonymization & Re-IdentificationData Privacy: Anonymization & Re-Identification
Data Privacy: Anonymization & Re-IdentificationMike Nowakowski
 
An overview of methods for data anonymization
An overview of methods for data anonymizationAn overview of methods for data anonymization
An overview of methods for data anonymizationarx-deidentifier
 
ARX - a comprehensive tool for anonymizing / de-identifying biomedical data
ARX - a comprehensive tool for anonymizing / de-identifying biomedical dataARX - a comprehensive tool for anonymizing / de-identifying biomedical data
ARX - a comprehensive tool for anonymizing / de-identifying biomedical dataarx-deidentifier
 
Engineering data privacy - The ARX data anonymization tool
Engineering data privacy - The ARX data anonymization toolEngineering data privacy - The ARX data anonymization tool
Engineering data privacy - The ARX data anonymization toolarx-deidentifier
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
Closeness through-microaggregation-strict-privacy-with-enhanced-utility-prese...
Closeness through-microaggregation-strict-privacy-with-enhanced-utility-prese...Closeness through-microaggregation-strict-privacy-with-enhanced-utility-prese...
Closeness through-microaggregation-strict-privacy-with-enhanced-utility-prese...Pvrtechnologies Nellore
 
Protecting patients privacy slide presentation
Protecting patients privacy slide presentationProtecting patients privacy slide presentation
Protecting patients privacy slide presentationplunkk
 
slides
slidesslides
slidesbutest
 
Data Privacy in India and data theft
Data Privacy in India and data theftData Privacy in India and data theft
Data Privacy in India and data theftAmber Gupta
 
ARX - a comprehensive tool for anonymizing / de-identifying biomedical data
ARX - a comprehensive tool for anonymizing / de-identifying biomedical dataARX - a comprehensive tool for anonymizing / de-identifying biomedical data
ARX - a comprehensive tool for anonymizing / de-identifying biomedical dataarx-deidentifier
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data MiningROMALEE AMOLIC
 
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and ApproachesA Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches14894
 
Data mining and privacy preserving in data mining
Data mining and privacy preserving in data miningData mining and privacy preserving in data mining
Data mining and privacy preserving in data miningNeeda Multani
 
Privacy in India: Legal issues
Privacy in India: Legal issuesPrivacy in India: Legal issues
Privacy in India: Legal issuesSagar Rahurkar
 
Presentation on Information Privacy
Presentation on Information PrivacyPresentation on Information Privacy
Presentation on Information PrivacyPerry Slack
 
Privacy , Security and Ethics Presentation
Privacy , Security and Ethics PresentationPrivacy , Security and Ethics Presentation
Privacy , Security and Ethics PresentationHajarul Cikyen
 
Privacy preserving dm_ppt
Privacy preserving dm_pptPrivacy preserving dm_ppt
Privacy preserving dm_pptSagar Verma
 
Different approaches and methods
Different approaches and methodsDifferent approaches and methods
Different approaches and methodsswitlu
 

Viewers also liked (20)

Data anonymization
Data anonymizationData anonymization
Data anonymization
 
Data Privacy: Anonymization & Re-Identification
Data Privacy: Anonymization & Re-IdentificationData Privacy: Anonymization & Re-Identification
Data Privacy: Anonymization & Re-Identification
 
An overview of methods for data anonymization
An overview of methods for data anonymizationAn overview of methods for data anonymization
An overview of methods for data anonymization
 
ARX - a comprehensive tool for anonymizing / de-identifying biomedical data
ARX - a comprehensive tool for anonymizing / de-identifying biomedical dataARX - a comprehensive tool for anonymizing / de-identifying biomedical data
ARX - a comprehensive tool for anonymizing / de-identifying biomedical data
 
Engineering data privacy - The ARX data anonymization tool
Engineering data privacy - The ARX data anonymization toolEngineering data privacy - The ARX data anonymization tool
Engineering data privacy - The ARX data anonymization tool
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Closeness through-microaggregation-strict-privacy-with-enhanced-utility-prese...
Closeness through-microaggregation-strict-privacy-with-enhanced-utility-prese...Closeness through-microaggregation-strict-privacy-with-enhanced-utility-prese...
Closeness through-microaggregation-strict-privacy-with-enhanced-utility-prese...
 
Protecting patients privacy slide presentation
Protecting patients privacy slide presentationProtecting patients privacy slide presentation
Protecting patients privacy slide presentation
 
slides
slidesslides
slides
 
Data Privacy in India and data theft
Data Privacy in India and data theftData Privacy in India and data theft
Data Privacy in India and data theft
 
ARX - a comprehensive tool for anonymizing / de-identifying biomedical data
ARX - a comprehensive tool for anonymizing / de-identifying biomedical dataARX - a comprehensive tool for anonymizing / de-identifying biomedical data
ARX - a comprehensive tool for anonymizing / de-identifying biomedical data
 
Privacy Preserving Data Mining
Privacy Preserving Data MiningPrivacy Preserving Data Mining
Privacy Preserving Data Mining
 
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and ApproachesA Review Study on the Privacy Preserving Data Mining Techniques and Approaches
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches
 
Data mining and privacy preserving in data mining
Data mining and privacy preserving in data miningData mining and privacy preserving in data mining
Data mining and privacy preserving in data mining
 
Privacy act
Privacy actPrivacy act
Privacy act
 
Privacy in India: Legal issues
Privacy in India: Legal issuesPrivacy in India: Legal issues
Privacy in India: Legal issues
 
Presentation on Information Privacy
Presentation on Information PrivacyPresentation on Information Privacy
Presentation on Information Privacy
 
Privacy , Security and Ethics Presentation
Privacy , Security and Ethics PresentationPrivacy , Security and Ethics Presentation
Privacy , Security and Ethics Presentation
 
Privacy preserving dm_ppt
Privacy preserving dm_pptPrivacy preserving dm_ppt
Privacy preserving dm_ppt
 
Different approaches and methods
Different approaches and methodsDifferent approaches and methods
Different approaches and methods
 

Similar to Data Privacy and Anonymization

Meyer Big Data SDP13
Meyer Big Data SDP13Meyer Big Data SDP13
Meyer Big Data SDP13Eric Meyer
 
Online Harassment Workshop Opening Talk
Online Harassment Workshop Opening TalkOnline Harassment Workshop Opening Talk
Online Harassment Workshop Opening Talknatematias
 
Marital Status, Individualism And On Line
Marital Status, Individualism And On LineMarital Status, Individualism And On Line
Marital Status, Individualism And On LineSorin Adam Matei
 
School of Computing PhD Research Conference Presentation
School of Computing PhD Research Conference PresentationSchool of Computing PhD Research Conference Presentation
School of Computing PhD Research Conference PresentationFrances Ryan
 
Curating Networked Presence: Beyond Pseudonymity
Curating Networked Presence: Beyond PseudonymityCurating Networked Presence: Beyond Pseudonymity
Curating Networked Presence: Beyond PseudonymitySon Vivienne
 
ICPSR - Complex Systems Models in the Social Sciences - Lecture 2 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 2 - Professor...ICPSR - Complex Systems Models in the Social Sciences - Lecture 2 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 2 - Professor...Daniel Katz
 
Seeking Support in Secret
Seeking Support in SecretSeeking Support in Secret
Seeking Support in SecretSon Vivienne
 
Is Facebook messing our life?
Is Facebook messing our life?Is Facebook messing our life?
Is Facebook messing our life?ozlemk
 
計算社會科學初探- 當電腦科學家遇上社會科學
計算社會科學初探-當電腦科學家遇上社會科學計算社會科學初探-當電腦科學家遇上社會科學
計算社會科學初探- 當電腦科學家遇上社會科學Sheng-Wei (Kuan-Ta) Chen
 
Social Media: the good, the bad and the ugly
Social Media: the good, the bad and the uglySocial Media: the good, the bad and the ugly
Social Media: the good, the bad and the uglyJosh Cowls
 
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...Daniel McLinden
 
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...Daniel McLinden
 
Assessing the available and accessible evidence: How personal reputations are...
Assessing the available and accessible evidence: How personal reputations are...Assessing the available and accessible evidence: How personal reputations are...
Assessing the available and accessible evidence: How personal reputations are...Frances Ryan
 
People Like You Like Presentations Like This
People Like You Like Presentations Like ThisPeople Like You Like Presentations Like This
People Like You Like Presentations Like ThisDavid Millard
 
Obstetrics, Gynecology, and Twitter: A Primer on Strategic Social Media to Im...
Obstetrics, Gynecology, and Twitter: A Primer on Strategic Social Media to Im...Obstetrics, Gynecology, and Twitter: A Primer on Strategic Social Media to Im...
Obstetrics, Gynecology, and Twitter: A Primer on Strategic Social Media to Im...Neel Shah
 
Introduction to Computational Social Science
Introduction to Computational Social ScienceIntroduction to Computational Social Science
Introduction to Computational Social SciencePremsankar Chakkingal
 
Team CDTW Capstone Presentation
Team CDTW Capstone Presentation Team CDTW Capstone Presentation
Team CDTW Capstone Presentation Todd Rutherford
 
CDTW Capstone Presentation
CDTW Capstone Presentation CDTW Capstone Presentation
CDTW Capstone Presentation Todd Rutherford
 
2009 - Connected Action - Marc Smith - Social Media Network Analysis
2009 - Connected Action - Marc Smith - Social Media Network Analysis2009 - Connected Action - Marc Smith - Social Media Network Analysis
2009 - Connected Action - Marc Smith - Social Media Network AnalysisMarc Smith
 

Similar to Data Privacy and Anonymization (20)

Meyer Big Data SDP13
Meyer Big Data SDP13Meyer Big Data SDP13
Meyer Big Data SDP13
 
Online Harassment Workshop Opening Talk
Online Harassment Workshop Opening TalkOnline Harassment Workshop Opening Talk
Online Harassment Workshop Opening Talk
 
Marital Status, Individualism And On Line
Marital Status, Individualism And On LineMarital Status, Individualism And On Line
Marital Status, Individualism And On Line
 
School of Computing PhD Research Conference Presentation
School of Computing PhD Research Conference PresentationSchool of Computing PhD Research Conference Presentation
School of Computing PhD Research Conference Presentation
 
Curating Networked Presence: Beyond Pseudonymity
Curating Networked Presence: Beyond PseudonymityCurating Networked Presence: Beyond Pseudonymity
Curating Networked Presence: Beyond Pseudonymity
 
ICPSR - Complex Systems Models in the Social Sciences - Lecture 2 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 2 - Professor...ICPSR - Complex Systems Models in the Social Sciences - Lecture 2 - Professor...
ICPSR - Complex Systems Models in the Social Sciences - Lecture 2 - Professor...
 
Seeking Support in Secret
Seeking Support in SecretSeeking Support in Secret
Seeking Support in Secret
 
Is Facebook messing our life?
Is Facebook messing our life?Is Facebook messing our life?
Is Facebook messing our life?
 
計算社會科學初探- 當電腦科學家遇上社會科學
計算社會科學初探-當電腦科學家遇上社會科學計算社會科學初探-當電腦科學家遇上社會科學
計算社會科學初探- 當電腦科學家遇上社會科學
 
Social Media: the good, the bad and the ugly
Social Media: the good, the bad and the uglySocial Media: the good, the bad and the ugly
Social Media: the good, the bad and the ugly
 
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
 
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
 
Assessing the available and accessible evidence: How personal reputations are...
Assessing the available and accessible evidence: How personal reputations are...Assessing the available and accessible evidence: How personal reputations are...
Assessing the available and accessible evidence: How personal reputations are...
 
People Like You Like Presentations Like This
People Like You Like Presentations Like ThisPeople Like You Like Presentations Like This
People Like You Like Presentations Like This
 
Obstetrics, Gynecology, and Twitter: A Primer on Strategic Social Media to Im...
Obstetrics, Gynecology, and Twitter: A Primer on Strategic Social Media to Im...Obstetrics, Gynecology, and Twitter: A Primer on Strategic Social Media to Im...
Obstetrics, Gynecology, and Twitter: A Primer on Strategic Social Media to Im...
 
Introduction to Computational Social Science
Introduction to Computational Social ScienceIntroduction to Computational Social Science
Introduction to Computational Social Science
 
Team CDTW Capstone Presentation
Team CDTW Capstone Presentation Team CDTW Capstone Presentation
Team CDTW Capstone Presentation
 
CDTW Capstone Presentation
CDTW Capstone Presentation CDTW Capstone Presentation
CDTW Capstone Presentation
 
Government 2.0.: Opportunities and challenges
Government 2.0.: Opportunities and challengesGovernment 2.0.: Opportunities and challenges
Government 2.0.: Opportunities and challenges
 
2009 - Connected Action - Marc Smith - Social Media Network Analysis
2009 - Connected Action - Marc Smith - Social Media Network Analysis2009 - Connected Action - Marc Smith - Social Media Network Analysis
2009 - Connected Action - Marc Smith - Social Media Network Analysis
 

Recently uploaded

Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/managementakshesh doshi
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computationsit20ad004
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Servicejennyeacort
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 

Recently uploaded (20)

Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/management
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computation
 
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts ServiceCall Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
Call Girls In Noida City Center Metro 24/7✡️9711147426✡️ Escorts Service
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 

Data Privacy and Anonymization

  • 1. Big Data and Attacks on Privacy: How to Properly Anonymize Social Networks and Databases (and Keep Them That Way) AC 298r Final Presentation Ryan Lee and Jeffrey Wang
  • 2. Obligatory Social Network Stats http://www.mediabistro.com/alltwitter/files/2013/11/growth-of-social-media-2013.jpg
  • 3. Uses of Social Data: Research Bollen et al. (2011). CS109 Harvard Univ. Fall 2013 Christakis & Fowler (2010). Christakis & Fowler (2007).
  • 4. Uses of Social Data: Marketing Facebook.com Bio-Rad
  • 5. Chang, R., Lee, A., Ghoniem, M., Kosara, R., Ribarsky, W., Yang, J., ... & Sudjianto, A. (2008). Scalable and interactive visual analysis of financial wire transactions for fraud detection. Information visualization, 7(1), 63-76. Uses of Social Data: Government
  • 7. Naive Approach: Anonymization Name Favorite Pizza Favorite Course Ryan Lee Supreme AC298r Jeffrey Wang Pepperoni AC298r Daniel Weinstock Anchovies AC298r
  • 8. Naive Approach: Anonymization Name Favorite Pizza Favorite Course Ryan Lee Supreme AC298r Jeffrey Wang Pepperoni AC298r Daniel Weinstock Anchovies AC298r
  • 11. Deanonymization is Possible Sweeny, Fuzziness and Knowledge-based Systems, 2002
  • 13. Netflix De-anon: How they did it ● 500,000 record dataset was super-sparse Netflix “Anonymized” Data Public Data (IMDb, twitter, blogs, etc.) Match if: time < threshold movie rating < threshold Names
  • 14. Surnames in Genomic Sequences TACATA is a real last name...
  • 15. “Anonymized” Cell Phone Data de Montjoye, Y. A., Hidalgo, C. A., Verleysen, M., & Blondel, V. D. (2013). Unique in the Crowd: The privacy bounds of human mobility. Scientific reports, 3.
  • 17. K-Anonymity Sweeny, Fuzziness and Knowledge-based Systems, 2002
  • 18. A Tough Problem DOB, Gender, and ZIP Code is enough to uniquely identify 87% of US Citizens Sweeny, Fuzziness and Knowledge-based Systems, 2002
  • 19. Solution? First Last Age Race Harry Stone 34 African American John Reyser 36 Caucasian Beatrice Stone 34 African American John Delgado 22 Hispanic Sweeny, Fuzziness and Knowledge-based Systems, 2002
  • 20. Solution: Suppression and Generalization First Last Age Race Harry Stone 34 African American John Reyser 36 Caucasian Beatrice Stone 34 African American John Delgado 22 Hispanic k=2: Polynomial Solution! (Simplex Matching) k>=3: NP-Hard (Graph Decomposition) Sweeny, Fuzziness and Knowledge-based Systems, 2002
  • 21. ● Users are ε times less likely to be identified if they chose not to participate in the database Differential Privacy Dwork, ICALP, 2002
  • 22. Anonymity in Social Networks Peter S. Bearman, James Moody, and Katherine Stovel, Chains of affection: The structure of adolescent romantic and sexual networks, American Journal of Sociology 110, 44-91 (2004). http://www-personal.umich.edu/~mejn/networks/addhealth.gif High School Dating Network
  • 23. Information-rich Network Structure Backstrom, L., & Kleinberg, J. (2013). Romantic Partnerships and the Dispersion of Social Ties: A Network Analysis of Relationship Status on Facebook. arXiv preprint arXiv:1310.6753.
  • 24. Attacks on Social Networks ● Passive: Find yourselves ● Active: structural steganography http://www.cse.psu.edu/~asmith/courses/privacy598d/www/lec-notes/Attacking%20Social%20Network%20FINAL.pdf No isomorphic No automorphism
  • 25. Obfuscating Social Networks Zhou and Pei, KAIS, 2011
  • 26. Part 1: Construct Min-DFS Tree for Neighborhood Zhou and Pei, KAIS, 2011
  • 27. 2 Useful Properties 1. Social Networks follow a Power-Law Distribution 2. Social Networks typically have a small diameter (6 degrees of separation)
  • 28. Step 2: Anonymize Similar Vertices Zhou and Pei, KAIS, 2011
  • 29. Step 3: ??? => Step 4: Profit! Zhou and Pei, KAIS, 2011