SlideShare a Scribd company logo
Differential Privacy:
Case Studies
Denny Lee, Microsoft SQLCAT Team | Best Practices
Case Studies
 Quantitative Case Study:
 Windows Live / MSN Web Analytics data
 Qualitative Case Study:
 Clinical Physicians Perspective
 Future Study
 OHSU/CORI data set to apply differential privacy to
Healthcare setting
Sanitization Concept
 Mask individuals within the data by creating a sanitization
point between user interface and data.
 The magnitude of the noise is given by the theorem. If many
queries f1, f2, … are to be made, noise proportional to ΣiΔfi
suffices. For many sequences, we can often use less noise
than ΣiΔfi . Note that Δ Histogram = 1, independent of
number of cells
6/3/2016
Generating the noise
 To generate the noise, a pseudo-random number
generator will create a stream of numbers, e.g.:
 The resulting translation of this stream is:
0 0 1 1 1 … 1 0 0 0 0 1
- . 2 + 1 … + . . . . 6
6/3/2016
Adding noise
Category Value
A 36
B 22
… …
N 102
Category Value
A 34
B 23
… …
N 108
noise
6/3/2016
• The stream of numbers above is applied
to the result set.
• While masking the individuals, it allows
accurate percentages and trending.
• Presuming the magnitude is small (i.e.
small error), the numbers are
themselves accurate within an
acceptable margin.
Windows Live User Data
 Our initial case study is based on Windows Live
user data:
 550 million Passport users
 Passport has web site visitor self-reported data: gender, birth
date, occupation, country, zip code, etc.
 Web data has: IP address, pages viewed, page view duration,
browser, operating system, etc.
 Created two groups for this case study to study the
acceptability / applicability of differential privacy within
the WL reporting context:
 WL Sampled Users Web Analytics
 Customer Churn Analytics
Windows Live Example Report
 As per below, you can see the effect on the data
Sampled Users Web Analytics
Group
 New solution built on top of an existing Windows
Live web analytics solution to provide a sample
specific to Passport users.
 Built on top of an OLAP database to provide analysts
to view the data from multiple dimensions.
 Built as well to showcase the privacy preserving
histogram for various teams including Channels,
Search, and Money.
Web Analytics Group Feedback
Country Visitors
United States 202
Canada 31
Country Gender Visitors
United States Female 128
Male 75
Total 203
Canada Female 15
Male 15
Total 30
 Feedback was negative because customers
could not accept any amount of error.
 This group had been using reporting
systems for over two years that had
perceived accuracy issues.
 They were adamant that all of the totals
matched; the difference on the right was
not acceptable even though this data was
not used for financial reconciliation.
Customer Churn Analysis
Group
 This reporting solution provided an OLAP cube, based on an
existing targeted marketing system, to allow analysts to
understand how services (Messenger, Mail, Search, Spaces,
etc.) are being used.
 A key difference between the groups is that this group did not
have access to any reporting (though it was requested for
many months).
 Within a few weeks of their initial request, CCA customers
received a working beta in which they were able to interact,
validate, and provide feedback to the precision and accuracy
of the data.
Discussion
 The collaborative effort lead to the customer
trusting the data, a key difference in comparison to
the first group.
 Because of this trust, the small amount of error
introduced into the system to ensure customer
privacy was well within a tolerable error margin.
 The CCA group is in direct marketing hence had to
deal more regularly with customer privacy.
An important component to the
acceptance of privacy algorithms is
the users’ trust of the data.
Clinical Researchers Perceptions
 A pilot qualitative study on the perceptions of clinical
researchers was recently completed.
 It has noted three categories of six themes:
 Unaffected Statistics
 Understanding the privacy algorithms
 Can get back to the original data
 Understanding the purpose of the privacy algorithms
 Management ROI
 Protecting Patient Privacy
Unaffected Statistics
 The most important point – no point applying privacy
if we get faulty statistics.
 Primary concern is healthcare studies involve smaller
number of patients than other studies.
 We are currently planning to provide in the near
future a healthcare template for the use of these
algorithms.
Understanding the privacy algorithms
 As we have done in these slides, we have described
the mathematics behind these algorithms only
briefly.
 But most clinical researchers are willing to accept the
science behind them without necessarily
understanding them.
 While this is good, it does pose the problem that one
will implement them w/o understanding them
incorrectly guaranteeing the privacy of patients.
Can get back to the original data
 It is very important to get back to the original data set
if so required.
 Many existing privacy algorithms perturb the data so
while guaranteeing the privacy of an individual, it is
impossible to get back to the individual.
 Healthcare research always requires the ability to get
back to the original data to potentially inform
patients of new outcomes.
 The privacy preserving data analysis approach here
will allow this ability.
Understand the purpose of the privacy
algorithms
 Most educated healthcare professionals understand
the issues and providing case studies such as the Gov
Weld case make this more apparent.
 But we will still want to provide well-worded text
and/or confidence intervals below a chart or report
that has privacy algorithms applied.
Management ROI
 We should be limiting the number of users who need
access to full data. So is there a good return-on-
investment to provide this extra step if you can
securely authorize the right people to access this
data?
 This is where standards from IRB, privacy & security
steering committees, and the government get
involved.
 Most importantly: the ability to share data.
Protecting Patient Privacy
For us to be able to analyze and mine
medical data so we can help patients
as well as lower the costs of
healthcare, we must first ensure
patient privacy.
Future Collaboration
 As noted above, we are currently working with OHSU
to build a template for the application of these
privacy algorithms to healthcare.
 For more information and/or interest in participating
in future application research, please email Denny
Lee at dennyl@microsoft.com.
Thanks
 Thanks to Sally Allwardt for helping implement the
privacy preserving histogram algorithm used in this
case study.
 Thanks to Kristina Behr, Lead Marketing Manager, for
all of her help and feedback with this case study.
6/3/2016
Practical Privacy: The SuLQ Framework
 Reference paper “Practical Privacy: The SuLQ
Framework”
 Conceptually, this application of privacy can be
applied to:
 Principal component analysis
 k means clustering
 ID3 algorithm
 Perceptron algorithm
 Apparently, all algorithms in the statistical queries learning
model.
6/3/2016

More Related Content

What's hot

Data quality - The True Big Data Challenge
Data quality - The True Big Data ChallengeData quality - The True Big Data Challenge
Data quality - The True Big Data Challenge
Stefan Kühn
 
Adventures in Data Profiling
Adventures in Data ProfilingAdventures in Data Profiling
Adventures in Data Profiling
Jim Harris
 
Paradigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the tableParadigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the table
Paradigm4
 
Big data analytics and large-scale computers
Big data analytics and large-scale computersBig data analytics and large-scale computers
Big data analytics and large-scale computers
ShubhamKhurana20
 
Data leakage detection Complete Seminar
Data leakage detection Complete SeminarData leakage detection Complete Seminar
Data leakage detection Complete Seminar
Sumit Thakur
 
Data Quality Rules introduction
Data Quality Rules introductionData Quality Rules introduction
Data Quality Rules introduction
datatovalue
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Edureka!
 
Machine Learning for Fraud Detection
Machine Learning for Fraud DetectionMachine Learning for Fraud Detection
Machine Learning for Fraud Detection
Nitesh Kumar
 
Power of the Run Graph
Power of the Run GraphPower of the Run Graph
Power of the Run Graph
Vaticle
 
Data mining financial services
Data mining financial servicesData mining financial services
Data mining financial servicesHprentice
 
Final review m score
Final review m scoreFinal review m score
Final review m scoreazhar4010
 
( Big ) Data Management - Data Quality - Global concepts in 5 slides
( Big ) Data Management - Data Quality - Global concepts in 5 slides( Big ) Data Management - Data Quality - Global concepts in 5 slides
( Big ) Data Management - Data Quality - Global concepts in 5 slides
Nicolas Sarramagna
 
Machine Learning in Big Data
Machine Learning in Big DataMachine Learning in Big Data
Machine Learning in Big Data
DataWorks Summit/Hadoop Summit
 
Data quality overview
Data quality overviewData quality overview
Data quality overviewAlex Meadows
 
Data Quality Integration (ETL) Open Source
Data Quality Integration (ETL) Open SourceData Quality Integration (ETL) Open Source
Data Quality Integration (ETL) Open SourceStratebi
 
Data Quality: The Data Science struggle nobody mentions - Data Science MeetUp...
Data Quality: The Data Science struggle nobody mentions - Data Science MeetUp...Data Quality: The Data Science struggle nobody mentions - Data Science MeetUp...
Data Quality: The Data Science struggle nobody mentions - Data Science MeetUp...
University of Twente
 
When Big Data and Predictive Analytics Collide: Visual Magic Happens
When Big Data and Predictive Analytics Collide: Visual Magic HappensWhen Big Data and Predictive Analytics Collide: Visual Magic Happens
When Big Data and Predictive Analytics Collide: Visual Magic Happens
Chase McMichael
 
Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)
Caserta
 
Big Data: Are you ready for it? Can you handle it?
Big Data: Are you ready for it? Can you handle it? Big Data: Are you ready for it? Can you handle it?
Big Data: Are you ready for it? Can you handle it?
ScaleFocus
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
Dr. C.V. Suresh Babu
 

What's hot (20)

Data quality - The True Big Data Challenge
Data quality - The True Big Data ChallengeData quality - The True Big Data Challenge
Data quality - The True Big Data Challenge
 
Adventures in Data Profiling
Adventures in Data ProfilingAdventures in Data Profiling
Adventures in Data Profiling
 
Paradigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the tableParadigm4 Research Report: Leaving Data on the table
Paradigm4 Research Report: Leaving Data on the table
 
Big data analytics and large-scale computers
Big data analytics and large-scale computersBig data analytics and large-scale computers
Big data analytics and large-scale computers
 
Data leakage detection Complete Seminar
Data leakage detection Complete SeminarData leakage detection Complete Seminar
Data leakage detection Complete Seminar
 
Data Quality Rules introduction
Data Quality Rules introductionData Quality Rules introduction
Data Quality Rules introduction
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
 
Machine Learning for Fraud Detection
Machine Learning for Fraud DetectionMachine Learning for Fraud Detection
Machine Learning for Fraud Detection
 
Power of the Run Graph
Power of the Run GraphPower of the Run Graph
Power of the Run Graph
 
Data mining financial services
Data mining financial servicesData mining financial services
Data mining financial services
 
Final review m score
Final review m scoreFinal review m score
Final review m score
 
( Big ) Data Management - Data Quality - Global concepts in 5 slides
( Big ) Data Management - Data Quality - Global concepts in 5 slides( Big ) Data Management - Data Quality - Global concepts in 5 slides
( Big ) Data Management - Data Quality - Global concepts in 5 slides
 
Machine Learning in Big Data
Machine Learning in Big DataMachine Learning in Big Data
Machine Learning in Big Data
 
Data quality overview
Data quality overviewData quality overview
Data quality overview
 
Data Quality Integration (ETL) Open Source
Data Quality Integration (ETL) Open SourceData Quality Integration (ETL) Open Source
Data Quality Integration (ETL) Open Source
 
Data Quality: The Data Science struggle nobody mentions - Data Science MeetUp...
Data Quality: The Data Science struggle nobody mentions - Data Science MeetUp...Data Quality: The Data Science struggle nobody mentions - Data Science MeetUp...
Data Quality: The Data Science struggle nobody mentions - Data Science MeetUp...
 
When Big Data and Predictive Analytics Collide: Visual Magic Happens
When Big Data and Predictive Analytics Collide: Visual Magic HappensWhen Big Data and Predictive Analytics Collide: Visual Magic Happens
When Big Data and Predictive Analytics Collide: Visual Magic Happens
 
Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)
 
Big Data: Are you ready for it? Can you handle it?
Big Data: Are you ready for it? Can you handle it? Big Data: Are you ready for it? Can you handle it?
Big Data: Are you ready for it? Can you handle it?
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
 

Viewers also liked

Differential privacy and applications to location privacy
Differential privacy and applications to location privacyDifferential privacy and applications to location privacy
Differential privacy and applications to location privacy
Pôle Systematic Paris-Region
 
Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]
Kentaro Minami
 
Differential Privacy Preservation for Deep Auto-Encoders
Differential Privacy Preservation for Deep Auto-EncodersDifferential Privacy Preservation for Deep Auto-Encoders
Differential Privacy Preservation for Deep Auto-Encoders
NhatHai Phan
 
Homomorphic Encryption
Homomorphic EncryptionHomomorphic Encryption
Homomorphic Encryption
Victor Pereira
 
Homomorphic Encryption
Homomorphic EncryptionHomomorphic Encryption
Homomorphic Encryption
Göktuğ Serez
 
Homomorphic encryption in cloud computing final
Homomorphic encryption  in cloud computing finalHomomorphic encryption  in cloud computing final
Homomorphic encryption in cloud computing final
Santanu Das Saan
 
Homomorphic Encryption
Homomorphic EncryptionHomomorphic Encryption
Homomorphic Encryption
Vipin Tejwani
 
Introduction to Homomorphic Encryption
Introduction to Homomorphic EncryptionIntroduction to Homomorphic Encryption
Introduction to Homomorphic Encryption
Christoph Matthies
 
Partial Homomorphic Encryption
Partial Homomorphic EncryptionPartial Homomorphic Encryption
Partial Homomorphic Encryption
securityxploded
 
Homomorphic encryption
Homomorphic encryptionHomomorphic encryption
Homomorphic encryption
Namit Sinha
 
Data mining with differential privacy
Data mining with differential privacy Data mining with differential privacy
Data mining with differential privacy
Wei-Yuan Chang
 

Viewers also liked (11)

Differential privacy and applications to location privacy
Differential privacy and applications to location privacyDifferential privacy and applications to location privacy
Differential privacy and applications to location privacy
 
Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]Differential privacy without sensitivity [NIPS2016読み会資料]
Differential privacy without sensitivity [NIPS2016読み会資料]
 
Differential Privacy Preservation for Deep Auto-Encoders
Differential Privacy Preservation for Deep Auto-EncodersDifferential Privacy Preservation for Deep Auto-Encoders
Differential Privacy Preservation for Deep Auto-Encoders
 
Homomorphic Encryption
Homomorphic EncryptionHomomorphic Encryption
Homomorphic Encryption
 
Homomorphic Encryption
Homomorphic EncryptionHomomorphic Encryption
Homomorphic Encryption
 
Homomorphic encryption in cloud computing final
Homomorphic encryption  in cloud computing finalHomomorphic encryption  in cloud computing final
Homomorphic encryption in cloud computing final
 
Homomorphic Encryption
Homomorphic EncryptionHomomorphic Encryption
Homomorphic Encryption
 
Introduction to Homomorphic Encryption
Introduction to Homomorphic EncryptionIntroduction to Homomorphic Encryption
Introduction to Homomorphic Encryption
 
Partial Homomorphic Encryption
Partial Homomorphic EncryptionPartial Homomorphic Encryption
Partial Homomorphic Encryption
 
Homomorphic encryption
Homomorphic encryptionHomomorphic encryption
Homomorphic encryption
 
Data mining with differential privacy
Data mining with differential privacy Data mining with differential privacy
Data mining with differential privacy
 

Similar to Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)

Big Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptxBig Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptx
PrabhaJoshi4
 
Running title TRENDS IN COMPUTER INFORMATION SYSTEMS1TRENDS I.docx
Running title TRENDS IN COMPUTER INFORMATION SYSTEMS1TRENDS I.docxRunning title TRENDS IN COMPUTER INFORMATION SYSTEMS1TRENDS I.docx
Running title TRENDS IN COMPUTER INFORMATION SYSTEMS1TRENDS I.docx
anhlodge
 
CS309A Final Paper_KM_DD
CS309A Final Paper_KM_DDCS309A Final Paper_KM_DD
CS309A Final Paper_KM_DDDavid Darrough
 
Big Data Risks and Rewards (good length and at least 3-4 references .docx
Big Data Risks and Rewards (good length and at least 3-4 references .docxBig Data Risks and Rewards (good length and at least 3-4 references .docx
Big Data Risks and Rewards (good length and at least 3-4 references .docx
tangyechloe
 
Big data analytics and its impact on internet users
Big data analytics and its impact on internet usersBig data analytics and its impact on internet users
Big data analytics and its impact on internet users
Struggler Ever
 
Descriptive Statistics and Interpretation Grading GuideQNT5.docx
Descriptive Statistics and Interpretation Grading GuideQNT5.docxDescriptive Statistics and Interpretation Grading GuideQNT5.docx
Descriptive Statistics and Interpretation Grading GuideQNT5.docx
theodorelove43763
 
Data Science-Data Analytics
Data Science-Data AnalyticsData Science-Data Analytics
Data Science-Data AnalyticsAlexander Kolker
 
1.  Patient Safety is a health care professionals’ duty. A sur.docx
1.    Patient Safety is a health care professionals’ duty. A sur.docx1.    Patient Safety is a health care professionals’ duty. A sur.docx
1.  Patient Safety is a health care professionals’ duty. A sur.docx
SONU61709
 
Achieving Privacy in Publishing Search logs
Achieving Privacy in Publishing Search logsAchieving Privacy in Publishing Search logs
Achieving Privacy in Publishing Search logs
IOSR Journals
 
Big Data Analytics: Challenges And Applications For Text, Audio, Video, And S...
Big Data Analytics: Challenges And Applications For Text, Audio, Video, And S...Big Data Analytics: Challenges And Applications For Text, Audio, Video, And S...
Big Data Analytics: Challenges And Applications For Text, Audio, Video, And S...
IJSCAI Journal
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
gerogepatton
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
ijscai
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
gerogepatton
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
gerogepatton
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
gerogepatton
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
ijscai
 
Self-service analytics risk_September_2016
Self-service analytics risk_September_2016Self-service analytics risk_September_2016
Self-service analytics risk_September_2016Leigh Ulpen
 
Big data analytics for life insurers
Big data analytics for life insurersBig data analytics for life insurers
Big data analytics for life insurers
dipak sahoo
 
Big_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedBig_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedShradha Verma
 
Predictive analytics-white-paper
Predictive analytics-white-paperPredictive analytics-white-paper
Predictive analytics-white-paperShubhashish Biswas
 

Similar to Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007) (20)

Big Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptxBig Data Analytics_Unit1.pptx
Big Data Analytics_Unit1.pptx
 
Running title TRENDS IN COMPUTER INFORMATION SYSTEMS1TRENDS I.docx
Running title TRENDS IN COMPUTER INFORMATION SYSTEMS1TRENDS I.docxRunning title TRENDS IN COMPUTER INFORMATION SYSTEMS1TRENDS I.docx
Running title TRENDS IN COMPUTER INFORMATION SYSTEMS1TRENDS I.docx
 
CS309A Final Paper_KM_DD
CS309A Final Paper_KM_DDCS309A Final Paper_KM_DD
CS309A Final Paper_KM_DD
 
Big Data Risks and Rewards (good length and at least 3-4 references .docx
Big Data Risks and Rewards (good length and at least 3-4 references .docxBig Data Risks and Rewards (good length and at least 3-4 references .docx
Big Data Risks and Rewards (good length and at least 3-4 references .docx
 
Big data analytics and its impact on internet users
Big data analytics and its impact on internet usersBig data analytics and its impact on internet users
Big data analytics and its impact on internet users
 
Descriptive Statistics and Interpretation Grading GuideQNT5.docx
Descriptive Statistics and Interpretation Grading GuideQNT5.docxDescriptive Statistics and Interpretation Grading GuideQNT5.docx
Descriptive Statistics and Interpretation Grading GuideQNT5.docx
 
Data Science-Data Analytics
Data Science-Data AnalyticsData Science-Data Analytics
Data Science-Data Analytics
 
1.  Patient Safety is a health care professionals’ duty. A sur.docx
1.    Patient Safety is a health care professionals’ duty. A sur.docx1.    Patient Safety is a health care professionals’ duty. A sur.docx
1.  Patient Safety is a health care professionals’ duty. A sur.docx
 
Achieving Privacy in Publishing Search logs
Achieving Privacy in Publishing Search logsAchieving Privacy in Publishing Search logs
Achieving Privacy in Publishing Search logs
 
Big Data Analytics: Challenges And Applications For Text, Audio, Video, And S...
Big Data Analytics: Challenges And Applications For Text, Audio, Video, And S...Big Data Analytics: Challenges And Applications For Text, Audio, Video, And S...
Big Data Analytics: Challenges And Applications For Text, Audio, Video, And S...
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
 
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
BIG DATA ANALYTICS: CHALLENGES AND APPLICATIONS FOR TEXT, AUDIO, VIDEO, AND S...
 
Self-service analytics risk_September_2016
Self-service analytics risk_September_2016Self-service analytics risk_September_2016
Self-service analytics risk_September_2016
 
Big data analytics for life insurers
Big data analytics for life insurersBig data analytics for life insurers
Big data analytics for life insurers
 
Big_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_publishedBig_data_analytics_for_life_insurers_published
Big_data_analytics_for_life_insurers_published
 
Predictive analytics-white-paper
Predictive analytics-white-paperPredictive analytics-white-paper
Predictive analytics-white-paper
 

More from Denny Lee

Azure Cosmos DB: Globally Distributed Multi-Model Database Service
Azure Cosmos DB: Globally Distributed Multi-Model Database ServiceAzure Cosmos DB: Globally Distributed Multi-Model Database Service
Azure Cosmos DB: Globally Distributed Multi-Model Database Service
Denny Lee
 
Spark to DocumentDB connector
Spark to DocumentDB connectorSpark to DocumentDB connector
Spark to DocumentDB connector
Denny Lee
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
Denny Lee
 
SQL Server Integration Services Best Practices
SQL Server Integration Services Best PracticesSQL Server Integration Services Best Practices
SQL Server Integration Services Best Practices
Denny Lee
 
SQL Server Reporting Services: IT Best Practices
SQL Server Reporting Services: IT Best PracticesSQL Server Reporting Services: IT Best Practices
SQL Server Reporting Services: IT Best Practices
Denny Lee
 
Introduction to Microsoft's Big Data Platform and Hadoop Primer
Introduction to Microsoft's Big Data Platform and Hadoop PrimerIntroduction to Microsoft's Big Data Platform and Hadoop Primer
Introduction to Microsoft's Big Data Platform and Hadoop Primer
Denny Lee
 
Yahoo!, Big Data, and Microsoft BI: Bigger and Better Together
Yahoo!, Big Data, and Microsoft BI: Bigger and Better TogetherYahoo!, Big Data, and Microsoft BI: Bigger and Better Together
Yahoo!, Big Data, and Microsoft BI: Bigger and Better Together
Denny Lee
 
SQL Server Reporting Services Disaster Recovery webinar
SQL Server Reporting Services Disaster Recovery webinarSQL Server Reporting Services Disaster Recovery webinar
SQL Server Reporting Services Disaster Recovery webinar
Denny Lee
 
Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...
Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...
Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...
Denny Lee
 
Designing, Building, and Maintaining Large Cubes using Lessons Learned
Designing, Building, and Maintaining Large Cubes using Lessons LearnedDesigning, Building, and Maintaining Large Cubes using Lessons Learned
Designing, Building, and Maintaining Large Cubes using Lessons Learned
Denny Lee
 
SQLCAT - Data and Admin Security
SQLCAT - Data and Admin SecuritySQLCAT - Data and Admin Security
SQLCAT - Data and Admin Security
Denny Lee
 
SQLCAT: Addressing Security and Compliance Issues with SQL Server 2008
SQLCAT: Addressing Security and Compliance Issues with SQL Server 2008SQLCAT: Addressing Security and Compliance Issues with SQL Server 2008
SQLCAT: Addressing Security and Compliance Issues with SQL Server 2008
Denny Lee
 
SQLCAT: A Preview to PowerPivot Server Best Practices
SQLCAT: A Preview to PowerPivot Server Best PracticesSQLCAT: A Preview to PowerPivot Server Best Practices
SQLCAT: A Preview to PowerPivot Server Best Practices
Denny Lee
 
Deploying and Managing PowerPivot for SharePoint
Deploying and Managing PowerPivot for SharePointDeploying and Managing PowerPivot for SharePoint
Deploying and Managing PowerPivot for SharePoint
Denny Lee
 
SQLCAT: Tier-1 BI in the World of Big Data
SQLCAT: Tier-1 BI in the World of Big DataSQLCAT: Tier-1 BI in the World of Big Data
SQLCAT: Tier-1 BI in the World of Big Data
Denny Lee
 
Big Data, Bigger Brains
Big Data, Bigger BrainsBig Data, Bigger Brains
Big Data, Bigger Brains
Denny Lee
 
Jump Start into Apache Spark (Seattle Spark Meetup)
Jump Start into Apache Spark (Seattle Spark Meetup)Jump Start into Apache Spark (Seattle Spark Meetup)
Jump Start into Apache Spark (Seattle Spark Meetup)
Denny Lee
 
How Concur uses Big Data to get you to Tableau Conference On Time
How Concur uses Big Data to get you to Tableau Conference On TimeHow Concur uses Big Data to get you to Tableau Conference On Time
How Concur uses Big Data to get you to Tableau Conference On Time
Denny Lee
 
SQL Server Reporting Services Disaster Recovery Webinar
SQL Server Reporting Services Disaster Recovery WebinarSQL Server Reporting Services Disaster Recovery Webinar
SQL Server Reporting Services Disaster Recovery Webinar
Denny Lee
 
Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)
Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)
Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)
Denny Lee
 

More from Denny Lee (20)

Azure Cosmos DB: Globally Distributed Multi-Model Database Service
Azure Cosmos DB: Globally Distributed Multi-Model Database ServiceAzure Cosmos DB: Globally Distributed Multi-Model Database Service
Azure Cosmos DB: Globally Distributed Multi-Model Database Service
 
Spark to DocumentDB connector
Spark to DocumentDB connectorSpark to DocumentDB connector
Spark to DocumentDB connector
 
Introduction to Azure DocumentDB
Introduction to Azure DocumentDBIntroduction to Azure DocumentDB
Introduction to Azure DocumentDB
 
SQL Server Integration Services Best Practices
SQL Server Integration Services Best PracticesSQL Server Integration Services Best Practices
SQL Server Integration Services Best Practices
 
SQL Server Reporting Services: IT Best Practices
SQL Server Reporting Services: IT Best PracticesSQL Server Reporting Services: IT Best Practices
SQL Server Reporting Services: IT Best Practices
 
Introduction to Microsoft's Big Data Platform and Hadoop Primer
Introduction to Microsoft's Big Data Platform and Hadoop PrimerIntroduction to Microsoft's Big Data Platform and Hadoop Primer
Introduction to Microsoft's Big Data Platform and Hadoop Primer
 
Yahoo!, Big Data, and Microsoft BI: Bigger and Better Together
Yahoo!, Big Data, and Microsoft BI: Bigger and Better TogetherYahoo!, Big Data, and Microsoft BI: Bigger and Better Together
Yahoo!, Big Data, and Microsoft BI: Bigger and Better Together
 
SQL Server Reporting Services Disaster Recovery webinar
SQL Server Reporting Services Disaster Recovery webinarSQL Server Reporting Services Disaster Recovery webinar
SQL Server Reporting Services Disaster Recovery webinar
 
Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...
Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...
Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...
 
Designing, Building, and Maintaining Large Cubes using Lessons Learned
Designing, Building, and Maintaining Large Cubes using Lessons LearnedDesigning, Building, and Maintaining Large Cubes using Lessons Learned
Designing, Building, and Maintaining Large Cubes using Lessons Learned
 
SQLCAT - Data and Admin Security
SQLCAT - Data and Admin SecuritySQLCAT - Data and Admin Security
SQLCAT - Data and Admin Security
 
SQLCAT: Addressing Security and Compliance Issues with SQL Server 2008
SQLCAT: Addressing Security and Compliance Issues with SQL Server 2008SQLCAT: Addressing Security and Compliance Issues with SQL Server 2008
SQLCAT: Addressing Security and Compliance Issues with SQL Server 2008
 
SQLCAT: A Preview to PowerPivot Server Best Practices
SQLCAT: A Preview to PowerPivot Server Best PracticesSQLCAT: A Preview to PowerPivot Server Best Practices
SQLCAT: A Preview to PowerPivot Server Best Practices
 
Deploying and Managing PowerPivot for SharePoint
Deploying and Managing PowerPivot for SharePointDeploying and Managing PowerPivot for SharePoint
Deploying and Managing PowerPivot for SharePoint
 
SQLCAT: Tier-1 BI in the World of Big Data
SQLCAT: Tier-1 BI in the World of Big DataSQLCAT: Tier-1 BI in the World of Big Data
SQLCAT: Tier-1 BI in the World of Big Data
 
Big Data, Bigger Brains
Big Data, Bigger BrainsBig Data, Bigger Brains
Big Data, Bigger Brains
 
Jump Start into Apache Spark (Seattle Spark Meetup)
Jump Start into Apache Spark (Seattle Spark Meetup)Jump Start into Apache Spark (Seattle Spark Meetup)
Jump Start into Apache Spark (Seattle Spark Meetup)
 
How Concur uses Big Data to get you to Tableau Conference On Time
How Concur uses Big Data to get you to Tableau Conference On TimeHow Concur uses Big Data to get you to Tableau Conference On Time
How Concur uses Big Data to get you to Tableau Conference On Time
 
SQL Server Reporting Services Disaster Recovery Webinar
SQL Server Reporting Services Disaster Recovery WebinarSQL Server Reporting Services Disaster Recovery Webinar
SQL Server Reporting Services Disaster Recovery Webinar
 
Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)
Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)
Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)
 

Recently uploaded

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 

Recently uploaded (20)

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 

Differential Privacy Case Studies (CMU-MSR Mindswap on Privacy 2007)

  • 1. Differential Privacy: Case Studies Denny Lee, Microsoft SQLCAT Team | Best Practices
  • 2. Case Studies  Quantitative Case Study:  Windows Live / MSN Web Analytics data  Qualitative Case Study:  Clinical Physicians Perspective  Future Study  OHSU/CORI data set to apply differential privacy to Healthcare setting
  • 3. Sanitization Concept  Mask individuals within the data by creating a sanitization point between user interface and data.  The magnitude of the noise is given by the theorem. If many queries f1, f2, … are to be made, noise proportional to ΣiΔfi suffices. For many sequences, we can often use less noise than ΣiΔfi . Note that Δ Histogram = 1, independent of number of cells 6/3/2016
  • 4. Generating the noise  To generate the noise, a pseudo-random number generator will create a stream of numbers, e.g.:  The resulting translation of this stream is: 0 0 1 1 1 … 1 0 0 0 0 1 - . 2 + 1 … + . . . . 6 6/3/2016
  • 5. Adding noise Category Value A 36 B 22 … … N 102 Category Value A 34 B 23 … … N 108 noise 6/3/2016 • The stream of numbers above is applied to the result set. • While masking the individuals, it allows accurate percentages and trending. • Presuming the magnitude is small (i.e. small error), the numbers are themselves accurate within an acceptable margin.
  • 6. Windows Live User Data  Our initial case study is based on Windows Live user data:  550 million Passport users  Passport has web site visitor self-reported data: gender, birth date, occupation, country, zip code, etc.  Web data has: IP address, pages viewed, page view duration, browser, operating system, etc.  Created two groups for this case study to study the acceptability / applicability of differential privacy within the WL reporting context:  WL Sampled Users Web Analytics  Customer Churn Analytics
  • 7. Windows Live Example Report  As per below, you can see the effect on the data
  • 8. Sampled Users Web Analytics Group  New solution built on top of an existing Windows Live web analytics solution to provide a sample specific to Passport users.  Built on top of an OLAP database to provide analysts to view the data from multiple dimensions.  Built as well to showcase the privacy preserving histogram for various teams including Channels, Search, and Money.
  • 9. Web Analytics Group Feedback Country Visitors United States 202 Canada 31 Country Gender Visitors United States Female 128 Male 75 Total 203 Canada Female 15 Male 15 Total 30  Feedback was negative because customers could not accept any amount of error.  This group had been using reporting systems for over two years that had perceived accuracy issues.  They were adamant that all of the totals matched; the difference on the right was not acceptable even though this data was not used for financial reconciliation.
  • 10. Customer Churn Analysis Group  This reporting solution provided an OLAP cube, based on an existing targeted marketing system, to allow analysts to understand how services (Messenger, Mail, Search, Spaces, etc.) are being used.  A key difference between the groups is that this group did not have access to any reporting (though it was requested for many months).  Within a few weeks of their initial request, CCA customers received a working beta in which they were able to interact, validate, and provide feedback to the precision and accuracy of the data.
  • 11. Discussion  The collaborative effort lead to the customer trusting the data, a key difference in comparison to the first group.  Because of this trust, the small amount of error introduced into the system to ensure customer privacy was well within a tolerable error margin.  The CCA group is in direct marketing hence had to deal more regularly with customer privacy.
  • 12. An important component to the acceptance of privacy algorithms is the users’ trust of the data.
  • 13. Clinical Researchers Perceptions  A pilot qualitative study on the perceptions of clinical researchers was recently completed.  It has noted three categories of six themes:  Unaffected Statistics  Understanding the privacy algorithms  Can get back to the original data  Understanding the purpose of the privacy algorithms  Management ROI  Protecting Patient Privacy
  • 14. Unaffected Statistics  The most important point – no point applying privacy if we get faulty statistics.  Primary concern is healthcare studies involve smaller number of patients than other studies.  We are currently planning to provide in the near future a healthcare template for the use of these algorithms.
  • 15. Understanding the privacy algorithms  As we have done in these slides, we have described the mathematics behind these algorithms only briefly.  But most clinical researchers are willing to accept the science behind them without necessarily understanding them.  While this is good, it does pose the problem that one will implement them w/o understanding them incorrectly guaranteeing the privacy of patients.
  • 16. Can get back to the original data  It is very important to get back to the original data set if so required.  Many existing privacy algorithms perturb the data so while guaranteeing the privacy of an individual, it is impossible to get back to the individual.  Healthcare research always requires the ability to get back to the original data to potentially inform patients of new outcomes.  The privacy preserving data analysis approach here will allow this ability.
  • 17. Understand the purpose of the privacy algorithms  Most educated healthcare professionals understand the issues and providing case studies such as the Gov Weld case make this more apparent.  But we will still want to provide well-worded text and/or confidence intervals below a chart or report that has privacy algorithms applied.
  • 18. Management ROI  We should be limiting the number of users who need access to full data. So is there a good return-on- investment to provide this extra step if you can securely authorize the right people to access this data?  This is where standards from IRB, privacy & security steering committees, and the government get involved.  Most importantly: the ability to share data.
  • 19. Protecting Patient Privacy For us to be able to analyze and mine medical data so we can help patients as well as lower the costs of healthcare, we must first ensure patient privacy.
  • 20. Future Collaboration  As noted above, we are currently working with OHSU to build a template for the application of these privacy algorithms to healthcare.  For more information and/or interest in participating in future application research, please email Denny Lee at dennyl@microsoft.com.
  • 21. Thanks  Thanks to Sally Allwardt for helping implement the privacy preserving histogram algorithm used in this case study.  Thanks to Kristina Behr, Lead Marketing Manager, for all of her help and feedback with this case study. 6/3/2016
  • 22. Practical Privacy: The SuLQ Framework  Reference paper “Practical Privacy: The SuLQ Framework”  Conceptually, this application of privacy can be applied to:  Principal component analysis  k means clustering  ID3 algorithm  Perceptron algorithm  Apparently, all algorithms in the statistical queries learning model. 6/3/2016

Editor's Notes

  1. This is based on the work of Cynthia Dwork and Frank McSherry from Microsoft Research (MSR) A carefully detailed algorithm is definitely important, and something we have and can show folks. Aside from the addition of noise, the main snafus are a) how much noise and b) where did the randomness come from? Both are fun and exciting questions that you could have neat policy answers to, but the safe answers are: a) standard deviation equal to total number of queries and b) fresh randomness for every query. If they don't want to tell you the number of queries up front, the the standard deviation can be proportional to the square of the queries asked so far. By doing this, this algorithm will be able to address all attacks. Consequently, for each person, the increase in probability of them being attacked (or anyone else for that matter) due to the contribution of their data is nominal. The example given is foiled for two reasons: a) the addition of noise will (formally) complicate the polynomial reconstruction and b) the number of queries is limited by the degree of privacy guaranteed, and N is generally going to be way too many queries.
  2. The distribution used to create this noise can be Guassian because this can often work. But in order to handle all situations, we should utilize other distributions that provide more noise and/or more complicated like Laplace (Exponential) as noted in the previous slide
  3. Windows Live User Data Application Windows Live can use the above data to provide customizable experiences for their users and understand how visitors are using these services. Microsoft is able to offer services like Search and Messenger at no charge to the consumer because the services are ad-funded, including ads that are targeted to be more relevant to the consumer. As the data is accumulated, it becomes easier to segment the population and potentially better identify individual users without directly using personally identifiable information. Potential Issues As noted above, the Windows Live user data has enough specifics to allow us to identify a web site visitor even through the aggregations. We need to worry about standard privacy issues: Identity theft Fraud Bad press (e.g. AOL releasing search queries which ended up being revealing of their users) If user expectations about privacy are not satisfied, consumers may no longer trust the services that we are so willing to provide.
  4. For example, reviewing the country Afghanistan, the “Unknown” value is 121561 in one case and 121599 in another. Because of the random noise, we do not know what the “real” value is.
  5. http://research.microsoft.com/research/sv/DatabasePrivacy/bdmn.pdf