SlideShare a Scribd company logo
1 of 19
Download to read offline
PriPeARL: A Framework for Privacy-
Preserving Analytics and Reporting at
LinkedIn
CIKM 2018
Krishnaram Kenthapadi, Thanh Tran
Data @ LinkedIn
1
Analytics Products at LinkedIn
Profile View Analytics
2
Content Analytics
Ad Campaign Analytics
All showing demographics
of members engaging with
the product
Product Requirements: Utility and Privacy
3
• Insights into the audience engaging with
the product (e.g., profile, article, or ad)
→ Desirable for the aggregate statistics
to be available and accurate.
• Different aspects of data consistency:
- Repeated queries
- Over time
- Total vs. Demographic breakdowns
- Hierarchy (e.g., time, entity)
Utility Privacy
• Member actions could be considered
sensitive information (e.g., click on an
article or an ad).
→ Individual’s action cannot be
inferred from the results of analytics.
• Assume malicious use cases, e.g.,
attacker can set up ad campaigns to
infer the behavior of a certain member.
LMS
Application: LinkedIn Ads Analytics
4
Objective:
Compute robust, reliable analytics in a privacy-preserving
manner, while addressing the product desiderata such as utility,
coverage, and consistency.
Ad
Ad
Targeting
LI Ad
Serving
Ad
Analytics
Advertiser
Possible Attacks
5
Targeting:
Senior directors in US, who studied at Cornell
Matches ~16k LinkedIn members
→ over minimum targeting threshold
Demographic breakdown:
E.g., company = X
Matches exactly one person
→ can determine whether the person
clicks on the ad or not
Enforcing minimum reporting threshold
Attacker could create fake profiles
E.g., if threshold is 10, create 9 fake
profiles that all click.
Rounding mechanism
E.g., report incremental of 10
Still amenable to attacks
E.g., using incremental counts over time
to infer individuals’ actions
Need rigorous techniques to preserve member privacy, not
revealing exact aggregate counts
Differential Privacy: Definition
6
● ε-Differential Privacy: For neighboring databases D and D’ (differ by one record),
the distribution of the curator’s outputs on both databases are nearly the same .
● Parameter ε (ε > 0) quantifies information leakage
○ Smaller ε, more private
Dwork, McSherry, Nissim, Smith [TCC 2006]
Differential Privacy: Random Noise Addition
7
● Achieving differential privacy via random noise addition.
● Common approach: noise draw from the Laplace distribution.
○ Let s be L1 sensitivity of the query function f
s = max D, D’ || f(D) - f(D’) ||, D and D’ differ by one record
○ and ε the privacy parameter.
○ Then the parameter for Laplace distribution is (s/ε)
Dwork, McSherry, Nissim, Smith [TCC 2006]
● This query form also applies for other analytics applications
Ad Analytics Canonical Queries
8
SELECT COUNT(*)
FROM table(stateType, entity)
WHERE timestamp ≥ startTime AND timestamp ≤ endTime
AND dAttr = dVal
E.g., clicks on a given ad
E.g., Title = “Senior Director”
● Application admits a predetermined query form.
● Preserving privacy by adding Laplace noise
○ Protect privacy at the event level
PriPeARL: A Framework for Privacy-Preserving Analytics
9
Pseudo-random noise generation, inspired by differential privacy
● Entity id (creative/campaign/
campaign group/account)
● Demographic dimension
● Stat type (impressions, clicks)
● Time range
● Fixed secret seed
Uniformly Random
Fraction
● Cryptographic
hash
● Normalize to
(0,1)
Random
Noise
Laplace
Noise
● Fixed ε
True
count
Reported
count
To satisfy consistency
requirements
● Pseudo-random noise → same query has same result over time, avoid
averaging attack.
● For non-canonical queries (e.g., time ranges, aggregate multiple entities)
○ Use the hierarchy and partition into canonical queries
○ Compute noise for each canonical queries and sum up the noisy counts
System Architecture
10
Implemented and integrated into Ads Analytics product.
Can be used for general analytics product.
Performance Evaluation: Setup
11
● Experiments using LinkedIn ad analytics data
○ Consider distribution of impression and click queries
across (account, ad campaign) and demographic
breakdowns.
● Examine
○ Tradeoff between privacy and utility
○ Effect of varying minimum threshold (non-negative)
○ Top-n queries
Performance Evaluation: Results
12
Privacy and Utility Tradeoff
● For ε = 1, average absolute and signed errors
are small for both queries.
● Variance is also small, ~95% of queries have
error of at most 2.
Top-N Queries
● Common use case in LinkedIn applications.
● Jaccard distance as a function of ε and n.
● (This shows the worst case as queries with
return sets ≤ n and error=0 were omitted.)
Lessons Learned
13
● Lessons from privacy breaches → need “Privacy by Design”
● Consider business requirements and usability
○ Various consistency desiderata to ensure results useful and insightful
● Scaling across analytics applications
○ Abstract away application specifics, build libraries, and optimize for
performance
Acknowledgements
▹ Team:
▸ AI/ML: Krishnaram Kenthapadi, Thanh T. L. Tran
▸ Ad Analytics Product & Engineering: Mark Dietz, Taylor Greason, Ian
Koeppe
▸ Legal / Security: Sara Harrington, Sharon Lee, Rohit Pitke
▹ Additional Acknowledgements
▸ Deepak Agarwal, Igor Perisic, Arun Swami, Ya Xu, Yang Zhou
14
▹ Framework to compute robust, privacy-preserving analytics
▸ Addressing challenges such as preserving member privacy, product
coverage, utility, and data consistency
▹ Future
▸ Utility maximization problem given constraints on the ‘privacy loss budget’ per user
⬩ E.g., noise with larger variance to impressions but less noise to clicks (or
conversions)
⬩ E.g., more noise to broader time range sub-queries and less noise to granular
time range sub-queries
▹ Tech Report: K. Kenthapadi, T. Tran, PriPeARL: A Framework for Privacy-
Preserving Analytics and Reporting at LinkedIn, ACM CIKM 2018
(https://arxiv.org/pdf/1809.07754)
Summary
15
What’s Next: Privacy for ML / Data Applications
▹ Hard open questions
▸ Can we simultaneously develop highly personalized models
and ensure that the models do not encode private information
of members?
▸ How do we guarantee member privacy over time without
exhausting the “privacy loss budget”?
▸ How do we enable privacy-preserving mechanisms for data
marketplaces?
▹ Thanks!
16
Appendix
17
Algorithm to Computing Noisy Analytics
18
Performance Evaluation: Results
19
Varying minimum thresholds

More Related Content

More from Krishnaram Kenthapadi

Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)
Krishnaram Kenthapadi
 
Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)
Krishnaram Kenthapadi
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Krishnaram Kenthapadi
 

More from Krishnaram Kenthapadi (12)

Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML Systems
 
Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)
 
Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML Systems
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (KD...
 
Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WW...
 
Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)
Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)
Privacy-preserving Data Mining in Industry (WWW 2019 Tutorial)
 
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
Fairness-aware Machine Learning: Practical Challenges and Lessons Learned (WS...
 
Privacy-preserving Data Mining in Industry (WSDM 2019 Tutorial)
Privacy-preserving Data Mining in Industry (WSDM 2019 Tutorial)Privacy-preserving Data Mining in Industry (WSDM 2019 Tutorial)
Privacy-preserving Data Mining in Industry (WSDM 2019 Tutorial)
 
Fairness, Transparency, and Privacy in AI @ LinkedIn
Fairness, Transparency, and Privacy in AI @ LinkedInFairness, Transparency, and Privacy in AI @ LinkedIn
Fairness, Transparency, and Privacy in AI @ LinkedIn
 
Privacy-preserving Analytics and Data Mining at LinkedIn
Privacy-preserving Analytics and Data Mining at LinkedInPrivacy-preserving Analytics and Data Mining at LinkedIn
Privacy-preserving Analytics and Data Mining at LinkedIn
 
Privacy-preserving Data Mining in Industry: Practical Challenges and Lessons ...
Privacy-preserving Data Mining in Industry: Practical Challenges and Lessons ...Privacy-preserving Data Mining in Industry: Practical Challenges and Lessons ...
Privacy-preserving Data Mining in Industry: Practical Challenges and Lessons ...
 

Recently uploaded

一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
ayvbos
 
Independent Escorts & Call Girls In Aerocity Delhi - 9758998899 - Escortgram ...
Independent Escorts & Call Girls In Aerocity Delhi - 9758998899 - Escortgram ...Independent Escorts & Call Girls In Aerocity Delhi - 9758998899 - Escortgram ...
Independent Escorts & Call Girls In Aerocity Delhi - 9758998899 - Escortgram ...
Escortgram India
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
JOHNBEBONYAP1
 
一比一原版贝德福特大学毕业证学位证书
一比一原版贝德福特大学毕业证学位证书一比一原版贝德福特大学毕业证学位证书
一比一原版贝德福特大学毕业证学位证书
F
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Monica Sydney
 
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
ydyuyu
 

Recently uploaded (20)

Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.
 
Call girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsCall girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girls
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
 
Independent Escorts & Call Girls In Aerocity Delhi - 9758998899 - Escortgram ...
Independent Escorts & Call Girls In Aerocity Delhi - 9758998899 - Escortgram ...Independent Escorts & Call Girls In Aerocity Delhi - 9758998899 - Escortgram ...
Independent Escorts & Call Girls In Aerocity Delhi - 9758998899 - Escortgram ...
 
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
 
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime BalliaBallia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
 
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsMira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
 
Washington Football Commanders Redskins Feathers Shirt
Washington Football Commanders Redskins Feathers ShirtWashington Football Commanders Redskins Feathers Shirt
Washington Football Commanders Redskins Feathers Shirt
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
 
一比一原版贝德福特大学毕业证学位证书
一比一原版贝德福特大学毕业证学位证书一比一原版贝德福特大学毕业证学位证书
一比一原版贝德福特大学毕业证学位证书
 
Down bad crying at the gym t shirtsDown bad crying at the gym t shirts
Down bad crying at the gym t shirtsDown bad crying at the gym t shirtsDown bad crying at the gym t shirtsDown bad crying at the gym t shirts
Down bad crying at the gym t shirtsDown bad crying at the gym t shirts
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
 
Delivery in 20 Mins Call Girls Cuttack 9332606886 HOT & SEXY Models beautifu...
Delivery in 20 Mins Call Girls Cuttack  9332606886 HOT & SEXY Models beautifu...Delivery in 20 Mins Call Girls Cuttack  9332606886 HOT & SEXY Models beautifu...
Delivery in 20 Mins Call Girls Cuttack 9332606886 HOT & SEXY Models beautifu...
 
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
 
[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon
[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon
[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon
 

PriPeARL: A Framework for Privacy-Preserving Analytics and Reporting at LinkedIn

  • 1. PriPeARL: A Framework for Privacy- Preserving Analytics and Reporting at LinkedIn CIKM 2018 Krishnaram Kenthapadi, Thanh Tran Data @ LinkedIn 1
  • 2. Analytics Products at LinkedIn Profile View Analytics 2 Content Analytics Ad Campaign Analytics All showing demographics of members engaging with the product
  • 3. Product Requirements: Utility and Privacy 3 • Insights into the audience engaging with the product (e.g., profile, article, or ad) → Desirable for the aggregate statistics to be available and accurate. • Different aspects of data consistency: - Repeated queries - Over time - Total vs. Demographic breakdowns - Hierarchy (e.g., time, entity) Utility Privacy • Member actions could be considered sensitive information (e.g., click on an article or an ad). → Individual’s action cannot be inferred from the results of analytics. • Assume malicious use cases, e.g., attacker can set up ad campaigns to infer the behavior of a certain member.
  • 4. LMS Application: LinkedIn Ads Analytics 4 Objective: Compute robust, reliable analytics in a privacy-preserving manner, while addressing the product desiderata such as utility, coverage, and consistency. Ad Ad Targeting LI Ad Serving Ad Analytics Advertiser
  • 5. Possible Attacks 5 Targeting: Senior directors in US, who studied at Cornell Matches ~16k LinkedIn members → over minimum targeting threshold Demographic breakdown: E.g., company = X Matches exactly one person → can determine whether the person clicks on the ad or not Enforcing minimum reporting threshold Attacker could create fake profiles E.g., if threshold is 10, create 9 fake profiles that all click. Rounding mechanism E.g., report incremental of 10 Still amenable to attacks E.g., using incremental counts over time to infer individuals’ actions Need rigorous techniques to preserve member privacy, not revealing exact aggregate counts
  • 6. Differential Privacy: Definition 6 ● ε-Differential Privacy: For neighboring databases D and D’ (differ by one record), the distribution of the curator’s outputs on both databases are nearly the same . ● Parameter ε (ε > 0) quantifies information leakage ○ Smaller ε, more private Dwork, McSherry, Nissim, Smith [TCC 2006]
  • 7. Differential Privacy: Random Noise Addition 7 ● Achieving differential privacy via random noise addition. ● Common approach: noise draw from the Laplace distribution. ○ Let s be L1 sensitivity of the query function f s = max D, D’ || f(D) - f(D’) ||, D and D’ differ by one record ○ and ε the privacy parameter. ○ Then the parameter for Laplace distribution is (s/ε) Dwork, McSherry, Nissim, Smith [TCC 2006]
  • 8. ● This query form also applies for other analytics applications Ad Analytics Canonical Queries 8 SELECT COUNT(*) FROM table(stateType, entity) WHERE timestamp ≥ startTime AND timestamp ≤ endTime AND dAttr = dVal E.g., clicks on a given ad E.g., Title = “Senior Director” ● Application admits a predetermined query form. ● Preserving privacy by adding Laplace noise ○ Protect privacy at the event level
  • 9. PriPeARL: A Framework for Privacy-Preserving Analytics 9 Pseudo-random noise generation, inspired by differential privacy ● Entity id (creative/campaign/ campaign group/account) ● Demographic dimension ● Stat type (impressions, clicks) ● Time range ● Fixed secret seed Uniformly Random Fraction ● Cryptographic hash ● Normalize to (0,1) Random Noise Laplace Noise ● Fixed ε True count Reported count To satisfy consistency requirements ● Pseudo-random noise → same query has same result over time, avoid averaging attack. ● For non-canonical queries (e.g., time ranges, aggregate multiple entities) ○ Use the hierarchy and partition into canonical queries ○ Compute noise for each canonical queries and sum up the noisy counts
  • 10. System Architecture 10 Implemented and integrated into Ads Analytics product. Can be used for general analytics product.
  • 11. Performance Evaluation: Setup 11 ● Experiments using LinkedIn ad analytics data ○ Consider distribution of impression and click queries across (account, ad campaign) and demographic breakdowns. ● Examine ○ Tradeoff between privacy and utility ○ Effect of varying minimum threshold (non-negative) ○ Top-n queries
  • 12. Performance Evaluation: Results 12 Privacy and Utility Tradeoff ● For ε = 1, average absolute and signed errors are small for both queries. ● Variance is also small, ~95% of queries have error of at most 2. Top-N Queries ● Common use case in LinkedIn applications. ● Jaccard distance as a function of ε and n. ● (This shows the worst case as queries with return sets ≤ n and error=0 were omitted.)
  • 13. Lessons Learned 13 ● Lessons from privacy breaches → need “Privacy by Design” ● Consider business requirements and usability ○ Various consistency desiderata to ensure results useful and insightful ● Scaling across analytics applications ○ Abstract away application specifics, build libraries, and optimize for performance
  • 14. Acknowledgements ▹ Team: ▸ AI/ML: Krishnaram Kenthapadi, Thanh T. L. Tran ▸ Ad Analytics Product & Engineering: Mark Dietz, Taylor Greason, Ian Koeppe ▸ Legal / Security: Sara Harrington, Sharon Lee, Rohit Pitke ▹ Additional Acknowledgements ▸ Deepak Agarwal, Igor Perisic, Arun Swami, Ya Xu, Yang Zhou 14
  • 15. ▹ Framework to compute robust, privacy-preserving analytics ▸ Addressing challenges such as preserving member privacy, product coverage, utility, and data consistency ▹ Future ▸ Utility maximization problem given constraints on the ‘privacy loss budget’ per user ⬩ E.g., noise with larger variance to impressions but less noise to clicks (or conversions) ⬩ E.g., more noise to broader time range sub-queries and less noise to granular time range sub-queries ▹ Tech Report: K. Kenthapadi, T. Tran, PriPeARL: A Framework for Privacy- Preserving Analytics and Reporting at LinkedIn, ACM CIKM 2018 (https://arxiv.org/pdf/1809.07754) Summary 15
  • 16. What’s Next: Privacy for ML / Data Applications ▹ Hard open questions ▸ Can we simultaneously develop highly personalized models and ensure that the models do not encode private information of members? ▸ How do we guarantee member privacy over time without exhausting the “privacy loss budget”? ▸ How do we enable privacy-preserving mechanisms for data marketplaces? ▹ Thanks! 16
  • 18. Algorithm to Computing Noisy Analytics 18