Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

In:Confidence 2019 - Regulation breeds innovation

48 views

Published on

Jason McFall, CTO, Privitar, talks about how privacy engineering can make data analysis easier and more robust on the In:Confidence 2019 main stage (April 4th at Printworks, London).

Published in: Technology
  • Be the first to comment

  • Be the first to like this

In:Confidence 2019 - Regulation breeds innovation

  1. 1. Privitar.com Regulationbreeds innovation How privacy engineering makes data analytics faster and easier Jason McFall, CTO
  2. 2. Privacy UtilityUsability
  3. 3. Privacy Usability
  4. 4. 1 Faster Access to Data
  5. 5. No Yes
  6. 6. Privacy Engineering:Data Provisioning Privacy Policy Assembly Metadata Catalogue ID Management and Access Rights Data Request Service DataSet Directory Raw Data Privacy Policy Execution Secure Data Analytics Environment Protected Data Domain
  7. 7. 2 Data Science Efficiency
  8. 8. Celltowers TheBritish Library GreenPark South Bank Sir John Soane’s Museum Printworks 1/1/1910:35am 1/1/1910:07am 1/1/1911:12am 1/1/1910:44am 1/1/192:01pm St. Paul’s Cathedral 1/1/1912:01pm Location traces are highly sensitive, and hard to deidentify
  9. 9. Source: Crowdflower/FigureEight Survey of data scientists
  10. 10. # Calculate mean commute distance def commute_distance(location_trace): # evaluate brand loyalty def switching_propensity(transactions_vendor): # score braking safety def commute_distance(brake_sensor_data): # subsequence frequency def count_repeats(motif,gene_sequence): Approved Standard Functions Sensitive Raw Data # Define evaluation function def evaluation_function(attributes, classes, batch_size): attributes=dict(attributes) if classes is None: inputs = attributes else: inputs = (attributes, classes) dataset = tf.data.Dataset.from_tensor_slices(inputs) dataset = dataset.batch(batch_size) return dataset.make_one_shot_iterator().get_next() # Evaluate the model. eval_result = classifier.evaluate( input_fn=lambda:evaluation_function(test_x, test_y, 100)) Analysis code
  11. 11. 3 Machine Learning Robustness
  12. 12. Cat Dog
  13. 13. Overfitting to fine details in the training set harmful
  14. 14. https://nypost.com/2017/11/08/these-adorable-cats-and-bunnies-are-purr-fect-doppelgangers/
  15. 15. https://nypost.com/2017/11/08/these-adorable-cats-and-bunnies-are-purr-fect-doppelgangers/
  16. 16. Error Model training cycles Stop here DataTraining Data Test Data Overfitting
  17. 17. 0.910.88 0.93 0.89 0.90 Papernot et al https://arxiv.org/pdf/1610.05755.pdf
  18. 18. Privacy Utility
  19. 19. Summary Deploying privacy engineering can make data analytics faster andeasier Faster Access To Data MachineLearning Robustness Data Science Efficiency
  20. 20. Privacy Usability Design Privacy Engineering Into Systems Early

×