SlideShare a Scribd company logo
1 of 48
PUTTING DATA
SCIENCE INTO
PERSPECTIVE
PRIVACY &
ETHICS
SRAVAN ANKARAJU
FOUNDER & CEO
DIVERGENCE ACADEMY
June 29th, 2017
Is Fake News a Well-defined Machine Learning Problem?
Source
SRAVAN ANKARAJU
• Technology Leader focused on Strategy & Innovation, Risk and
Decision Management
• 13.5 years with Microsoft in Technology Integration consulting,
Developer Support – focus on DevOps & Agile Development
• Big Picture educator – start with concepts & then play to learn
advanced areas. Iterate and iterate often.
• Implementer of Gamification systems in Learning/Training
• Experienced in High volume & high performance transactional
systems.
• Data Analytics for various Fortune 100 companies
FOUNDER & PRESIDENT
CLAIM TO
FAME
WHY THIS
WHY NOW?
3 TRENDS SHAPING MACHINE LEARNING IN 2017
Algorithm Economy is on
the Rise
Expect more Interaction
between Machine and
Humans
Giant companies will develop
ML based AI systems
Source: http://www.datasciencecentral.com/profiles/blogs/trends-shaping-machine-learning-in-2017
1 2
3
EMERGING
TECHNOLOGY
HYPE CYCLE
FOR 2016
AGENDA
• What’s the catch if there is ton of goodness in AI-based systems
• When do you get human involved
• Global companies & Importance of May 25th, 2018
• Privacy Paradox & Distinctive aspects of Big Data
• Data Science Ethics Framework
• Where do you go from here / What you can do today
PUTTING DATA SCIENCE INTO PERSPECTIVE
WHAT IS
THE
CATCH?
To study possibly racist algorithms,
professors have to sue the US
http://arstechnica.com/tech-policy/2016/06/do-housing-jobs-sites-have-racist-algorithms-academics-sue-to-find-out/
http://moralmachine.mit.edu/
WHEN DO
YOU GET A
HUMAN
INVOLVED?
QUALITY ASSURANCE
When does securing AI against attacks or reverse-
engineering become more of an issue?
It’s an issue now. One of my biggest learnings from [chatbot] Tay was that
you need to build even AI that is resilient to attacks. It was fascinating to
see what happened on Twitter, but for instance we didn’t face the same
thing in China. Just the social conversation in China is different, and if you
put it in the US corpus it’s different. Then, of course, there was a concerted
attack. Just like you build software today that is resilient to a DDOS
(Distributed Denial of Service) attack, you need to be able to be resilient to
a corpus attack, that tries to pollute the corpus so that you pick up the
wrong thing in your AI learners.
- Satya Nadella, Microsoft CEO
HUMAN INTERVENTION
Whenever you have ambiguity and errors, you need to think
about how you put the human in the loop and escalate to the
human to make choices. That is the art form of an AI product. If
you have ambiguity and error rates, you have to be able to handle
exception. But first you have to detect that exception, and luckily
enough in AI you have confidence and probability and
distribution, so you have to use all of those to get the human in
the loop.
GOVERNANCE
- Satya Nadella, Microsoft CEO
GLOBAL
DATA
PROTECTION
REGULATION
GDPR
Stricter rules will apply to the
collection and use of personal data.
Will apply for May 2018
GLOBAL DATA PROTECTION
REGULATIONOperational Impacts
Mandatory Data-
breach Protection
Privacy Impact
Assessments
Right to be
Forgotten
Privacy by Design
and Default
Mandatory Data
Protection
Officers
1 2 3
4 5
MANDATORY DATA-BREACH
PROTECTION
• Companies that experience data breaches will need to notify
regulators and individuals whose personal data was
compromised.
• Companies will most likely want to avoid the negative
publicity of these disclosures. Multinationals will gradually
ramp up:
• Comprehensive risk assessments
• End-to-end security enhancements
• Outsourced managed security services.
Global Data Protection Regulation Operational Impact
PRIVACY IMPACT ASSESSMENTS
• Require companies to conduct data protection impact
assessments (DPIAs) where their data processing operations
are highly invasive.
• Include marketing activities based on advanced profiling and
analytics.
• Privacy operations may need to extend outside the legal
office, where it has traditionally resided, and into the day-to-
day processes of European businesses.
Global Data Protection Regulation Operational Impact
RIGHT TO BE FORGOTTEN
• Right to erasure could impose a significant burden on
companies with personal data stored across multiple
systems.
• Companies may need to –
• Maintain comprehensive data inventories
• Accelerate data-governance strategies
• Potentially re-architect key systems in order to more efficiently
process Right to be erasure requests.
Global Data Protection Regulation Operational Impact
PRIVACY BY DESIGN AND DEFAULT
• Privacy-friendly settings or postures—such as those that
collect, retain, and share personal information – will be built
into new products, devices, and business processes.
• The flip-side of DPIAs, privacy-design requirements may
give rise to a need for privacy engineers to embed privacy
features throughout the daily operations of their businesses
Global Data Protection Regulation Operational Impact
MANDATORY DATA PROTECTION
OFFICERS
• Require large companies to appoint data protection officers
(DPOs), if their core activities consist of large-scale,
systematic monitoring of people.
• DPOs will have to exhibit expertise in technology and
business processes and project and program management,
such as risk assessment and compliance monitoring skills.
• Talent is in short supply.
Global Data Protection Regulation Operational Impact
PRIVACY
PARADOX
PRIVACY PARADOX
Price of using internet services
• People may express concerns about the impact on their privacy of ‘creepy’ uses of their
data, but in practice they contribute their data anyway via the online systems they use. In
other words they provide the data because it is the price of using internet services.
RIGHT TO MY IDENTITY
Microsoft’s Digital Trends report 2015 noted a trend called Right to My
Identity which means that, rather than simply wishing to preserve
privacy through anonymity, a significant percentage of global consumers
now want to be able to control how long information they have shared
stays online, and are also interested in services that help them manage
their digital identity. This suggests consumers have increasing
expectations of how organizations will use their data and want to be able
to influence it.
A Lawyer and A Data Scientist
Walk Into A Bar
An organization wants to use data
generated in different regulatory
environments to learn about its customers
or to predict their behavior. Some
customers are in Germany, some are in
Switzerland, and others are in the U.S. and
Canada. How can a data scientist get the
most out of this data without breaking the
law, when each country has its own
regulations on what he or she can do with
the data?
“Where the data subject has provided the personal data and the
processing is based on consent or on a contract, the data subject shall
have the right to transmit those personal data and any other
information provided by the data subject and retained by an automated
processing system, into another one, in an electronic format which is
commonly used, without hindrance from the controller from whom the
personal data are withdrawn.”
GDPR CONTROVERSY
Data Portability Legalese
DISTINCTIVE ASPECTS OF BIG DATA
ANALYTICS
Potential implications for data protection
Use of algorithms
Opacity of
processing
Tendency to collect
all the data
Repurposing of data
Use of new type of
data
1 2 3
4 5
#1 USE OF ALGORITHMS
• Thinking with data: Find correlations / system learns
• Acting with data: Applied to particular case in the
Application phase
Unpredictability by Design
#2 OPACITY OF THE PROCESSING
The ‘Black Box’ effect
• Deep learning, involves feeding vast quantities of data
through non-linear neural networks that classify the data
based on the outputs from each successive layer.
• The complexity of the processing of data through such
massive networks creates a ‘black box’ effect.
• Makes it very difficult to understand the reasons for
decisions made as a result of deep learning.
#3 USING ALL THE DATA
n=all
In a retail context it could mean analyzing all the purchases
made by shoppers using a loyalty card, and using this to find
correlations, rather than asking a sample of shoppers to take
part in a survey.
#4 REPURPOSING DATA
Different than the original intent
• Geolocated Twitter data to infer people’s residence and mobility
patterns, to supplement official population estimates.
• Geotagged photos on Flickr, together with the profiles of
contributors, have been used as a reliable proxy for estimating
visitor numbers at tourist sites and where the visitors have come
from.
• Mobile-phone presence data to analyze the foot traffic into the
retail centers.
• Data about where shoppers to plan advertising campaigns.
• Data about patterns of movement in an airport to set the rents for
shops and restaurants.
#5 NEW TYPES OF DATA
Tracking without permission
• Developments in technology such as IoT mean that the
traditional scenario in which people consciously provide their
personal data is no longer the only or main way in which
personal data is collected.
• For example by tracking online activity, rather than being
consciously provided by individuals - investigate the
possibility of using data from domestic smart meters to
predict the number of people in a household and whether
they include children or older people.
DATA
SCIENCE
ETHICS
FRAMEWORK
4
ALGORITHMIC ACCOUNTABILITY
Five Principles
Needs to be a person with the authority to deal with its adverse individual or societal effects in
a timely fashion. This is not a statement about legal responsibility but, rather, a focus on
avenues for redress, public dialogue, and internal authority for change.
RESPONSIBILI
TY
Any decisions produced by an algorithmic system should be explainable to the people affected
by those decisions. These explanations must be accessible and understandable to the target
audience; purely technical descriptions are not appropriate for the general public.
EXPLAINABILI
TY
Algorithms make mistakes, whether because of data errors in their inputs (garbage in, garbage out)
or statistical uncertainty in their outputs. The principle of accuracy suggests that sources of error
and uncertainty throughout an algorithm and its data sources need to be identified, logged, and
benchmarked. Understanding the nature of errors produced by an algorithmic system can inform
mitigation procedures.
ACCURACY
https://www.technologyreview.com/s/602933/how-to-hold-algorithms-accountable/
The principle of auditability states that algorithms should be developed to enable third parties to
probe and review the behavior of an algorithm. Enabling algorithms to be monitored, checked, and
criticized would lead to more conscious design and course correction in the event of failure.
AUDITABILITY
As algorithms increasingly make decisions based on historical and societal data, existing biases
and historically discriminatory human decisions risk being “baked in” to automated decisions. All
algorithms making decisions about individuals should be evaluated for discriminatory effects. The
results of the evaluation and the criteria used should be publicly released and explained.
FAIRENESS
WHERE DO
YOU GO
FROM
HERE?
DATA SCIENCE ACTIVITIES & ORG
MATURITY
Source: Booz Allen Hamilton
IMPLEMENTATION CONSTRAINTS
Source: Booz Allen Hamilton
OPERATING MODELS
Source: Booz Allen Hamilton
GDPR PREPARATION
“No legislation rivals the potential global impact of the EU’s General Data Protection
Regulation (GDPR), going into effect in April 2018. The new law will usher in cascading
privacy demands that will require a renewed focus on data privacy for US companies
that offer goods and services to EU citizens,” said Jay Cline, PwC’s US Privacy Leader.
“Businesses that do not comply with GDPR face a potential 4% fine of global revenues,
increasing the need to successfully navigate how to plan for and implement the
necessary changes.”
Source - http://www.pwc.com/us/en/press-releases/2017/pwc-gdpr-compliance-press-release.html
INFORMATION
SECURITY
TOP INITIATIVES
PRIVACY
POLICIES
GAP
ASSESSMENT
DATA
DISCOVERY
GOVERNANCE
FactGem is a platform that allows users to generate
their own visualization and analysis applications on top
of Neo4j, without the need to learn any other
programming language. FactGem makes data analysis
accessible to everyone, whether they’re a seasoned data
scientist or completely new to data science.
Through the integration of two platforms users can
access regulated data without worrying about the risk
of violating policies. This enables users to gain insight
into data without having to worry about writing code,
requesting data engineering support, or repercussions
for failing to add policies to data. This process
dramatically accelerates innovation across teams, as
the joint solution provides an end-to-end self-service
mechanism for analysts to exploit the most important
data within an organization.
BLOCKCHAIN IMPLEMENTATION
INTERNET OF EVERYTHING NEEDS LEDGER OF EVERYTHING
1. DECENTRALIZED (Shared
Control)
2. TRUSTED (Immutability /
Audit Trail)
3. PUBLIC (Tokens / Exchanges)
Algorithmic Law and Blockchain Enabled Automation
RESOURCES
- Machine Learning: The High-interest Credit Card of Technical Debt
- Attacking discrimination with smarter machine learning
- Rules of Machine Learning [43 rules]
THANK YOU
WHERE DATA SCIENCE MEETS CYBERSECURITY

More Related Content

What's hot

DBryant-Cybersecurity Challenge
DBryant-Cybersecurity ChallengeDBryant-Cybersecurity Challenge
DBryant-Cybersecurity Challengemsdee3362
 
Looking Forward - Regulators and Data Incidents
Looking Forward - Regulators and Data IncidentsLooking Forward - Regulators and Data Incidents
Looking Forward - Regulators and Data IncidentsResilient Systems
 
Information Leakage & DLP
Information Leakage & DLPInformation Leakage & DLP
Information Leakage & DLPYun Lu
 
Data Protection Maturity Survey Results 2013
Data Protection Maturity Survey Results 2013 Data Protection Maturity Survey Results 2013
Data Protection Maturity Survey Results 2013 - Mark - Fullbright
 
wp-us-cities-exposed
wp-us-cities-exposedwp-us-cities-exposed
wp-us-cities-exposedNumaan Huq
 
The Business(es) of Disinformation
The Business(es) of DisinformationThe Business(es) of Disinformation
The Business(es) of DisinformationSara-Jayne Terp
 
Distributed defense against disinformation: disinformation risk management an...
Distributed defense against disinformation: disinformation risk management an...Distributed defense against disinformation: disinformation risk management an...
Distributed defense against disinformation: disinformation risk management an...Sara-Jayne Terp
 
Opportunities and Challenges in Crisis Informatics
Opportunities and Challenges in Crisis InformaticsOpportunities and Challenges in Crisis Informatics
Opportunities and Challenges in Crisis InformaticsLea Shanley
 
Risk, SOCs, and mitigations: cognitive security is coming of age
Risk, SOCs, and mitigations: cognitive security is coming of ageRisk, SOCs, and mitigations: cognitive security is coming of age
Risk, SOCs, and mitigations: cognitive security is coming of ageSara-Jayne Terp
 
2021 12 nyu-the_business_of_disinformation
2021 12 nyu-the_business_of_disinformation2021 12 nyu-the_business_of_disinformation
2021 12 nyu-the_business_of_disinformationSaraJayneTerp
 
Article 1 currently, smartphone, web, and social networking techno
Article 1 currently, smartphone, web, and social networking technoArticle 1 currently, smartphone, web, and social networking techno
Article 1 currently, smartphone, web, and social networking technohoney690131
 
2021-05-SJTerp-AMITT_disinfoSoc-umaryland
2021-05-SJTerp-AMITT_disinfoSoc-umaryland2021-05-SJTerp-AMITT_disinfoSoc-umaryland
2021-05-SJTerp-AMITT_disinfoSoc-umarylandSara-Jayne Terp
 
Global Technology Outlook 2012 Booklet
Global Technology Outlook 2012 BookletGlobal Technology Outlook 2012 Booklet
Global Technology Outlook 2012 BookletIBM Danmark
 
ZoomLens - Loveland, Subramanian -Tackling Info Risk
ZoomLens - Loveland, Subramanian -Tackling Info RiskZoomLens - Loveland, Subramanian -Tackling Info Risk
ZoomLens - Loveland, Subramanian -Tackling Info RiskJohn Loveland
 
Sj terp emerging tech radar
Sj terp emerging tech radarSj terp emerging tech radar
Sj terp emerging tech radarSaraJayneTerp
 
MCCA Global TEC Forum - Bug Bounties, Ransomware, and Other Cyber Hype for Le...
MCCA Global TEC Forum - Bug Bounties, Ransomware, and Other Cyber Hype for Le...MCCA Global TEC Forum - Bug Bounties, Ransomware, and Other Cyber Hype for Le...
MCCA Global TEC Forum - Bug Bounties, Ransomware, and Other Cyber Hype for Le...Casey Ellis
 
Proven Practices to Protect Critical Data - DarkReading VTS Deck
Proven Practices to Protect Critical Data - DarkReading VTS DeckProven Practices to Protect Critical Data - DarkReading VTS Deck
Proven Practices to Protect Critical Data - DarkReading VTS DeckNetIQ
 
disinformation risk management: leveraging cyber security best practices to s...
disinformation risk management: leveraging cyber security best practices to s...disinformation risk management: leveraging cyber security best practices to s...
disinformation risk management: leveraging cyber security best practices to s...Sara-Jayne Terp
 

What's hot (20)

DBryant-Cybersecurity Challenge
DBryant-Cybersecurity ChallengeDBryant-Cybersecurity Challenge
DBryant-Cybersecurity Challenge
 
Data Breach Visualization
Data Breach VisualizationData Breach Visualization
Data Breach Visualization
 
Looking Forward - Regulators and Data Incidents
Looking Forward - Regulators and Data IncidentsLooking Forward - Regulators and Data Incidents
Looking Forward - Regulators and Data Incidents
 
Information Leakage & DLP
Information Leakage & DLPInformation Leakage & DLP
Information Leakage & DLP
 
Data Protection Maturity Survey Results 2013
Data Protection Maturity Survey Results 2013 Data Protection Maturity Survey Results 2013
Data Protection Maturity Survey Results 2013
 
wp-us-cities-exposed
wp-us-cities-exposedwp-us-cities-exposed
wp-us-cities-exposed
 
The Business(es) of Disinformation
The Business(es) of DisinformationThe Business(es) of Disinformation
The Business(es) of Disinformation
 
Distributed defense against disinformation: disinformation risk management an...
Distributed defense against disinformation: disinformation risk management an...Distributed defense against disinformation: disinformation risk management an...
Distributed defense against disinformation: disinformation risk management an...
 
Opportunities and Challenges in Crisis Informatics
Opportunities and Challenges in Crisis InformaticsOpportunities and Challenges in Crisis Informatics
Opportunities and Challenges in Crisis Informatics
 
Risk, SOCs, and mitigations: cognitive security is coming of age
Risk, SOCs, and mitigations: cognitive security is coming of ageRisk, SOCs, and mitigations: cognitive security is coming of age
Risk, SOCs, and mitigations: cognitive security is coming of age
 
2021 12 nyu-the_business_of_disinformation
2021 12 nyu-the_business_of_disinformation2021 12 nyu-the_business_of_disinformation
2021 12 nyu-the_business_of_disinformation
 
Article 1 currently, smartphone, web, and social networking techno
Article 1 currently, smartphone, web, and social networking technoArticle 1 currently, smartphone, web, and social networking techno
Article 1 currently, smartphone, web, and social networking techno
 
2021-05-SJTerp-AMITT_disinfoSoc-umaryland
2021-05-SJTerp-AMITT_disinfoSoc-umaryland2021-05-SJTerp-AMITT_disinfoSoc-umaryland
2021-05-SJTerp-AMITT_disinfoSoc-umaryland
 
Spo2 t17
Spo2 t17Spo2 t17
Spo2 t17
 
Global Technology Outlook 2012 Booklet
Global Technology Outlook 2012 BookletGlobal Technology Outlook 2012 Booklet
Global Technology Outlook 2012 Booklet
 
ZoomLens - Loveland, Subramanian -Tackling Info Risk
ZoomLens - Loveland, Subramanian -Tackling Info RiskZoomLens - Loveland, Subramanian -Tackling Info Risk
ZoomLens - Loveland, Subramanian -Tackling Info Risk
 
Sj terp emerging tech radar
Sj terp emerging tech radarSj terp emerging tech radar
Sj terp emerging tech radar
 
MCCA Global TEC Forum - Bug Bounties, Ransomware, and Other Cyber Hype for Le...
MCCA Global TEC Forum - Bug Bounties, Ransomware, and Other Cyber Hype for Le...MCCA Global TEC Forum - Bug Bounties, Ransomware, and Other Cyber Hype for Le...
MCCA Global TEC Forum - Bug Bounties, Ransomware, and Other Cyber Hype for Le...
 
Proven Practices to Protect Critical Data - DarkReading VTS Deck
Proven Practices to Protect Critical Data - DarkReading VTS DeckProven Practices to Protect Critical Data - DarkReading VTS Deck
Proven Practices to Protect Critical Data - DarkReading VTS Deck
 
disinformation risk management: leveraging cyber security best practices to s...
disinformation risk management: leveraging cyber security best practices to s...disinformation risk management: leveraging cyber security best practices to s...
disinformation risk management: leveraging cyber security best practices to s...
 

Similar to Putting data science into perspective

Data foundation for analytics excellence
Data foundation for analytics excellenceData foundation for analytics excellence
Data foundation for analytics excellenceMudit Mangal
 
From information to intelligence
From information to intelligence From information to intelligence
From information to intelligence Srini Koushik
 
Big data security
Big data securityBig data security
Big data securityAnne ndolo
 
Big data security
Big data securityBig data security
Big data securityAnne ndolo
 
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docxProject 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docxstilliegeorgiana
 
Global Data Management: Governance, Security and Usefulness in a Hybrid World
Global Data Management: Governance, Security and Usefulness in a Hybrid WorldGlobal Data Management: Governance, Security and Usefulness in a Hybrid World
Global Data Management: Governance, Security and Usefulness in a Hybrid WorldNeil Raden
 
3 Steps to Turning CCPA & Data Privacy into Personalized Customer Experiences
3 Steps to Turning CCPA & Data Privacy into Personalized Customer Experiences3 Steps to Turning CCPA & Data Privacy into Personalized Customer Experiences
3 Steps to Turning CCPA & Data Privacy into Personalized Customer ExperiencesJean-Michel Franco
 
GDPR: Leverage the Power of Graphs
GDPR: Leverage the Power of GraphsGDPR: Leverage the Power of Graphs
GDPR: Leverage the Power of GraphsNeo4j
 
Master Data in the Cloud: 5 Security Fundamentals
Master Data in the Cloud: 5 Security FundamentalsMaster Data in the Cloud: 5 Security Fundamentals
Master Data in the Cloud: 5 Security FundamentalsSarah Fane
 
The Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallThe Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallTrillium Software
 
Internet of Things With Privacy in Mind
Internet of Things With Privacy in MindInternet of Things With Privacy in Mind
Internet of Things With Privacy in MindGosia Fraser
 
Big Data, Analytics and Data Science
Big Data, Analytics and Data ScienceBig Data, Analytics and Data Science
Big Data, Analytics and Data Sciencedlamb3244
 
Data mining and privacy preserving in data mining
Data mining and privacy preserving in data miningData mining and privacy preserving in data mining
Data mining and privacy preserving in data miningNeeda Multani
 
The top trends changing the landscape of Information Management
The top trends changing the landscape of Information ManagementThe top trends changing the landscape of Information Management
The top trends changing the landscape of Information ManagementVelrada
 
Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101Mukul Krishna
 
Setting the right GDPR priorities
Setting the right GDPR prioritiesSetting the right GDPR priorities
Setting the right GDPR prioritiesAlberto Canadè
 
FINAL presentationMay2016
FINAL presentationMay2016FINAL presentationMay2016
FINAL presentationMay2016Melissa Krasnow
 
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...Edge AI and Vision Alliance
 
ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION
ETHICAL ISSUES WITH CUSTOMER DATA COLLECTIONETHICAL ISSUES WITH CUSTOMER DATA COLLECTION
ETHICAL ISSUES WITH CUSTOMER DATA COLLECTIONPranav Godse
 

Similar to Putting data science into perspective (20)

Data foundation for analytics excellence
Data foundation for analytics excellenceData foundation for analytics excellence
Data foundation for analytics excellence
 
From information to intelligence
From information to intelligence From information to intelligence
From information to intelligence
 
Big data security
Big data securityBig data security
Big data security
 
Big data security
Big data securityBig data security
Big data security
 
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docxProject 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
 
Global Data Management: Governance, Security and Usefulness in a Hybrid World
Global Data Management: Governance, Security and Usefulness in a Hybrid WorldGlobal Data Management: Governance, Security and Usefulness in a Hybrid World
Global Data Management: Governance, Security and Usefulness in a Hybrid World
 
3 Steps to Turning CCPA & Data Privacy into Personalized Customer Experiences
3 Steps to Turning CCPA & Data Privacy into Personalized Customer Experiences3 Steps to Turning CCPA & Data Privacy into Personalized Customer Experiences
3 Steps to Turning CCPA & Data Privacy into Personalized Customer Experiences
 
GDPR: Leverage the Power of Graphs
GDPR: Leverage the Power of GraphsGDPR: Leverage the Power of Graphs
GDPR: Leverage the Power of Graphs
 
Master Data in the Cloud: 5 Security Fundamentals
Master Data in the Cloud: 5 Security FundamentalsMaster Data in the Cloud: 5 Security Fundamentals
Master Data in the Cloud: 5 Security Fundamentals
 
The Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallThe Bigger They Are The Harder They Fall
The Bigger They Are The Harder They Fall
 
Internet of Things With Privacy in Mind
Internet of Things With Privacy in MindInternet of Things With Privacy in Mind
Internet of Things With Privacy in Mind
 
Big Data, Analytics and Data Science
Big Data, Analytics and Data ScienceBig Data, Analytics and Data Science
Big Data, Analytics and Data Science
 
Data mining and privacy preserving in data mining
Data mining and privacy preserving in data miningData mining and privacy preserving in data mining
Data mining and privacy preserving in data mining
 
The top trends changing the landscape of Information Management
The top trends changing the landscape of Information ManagementThe top trends changing the landscape of Information Management
The top trends changing the landscape of Information Management
 
Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
Setting the right GDPR priorities
Setting the right GDPR prioritiesSetting the right GDPR priorities
Setting the right GDPR priorities
 
FINAL presentationMay2016
FINAL presentationMay2016FINAL presentationMay2016
FINAL presentationMay2016
 
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
“Responsible AI: Tools and Frameworks for Developing AI Solutions,” a Present...
 
ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION
ETHICAL ISSUES WITH CUSTOMER DATA COLLECTIONETHICAL ISSUES WITH CUSTOMER DATA COLLECTION
ETHICAL ISSUES WITH CUSTOMER DATA COLLECTION
 

Recently uploaded

How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Recently uploaded (20)

How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Putting data science into perspective

  • 1. PUTTING DATA SCIENCE INTO PERSPECTIVE PRIVACY & ETHICS SRAVAN ANKARAJU FOUNDER & CEO DIVERGENCE ACADEMY June 29th, 2017
  • 2. Is Fake News a Well-defined Machine Learning Problem? Source
  • 3. SRAVAN ANKARAJU • Technology Leader focused on Strategy & Innovation, Risk and Decision Management • 13.5 years with Microsoft in Technology Integration consulting, Developer Support – focus on DevOps & Agile Development • Big Picture educator – start with concepts & then play to learn advanced areas. Iterate and iterate often. • Implementer of Gamification systems in Learning/Training • Experienced in High volume & high performance transactional systems. • Data Analytics for various Fortune 100 companies FOUNDER & PRESIDENT
  • 6. 3 TRENDS SHAPING MACHINE LEARNING IN 2017 Algorithm Economy is on the Rise Expect more Interaction between Machine and Humans Giant companies will develop ML based AI systems Source: http://www.datasciencecentral.com/profiles/blogs/trends-shaping-machine-learning-in-2017 1 2 3
  • 8.
  • 9. AGENDA • What’s the catch if there is ton of goodness in AI-based systems • When do you get human involved • Global companies & Importance of May 25th, 2018 • Privacy Paradox & Distinctive aspects of Big Data • Data Science Ethics Framework • Where do you go from here / What you can do today PUTTING DATA SCIENCE INTO PERSPECTIVE
  • 11. To study possibly racist algorithms, professors have to sue the US http://arstechnica.com/tech-policy/2016/06/do-housing-jobs-sites-have-racist-algorithms-academics-sue-to-find-out/
  • 13. WHEN DO YOU GET A HUMAN INVOLVED?
  • 14.
  • 15. QUALITY ASSURANCE When does securing AI against attacks or reverse- engineering become more of an issue? It’s an issue now. One of my biggest learnings from [chatbot] Tay was that you need to build even AI that is resilient to attacks. It was fascinating to see what happened on Twitter, but for instance we didn’t face the same thing in China. Just the social conversation in China is different, and if you put it in the US corpus it’s different. Then, of course, there was a concerted attack. Just like you build software today that is resilient to a DDOS (Distributed Denial of Service) attack, you need to be able to be resilient to a corpus attack, that tries to pollute the corpus so that you pick up the wrong thing in your AI learners. - Satya Nadella, Microsoft CEO
  • 16. HUMAN INTERVENTION Whenever you have ambiguity and errors, you need to think about how you put the human in the loop and escalate to the human to make choices. That is the art form of an AI product. If you have ambiguity and error rates, you have to be able to handle exception. But first you have to detect that exception, and luckily enough in AI you have confidence and probability and distribution, so you have to use all of those to get the human in the loop. GOVERNANCE - Satya Nadella, Microsoft CEO
  • 18. GDPR Stricter rules will apply to the collection and use of personal data. Will apply for May 2018
  • 19. GLOBAL DATA PROTECTION REGULATIONOperational Impacts Mandatory Data- breach Protection Privacy Impact Assessments Right to be Forgotten Privacy by Design and Default Mandatory Data Protection Officers 1 2 3 4 5
  • 20. MANDATORY DATA-BREACH PROTECTION • Companies that experience data breaches will need to notify regulators and individuals whose personal data was compromised. • Companies will most likely want to avoid the negative publicity of these disclosures. Multinationals will gradually ramp up: • Comprehensive risk assessments • End-to-end security enhancements • Outsourced managed security services. Global Data Protection Regulation Operational Impact
  • 21. PRIVACY IMPACT ASSESSMENTS • Require companies to conduct data protection impact assessments (DPIAs) where their data processing operations are highly invasive. • Include marketing activities based on advanced profiling and analytics. • Privacy operations may need to extend outside the legal office, where it has traditionally resided, and into the day-to- day processes of European businesses. Global Data Protection Regulation Operational Impact
  • 22. RIGHT TO BE FORGOTTEN • Right to erasure could impose a significant burden on companies with personal data stored across multiple systems. • Companies may need to – • Maintain comprehensive data inventories • Accelerate data-governance strategies • Potentially re-architect key systems in order to more efficiently process Right to be erasure requests. Global Data Protection Regulation Operational Impact
  • 23. PRIVACY BY DESIGN AND DEFAULT • Privacy-friendly settings or postures—such as those that collect, retain, and share personal information – will be built into new products, devices, and business processes. • The flip-side of DPIAs, privacy-design requirements may give rise to a need for privacy engineers to embed privacy features throughout the daily operations of their businesses Global Data Protection Regulation Operational Impact
  • 24. MANDATORY DATA PROTECTION OFFICERS • Require large companies to appoint data protection officers (DPOs), if their core activities consist of large-scale, systematic monitoring of people. • DPOs will have to exhibit expertise in technology and business processes and project and program management, such as risk assessment and compliance monitoring skills. • Talent is in short supply. Global Data Protection Regulation Operational Impact
  • 26. PRIVACY PARADOX Price of using internet services • People may express concerns about the impact on their privacy of ‘creepy’ uses of their data, but in practice they contribute their data anyway via the online systems they use. In other words they provide the data because it is the price of using internet services. RIGHT TO MY IDENTITY Microsoft’s Digital Trends report 2015 noted a trend called Right to My Identity which means that, rather than simply wishing to preserve privacy through anonymity, a significant percentage of global consumers now want to be able to control how long information they have shared stays online, and are also interested in services that help them manage their digital identity. This suggests consumers have increasing expectations of how organizations will use their data and want to be able to influence it.
  • 27. A Lawyer and A Data Scientist Walk Into A Bar
  • 28. An organization wants to use data generated in different regulatory environments to learn about its customers or to predict their behavior. Some customers are in Germany, some are in Switzerland, and others are in the U.S. and Canada. How can a data scientist get the most out of this data without breaking the law, when each country has its own regulations on what he or she can do with the data?
  • 29. “Where the data subject has provided the personal data and the processing is based on consent or on a contract, the data subject shall have the right to transmit those personal data and any other information provided by the data subject and retained by an automated processing system, into another one, in an electronic format which is commonly used, without hindrance from the controller from whom the personal data are withdrawn.” GDPR CONTROVERSY Data Portability Legalese
  • 30. DISTINCTIVE ASPECTS OF BIG DATA ANALYTICS Potential implications for data protection Use of algorithms Opacity of processing Tendency to collect all the data Repurposing of data Use of new type of data 1 2 3 4 5
  • 31. #1 USE OF ALGORITHMS • Thinking with data: Find correlations / system learns • Acting with data: Applied to particular case in the Application phase Unpredictability by Design
  • 32. #2 OPACITY OF THE PROCESSING The ‘Black Box’ effect • Deep learning, involves feeding vast quantities of data through non-linear neural networks that classify the data based on the outputs from each successive layer. • The complexity of the processing of data through such massive networks creates a ‘black box’ effect. • Makes it very difficult to understand the reasons for decisions made as a result of deep learning.
  • 33. #3 USING ALL THE DATA n=all In a retail context it could mean analyzing all the purchases made by shoppers using a loyalty card, and using this to find correlations, rather than asking a sample of shoppers to take part in a survey.
  • 34. #4 REPURPOSING DATA Different than the original intent • Geolocated Twitter data to infer people’s residence and mobility patterns, to supplement official population estimates. • Geotagged photos on Flickr, together with the profiles of contributors, have been used as a reliable proxy for estimating visitor numbers at tourist sites and where the visitors have come from. • Mobile-phone presence data to analyze the foot traffic into the retail centers. • Data about where shoppers to plan advertising campaigns. • Data about patterns of movement in an airport to set the rents for shops and restaurants.
  • 35. #5 NEW TYPES OF DATA Tracking without permission • Developments in technology such as IoT mean that the traditional scenario in which people consciously provide their personal data is no longer the only or main way in which personal data is collected. • For example by tracking online activity, rather than being consciously provided by individuals - investigate the possibility of using data from domestic smart meters to predict the number of people in a household and whether they include children or older people.
  • 37. 4
  • 38. ALGORITHMIC ACCOUNTABILITY Five Principles Needs to be a person with the authority to deal with its adverse individual or societal effects in a timely fashion. This is not a statement about legal responsibility but, rather, a focus on avenues for redress, public dialogue, and internal authority for change. RESPONSIBILI TY Any decisions produced by an algorithmic system should be explainable to the people affected by those decisions. These explanations must be accessible and understandable to the target audience; purely technical descriptions are not appropriate for the general public. EXPLAINABILI TY Algorithms make mistakes, whether because of data errors in their inputs (garbage in, garbage out) or statistical uncertainty in their outputs. The principle of accuracy suggests that sources of error and uncertainty throughout an algorithm and its data sources need to be identified, logged, and benchmarked. Understanding the nature of errors produced by an algorithmic system can inform mitigation procedures. ACCURACY https://www.technologyreview.com/s/602933/how-to-hold-algorithms-accountable/ The principle of auditability states that algorithms should be developed to enable third parties to probe and review the behavior of an algorithm. Enabling algorithms to be monitored, checked, and criticized would lead to more conscious design and course correction in the event of failure. AUDITABILITY As algorithms increasingly make decisions based on historical and societal data, existing biases and historically discriminatory human decisions risk being “baked in” to automated decisions. All algorithms making decisions about individuals should be evaluated for discriminatory effects. The results of the evaluation and the criteria used should be publicly released and explained. FAIRENESS
  • 40. DATA SCIENCE ACTIVITIES & ORG MATURITY Source: Booz Allen Hamilton
  • 43. GDPR PREPARATION “No legislation rivals the potential global impact of the EU’s General Data Protection Regulation (GDPR), going into effect in April 2018. The new law will usher in cascading privacy demands that will require a renewed focus on data privacy for US companies that offer goods and services to EU citizens,” said Jay Cline, PwC’s US Privacy Leader. “Businesses that do not comply with GDPR face a potential 4% fine of global revenues, increasing the need to successfully navigate how to plan for and implement the necessary changes.” Source - http://www.pwc.com/us/en/press-releases/2017/pwc-gdpr-compliance-press-release.html INFORMATION SECURITY TOP INITIATIVES PRIVACY POLICIES GAP ASSESSMENT DATA DISCOVERY
  • 45. FactGem is a platform that allows users to generate their own visualization and analysis applications on top of Neo4j, without the need to learn any other programming language. FactGem makes data analysis accessible to everyone, whether they’re a seasoned data scientist or completely new to data science. Through the integration of two platforms users can access regulated data without worrying about the risk of violating policies. This enables users to gain insight into data without having to worry about writing code, requesting data engineering support, or repercussions for failing to add policies to data. This process dramatically accelerates innovation across teams, as the joint solution provides an end-to-end self-service mechanism for analysts to exploit the most important data within an organization.
  • 46. BLOCKCHAIN IMPLEMENTATION INTERNET OF EVERYTHING NEEDS LEDGER OF EVERYTHING 1. DECENTRALIZED (Shared Control) 2. TRUSTED (Immutability / Audit Trail) 3. PUBLIC (Tokens / Exchanges) Algorithmic Law and Blockchain Enabled Automation
  • 47. RESOURCES - Machine Learning: The High-interest Credit Card of Technical Debt - Attacking discrimination with smarter machine learning - Rules of Machine Learning [43 rules]
  • 48. THANK YOU WHERE DATA SCIENCE MEETS CYBERSECURITY