This was a talk I gave at SXSW 2016. It outlines the current state of applied ethics in data science as a profession. Describes key reasons a code should be constructed and also proposes a framework to begin discussion.
Ethics in Data Science and Machine LearningHJ van Veen
Introduction and overview on ethics in data science and machine learning, variations and examples of algorithmic bias, and a call-to-action for self-regulation. Given by Thierry Silbermann as part of the Sao Paulo Machine Learning Meetup, theme: "Ethics".
https://www.linkedin.com/in/thierrysilbermann
https://twitter.com/silbermannt
https://github.com/thierry-silbermann
Data ethics and machine learning: discrimination, algorithmic bias, and how t...Data Driven Innovation
Machine learning and data mining algorithms construct predictive models and decision making systems based on big data. Big data are the digital traces of human activities - opinions, preferences, movements, lifestyles, ... - hence they reflect all human biases and prejudices. Therefore, the models learnt from big data may inherit all such biases, leading to discriminatory decisions. In my talk, I discuss many real examples, from crime prediction to credit scoring to image recognition, and how we can tackle the problem of discovering discrimination using the very same approach: data mining.
GDPR and personal data protection in EU research projectsLorenzo Mannella
This 20-minute presentation provides participants with a case study on data protection issues exposed by research partners awarded with a fictional Horizon 2020/Horizon Europe grant. Participants will follow the work of data controller and processors, committed to handle and store personal data of EU and Non-EU citizens for research purposes.
Participants will be engaged to evaluate the compliance of research activities with the General Data Protection Regulation (GDPR), which defines principles relating to processing of personal data, the lawfulness of such processing and modalities to ensure transparent information, communication and rights of the data subjects.
Rules and best practices in data processing are part of the essential toolbox for Research Managers and Administrators, answering the growing call of GDPR compliance along with Data Protection Officers. Beyond the understanding of accountability, privacy by design and by default principles, professionals are testing themselves with the constant update of data protection guidelines from the European Data Protection Board.
This session is targeted to an audience of intermediate level, aware of the topic of data protection/GDPR and willing to engage with other professionals on a case study analysis. The session will benefit from a short Q&A and a follow-up survey to gather best practices in data management put in place by participants in their day-to-day work.
AI Governance and Ethics - Industry StandardsAnsgar Koene
Presentation on the potential for Ethics based Industry Standards to function as vehicle to address socio-technical challenges from AI.
Presentation given at the the 1st Austrian IFIP forum ono "AI and future society".
Towards the Next Generation Financial Crimes Platform - How Data, Analytics, ...Molly Alexander
Towards the Next Generation Financial Crimes Platform - How Data, Analytics, & ML Are Transforming the Fight Against Fraud, AML & Cybersecurity -Nadeem Asghar
Ethics in Data Science and Machine LearningHJ van Veen
Introduction and overview on ethics in data science and machine learning, variations and examples of algorithmic bias, and a call-to-action for self-regulation. Given by Thierry Silbermann as part of the Sao Paulo Machine Learning Meetup, theme: "Ethics".
https://www.linkedin.com/in/thierrysilbermann
https://twitter.com/silbermannt
https://github.com/thierry-silbermann
Data ethics and machine learning: discrimination, algorithmic bias, and how t...Data Driven Innovation
Machine learning and data mining algorithms construct predictive models and decision making systems based on big data. Big data are the digital traces of human activities - opinions, preferences, movements, lifestyles, ... - hence they reflect all human biases and prejudices. Therefore, the models learnt from big data may inherit all such biases, leading to discriminatory decisions. In my talk, I discuss many real examples, from crime prediction to credit scoring to image recognition, and how we can tackle the problem of discovering discrimination using the very same approach: data mining.
GDPR and personal data protection in EU research projectsLorenzo Mannella
This 20-minute presentation provides participants with a case study on data protection issues exposed by research partners awarded with a fictional Horizon 2020/Horizon Europe grant. Participants will follow the work of data controller and processors, committed to handle and store personal data of EU and Non-EU citizens for research purposes.
Participants will be engaged to evaluate the compliance of research activities with the General Data Protection Regulation (GDPR), which defines principles relating to processing of personal data, the lawfulness of such processing and modalities to ensure transparent information, communication and rights of the data subjects.
Rules and best practices in data processing are part of the essential toolbox for Research Managers and Administrators, answering the growing call of GDPR compliance along with Data Protection Officers. Beyond the understanding of accountability, privacy by design and by default principles, professionals are testing themselves with the constant update of data protection guidelines from the European Data Protection Board.
This session is targeted to an audience of intermediate level, aware of the topic of data protection/GDPR and willing to engage with other professionals on a case study analysis. The session will benefit from a short Q&A and a follow-up survey to gather best practices in data management put in place by participants in their day-to-day work.
AI Governance and Ethics - Industry StandardsAnsgar Koene
Presentation on the potential for Ethics based Industry Standards to function as vehicle to address socio-technical challenges from AI.
Presentation given at the the 1st Austrian IFIP forum ono "AI and future society".
Towards the Next Generation Financial Crimes Platform - How Data, Analytics, ...Molly Alexander
Towards the Next Generation Financial Crimes Platform - How Data, Analytics, & ML Are Transforming the Fight Against Fraud, AML & Cybersecurity -Nadeem Asghar
Workshop on "Building Successful Pipelines for Predictive Analytics in Healthcare" delivered by Danielle Belgrave, PhD, Researcher at Microsoft Research, Cambridge, UK.
[Ai in finance] AI in regulatory compliance, risk management, and auditingNatalino Busa
AI to Improve Regulatory Compliance, Governance & Auditing. How AI identifies and prevents risks, above and beyond traditional methods. Techniques and analytics that protect customers and firms from cyber-attacks and fraud. Using AI to quickly and efficiently provide evidence for auditing requests.
Big Data Analytics : Understanding for Research ActivityAndry Alamsyah
Big Data Analytics Presentation at International Workshop Colloquium Exploring Research Opportunity. School of Business and Management (SBM) - ITB. Bandung, 8 August 2019.
How Machine Learning & AI Will Improve Cyber SecurityDevOps.com
Machine Learning (ML) and Artificial Intelligence (AI) have been proclaimed as perhaps the next great leap in human quality of life, as well as a potential reason for our extinction. Somewhere in between lies how ML & AI can potentially improve our Cyber Security efforts. But are ML & AI a true panacea or merely the next shiny trinket for the cyber industry to fixate on? In this webinar we will explore:
How ML & AI are currently being utilized in cyber security efforts.
What is working and what has not worked
What is on the both the short term and near-term horizon for ML &AI
Practical steps you can take now to begin leveraging these technologies to tangibly improve your cyber security posture
Join our panel of industry experts as we explore this brave new frontier in cyber security with a candid look cutting through the hype.
This presentation looks at how AI works, how it is being used presently in Education and then outline some concerns about how AI might be used in education in the future.
I argue that AI has a much greater part to play in Education – particularly in making education more widely available in the developing world and in reducing the cost of education.
The talk then moves on to discuss general ethical concerns about how AI is being used in society, looking at the issue of how we program autonomous vehicles as a case in point. I then outline five areas of concern about the use (and potential abuse) of AI in education arguing that we need to have a much more informed debate before things go too far. With this in mind, I close with some suggestions for courses and reading that might help colleagues to become better informed about the subject.
Responsible Data Use in AI - core tech pillarsSofus Macskássy
In this deck, we cover four core pillars of responsible data use in AI, including fairness, transparency, explainability -- as well as data governance.
“AI is the new electricity” proclaims Andrew Ng, co-founder of Google Brain. Just as we need to know how to safely harness electricity, we also need to know how to securely employ AI to power our businesses. In some scenarios, the security of AI systems can impact human safety. On the flip side, AI can also be misused by cyber-adversaries and so we need to understand how to counter them.
This talk will provide food for thought in 3 areas:
Security of AI systems
Use of AI in cybersecurity
Malicious use of AI
Our report will provide a look into the technology landscape of the future, including:
- Importance of AI in enabling innovation
- Catalysts of future innovations
- Top technology trends in 2023-2024
- Main benefits of AI adoption
- Steps to prepare for future disruptions.
Download your free copy now and implement the key findings to improve your business.
Ethical Dimensions of Artificial Intelligence (AI) by Rinshad ChoorapparaRinshad Choorappara
Explore the ethical landscape of Artificial Intelligence (AI) through our insightful PowerPoint presentation. Delve into crucial considerations that shape the responsible development and deployment of AI technologies. From privacy concerns and bias mitigation to transparency and accountability, this presentation covers the key ethical dimensions of AI. Gain a comprehensive understanding of the ethical challenges and solutions in the rapidly evolving world of artificial intelligence. Stay informed and empower your audience with the knowledge needed to navigate the ethical intricacies of AI responsibly.
Let us see the good and bad effects of the impact of Artificial Intelligence and the emerging technologies!
[Video available at https://sites.google.com/view/ResponsibleAITutorial]
Artificial Intelligence is increasingly being used in decisions and processes that are critical for individuals, businesses, and society, especially in areas such as hiring, lending, criminal justice, healthcare, and education. Recent ethical challenges and undesirable outcomes associated with AI systems have highlighted the need for regulations, best practices, and practical tools to help data scientists and ML developers build AI systems that are secure, privacy-preserving, transparent, explainable, fair, and accountable – to avoid unintended and potentially harmful consequences and compliance challenges.
In this tutorial, we will present an overview of responsible AI, highlighting model explainability, fairness, and privacy in AI, key regulations/laws, and techniques/tools for providing understanding around AI/ML systems. Then, we will focus on the application of explainability, fairness assessment/unfairness mitigation, and privacy techniques in industry, wherein we present practical challenges/guidelines for using such techniques effectively and lessons learned from deploying models for several web-scale machine learning and data mining applications. We will present case studies across different companies, spanning many industries and application domains. Finally, based on our experiences in industry, we will identify open problems and research directions for the AI community.
Data Con LA 2020
Description
More and more organizations are embracing AI technology by infusing it in their products and services to to differentiate themselves against their competitors. AI is being utilized in some sensitive areas of human life. In this session let's look at some of principles governing adoption of AI in a responsible manner. Why companies are accelerating adoption of AI?
Increasingly organization are accelerating adoption of AI to differentiate their product and services in the market. Outcomes of this digital transformation that we have seen in the areas of optimizing operations, engaging customers, empowering employees and transforming their products and services.
*List some of the sensitive use cases where AI is being applied
*Why governing AI is important and what are those principles?
*How Microsoft is approaching it?
Speaker
Suresh Paulraj, Microsoft, Principal Cloud Solution Architect Data & AI
Opportunities for you, your company and your worldCartegraph
The 2015 Loras College Business Analytics Symposium kicked off with a morning keynote by Tim Suther, managing director at JP Morgan Chase that took a look at the enormous business analytics opportunities available to you, your company and your world.
Attendees left this presentation with an idea of how to:
-Identify these opportunities
-Position themselves and their company for these opportunities
-Prioritize among the many opportunities that will inevitably be identified
-Be a world citizen while pursuing these opportunities.
Workshop on "Building Successful Pipelines for Predictive Analytics in Healthcare" delivered by Danielle Belgrave, PhD, Researcher at Microsoft Research, Cambridge, UK.
[Ai in finance] AI in regulatory compliance, risk management, and auditingNatalino Busa
AI to Improve Regulatory Compliance, Governance & Auditing. How AI identifies and prevents risks, above and beyond traditional methods. Techniques and analytics that protect customers and firms from cyber-attacks and fraud. Using AI to quickly and efficiently provide evidence for auditing requests.
Big Data Analytics : Understanding for Research ActivityAndry Alamsyah
Big Data Analytics Presentation at International Workshop Colloquium Exploring Research Opportunity. School of Business and Management (SBM) - ITB. Bandung, 8 August 2019.
How Machine Learning & AI Will Improve Cyber SecurityDevOps.com
Machine Learning (ML) and Artificial Intelligence (AI) have been proclaimed as perhaps the next great leap in human quality of life, as well as a potential reason for our extinction. Somewhere in between lies how ML & AI can potentially improve our Cyber Security efforts. But are ML & AI a true panacea or merely the next shiny trinket for the cyber industry to fixate on? In this webinar we will explore:
How ML & AI are currently being utilized in cyber security efforts.
What is working and what has not worked
What is on the both the short term and near-term horizon for ML &AI
Practical steps you can take now to begin leveraging these technologies to tangibly improve your cyber security posture
Join our panel of industry experts as we explore this brave new frontier in cyber security with a candid look cutting through the hype.
This presentation looks at how AI works, how it is being used presently in Education and then outline some concerns about how AI might be used in education in the future.
I argue that AI has a much greater part to play in Education – particularly in making education more widely available in the developing world and in reducing the cost of education.
The talk then moves on to discuss general ethical concerns about how AI is being used in society, looking at the issue of how we program autonomous vehicles as a case in point. I then outline five areas of concern about the use (and potential abuse) of AI in education arguing that we need to have a much more informed debate before things go too far. With this in mind, I close with some suggestions for courses and reading that might help colleagues to become better informed about the subject.
Responsible Data Use in AI - core tech pillarsSofus Macskássy
In this deck, we cover four core pillars of responsible data use in AI, including fairness, transparency, explainability -- as well as data governance.
“AI is the new electricity” proclaims Andrew Ng, co-founder of Google Brain. Just as we need to know how to safely harness electricity, we also need to know how to securely employ AI to power our businesses. In some scenarios, the security of AI systems can impact human safety. On the flip side, AI can also be misused by cyber-adversaries and so we need to understand how to counter them.
This talk will provide food for thought in 3 areas:
Security of AI systems
Use of AI in cybersecurity
Malicious use of AI
Our report will provide a look into the technology landscape of the future, including:
- Importance of AI in enabling innovation
- Catalysts of future innovations
- Top technology trends in 2023-2024
- Main benefits of AI adoption
- Steps to prepare for future disruptions.
Download your free copy now and implement the key findings to improve your business.
Ethical Dimensions of Artificial Intelligence (AI) by Rinshad ChoorapparaRinshad Choorappara
Explore the ethical landscape of Artificial Intelligence (AI) through our insightful PowerPoint presentation. Delve into crucial considerations that shape the responsible development and deployment of AI technologies. From privacy concerns and bias mitigation to transparency and accountability, this presentation covers the key ethical dimensions of AI. Gain a comprehensive understanding of the ethical challenges and solutions in the rapidly evolving world of artificial intelligence. Stay informed and empower your audience with the knowledge needed to navigate the ethical intricacies of AI responsibly.
Let us see the good and bad effects of the impact of Artificial Intelligence and the emerging technologies!
[Video available at https://sites.google.com/view/ResponsibleAITutorial]
Artificial Intelligence is increasingly being used in decisions and processes that are critical for individuals, businesses, and society, especially in areas such as hiring, lending, criminal justice, healthcare, and education. Recent ethical challenges and undesirable outcomes associated with AI systems have highlighted the need for regulations, best practices, and practical tools to help data scientists and ML developers build AI systems that are secure, privacy-preserving, transparent, explainable, fair, and accountable – to avoid unintended and potentially harmful consequences and compliance challenges.
In this tutorial, we will present an overview of responsible AI, highlighting model explainability, fairness, and privacy in AI, key regulations/laws, and techniques/tools for providing understanding around AI/ML systems. Then, we will focus on the application of explainability, fairness assessment/unfairness mitigation, and privacy techniques in industry, wherein we present practical challenges/guidelines for using such techniques effectively and lessons learned from deploying models for several web-scale machine learning and data mining applications. We will present case studies across different companies, spanning many industries and application domains. Finally, based on our experiences in industry, we will identify open problems and research directions for the AI community.
Data Con LA 2020
Description
More and more organizations are embracing AI technology by infusing it in their products and services to to differentiate themselves against their competitors. AI is being utilized in some sensitive areas of human life. In this session let's look at some of principles governing adoption of AI in a responsible manner. Why companies are accelerating adoption of AI?
Increasingly organization are accelerating adoption of AI to differentiate their product and services in the market. Outcomes of this digital transformation that we have seen in the areas of optimizing operations, engaging customers, empowering employees and transforming their products and services.
*List some of the sensitive use cases where AI is being applied
*Why governing AI is important and what are those principles?
*How Microsoft is approaching it?
Speaker
Suresh Paulraj, Microsoft, Principal Cloud Solution Architect Data & AI
Opportunities for you, your company and your worldCartegraph
The 2015 Loras College Business Analytics Symposium kicked off with a morning keynote by Tim Suther, managing director at JP Morgan Chase that took a look at the enormous business analytics opportunities available to you, your company and your world.
Attendees left this presentation with an idea of how to:
-Identify these opportunities
-Position themselves and their company for these opportunities
-Prioritize among the many opportunities that will inevitably be identified
-Be a world citizen while pursuing these opportunities.
Analytic opportunities for you, companies and the worldTim Suther
Explore how people, companies and communities are using analytics to improve healthcare, personal safety, drive growth and create career opportunities. Examine "how you can," "whether you should," and historic parallels to other transformative changes
The Summit will consider the role of leadership within the technology domain. Amidst a backdrop of uncertainty and disruption, the conference will discuss how you can help your organisation navigate change, overcome problems and accelerate innovation.
The programme will feature insights from an impressive array of technologists, founders, researchers and transformation specialists; contextualising the biggest challenges facing the industry and sharing practical advice, guidance and best-practice on how you can maximise your impact within your team.
Now in its seventh year, the Summit has established itself as the largest annual leadership event for Scotland’s Technology community, and an invaluable forum for knowledge exchange, discussion and high-level networking.
Core themes:
Trends: Digitalisation, agility, disruption and hybrid teams
Evolution: The changing nature of technology as a discipline
Leadership: Strategy, empowerment, communication, motivation and empathy
Culture: Creating a culture of inclusion, innovation and exploration
Impact: Technology as a driver of growth, innovation and improvement
Inspirational talk on AI (artificial intelligence) and machine learning, i.e., how to give birth to an AI. Introductory and intentionally kept simple for non experts and non technical executives. Care should be taken not too over interpret some of the intentional simplified statements in the presentation.
Disrupting technologies like Data Science and Knowledge Automation are projected to have an economic impact of trillions of dollars in the next decade.
This presentation was given at the Dallas Tableau User Group on Oct 29, 2103 and
While there is tendency to publicly acclaim GDPR as a wonderful advancement, the sad truth is that EU operators now need sophisticated techniques to extract at least part of the knowledge that is freely available in other Countries. One of the main tools is Data Anonymization. Full anonymization amounts to data destruction. But there are levels. What is actually required to be compliant? How different situations require different anonymization levels? How to measure?
“Permissionless Innovation” & the Clash of Visions over Emerging TechnologiesAdam Thierer
"Permissionless Innovation & the Clash of Visions over Emerging Technologies." A presentation created by Adam Thierer (Mercatus Center at George Mason University). It focuses on coming public policy fights over various emerging technologies, such as: driverless cars, the Internet of Things, wearable technology, commercial drones, mobile medical innovations, virtual reality, and more.
This presentation has been updated to reflect most recent version.
Technology in Business Law by Ammar YounasAmmar Younas
This lecture has been prepared by Ammar Younas, Senior Lecturer in Commercial Law at Westminster International University in Tashkent for the Class of 2019-2020 Introduction to Business Law.
Cognizant Community Europe 2017: Mastering Digital: Navigating the Shift to t...Cognizant
Executives gathered at Cognizant’s flagship European thought leadership conference heard how digital technologies in general and AI in particular are poised to generate significant economic growth.
The First of Me! Insights from the Future of Digital at SxSW 2019Inês Almeida
What does the title of a corny Hoobastank song have to do with SXSW 2019 takeaways? Absolutely everything. In this talk, we will explore the next frontier in personalisation—the trends, benefits and potential unintended consequences of Relevancy 2.0. Then we will focus on what organisations must do now to finally put the personal back into personalisation.
The Future of Innovation of Policy - Adam Thierer - Mercatus CenterAdam Thierer
An overview of the future of innovation policy and what governance vision will drive it -- the precautionary principle or permissionless innovation. (By Adam Thierer, Senior Research Fellow, Mercatus Center at George Mason University).
Evolution of Social Media and its effects on Knowledge OrganisationCollabor8now Ltd
There has been a lot of hype around social media, social networks and social business, much of it unhelpful in understanding what this is all about. For some people, “social” will always mean frivolity and time wasting. For others, social media just means marketing and communications.
The evolution of social media over the past several years has made it easier than ever before to find, connect and engage with “experts” and people with similar interests. Enlightened organisations have recognised that investment in social technologies and (most importantly) the organisational change required in order to nurture and embed a collaborative culture, can overcome the limitations of silo’d structures that have traditionally inhibited information flows and opportunities for innovation.
In a broader context, the pervasive and ubiquitous availability of social media in almost all aspects of daily life, from the way we communicate, get information, buy and sell, travel, live and learn is adding to the pressure on organisations to provide a more porous interface between internal (behind the firewall) and external services. Knowledge workers are increasingly making their own decisions on what tools, products and services that they need to work more effectively and will become increasingly disaffected if these are not available within the work environment.
This presentation looks at industry trends on how social media and social technologies are changing the way that we generate, organise and consume knowledge, and how this is driving emergent digital literacies for knowledge workers.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
2. WHAT IS THIS?
‣ Advertisers and ethics… WTF!
‣ What me ethical?
‣ Mapping the code.
‣ Why do this at all?
3. WHAT IS THIS NOT?
‣ An attempt to get you to Tweet about something
‣ A vision for Tim’s perfect future
‣ A shameless plug for any association, business
or way of thinking
11. WHAT IS A DATA SCIENTIST?
‣ Statistics
‣ Data Strategy
‣ Social Science
‣ Coding chops
‣ Good Looks
12. AND WE SEEM TO HAVE MORE AND MORE
OF THEM IN THE WORLD IN GENERAL
13. O’Riley 2015 Data Science Survey
http://duu86o6n09pv.cloudfront.net/reports/2015-data-science-salary-survey.pdf
of +/- 600 respondents
1%
9%
23%
25%
14%
13%
6%
5%
4%
0%
5%
10%
15%
20%
25%
30%
<21 21+25 26+30 31+35 36+40 41+45 46+50 51+55 56<
Percent2of2Respondents
Reported2 Age
THEY ARE ALSO A YOUNG BUNCH
14. AND THAT MAKES SENSE AS
IT IS A YOUNG PROFESSION
1996 Members of the
International Federation of
Classification Societies (IFCS)
meet in Kobe, Japan.
2001 William S. Cleveland
publishes “Data Science: An Action
Plan for Expanding the Technical
Areas of the Field of Statistics.”
FIRST USE OF
“DATA SCIENCE”
THE PAPER THAT
LAUNCHED A 1,000 NERDS
15. MOREOVER, NEW ENTRANTS INTO THE
FIELD ARE NOT GIVEN VERY MUCH
ETHICAL TRAINING
Surveyed Syllabi from 13 Intro to Data Science Courses
16. ONLY THREE HAVE AT LEAST ONE
MENTION OF AN “ETHICS” COMPONENT
IN THE SYLLABUS
23. Earl, I think Data
Science needs a code
of ethics.
Yup.
24. A CODE OF ETHICS WOULD
‣ Establish credibility and responsibility outside
of nerd-dom
‣ Provide a starting point to act as technology
changes
‣ Galvanize the disparate data practitioner
community
28. A TIMELINE OF ETHICAL CODES
EGYPTIAN
CODE OF
MA’AT
JEWISH
TORAH
HIPPOCRATIC
OATH
BUSHIDO
WARRIOR
CODE
PIRATE’S
CODE OF THE
BRETHREN
FRENCH
FOREIGN
LEGION CODE
D'HONNEUR
JOURNALIST’S
CREED
NUREMBURG
CODE
I.R.B. - EXEMPT
COMMON RULE
INTERNATIONAL
STATISTICAL
INSTITUTE
ASSOCIATION
FOR COMPUTING
MACHINERY
AMERICAN
STATISTICAL
ASSOCIATION
DRAFT MODEL
BIOETHICISTS
CODE
~1200 bce~2300 bce ~500 bce 1914~1600
~1000 1831
1999199219811946
1985
2005
increase of professional codes
29. ETHICAL CODES ARE NOT ALL THE SAME
BUT THEY HAVE TWO CLASSES OF
CHARACTERISTICS
Inward
facing goals
Outward
facing goals
30. INWARD FACING GOALS
‣ Provide guidance when norms are not
explicit
‣ Reduce internal conflicts and build a
common purpose
‣ Establish professional behavior
‣ Deter unethical behavior with sanctions and
internal reporting structures
31. OUTWARD FACING GOALS
‣ Protect vulnerable populations who could be
harmed by profession’s activities
‣ Establish the profession as a distinct moral
community worthy of autonomy
‣ Serve as tool for disputes between member
and non-member parties
‣ Create institutions resilient to external
pressures
32. PROMOTE POSITIVE ENFORCEMENT
‣ Accept the distributed nature of
professional communities creates too many
judicial problems for active regulation
‣ Construct the code with consensus
allowing for broad buy-in
‣ Set boundaries and expectations of the
practicing community, allowing for self-
affirming social control mechanisms
33. ‣ Mediate internal group needs and external
community interactions
‣ Adapt to future unknown circumstances
‣ Inspire collective identity supporting
adherence and adoption
OVERALL A PROFESSIONAL
CODE OF ETHICS SHOULD:
34. OKAY PROFESSOR, SO WHAT IS THE
REAL REASON DATA SCIENCE NEEDS
AN ETHICAL CODE?
36. "In economics, moral hazard occurs
when one person takes more risks
because someone else bears the
burden of those risks."
– wikipedia
https://en.wikipedia.org/wiki/Moral_hazard
40. ‣ Connections between data and the people
it represents are very abstracted
‣ Digital creations affect people we never
see
‣ Unintended algorithmic consequences are
almost never known or explored
‣ When was the last time an algorithm ever
“hurt” anybody?
DATA SCIENCE IS STEEPED IN
MORAL HAZARD
43. –Paul Ohm
“Broken Promises of Privacy: Responding to
the Surprising Failure of Anonymization,”
UCLA Law Review 57,p.1702
“Data can be useful
or anonymous,
but never both.”
44. THUS A CODE WOULD NEED
TO MAINTAIN THE UTILITY
OF DATA
WHILE BALANCING
CONTROL OF THAT DATA
45. A FRAMEWORK FOR A CODE IS
COMPOSED OF THREE CLUSTERS
Data Ethics Code
Safety of used
data & analysis
Protection of
subjects
Mathematical
responsibility
Community
Privacy
bio-
information
Business
applications
3rd party
usage
Identity
Ownership Verification
Right to be
forgotten
Incorrect data
correction
46. PRIVACY
‣ Once you buy or sell data what are the ethics around
using it? You did ‘buy it’ right?
3rd party data
‣ What is the relationship between privacy of internet
exploration and advertisement of relevant
products?
Business applications
‣ Is data generated from your body owned differently?
Bio-information
47. COMMUNITY
‣ How do we protect people who our analysis affects
for negative consequences?
Protection of subjects
‣ Is there a system for correct use of professional
tools and continuing education?
Mathematical responsibility
‣ Once data is used how is it discarded and sensitive
analysis protected?
Safety of used data & analysis
48. IDENTITY
‣ Is there a need for a centralized personal data
safe?
Ownership
‣ How do means of validation affect access, privacy and
safety?
Validation
‣ What are the mechanisms to correct bad data?
Incorrect data correction
49. THESE COMPONENTS PROVIDE THE
BASIS FOR CONVERSATION NOT A
HARD STRUCTURE
Data Ethics Code
Identity
Safety of used
data & analysis
Protection of
subjects
Mathematical
responsibility
Community
Privacy
bio-
information
Business
applications
3rd party
usage
Ownership Verification
Right to be
forgotten
Incorrect data
correction
61. ESTIMATED $100 MILLION - $500 MILLION
2006 - data theft
http://www.lifehealthpro.com/2015/06/18/the-10-most-expensive-data-breaches?t=regulatory&slreturn=1456110972&page=5
62. HIGH ESTIMATES $4 BILLION DOLLARS
2011 - data breach of 75 client companies
http://www.eweek.com/c/a/Security/Epsilon-Data-Breach-to-Cost-Billions-in-WorstCase-Scenario-459480
marketing data
70. Some folks working on this:
‣ The Council for Big Data, Ethics and Society
‣ Certified Analytics Professionals
‣ Michael McFarland, S.J. - Computer Scientist
‣ Cynthia Dwork - Microsoft Research
‣ Kord Davis - Digital Strategist
READ MORE HERE