A number of recent milestones in AI have rekindled the faith that human-grade computer intelligence can fuel the next technological revolution. In parallel and almost independently, the job role of Data Scientist rose to one of the hottest tickets in the technology sector. Despite the obvious overlap in the domains of Data Science and Artificial Intelligence, the two approaches are sufficiently distinct that choosing the wrong one might trigger a product to fail or a hiring process to go wrong. This presentation will offer some clarity and best practices with regards to understanding what data analysis requirements you really have, as what opposed to what you think you have.
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Data Science versus Artificial Intelligence: a useful distinction
1. Data Science vs Artificial
Intelligence: a useful distinction
Dr. Christoforos Anagnostopoulos
Founder and Chief Data Scientist, Mentat Innovations
ex-Assoc. Professor, Imperial College London
Mentat
2. PhD in Machine Learning at Imperial College
Research Fellow, Statistical Laboratory, Cambridge U.
ex-Lecturer in Statistical Modelling at Imperial College
Numerous consulting projects (defence, web, social media)
Data journalism (The Independent, The Guardian, BBC, …)
Founder and Chief Scientist of Mentat (AI for Cybersecurity)
Many ideas in this talk were the result of conversations with:
Credentials
Prof. David Hand, OBE (Chairman of Advisory Board of Mentat)
Renowned statistician, twice president of Royal Statistical Society
3. Machine Learning / AI
This talk
Data Science
Are the two technology trends different?
Does it matter?
4. Data Science vs AI: skill-sets
Courtesy of Cathy O’Neil and Rachel Schutt
5. Data Science: the origins
Many rediscoveries of data
analysis in the last 20 years
1970s: Peter Naur introduces “data science” as a
synonym to “computer science”
1997: Jeff Wu claims “statisticians” are “data scientists”.
2001: William Cleveland introduces data science as an
independent discipline, extending statistics.
2008: DJ Patil (LinkedIn) and Jeff
Hammerbacher (Facebook) describe their job
role as that of “Data Scientist”
7. Artificial Intelligence: the origins
1950
Turing Test
Perceptron
Logic programming
1960s
1970s
AI winter
Minsky
Turing
Lighthill
Rise and Fall of
Expert Systems
1980s
Lisp
1997
Chess
2011
Jeopardy!
Deep Blue
Watson
2015
Big Data
Computational Stats
Bayes Revival
Machine Learning
Deep Learning
GPUs
Open Source
8. Big Data
Volume SQL
HDFS
Velocity
complex events processing
apache storm
apache spark streaming
Variety
structured semi-structured unstructured
social graphs, system logs,
tweets/blogs, CCTV
many variables, sampling variability
(e.g., spatiotemporal)
10. Big Data in Science
Models guided by theory
Well formulated questions
Big Data in the Commercial World
Little to no theory
“Needle in the haystack”
Often question is unclear (“fishing”)
Data quality low
11. The data value pyramid
Access
Analytics
Fusion
Artificial Intelligence
Machine Learning
Data Science
Value
Big Data query
learn
12. The data value pyramid
Access
Example: “Give me all transactions by this user”
Tech: DB, HDFS, Query Languages, APIs
Analytics
Example: “How many transactions per country?”
Tech: Anything from Excel to Apache SPARK. Mostly
basic aggregations, strong visualisation component
Fusion
Example: “Give me the office building locations of all
employees that visited this website yesterday”
Tech: Break through silos. Data Lakes, Big Data stacks
Plus: Tremendous amount of value unlocked in this process
Minus: retrospective, user-driven, manual
13. The data value pyramid
Learning:
Forecasting
Example: “How many new users will I get next week?”
Tech: Predictive Analytics tools (ML/DS)
Learning:
regression
Example: “How many emails should I expect a user of
these characteristics to receive per day? ”
Tech: Regression tools (machine learning / statistics)
Learning:
classification
Example: “Given the email header, the email body, and
the type of attachment, classify it as Spam or not.”
Tech: Classification tools (machine learning)
Learning:
inference /
anomaly
detection
Example: “why did we have a peak in traffic?”
Tech: Data Science
14. What does success mean?
interact
predict
infer
in controlled/semi-controlled environments
the future / the class of new examples
unobserved/unobservable attributes
query what has been recorded
consistency
predictive accuracy
model quality / causality
live trials / competitions
20. Market is driving a “standardisation” of AI/ML APIs.
Tech Stack
If your problem fits one of these APIs, you’re 99% there.
If not, your data science pipeline might still use them.
21. Data Science vs AI: skill-sets
Courtesy of Cathy O’Neil and Rachel Schutt
24. Pipeline
Heuristics
Visualisation /
Reporting / UX
Actionability
Interpretation /
Validation of Results
Big Data Stack
domain expertise
machine learning
data science
hacking
Data Cleaning
Data Moulding Model Lifecycle
Management
Bespoke
Statistical
Models
Machine
Learning
Learning
Stack
25. Futurology
Machines are outperforming
humans in an increasingly broad
array of everyday tasks.
Last time this happened was the
Industrial Revolution.
No more call centres, truck drivers, shop assistants.
No more doctors? Not yet. But less looking at X-rays.
stone iron steam electricity AI
26. Overconfident machines
If true wisdom is to know what you don't
know, machines are still pretty stupid.
“Complex Models fail in
Complicated Ways”
Learning by Example /
replicating human cognition is
not always a good idea.
27. By way of conclusion
Chris Anderson quoting Peter Norvig:
“All models are wrong, and increasingly you can
succeed without them.”
in “The End of Theory: the data deluge is making
the scientific method obsolete”
Peter Norvig: “That's a silly statement, I didn't
say it, and I disagree with it. […] Theory has not
ended, it is expanding into new forms.”
info@ment.at
@canagnos