Portal Kombat : extension du réseau de propagande russe
Data Science Seminar l
1. Introspective Systems
KAY AIKIN TREVOR GIONET, JR.
kay.aikin@introspectivesystems.com trevor.gionet@introspectivesystems.com
DR. CARYL JOHNSON
caryl.johnson@introspectivesystems.com
It’s not your grandmother’s data science anymore.
2. Introspective Systems
Data Science the New Wave
Agenda
• Introductions
• What is Data Science?
• What Data Science is not
• Understanding the problem
• Data Science approaches
INTRODUCTION THE WAVE WHAT IS DATA SCIENCE WHAT DATA SCIENCE IS NOT UNDERSTANDING THE PROBLEM DATA SCIENCE
APPROACHES
3. Introspective Systems
What is Data Science?
An interdisciplinary field of scientific methods, processes, algorithms
and systems working together to extract knowledge from data in any
form, structured and unstructured.
INTRODUCTION THE WAVE WHAT IS DATA SCIENCE WHAT DATA SCIENCE IS NOT UNDERSTANDING THE PROBLEM DATA SCIENCE
APPROACHES
4. Introspective Systems
Why is it important?
INTRODUCTION THE WAVE WHAT IS DATA SCIENCE WHAT DATA SCIENCE IS NOT UNDERSTANDING THE PROBLEM DATA SCIENCE
APPROACHES
Our world is more complex and data is
growing faster than our methods can handle
Complex systems
• Highly interconnected
• Heterogeneous device-
human participation
• Extreme data
• Pervasive intelligence
• Increasing autonomy
• Real-time analytics
5. Introspective Systems
What Data Science is not
INTRODUCTION THE WAVE WHAT IS DATA SCIENCE WHAT DATA SCIENCE IS NOT UNDERSTANDING THE PROBLEM DATA SCIENCE
APPROACHES
• Data Analysis
• Statistics
• Data Mining
• Artificial Intelligence
• Machine Learning
It is a combination of these
things and much more
6. Introspective Systems
Understanding the Problem
INTRODUCTION THE WAVE WHAT IS DATA SCIENCE WHAT DATA SCIENCE IS NOT UNDERSTANDING THE PROBLEM DATA SCIENCE
APPROACHES
• Data organization- structured, unstructured
• Complexity- apparent, static, detail, inherent or dynamic
• Structure- Linear, relational, tree or graph
• Dynamics- deterministic or stochastic
7. Introspective Systems
Data Organization
INTRODUCTION THE WAVE WHAT IS DATA SCIENCE WHAT DATA SCIENCE IS NOT UNDERSTANDING THE PROBLEM DATA SCIENCE
APPROACHES
One of the fundamental drivers of data science is if the data is
organized or not. It can drive choices of databases, algorithms and
visualizations.
Not how YOU structure the data, but
how the WILD data is structured.
If you take unorganized data and try to
force it into a structured form you will
often destroy data and make analytics
harder.
8. Introspective Systems
Complexity
INTRODUCTION THE WAVE WHAT IS DATA SCIENCE WHAT DATA SCIENCE IS NOT UNDERSTANDING THE PROBLEM DATA SCIENCE
APPROACHES
Five Kinds of Complexity and you deal with each differently.
• Apparent - appears complex but simple patterns underneath
• Static - complex but once understood doesn’t change
• Detail – great number of different parts
• Inherent - different parts, multiple connections and feedback
• Dynamic – multiple connections and parts that evolve
9. Introspective Systems
Structure
INTRODUCTION THE WAVE WHAT IS DATA SCIENCE WHAT DATA SCIENCE IS NOT UNDERSTANDING THE PROBLEM DATA SCIENCE
APPROACHES
Structure- Linear (ie: time series), relational (ie:tabular), tree or
graph (ie: nodes and edges)
Depending upon how the data is organized that drives how
to structure your solution, which motivates the decision
making methods.
If you take unstructured data with
multiple inter-connections and force it
into a relational structure the problem
becomes much harder.
10. Introspective Systems
Dynamics
INTRODUCTION THE WAVE WHAT IS DATA SCIENCE WHAT DATA SCIENCE IS NOT UNDERSTANDING THE PROBLEM DATA SCIENCE
APPROACHES
Dynamics- deterministic (the state of the system can be
predicted) or stochastic (the state of the system has a
probability distribution)
If the problem is stochastic (more problems than not are
stochastic) then statistical methods are often a better
approach.
Unintended consequences can
happen when you approach a
stochastic problem with deterministic
methods.
11. Introspective Systems
Questions
INTRODUCTION THE WAVE WHAT IS DATA SCIENCE WHAT DATA SCIENCE IS NOT UNDERSTANDING THE PROBLEM DATA SCIENCE
APPROACHES
Before trying to solve a problem, approach the problem like a
data scientist by diving deep into the type of problem and
underlying data you have.
Is my data organized?
How complex is my data?
How is my data structured?
How dynamic is my data?