Lewis Tunstall | Data Scientist | lewis.tunstall@bfh.ch
Leandro von Werra | Data Scientist | leandro.vonwerra@bfh.ch
BTW2401 - Data Science
Lesson 1.1 - The Big Picture
Overall Course Goals
• Know how to approach business problems from a
data science perspective
• Understand the fundamental principles behind
extracting useful knowledge from data
• Gain hands-on experience with mining data for
insights
By the end of this course you will:
Skills Overview
• Python programming and core libraries for data
analysis, visualisation, and modelling
• Working with data: collecting, cleaning, transforming
• Creating and interpreting descriptive statistics
• Creating and interpreting data visualisations
• Creating statistical models for inference
• Practical machine learning
In this course you are going to learn several skills:
Course Materials
• Data Science for Business, F. Provost & T. Fawcett
(O'Reilly Media, Sebastopol, 2013).
This course is largely based on the excellent textbook:
• Hands-On Machine Learning with Scikit-Learn and
TensorFlow, A. Géron (O'Reilly Media, 2017)
• Introduction to Machine Learning for Coders, fast.ai
(http://course18.fast.ai/ml)
Other useful references include:
Week No. Date Topic
8 20.02 Introduction to data science
9 27.02 Python for data analysis I
10 5.03 Python for data analysis II
11 12.03 Introduction to random forests
12 19.03 Random forest deep dive
13 26.03 Model interpretation
14 2.04 Classification
15 9.04 No class (Easter)
16 16.04 Midterm exam & define group projects
17 23.04 Cross-validation and model performance
Week No. Date Topic
18 30.04 Neighbours and clusters I
19 7.05 Neighbours and clusters II
20 14.05 Natural language processing I
21 21.05
No class (Ascension) & project
submission
22 28.05 Natural language processing II
23 4.06 Project presentations
24 11.06 Deep learning
25 18.06 Exam Preparation
26 25.06 Exam Preparation
27 TBD Final Exam
Timetable and Key Dates
About Us: Backgrounds
• PhD in theoretical physics from Adelaide, Australia
• Postdoctoral researcher in Switzerland (University of Bern)
• 2+ years working in industry
• Expertise in machine learning & mathematics
• MSc in computational physics from ETH
• 2+ years working in industry
• Focus on application of machine learning to big data
Lewis
Leandro
About Us: What We Actually Do
Raw data,
little value
Data
exploration
Analysis,
Model building
Reporting,
Automation
This Lesson
We aim to answer the following 3 questions:
• What is data science?
• Why is it important?
• How is data science performed?
To answer these questions, we will
focus on a series of trends that are driving
the data science “revolution”
This Lesson
We aim to answer the following 3 questions:
To answer these questions, we will
focus on a series of trends that are driving
the data science “revolution”
Big Data
• What is data science?
• Why is it important?
• How is data science performed?
This Lesson
We aim to answer the following 3 questions:
To answer these questions, we will
focus on a series of trends that are driving
the data science “revolution”
Big Data
Machine Learning
• What is data science?
• Why is it important?
• How is data science performed?
This Lesson
We aim to answer the following 3 questions:
To answer these questions, we will
focus on a series of trends that are driving
the data science “revolution”
We finish with
• A mini (ungraded!) quiz
• Onboarding of Python, Kaggle, and Paperspace
Big Data
Machine Learning
• What is data science?
• Why is it important?
• How is data science performed?
What is Data Science?
It’s a surprisingly hard definition to nail down:
• Superfluous?
• Buzzword?
What is Data Science?
It’s a surprisingly hard definition to nail down:
• Superfluous?
• Buzzword?
You know it’s serious when your
field makes it onto Gartner’s
hype cycle1
1https://en.wikipedia.org/wiki/Hype_cycle
Python
Deep/Machine Learning
Predictive
Analytics
What is Data Science?
Despite the hype, useful definitions exist1
Data science is about the extraction of useful
information and knowledge from large
volumes of data, in order to improve
business decision-making
1Provost & Fawcett, Chapter 1
What is Data Science?
1Provost & Fawcett, Chapter 1
Is an interdisciplinary subject with 3 key areas:
Despite the hype, useful definitions exist1
Data science is about the extraction of useful
information and knowledge from large
volumes of data, in order to improve
business decision-making
What is Data Science?
1Provost & Fawcett, Chapter 1
Is an interdisciplinary subject with 3 key areas:
• Statistics
• Computer science
• Domain expertise
Despite the hype, useful definitions exist1
Data science is about the extraction of useful
information and knowledge from large
volumes of data, in order to improve
business decision-making
What is Data Science?
Data
Science
1Provost & Fawcett, Chapter 1
Is an interdisciplinary subject with 3 key areas:
• Statistics
• Computer science
• Domain expertise
Despite the hype, useful definitions exist1
Data science is about the extraction of useful
information and knowledge from large
volumes of data, in order to improve
business decision-making
Why is Data Science Important?
In the past, data analysis was typically slow:
needed teams of statisticians, analysts etc to
explore data manually
Today: volume, velocity, and variety make
manual analysis impossible …
Big Data: The Large Hadron Collider at CERN
Big Data: The Large Hadron Collider at CERN
• 150 million sensors delivering data
40 million times per second.
• There are nearly 600 million collisions per second.
• Only 100 collisions of interest per second.
• Raw data production exceeds 500 exabytes per day
(1 EB = 1 million TB).
• Due to filtering only 200 petabyte are generated
annually (1 PB = 1000 TB).
Why is Data Science Important?
In the past, data analysis was typically slow:
needed teams of statisticians, analysts etc to
explore data manually
Today: volume, velocity, and variety make
manual analysis impossible …
… but fast computers and good algorithms
allow much deeper analyses than before
)
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
data-driven decision making
Cartoon from Provost & Fawcett, Chapter 1
)
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
base decisions on analysis of data,
not intuition
The Data Science Process
Find a
question
Collect
the data
Deploy
the model
Evaluate
the model
Create a
model
Prepare
the data
Data
How is data science performed?
1From Data Science: The Big Picture, M. Renze, Pluralsight
The Data Science Process
Find a
question
Collect
the data
Deploy
the model
Evaluate
the model
Create a
model
Prepare
the data
Data
• Iterative process
• Non-sequential
• Early termination
• Established processes, e.g.
CRISP-DM (https://bit.ly/1tX6508)
How is data science performed?
1From Data Science: The Big Picture, M. Renze, Pluralsight
Example #1
1What Wal-Mart Knows About Customer Habits, NYT (2004)
Hurricane Frances was on its way, barreling
across the Caribbean, threatening a direct hit
on Florida's Atlantic coast … A week ahead of
the storm's landfall, Linda M. Dillman, Wal-
Mart's chief information officer, pressed her
staff to come up with forecasts based on what
had happened when Hurricane Charley struck
several weeks earlier.1
Why might data-driven predictions be useful in
this scenario?
Example #1
1What Wal-Mart knows about customer habits, NYT (2004)
Hurricane Frances was on its way, barreling
across the Caribbean, threatening a direct hit
on Florida's Atlantic coast … A week ahead of
the storm's landfall, Linda M. Dillman, Wal-
Mart's chief information officer, pressed her
staff to come up with forecasts based on what
had happened when Hurricane Charley struck
several weeks earlier.1
Why might data-driven predictions be useful in
this scenario? 7x increase in sales
before hurricane
Top-selling item
was beer!
Example #2
“If we wanted to figure out if a
customer is pregnant, even if she
didn’t want us to know, can you
do that?”1
1How Companies Learn Your Secrets, NYT (2012)
Why might Target want to know when you’re
pregnant?
Example #2
“If we wanted to figure out if a
customer is pregnant, even if she
didn’t want us to know, can you
do that?”1
1How Companies Learn Your Secrets, NYT (2012)
Why might Target want to know when you’re
pregnant?
Example #2
“If we wanted to figure out if a
customer is pregnant, even if she
didn’t want us to know, can you
do that?”1
1How Companies Learn Your Secrets, NYT (2012)
Why might Target want to know when you’re
pregnant?
“My daughter got this in the mail!
She’s still in high school …
Are you trying to encourage her to
get pregnant?”
Example #2
“If we wanted to figure out if a
customer is pregnant, even if she
didn’t want us to know, can you
do that?”1
1How Companies Learn Your Secrets, NYT (2012)
Why might Target want to know when you’re
pregnant?
“My daughter got this in the mail!
She’s still in high school …
Are you trying to encourage her to
get pregnant?”
“It turns out there’s been some activities in my
house I haven’t been completely aware of.
She’s due in August. I owe you an apology.”
3 days later …
Lewis Tunstall | Data Scientist | lewis.tunstall@bfh.ch
Leandro von Werra | Data Scientist | leandro.vonwerra@bfh.ch
BTW2401 - Data Science
Lesson 1.2 - Machine Learning
What is machine learning?
Blue faces seem to be important…
What is machine learning really?
1950’s: creation of first “intelligent”
algorithms and programs
1980’s: statistical models and algorithms
that can learn from data
2010’s: statistical models and algorithms
inspired by neurones that can learn from data
Machine Learning Branches
3 Main Branches:
- Supervised Learning
- Unsupervised Learning
- Reinforcement Learning
Machine Learning Branches
Machine Learning Branches: Supervised Learning
In supervised learning the
training data consists of
input/output pairs and we
train a function to map the
inputs to the outputs.
Input
Regression
Classification
Categorical Variable Continuous Variables
Values, Vectors,
Words, Images etc.
A, B, C
Dogs/Cats
Prize/Cost,
Weight, Lifetime
Supervised Learning: Classification
Classification: Assign categorical labels from a fixed set of labels to data samples.
“Broken Bike”
“Normal Bike”
Output/Label:
Input Data:
Supervised Learning: Regression
Regression: Find the relationship between one dependent variable and a series of
other changing variables.
Concentration
Length of lecture
Machine Learning Branches
Machine Learning Branches: Unsupervised Learning
In unsupervised learning there are
no labels available, insights are
gained without* prior knowledge. Input Data
Dimensionality Reduction
Clustering
Outlier Detection
Generative Models
* Usually some model parameters need to be set ahead of training.
Unsupervised Learning: Anomaly/Outlier detection
Anomaly Detection: The task of finding
samples in a dataset that raise suspicion.
Problem: Usually, what exactly you are
looking for is unknown.
Solution: Use statistics and characteristics
of dataset to find outliers.
Unsupervised Learning: Anomaly/Outlier detection
Figure: Topology of Swiss Mobiliar’s application network, where each blue node corresponds to a
component of an application. Communication between components is denoted by grey lines. The right
panel shows an illustrative time series from a single node exhibiting positive and negative anomalies.
integrate streaming business data within and outside company borders.
As part of our product development, we apply our expertise in event-processing
technologies to solve real-world problems for our customers. One such customer is
Swiss Mobiliar, whose employees rely on a very large network of applications to assist
their customers with insurance policies, quotes, etc. These applications are monitored,
where quantities such as workload and response time are recorded at frequent
intervals.
Key challenges:
bottlenecks
disruptions
> 20,000 correlated time series
> 2,000 events/s
univariate time series [2]. Surprise is defined as the
the actual value of a given metric. Calculating t
secondary, “surprise” time series. Examples of statio
Validation Procedure
Numenta Anomaly Benchmark (NAB)
The NAB [1] consists of over 50 time series with an
evaluation profiles to measure precision and r
Stationary Signal
Figure: Calculation of surprise, where the expected
value is obtained by linear extrapolation of the past
time-steps.
Anomaly detection via surprise percentiles
From the surprise time series, percentiles
(10th and 90th) are tracked over time with a
sliding window. With a 3σ test for outliers,
the percentile time series are probed for
anomalies. Tracking both the upper and
lower percentiles enables the reliable
detection of negative (dips), as well as
positive (spikes) anomalies.
Anomaly Detection Use-Case: IT Infrastructure at Swiss Mobiliar
The goal is to detect disruptions in
network before they affect user or
escalate.
This is an unsupervised task, since we
don’t know how a disruption looks like.
- 20’000 components communicating
- average latency and number of calls are logged
- ca. 2’000 events/second
Unsupervised Learning: Anomaly/Outlier detection
Anomaly Detection Use-Case: IT Infrastructure at Swiss Mobiliar
Introducing MULDER:
Anomaly detector for positive and negative anomalies
in time-series data.
Machine Learning Branches
Part II: Deep Learning
Why now? In recent years two things became available:
1. A lot of data
Part II: Deep Learning
Why now? In recent years two things became available:
2. Necessary compute
What is deep learning?
Rosenblatt - 1961
What is new in deep learning?
What is new (among other things) is a learning
algorithm called backpropagation which allows
to train deep neural nets.
State-of-the-art networks can have over 200
layers!
GoogLeNet -2014
Difference classical ML vs. Deep Learning
Classical ML methods don’t handle high
dimensionality well.
dimensionality reduction & feature selection
Deep neural nets learn compact representations
of data even in a high dimensionality/sparse
setting - no feature engineering required!
Unsupervised Learning: Generative Adversarial Nets (GAN)
None of these images were taken in the real world!
NVIDIA
Unsupervised Learning: Generative Adversarial Nets (GAN)
Which of these images was generated?
DeepMind - 2018
Unsupervised Learning: Language Generation
OpenAI, February 14 2019: Better Language Models and Their Implications
Unsupervised Learning: Language Generation
OpenAI, February 14 2019: Better Language Models and Their Implications
www.talktotransformer.com
So why not use Deep Learning for everything?
There are reasons why we don’t only use DL:
- Necessary data no available
- Computational power not available
- Harder to interpret results
- Deep networks can be fooled:

Data Science: lesson01_intro-to-ds-and-ml.pdf

  • 1.
    Lewis Tunstall |Data Scientist | lewis.tunstall@bfh.ch Leandro von Werra | Data Scientist | leandro.vonwerra@bfh.ch BTW2401 - Data Science Lesson 1.1 - The Big Picture
  • 2.
    Overall Course Goals •Know how to approach business problems from a data science perspective • Understand the fundamental principles behind extracting useful knowledge from data • Gain hands-on experience with mining data for insights By the end of this course you will:
  • 3.
    Skills Overview • Pythonprogramming and core libraries for data analysis, visualisation, and modelling • Working with data: collecting, cleaning, transforming • Creating and interpreting descriptive statistics • Creating and interpreting data visualisations • Creating statistical models for inference • Practical machine learning In this course you are going to learn several skills:
  • 4.
    Course Materials • DataScience for Business, F. Provost & T. Fawcett (O'Reilly Media, Sebastopol, 2013). This course is largely based on the excellent textbook: • Hands-On Machine Learning with Scikit-Learn and TensorFlow, A. Géron (O'Reilly Media, 2017) • Introduction to Machine Learning for Coders, fast.ai (http://course18.fast.ai/ml) Other useful references include:
  • 5.
    Week No. DateTopic 8 20.02 Introduction to data science 9 27.02 Python for data analysis I 10 5.03 Python for data analysis II 11 12.03 Introduction to random forests 12 19.03 Random forest deep dive 13 26.03 Model interpretation 14 2.04 Classification 15 9.04 No class (Easter) 16 16.04 Midterm exam & define group projects 17 23.04 Cross-validation and model performance Week No. Date Topic 18 30.04 Neighbours and clusters I 19 7.05 Neighbours and clusters II 20 14.05 Natural language processing I 21 21.05 No class (Ascension) & project submission 22 28.05 Natural language processing II 23 4.06 Project presentations 24 11.06 Deep learning 25 18.06 Exam Preparation 26 25.06 Exam Preparation 27 TBD Final Exam Timetable and Key Dates
  • 6.
    About Us: Backgrounds •PhD in theoretical physics from Adelaide, Australia • Postdoctoral researcher in Switzerland (University of Bern) • 2+ years working in industry • Expertise in machine learning & mathematics • MSc in computational physics from ETH • 2+ years working in industry • Focus on application of machine learning to big data Lewis Leandro
  • 7.
    About Us: WhatWe Actually Do Raw data, little value Data exploration Analysis, Model building Reporting, Automation
  • 8.
    This Lesson We aimto answer the following 3 questions: • What is data science? • Why is it important? • How is data science performed? To answer these questions, we will focus on a series of trends that are driving the data science “revolution”
  • 9.
    This Lesson We aimto answer the following 3 questions: To answer these questions, we will focus on a series of trends that are driving the data science “revolution” Big Data • What is data science? • Why is it important? • How is data science performed?
  • 10.
    This Lesson We aimto answer the following 3 questions: To answer these questions, we will focus on a series of trends that are driving the data science “revolution” Big Data Machine Learning • What is data science? • Why is it important? • How is data science performed?
  • 11.
    This Lesson We aimto answer the following 3 questions: To answer these questions, we will focus on a series of trends that are driving the data science “revolution” We finish with • A mini (ungraded!) quiz • Onboarding of Python, Kaggle, and Paperspace Big Data Machine Learning • What is data science? • Why is it important? • How is data science performed?
  • 12.
    What is DataScience? It’s a surprisingly hard definition to nail down: • Superfluous? • Buzzword?
  • 13.
    What is DataScience? It’s a surprisingly hard definition to nail down: • Superfluous? • Buzzword? You know it’s serious when your field makes it onto Gartner’s hype cycle1 1https://en.wikipedia.org/wiki/Hype_cycle Python Deep/Machine Learning Predictive Analytics
  • 14.
    What is DataScience? Despite the hype, useful definitions exist1 Data science is about the extraction of useful information and knowledge from large volumes of data, in order to improve business decision-making 1Provost & Fawcett, Chapter 1
  • 15.
    What is DataScience? 1Provost & Fawcett, Chapter 1 Is an interdisciplinary subject with 3 key areas: Despite the hype, useful definitions exist1 Data science is about the extraction of useful information and knowledge from large volumes of data, in order to improve business decision-making
  • 16.
    What is DataScience? 1Provost & Fawcett, Chapter 1 Is an interdisciplinary subject with 3 key areas: • Statistics • Computer science • Domain expertise Despite the hype, useful definitions exist1 Data science is about the extraction of useful information and knowledge from large volumes of data, in order to improve business decision-making
  • 17.
    What is DataScience? Data Science 1Provost & Fawcett, Chapter 1 Is an interdisciplinary subject with 3 key areas: • Statistics • Computer science • Domain expertise Despite the hype, useful definitions exist1 Data science is about the extraction of useful information and knowledge from large volumes of data, in order to improve business decision-making
  • 18.
    Why is DataScience Important? In the past, data analysis was typically slow: needed teams of statisticians, analysts etc to explore data manually Today: volume, velocity, and variety make manual analysis impossible …
  • 19.
    Big Data: TheLarge Hadron Collider at CERN
  • 20.
    Big Data: TheLarge Hadron Collider at CERN • 150 million sensors delivering data 40 million times per second. • There are nearly 600 million collisions per second. • Only 100 collisions of interest per second. • Raw data production exceeds 500 exabytes per day (1 EB = 1 million TB). • Due to filtering only 200 petabyte are generated annually (1 PB = 1000 TB).
  • 21.
    Why is DataScience Important? In the past, data analysis was typically slow: needed teams of statisticians, analysts etc to explore data manually Today: volume, velocity, and variety make manual analysis impossible … … but fast computers and good algorithms allow much deeper analyses than before ) <latexit sha1_base64="(null)">(null)</latexit> <latexit sha1_base64="(null)">(null)</latexit> <latexit sha1_base64="(null)">(null)</latexit> <latexit sha1_base64="(null)">(null)</latexit> data-driven decision making Cartoon from Provost & Fawcett, Chapter 1 ) <latexit sha1_base64="(null)">(null)</latexit> <latexit sha1_base64="(null)">(null)</latexit> <latexit sha1_base64="(null)">(null)</latexit> <latexit sha1_base64="(null)">(null)</latexit> base decisions on analysis of data, not intuition
  • 22.
    The Data ScienceProcess Find a question Collect the data Deploy the model Evaluate the model Create a model Prepare the data Data How is data science performed? 1From Data Science: The Big Picture, M. Renze, Pluralsight
  • 23.
    The Data ScienceProcess Find a question Collect the data Deploy the model Evaluate the model Create a model Prepare the data Data • Iterative process • Non-sequential • Early termination • Established processes, e.g. CRISP-DM (https://bit.ly/1tX6508) How is data science performed? 1From Data Science: The Big Picture, M. Renze, Pluralsight
  • 24.
    Example #1 1What Wal-MartKnows About Customer Habits, NYT (2004) Hurricane Frances was on its way, barreling across the Caribbean, threatening a direct hit on Florida's Atlantic coast … A week ahead of the storm's landfall, Linda M. Dillman, Wal- Mart's chief information officer, pressed her staff to come up with forecasts based on what had happened when Hurricane Charley struck several weeks earlier.1 Why might data-driven predictions be useful in this scenario?
  • 25.
    Example #1 1What Wal-Martknows about customer habits, NYT (2004) Hurricane Frances was on its way, barreling across the Caribbean, threatening a direct hit on Florida's Atlantic coast … A week ahead of the storm's landfall, Linda M. Dillman, Wal- Mart's chief information officer, pressed her staff to come up with forecasts based on what had happened when Hurricane Charley struck several weeks earlier.1 Why might data-driven predictions be useful in this scenario? 7x increase in sales before hurricane Top-selling item was beer!
  • 26.
    Example #2 “If wewanted to figure out if a customer is pregnant, even if she didn’t want us to know, can you do that?”1 1How Companies Learn Your Secrets, NYT (2012) Why might Target want to know when you’re pregnant?
  • 27.
    Example #2 “If wewanted to figure out if a customer is pregnant, even if she didn’t want us to know, can you do that?”1 1How Companies Learn Your Secrets, NYT (2012) Why might Target want to know when you’re pregnant?
  • 28.
    Example #2 “If wewanted to figure out if a customer is pregnant, even if she didn’t want us to know, can you do that?”1 1How Companies Learn Your Secrets, NYT (2012) Why might Target want to know when you’re pregnant? “My daughter got this in the mail! She’s still in high school … Are you trying to encourage her to get pregnant?”
  • 29.
    Example #2 “If wewanted to figure out if a customer is pregnant, even if she didn’t want us to know, can you do that?”1 1How Companies Learn Your Secrets, NYT (2012) Why might Target want to know when you’re pregnant? “My daughter got this in the mail! She’s still in high school … Are you trying to encourage her to get pregnant?” “It turns out there’s been some activities in my house I haven’t been completely aware of. She’s due in August. I owe you an apology.” 3 days later …
  • 30.
    Lewis Tunstall |Data Scientist | lewis.tunstall@bfh.ch Leandro von Werra | Data Scientist | leandro.vonwerra@bfh.ch BTW2401 - Data Science Lesson 1.2 - Machine Learning
  • 31.
    What is machinelearning? Blue faces seem to be important…
  • 32.
    What is machinelearning really? 1950’s: creation of first “intelligent” algorithms and programs 1980’s: statistical models and algorithms that can learn from data 2010’s: statistical models and algorithms inspired by neurones that can learn from data
  • 33.
    Machine Learning Branches 3Main Branches: - Supervised Learning - Unsupervised Learning - Reinforcement Learning
  • 34.
  • 35.
    Machine Learning Branches:Supervised Learning In supervised learning the training data consists of input/output pairs and we train a function to map the inputs to the outputs. Input Regression Classification Categorical Variable Continuous Variables Values, Vectors, Words, Images etc. A, B, C Dogs/Cats Prize/Cost, Weight, Lifetime
  • 36.
    Supervised Learning: Classification Classification:Assign categorical labels from a fixed set of labels to data samples. “Broken Bike” “Normal Bike” Output/Label: Input Data:
  • 37.
    Supervised Learning: Regression Regression:Find the relationship between one dependent variable and a series of other changing variables. Concentration Length of lecture
  • 38.
  • 39.
    Machine Learning Branches:Unsupervised Learning In unsupervised learning there are no labels available, insights are gained without* prior knowledge. Input Data Dimensionality Reduction Clustering Outlier Detection Generative Models * Usually some model parameters need to be set ahead of training.
  • 40.
    Unsupervised Learning: Anomaly/Outlierdetection Anomaly Detection: The task of finding samples in a dataset that raise suspicion. Problem: Usually, what exactly you are looking for is unknown. Solution: Use statistics and characteristics of dataset to find outliers.
  • 41.
    Unsupervised Learning: Anomaly/Outlierdetection Figure: Topology of Swiss Mobiliar’s application network, where each blue node corresponds to a component of an application. Communication between components is denoted by grey lines. The right panel shows an illustrative time series from a single node exhibiting positive and negative anomalies. integrate streaming business data within and outside company borders. As part of our product development, we apply our expertise in event-processing technologies to solve real-world problems for our customers. One such customer is Swiss Mobiliar, whose employees rely on a very large network of applications to assist their customers with insurance policies, quotes, etc. These applications are monitored, where quantities such as workload and response time are recorded at frequent intervals. Key challenges: bottlenecks disruptions > 20,000 correlated time series > 2,000 events/s univariate time series [2]. Surprise is defined as the the actual value of a given metric. Calculating t secondary, “surprise” time series. Examples of statio Validation Procedure Numenta Anomaly Benchmark (NAB) The NAB [1] consists of over 50 time series with an evaluation profiles to measure precision and r Stationary Signal Figure: Calculation of surprise, where the expected value is obtained by linear extrapolation of the past time-steps. Anomaly detection via surprise percentiles From the surprise time series, percentiles (10th and 90th) are tracked over time with a sliding window. With a 3σ test for outliers, the percentile time series are probed for anomalies. Tracking both the upper and lower percentiles enables the reliable detection of negative (dips), as well as positive (spikes) anomalies. Anomaly Detection Use-Case: IT Infrastructure at Swiss Mobiliar The goal is to detect disruptions in network before they affect user or escalate. This is an unsupervised task, since we don’t know how a disruption looks like. - 20’000 components communicating - average latency and number of calls are logged - ca. 2’000 events/second
  • 42.
    Unsupervised Learning: Anomaly/Outlierdetection Anomaly Detection Use-Case: IT Infrastructure at Swiss Mobiliar Introducing MULDER: Anomaly detector for positive and negative anomalies in time-series data.
  • 43.
  • 44.
    Part II: DeepLearning Why now? In recent years two things became available: 1. A lot of data
  • 45.
    Part II: DeepLearning Why now? In recent years two things became available: 2. Necessary compute
  • 46.
    What is deeplearning? Rosenblatt - 1961
  • 47.
    What is newin deep learning? What is new (among other things) is a learning algorithm called backpropagation which allows to train deep neural nets. State-of-the-art networks can have over 200 layers! GoogLeNet -2014
  • 48.
    Difference classical MLvs. Deep Learning Classical ML methods don’t handle high dimensionality well. dimensionality reduction & feature selection Deep neural nets learn compact representations of data even in a high dimensionality/sparse setting - no feature engineering required!
  • 49.
    Unsupervised Learning: GenerativeAdversarial Nets (GAN) None of these images were taken in the real world! NVIDIA
  • 50.
    Unsupervised Learning: GenerativeAdversarial Nets (GAN) Which of these images was generated? DeepMind - 2018
  • 51.
    Unsupervised Learning: LanguageGeneration OpenAI, February 14 2019: Better Language Models and Their Implications
  • 52.
    Unsupervised Learning: LanguageGeneration OpenAI, February 14 2019: Better Language Models and Their Implications www.talktotransformer.com
  • 53.
    So why notuse Deep Learning for everything? There are reasons why we don’t only use DL: - Necessary data no available - Computational power not available - Harder to interpret results - Deep networks can be fooled: