V like Velocity, Predicting in
Real-Time with Azure ML
Barbara Fusinska
@BasiaFusinska
About me
Programmer
Machine Learning
Data Solutions Architect
@BasiaFusinska
Agenda
• Aircraft Predictive maintenance
• Use Case description
• Stream processing challenges
• Machine Learning Model
• Solution Architecture
• Azure Architecture
• Stream processing
• Applying intelligence to streams
Aerospace Predictive
Maintenance
Engines Sensors Analysis
Will the device fail in 2
weeks, 4 weeks or will
not fail during this time?
Use Case
When will the device
break?
Will the device fail in the
next 2 weeks?
Velocity
How is it different?
Big Data 4Vs
• Untrusted
• Uncleansed
• Speed of
generation
• Rate of analysis
• Unstructured
• Semi-structured
• Structured
• Click stream
• Active/passive sensors
• Logs
• Events
• Speech
• Social media
• Traditional
Volume Variety
VeracityVelocity
Big Data & Azure
Intelligence
Dashboards &
Visualizations
Information
Management
Big Data Stores Machine Learning
and Analytics
CortanaEvent Hub
HDInsight
(Hadoop and
Spark)
Stream Analytics
Data
Sources
Apps
Sensors
and
devices
Data Intelligence Action
People
Automated
Systems
Apps
Web
Mobile
Bots
Bot
Framework
SQL Data
WarehouseData Catalog
Data Lake
Analytics
Data Factory
Machine Learning
Data Lake Store
Blob Store
Cognitive Services
Power BI
Batch vs Real-time
Stream processing challenges
• Data Ingestion
• Stream Processing
• Applying intelligence to the
stream
• Aggregations
• Data sink
Machine Learning
Machine Learning
Model
Model Training
Published
Machine Learning
Model
Prediction
Training data
Publish model
Test stream
Scores
Machine Learning workflow
Data
preparation
Data split
Machine Learning
algorithm
Trained model Score
Clean
data
Training
data
Test data
Data -> Predictive model -> Operational web API in minutes
APIML STUDIO
Azure Machine
Learning Studio
Income prediction
Demo
Will the device break in the next 2 weeks?
0
1
2
3
4
5
6
cycle 11 cycle 12 cycle 13 cycle 14 cycle 15
Sensor reads
s1 s2 s3
Device
37
Cycle s1 ... s21 failed
1 518.67 23.419 0
2 518.67 23.4236 0
3 518.67 23.3442 0
...
134 518.67 23.1295 0
135 518.67 23.4085 1
(s, avg, stdev)
https://gallery.cortanaintelligence.com/SolutionTemplate/Predictive-Maintenance-for-Aerospace-1
https://gallery.cortanaintelligence.com/Collection/Predictive-Maintenance-Template-3
Aircraft
Predictive
Maintenance
Use Case
Solution Architecture
Stream AnalyticsEvent Hub DocumentDB
Machine Learning Web AppML Model Training
Training data
Retrain
Sensors data
simulation
Data generator
Event Hub configuration
Data Ingestion
DocumentDB configuration
Storing Maintenance Predictions
Stream analytics configuration
Maintenance prediction
Stream Analytics – Source and Sink
SELECT
DeviceId, result.ScoredLabels,
cycle, seeting1, …, s1, …, s21,
a1, …, a21, sd1, …, sd21
INTO maintenance
FROM predict
Aggregation in Stream Analytics
SELECT avg(s1) as a1
FROM CallStream
GROUP BY id
, TumblingWindow(minute, 1)
TIMESTAMP BY T
Stream Analytics – Aggregations
WITH aggregate AS (
SELECT
id,
avg(s1) AS a1, …, avg(s21) AS a21,
stdev(s1) AS sd1, …, stdev(s21) AS sd21
FROM CallStream AS s
GROUP BY id, TumblingWindow(minute, 1)
)
Join in Stream Analytics
SELECT c.s1, a.a1
FROM CallStream c
JOIN aggregate a
ON c.id = a.id
and DATEDIFF(minute, c, a)
BETWEEN 0 and 1
Stream Analytics – Join aggregations
WITH predict AS (
SELECT
s.id, s.cycle, s.seeting1, …, s.s1, …, s.s21,
a.a1, …, a.a21, a.sd1, …, a.sd21,
predmain(s.cycle, s.seeting1, …, s.s1, …, s.s21,
a.a1, …, a.a21, a.sd1, …, a.sd21) AS result
FROM CallStream AS s
join aggregate AS a
ON s.id = a.id
AND DATEDIFF(minute, s, a) BETWEEN 0 AND 1
)
Monitoring Dashboard
Reading Predictions from DocumentDB
Retraining scenario
Demo
Scaling web
services
• Azure Classic Portal
• 20-200 concurrent requests
• Azure ML optimization
• Multiple endpoints for the
same web service
• New web services
• Production web services
• Pricing Plans
• 1,000 – 50,000,000 request
per month
Keep in touch
BarbaraFusinska.com
@BasiaFusinska

V like Velocity, Predicting in Real-Time with Azure ML