2. Pranav Prakash, Quartic.ai
Application and challenges of streaming
analytics and machine learning on multi-variate
time series data for smart manufacturing
#UnifiedDataAnalytics #SparkAISummit
3. Pranav Prakash
• Co-Founder, VP Engineering at
Quartic.ai
• Ex- LinkedIn SlideShare
• Passionate about
– A.I., Computer Vision, 3D
Printing
– Music, Caffeine
3
4. What
you’ll
learn in
next 40
mins
4
A cool startup
solving some real-
life use cases
Downtime
Reduction use
case of a critical
asset in Pharma
world
•And a “secret” to
solve such problems
Challenges in
Industrial Stream
Processing
Spark specific stuff
that we learned
5. We enable Industry 4.0
• AI powered smart manufacturing platform
• Processing Billions of sensor data every
day
• Work with top Pharma companies on
multiple use cases
• Team of 22 techies including Engineers &
Data Scientists + 4 Domain Veterans
#UnifiedDataAnalytics #SparkAISummit 5
7. We started by
building
solutions for
pharmaceutical
manufacturing
And created a
DIY platform
• Increased uptime of sterilization autoclave by 7 days
• Increased yield of protein from fermentation process
• Incubated egg harvester – increase uptime during
critical flu season
• Cold-chain monitoring for pharma refrigeration –
reduced downtime and waste
• Predictive health monitoring of air handlers for clean
rooms in pharma
• Enable continuous validation of biologic production
process
• Medical Device Assembly – reduce recalls caused by
poor quality.
8. Case study – an Intelligent
Asset Health Monitoring system
for an Industrial Autoclave
• Mission - Improve the
reliability of a complex asset.
• Details - 13 differentmodes
(cycles)
• Runs 24/7
• CriticalAsset
9. Equipment
Reliability
• Capture process, condition data
• Establish baseline and measure
deviations
• Forecast the future
• Classify errors early
• “Advisory Mode” AI
10. SCADA = Supervisory Control and Data Acquisition
PLC = Programable Logic Controller
11. System
Design
Params
• Data
– Speed: 10ms – 2 hours
– Volume: Couple 1,000s sensors per
asset. 10,000s of asset per
enterprise
– Data Type: String, Numeric,
Boolean, Array
– Timeseries, Discrete
13. System
Design
Params
• Use Cases
– Automatic Model Param Tuning,
Model Training
– 1000s of ML Models Deployment
– Complex Event Processing (CEP)
– Statistical & Analytical Processing
• Rule Recommendation
• Near Real Time Stream Processing
14. Challenges
• ML
– Multiple granularities
– Late Data Arrival
– Model Deployment on a
heterogenous data stream
– Flash Flood of Data
15. Multiple Granularities
15
TS Sensor A Sensor B
12:03:01.198
12:03:02.283
12:03:03.316
12:03:04.572
12:03:05.283
12:03:06.342
TS Sensor C Sensor D
12:03:01.230
12:03:06.233
12:03:11.316
12:03:16.520
12:03:21.283
- Both belong to same “Asset”
- Target Feature – C/D or A/B
Poll Frequency = 1s Poll Frequency = 5s