1
Jeeves Grows Up:
An AI Chatbot for
Performance and Quality
Shivnath Babu
CTO/Cofounder @ Unravel
Adjunct Professor @ Duke University
TRUSTED BY
2
About the speaker
Shivnath Babu
Cofounder/CTO at Unravel
Adjunct Professor of Computer Science at Duke
University
Focusing on manageability of data pipelines and
the modern data stack
Recipient of US National Science Foundation
CAREER Award, IBM Faculty Award, HP Labs
Innovation Research Award
3
Unravel radically simplifies DataOps & has
strong adoption across platforms & industries
• Brings together
information about all
your apps, clusters,
resource utilization,
users, & datasets in a
single place
• Creates end-to-end view
of data pipelines to easily
track & understand issues
• Tracks & reports on usage
across environments
• Checks for & alerts on
anomalous behavior
• Uses AI/ML to troubleshoot &
optimize apps to meet desired
performance & cost needs
• Spots & fixes inefficient usage
• Ensures efficiency, quality, &
performance of all apps in
development & production
4
4
Chatbot
A program that conducts a
conversation via text or voice
5
5
#UnifiedAnalytics #SparkAISummit
The happy Spark user
6
6
#UnifiedAnalytics #SparkAISummit
“I have no clue
which cloud
instance type to
pick for my
workload”
“My cloud
costs are
getting out of
control. Help!”
“I have no
idea why
my app is
slow”
“My app
failed and I
don’t know
why!”
The UNhappy Spark user
7
• Many levels of dependent stack traces
• Identifying the root cause is hard and time consuming
7
Typical app failure in Spark
#UnifiedAnalytics #SparkAISummit
8
8
#UnifiedAnalytics #SparkAISummit
“My app
failed and I
don’t know
why!”
Chatbot
“I know that sucks! Let me take a
look here …”
“I see the problem. Executors
are running out of memory”
“Setting spark.executor.memory
to 12g fixes the problem. I have
verified it. See this run here”
“Wow.
Thanks. You
are
awesome!”
Spark User
9
9
Let us see it in action
10
10
Fast forward to 2021
11
Now every company is a data company
Powered by
Data, ML and AI
12
Most companies have 10+
mission-critical Data Pipelines
Data Pipelines
Now every company is a data company
13
DATA
SOURCES
DATA
PRODUCTS
CAPTURE
Batch
Ingest
Stream
Ingest
STORE
Data Lake
Data
Warehouse
TRANSFORM
Batch
Processing
Orchestrate
Tasks
Machine
Learning
Stream
Processing
PUBLISH
Real-time
Store
Data
Catalog
Feature
Store
CONSUME
Real-time
Apps
BI
Advanced
Analytics
DATA PIPELINE
14
Most companies have 10+
mission-critical Data Pipelines
Data Pipelines
Data Stack for these pipelines
is multi-system & complex
Data Stack
Now every company is a data company
15
DATA
SOURCES
DATA
PRODUCTS
CAPTURE
Batch
Ingest
Stream
Ingest
STORE
Data Lake
Data
Warehouse
TRANSFORM
Batch
Processing
Orchestrate
Tasks
Machine
Learning
Stream
Processing
PUBLISH
Real-time
Store
Data
Catalog
Feature
Store
CONSUME
Real-time
Apps
BI
Advanced
Analytics
DATA PIPELINE
16
DATA
SOURCES
DATA
PRODUCTS
CAPTURE
Batch
Ingest
Stream
Ingest
STORE
Data Lake
Data
Warehouse
TRANSFORM
Batch
Processing
Orchestrate
Tasks
Machine
Learning
Stream
Processing
PUBLISH
Real-time
Store
Data
Catalog
Feature
Store
CONSUME
Real-time
Apps
BI
Advanced
Analytics
DATA PIPELINE
17
DATA
SOURCES
DATA
PRODUCTS
CAPTURE
Batch
Ingest
Stream
Ingest
STORE
Data Lake
Data
Warehouse
TRANSFORM
Batch
Processing
Orchestrate
Tasks
Machine
Learning
Stream
Processing
PUBLISH
Real-time
Store
Data
Catalog
Feature
Store
CONSUME
Real-time
Apps
BI
Advanced
Analytics
DATA PIPELINE
18
DATA
SOURCES
DATA
PRODUCTS
CAPTURE
Batch
Ingest
Stream
Ingest
STORE
Data Lake
Data
Warehouse
TRANSFORM
Batch
Processing
Orchestrate
Tasks
Machine
Learning
Stream
Processing
PUBLISH
Real-time
Store
Data
Catalog
Feature
Store
CONSUME
Real-time
Apps
BI
Advanced
Analytics
DATA PIPELINE
19
DATA
SOURCES
DATA
PRODUCTS
CAPTURE
Batch
Ingest
Stream
Ingest
STORE
Data Lake
Data
Warehouse
TRANSFORM
Batch
Processing
Orchestrate
Tasks
Machine
Learning
Stream
Processing
PUBLISH
Real-time
Store
Data
Catalog
Feature
Store
CONSUME
Real-time
Apps
BI
Advanced
Analytics
DATA PIPELINE
20
Most companies have 10+
mission-critical Data Pipelines
Data Pipelines
Data Stack for these pipelines
is multi-system & complex
Data Stack
33% & growing # of data teams
follow a DataOps practice
DataOps
Now every company is a data company
21
SLA misses
are creating
problems
We asked 200+ companies how they
manage their data pipelines
We only
detect the fire
after it starts!
Our pipeline
schedules
are all
messed up!
We need
CI/CD for our
pipelines
Fixing
problems
takes weeks
Users are
always
complaining
I am wasting
most of my
time with
bad data
Do devs ever
#!$ test their
pipelines?
Two failed
attempts to
migrate to
cloud
Cost
reduction is
our #1
priority
22
Effective DataOps practice is required to
solve these problems with data pipelines
SLA misses
are creating
problems
We only
detect the fire
after it starts!
Our pipeline
schedules
are all
messed up!
We need
CI/CD for our
pipelines
Fixing
problems
takes weeks
Users are
always
complaining
I am wasting
most of my
time with
bad data
Do devs ever
#!$ test their
pipelines?
Two failed
attempts to
migrate to
cloud
Cost
reduction is
our #1
priority
23
We created Unravel’s Pipeline Observer to
simplify DataOps
Real-time
Store
Root Cause
Analysis
Service
Baselining
Service
Pipeline
Observer
UI/API
Correlation
Services
Logs
Metrics
Traces
Metadata
Conf
Events
Chatbot
SLA
Tracking UI
Pipeline
Capacity
Planning
Proactive
Alerting
Usage / Cost
Chargeback UI
24
Modern Data Stack composed of:
1. Databricks (Advanced Analytics with Spark)
2. Azure Data Lake Storage (Data Lake)
3. Airflow (Orchestration)
4. dbt (Data Transformation)
5. Great Expectations (Data Quality/Validation)
6. Slack (Chatbot, Team Comm., & Alerting)
7. Unravel (End-to-end Observability)
Demo
Stack
25
1. Pipeline in danger of missing
performance SLA
2. Pipeline in danger of cost overrun
3. Pipeline in danger of breaking due
to data quality problems
Demo
Scenarios
26
26
Let us see it in action
27
In summary
AI-driven DataOps to manage Data Pipelines for the New Data Stack
• Develop & manage data pipelines with ease
• Save time & money
27
Sign up for a free trial, we value your feedback!
https://unraveldata.com/saas-free-trial
We are hiring
shivnath@unraveldata.com
Feedback
Your feedback is important to us.
Don’t forget to rate and review the sessions.

Jeeves Grows Up: An AI Chatbot for Performance and Quality