Agility in an AI / DS / ML Project

Agility in an
AI/DS/ML
project
TATHAGAT VARMA
STRATEGY & OPERATIONS, WALMART
DOCTORAL SCHOLAR, INDIAN SCHOOL OF BUSINESS

Disclaimer!
THESE ARE MY PERSONAL VIEWS.

AI is (getting)
everywhere…!
https://info.algorithmia.com/2021

…and fast accelerating!!!

However…it is
taking too long
to develop and
deploy…!
u 2-6 months depending on
scope and size. Data
Collection (20%), Data
Cleaning (50%), Data
Exploration (15%), Data
Modeling (10%), Data
Interpretation (5%)
u The time required to deploy
a model is increasing year-
on-year.
u Only 11% of organizations
can put a model into
production within a week,
and 64% take a month or
longer

The time of Data Scientists being spent in deploying the
models…and more models means more time spent in
deployment…!

…with alarmingly high failure rates!
u It was estimated that 85% of AI projects will fail
and deliver erroneous outcomes through 2022.
u 70% of companies report minimal or no impact
from AI.
u 87% of data science projects never make it into
production.
https://research.aimultiple.com/ai-fail/

Low ROI, & Long Payback periods!
u The ROI for AI projects varies greatly, based on
how much experience an organization has.
Leaders showed an average of a 4.3% ROI for
their projects, compared to only 0.2% for
beginning companies.
u Payback periods also varied, with leaders
reporting a typical payback period of 1.2 years
and beginners at 1.6 years.
https://www2.deloitte.com/us/en/insights/industry/technology/artificial-intelligence-roi.html

How is AI different?
Traditional Software AI Software
Reasoning Deductive Inductive
Inputs Data + Program Data + Output
Logic Manually pre-programmed to perform a
specific task on a given dataset
Programmed to automatically keep learning
rules from a given dataset
Output Output Models, Rules
Learning Learns one-time from the programmer Learns constantly being the data
Resource Code Data
Solutions Deterministic Probabilistic
Output Consistently remains the same Can improve with usage (or degrade over time)
Business
model
One-time development efforts, followed by
multiple sales, and small maintenance effort
(optional)
Each project is one-off, and needs full lifecycle
management mandatorily

Elements of ML systems
https://www.ibm.com/cloud/blog/ai-model-lifecycle-management-overview

A typical lifecycle for an AI project
u Scoping and Data
Acquisition
u Experimentation
and Model Building
u Production,
Deployment,
Scaling and
Operationalize

Data, data, data…!
u Industry reports indicate up
to 80% efforts in data
wrangling!
u Upto 1/4th of that only in
cleaning and another 1/4th
in labeling
u Just 10% of the time spend
in model training!
https://medium.com/whattolabel/data-labeling-ais-human-bottleneck-24bd10136e52

Data trumps algorithms!
In the article “Datasets Over Algorithms”, Alexander Wissner-Gross showed that
the mean time between a new machine learning algorithm being published
and its use in an AI breakthrough was 18 years; however, the mean time
between the required datasets becoming available and those AI
breakthroughs was 3 years. Machine learning without the necessary data and
use cases is merely a pile of nuts and bolts waiting to be built into something
useful. Nonetheless, machine learning is about learning from data, not about
writing code, and that represents a fundamental difference from previous
software engineering practices.
- Agile AI, Carlo Appugliese, Paco Nathan, and William S. Roberts, O’Reilly

Data lifecycle
u While CRISP-DM (Cross
Industry Standard
Process for Data
Mining) lifecycle seems
to be a bit dated
(published 1999) and
inactive, it is still a good
reference point on the
key phases of data
lifecycle
u Flows are not
sequential but
back/forth
https://www.ibm.com/docs/en/spss-modeler/SaaS?topic=dm-crisp-help-overview

Generic
tasks and
outputs in a
CRISP-DM
Reference
Model
https://www.ibm.com/docs/en/spss-modeler/SaaS?topic=dm-crisp-help-overview

CRISP-DM favored over agile methodologies?
https://www.datascience-pm.com/crisp-dm-still-most-popular/

Challenges with Scrum in Data
Science projects
u One key challenge of using a sprint-based framework within a data
science context is the fact that task estimation is unreliable. In other words,
if the team can not accurately estimate task duration, the concept of a
sprint, and what can get done within a sprint is problematic.
u Another key challenge is that Scrum’s fixed-length sprints can be
problematic. Even if a team could estimate how long a specific analysis
might take, having a fixed-length sprint might force the team to define an
iteration to include unrelated work items (as well as delay the feedback
from an exploratory analysis), which could help prioritize new work. In short,
a sprint does not allow smaller (or longer) logical chunks of work to be
completed and analyzed in a coherent fashion.
https://www.datascience-pm.com/data-driven-agile/

Challenges with traditional Kanban in
Data Science projects
u In general, these challenges include the lack of organizational support and
culture, lack of training and the misunderstanding of key concepts.
u Specifically, Kanban does not define project roles nor any process
specifics.
u The freedom Kanban provides (such as letting teams define their own
process for prioritizing tasks) can be part of the challenge in implementing
Kanban. While this lack of process structure can be a strength (since the
lack of a specified process definition allows teams to implement Kanban
within existing organizational practices), it can also mean that every team
could implement Kanban differently. In other words, a team that wants to
use Kanban needs to figure out its own processes and artifacts.

Data-Driven Scrum (DDS)
u The Data Science Process
Alliance created an alternative
framework called Data Driven
Scrum which is designed with data
science in mind.
u Data Driven Scrum™ (DDS) is
an agile framework specifically
designed for data science teams. DDS
provides a continuous flow framework
for agile data science by integrating
the structure of Scrum with the
continuous flow of Kanban.
https://www.datascience-pm.com/data-driven-scrum/

Leveraging Scrum and Kanban…
u DDS can be viewed as a specific instantiation of Scrum with two notable
exceptions:
u The most important exception is that the Scrum Guide requires all iterations (sprints) to be
of equal length in time. However, iterations in DDS vary in duration to allow a logical
increment of work to be done in one iteration (rather than defining the amount of work
that can be done in a specific unit of time).
u The other notable exception is that retrospectives and item reviews are not done at the
end of every iteration, but rather, on a frequency the team deems appropriate.
u DDS also adheres to the Kanban principles (e.g., there is a Kanban board, teams
need to limit WIP, and work items flow across the board). However, the framework
provides more structure than defined by Kanban, such as defined iterations as well
as a more defined framework (ex. roles and meetings). Having a more clearly
defined process that leverages agile best practices, will enable teams to
implement the process in a more consistent and repeatable manner.

Key Tenets of DDS
u Agile is Iterative Experimentation
Agile is intended to be a sequence of iterative experimentation and adaptation cycles.
u Iterations are Capacity-Based
Teams work iteratively on a given set of items until they are done (no inflexible deadlines).
u Focus on Create, Observe, Analyze
Each iteration always follows three core steps: Create something, observe its performance,
and analyze the results.
u Easily Integrate with Scrum
DDS’s interfaces can be seamlessly integrated within a traditional Scrum-based
organization.

DDS vs Traditional Scrum: Similarities
u Similar Roles
Just like traditional Scrum, each DDS team is a group of up to about ten people,
one of whom is the product owner, and one of whom is the process expert.
u Similar Events
Just as in traditional Scrum, there is a daily stand-up, as well as Iteration and
Retrospective Reviews.
u Similar Process to create and prioritize Items
Just like traditional Scrum, items are created, prioritized and viewed on a task
board.

DDS vs Traditional Scrum: Differences
u Functional Iterations
DDS iterations have unknown and varying length iterations (as compared to traditional Scrum sprints, which
have fixed-time durations). This enables iterations that might make sense to be shorter or longer than
average (e.g., an iteration might be shorter than normal due to being able to learn from a quick / short
experiment).
u Uncertain Task Duration
Unlike traditional Scrum (which requires accurate task estimations to know what can fit into a sprint), DDS
naturally accommodates tasks that are difficult to estimate (and task estimation is often difficult within a
data science context).
u Collective Analysis
The entire team focuses on creating, observing and then analyzing an hypothesis, analysis or feature (often
in traditional scrum, this analysis is done by the product owner outside of the codified process).
u Iteration-Independent Meetings
Retrospectives and item reviews and not done at the end of every iteration (as is done in traditional
Scrum), but rather, on a calendar-based frequency the team deems appropriate.

Principles of DDS
u Allow capability-based iterations – it might be that sometimes it makes sense to
have an iteration that lasts one day, and other times, for an iteration last three
weeks (ex. due to how long it takes to acquire / clean data or how long it takes for
an exploratory analysis). The goal should be to allow logical chunks of work to be
released in a coherent fashion.
u Decoupling meetings from an iteration – since an iteration could be very short (ex.
one day for a specific exploratory analysis), meetings (such as a retrospective to
improve the team’s process) should be based on a logical time-based window, not
linked to each iteration.
u Only require high-level item estimation – In many situations, defining an explicit
timeline for an exploratory analysis is difficult, so one should not need to generate
accurate detailed task estimations in order to use the framework. But, high-level “T-
Shirt” level of effort estimates can be helpful for prioritizing the potential tasks to be
done.

DDS Framework
u Data Driven Scrum supports lean iterative exploratory data science analysis,
and acknowledges that iterations will vary in length due to the phase of the
project (collecting data vs creating a machine learning analysis).
u DDS defines an agile lean process framework that leverages some of the key
concepts of Scrum as well as the key concepts of Kanban, but differently than
Scrumban (which as is more of Kanban within a Scrum Framework and hence,
Scrumban implements Scrum sprints, which as previously noted, introduces
several challenges for the project team).
u In short, DDS teams use a Kanban-like visual board and focus on working on a
specific item or collection of items during an iteration, which is task-based, not
time-boxed. Thus, an iteration more closely aligns with the lean concept of
pulling tasks, in a prioritized manner, when the team has capacity. Each
iteration can be viewed as validating or rejecting a specific lean hypothesis.

Steps in a DDS Iteration
Create: A thing or set of
things that will be created,
put into use with a
hypothesis about what will
happen.
Observe: A set of
observable outcomes of
that use that will be
measured (and any work
that is needed to facilitate
that measurement).
Analyze: Analyzing those
observables and create a
plan for the next iteration

https://datadrivenscrum.com/how-DDS-works/

Scaling DDS
The DDS framework is a single team
framework that is designed to be
compatible with the Scrum@Scale
scaling framework.
Each DDS team exposes the
necessary interfaces to collaborate
with other teams (each of which
might be doing Scrum or DDS) via
its roles and artifacts, while
encapsulating its internal workflow.
Team touchpoint DDS Scrum
Metascrum
representation
Product Owner Product owner
Scrum of Scrums
representation
Process Master Scrum Master
Product / release
feedback
Iteration Review Sprint Review
Metrics and
transparency
Item Backlog /
Taskboard
Product Backlog /
Sprint Backlog

Recap
u AI / DS / ML is an evolving field, with long development /
deployment cycles, high failure rates and low ROI.
u It is still a software, but yet, not quite like the traditional
software in many ways!
u While agile principles are rather generic problem-solving
methods, some ideas don’t quite apply well.
u Data-Driven Scrum offers an interesting perspective for
delivering DS projects with agility.
u For deployment, AIOps / MLOps orchestration platforms
are fast emerging to provide necessary tool support.

References
u https://future.a16z.com/new-business-ai-different-traditional-software/
u https://medium.com/machine-learning-in-practice/how-machine-learning-
differs-from-traditional-software-80d0a235ff3b
u https://blog.dataiku.com/ai-projects-lifecycle-key-steps-and-considerations
u https://www.ibm.com/cloud/blog/ai-model-lifecycle-management-overview
u https://labelyourdata.com/articles/lifecycle-of-an-ai-project-stages-
breakdown
u https://www.datascience-pm.com/effective-data-science-process/
u https://www.datascience-pm.com/data-driven-agile/
u https://datadrivenscrum.com/

Agility in an AI / DS / ML Project

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Agility in an AI / DS / ML Project

Similar to Agility in an AI / DS / ML Project (20)

More from Tathagat Varma

More from Tathagat Varma (20)

Recently uploaded

Recently uploaded (20)

Agility in an AI / DS / ML Project