SlideShare a Scribd company logo
Keyvan Azami
Enterprise AI, Google
ML for the
Enterprise
UXDX
Conference
2024
Applied AI research as
enterprise competitive
advantage
Our Mission
Google's IT landscape:
● Billions of users globally using products and
services
● Millions of support inquiries raised each year
● Thousands of help articles
Self Service Automation goal:
Identify critical self help offerings and make them
intuitive, fast, accessible, and effective
Our goal with this project:
Surface relevant support articles directly to internal
users when they open a ticket so that they can solve
their problems without hands-on support
Case Study Problem Statement
Impact
Thousands of hours saved
in Googler wait time
Many issues solved on first
touch
Ideation and Prioritization
Identify problems, estimate impact, assess feasibility, prioritize
Data Acquisition
Gather or create data, verify usability and access
Data Exploration
Understand the data and domain, create features and response
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Model Productionization
Deploy automated pipeline and A/B test model
Monitoring and Maintenance
Monitor data and model drift, report performance and impact
ML Project Lifecycle
Surface relevant support articles directly to
users when they open a ticket so that
they can solve their problems without
hands-on support
What's the business impact?
Tens of thousands of hours of efficiency gains
Is ML a good solution for the problem?
Yes it is. Lots of data, automated decisions, natural language data,
retrieval problem (Q/A)
Can we measure success for solving the problem?
Yes. Because we are able to tie model metrics to estimated business impact
Ideation & Prioritization
Identify problems, estimate impact, assess feasibility, prioritize
Project Impact Complexity
Article Suggestion High Medium
Tech Support Trends High High
Anomaly Detection of
Machine Logs
Medium High
What data do we have?
● Millions of historical resolved support tickets (title, description, location)
● 10% of those tickets are resolved with a link to a help article
● Thousands of articles (title and description) in our knowledge base
What data would we want?
Validated responses from users on whether or not the linked article solved the ticket
Data Acquisition
Gather or create data, verify usability and access
Can't remote into my machine
I can't remote connect into my
desktop, which is turned on, and
I've checked the connection and it
doesn't seem to be the problem,
Common SSH connection issues
My network connection is poor
and ruining my SSH session
Changing the following settings
may help. Run:
...
3 hour wait
Features
Ticket
● All tickets have titles and most have descriptions
● 10% of those tickets are resolved with a link
● Some of the tickets have templates
● Some of the tickets are machine generated
Article
● Articles have meaningful titles and descriptions
● There is a long tail of articles that solve tickets
Response
90% of tickets in our data don't have a self-service
solution
Data Exploration
Understand the data and domain, create features and response
Does my model not perform well enough but I have
significant room for experimentation?
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Ticket Article
Start with a simple model
TF-IDF for both ticket and article
Does my model not perform well enough because of my
features?
Have I framed the problem incorrectly?
Precision 40% Recall 15%
Then move to embeddings and then contextual
embeddings
Wait...
I thought this was supposed to move linearly?
Ideation and Prioritization
Identify problems, estimate impact, assess feasibility, prioritize
Data Acquisition
Gather or create data, verify usability and access
Data Exploration
Understand the data and domain, create features and response
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Model Productionization
Deploy automated pipeline and A/B test model
Monitoring and Maintenance
Monitor data and model drift, report performance and impact
ML Project Lifecycle
Really focus on the impact we are trying to achieve
Ideation & Prioritization
Identify problems, estimate impact, assess feasibility, prioritize
Focus on returning the right article only when there's an article to return (precision)
Reframe as classification and reduce the number of classes that we predict
Can we simplify the problem?
Do we really need to suggest from that entire long tail of articles?
Does my model not perform well enough but I have
significant room for experimentation?
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Then move to embeddings and then contextual
embeddings
Does my model not perform well enough because of my
features?
Precision 50% Recall 25%
Article 1
Article k
No Article
Ticket
embedding
Start with a simple model
TF-IDF for the tickets -- both summary and description
(Articles are now the classes, so no encoding)
We're still short of our goals
Time to debug with a quick demo to our partner
team
Out of Office Notifier
showing with exception
Download a new version
Ticket
Model says...
No article
Data says...
OK, that seems like a
real mistake...
I can see why the
model would link this
ticket to the article,
but OK
Help setting up monitors
Choosing the correct
monitor cable
Ticket
Model says...
No article
Data says...
My machine certificate
expired and I cannot connect
google
Installing certificates [a very
detailed article]
Ticket
Model says...
No article
Data says...
This seems like an
error in our data
We definitely have an
under-labeling problem
Ideation and Prioritization
Identify problems, estimate impact, assess feasibility, prioritize
Data Acquisition
Gather or create data, verify usability and access
Data Exploration
Understand the data and domain, create features and response
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Model Productionization
Deploy automated pipeline and A/B test model
Monitoring and Maintenance
Monitor data and model drift, report performance and impact
ML Project Lifecycle
Data Acquisition
Gather or create data, verify usability and access
2,000 total tickets
Positives: 800 (mix of solved and reasonable suggestions with coverage across articles)
Negatives: 1,200
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Model Precision Recall
Dual encoder, all articles 80% 21%
TF-IDF classification, top 100 80% 34%
Embeddings + CNN, top 100 80% 29%
LM no fine-tuning, top 100 80% 39%
LM with fine-tuning, top 100 80% 43%
Considerations for model productionization
Human-in-the-loop evaluation for initial A/B testing
Do production results line up with prototyping validation?
Yes! Extensive validation and golden set definitely helped with making this smooth
Model Productionization
Deploy automated pipeline and A/B test model
Monitor data drift
Are the features changing? (do tickets look materially different over time?)
Are the class labels shifting? (are more articles being added / changed)
Monitor model drift
How is retraining the model affecting the P/R curve?
Is performance degrading significantly?
Monitoring and Maintenance
Monitor data and model drift, report performance and impact
Takeaways and other considerations
However, always be asking yourself questions and be ready to iterate!
Have a structured approach to your machine learning projects
Takeaways
Leverage the right tools for the job in each stage -- notebooks -> experiment
managers -> full production framework
Make sure to start with impactful problems that are good fits for machine
learning
Thank You!

More Related Content

Similar to Strategic AI Integration in Engineering Teams

Using the power of OpenAI with your own data: what's possible and how to start?
Using the power of OpenAI with your own data: what's possible and how to start?Using the power of OpenAI with your own data: what's possible and how to start?
Using the power of OpenAI with your own data: what's possible and how to start?
Maxim Salnikov
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
Justin Basilico
 
Creating a successful continuous testing environment by Eran Kinsbruner
Creating a successful continuous testing environment by Eran KinsbrunerCreating a successful continuous testing environment by Eran Kinsbruner
Creating a successful continuous testing environment by Eran Kinsbruner
QA or the Highway
 
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
Leslie McFarlin
 
Analytics in the Cloud
Analytics in the CloudAnalytics in the Cloud
Analytics in the Cloud
Tejaswi (Tej) Redkar
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
Alluxio, Inc.
 
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Certification Study Group - NLP & Recommendation Systems on GCP Session 5Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
gdgsurrey
 
Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420
Jeremy Lehman
 
AI for Customer Service - How to Improve Contact Center Efficiency with Machi...
AI for Customer Service - How to Improve Contact Center Efficiency with Machi...AI for Customer Service - How to Improve Contact Center Efficiency with Machi...
AI for Customer Service - How to Improve Contact Center Efficiency with Machi...
Skyl.ai
 
Practical Explainable AI: How to build trustworthy, transparent and unbiased ...
Practical Explainable AI: How to build trustworthy, transparent and unbiased ...Practical Explainable AI: How to build trustworthy, transparent and unbiased ...
Practical Explainable AI: How to build trustworthy, transparent and unbiased ...
Raheel Ahmad
 
Managing machine learning
Managing machine learningManaging machine learning
Managing machine learning
David Murgatroyd
 
1 Ads
1 Ads1 Ads
1 Ads
lcbj
 
ML Framework for auto-responding to customer support queries
ML Framework for auto-responding to customer support queriesML Framework for auto-responding to customer support queries
ML Framework for auto-responding to customer support queries
Varun Nathan
 
Mit104 software engineering
Mit104  software engineeringMit104  software engineering
Mit104 software engineering
smumbahelp
 
UCD overview
UCD overviewUCD overview
UCD overview
Ravi Shyam
 
Maxdiff webinar_10_19_10
 Maxdiff webinar_10_19_10 Maxdiff webinar_10_19_10
Maxdiff webinar_10_19_10
QuestionPro
 
2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx
gdgsurrey
 
SwathiMabbu_BA
SwathiMabbu_BASwathiMabbu_BA
SwathiMabbu_BA
Swathi Mabbu
 
Keeping Product Backlog Healthy
Keeping Product Backlog HealthyKeeping Product Backlog Healthy
Keeping Product Backlog Healthy
Dhaval Panchal
 
Managing Data Science Projects
Managing Data Science ProjectsManaging Data Science Projects
Managing Data Science Projects
Danielle Dean
 

Similar to Strategic AI Integration in Engineering Teams (20)

Using the power of OpenAI with your own data: what's possible and how to start?
Using the power of OpenAI with your own data: what's possible and how to start?Using the power of OpenAI with your own data: what's possible and how to start?
Using the power of OpenAI with your own data: what's possible and how to start?
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Creating a successful continuous testing environment by Eran Kinsbruner
Creating a successful continuous testing environment by Eran KinsbrunerCreating a successful continuous testing environment by Eran Kinsbruner
Creating a successful continuous testing environment by Eran Kinsbruner
 
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
ML Times: Mainframe Machine Learning Initiative- June newsletter (2018)
 
Analytics in the Cloud
Analytics in the CloudAnalytics in the Cloud
Analytics in the Cloud
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
 
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Certification Study Group - NLP & Recommendation Systems on GCP Session 5Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
 
Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420
 
AI for Customer Service - How to Improve Contact Center Efficiency with Machi...
AI for Customer Service - How to Improve Contact Center Efficiency with Machi...AI for Customer Service - How to Improve Contact Center Efficiency with Machi...
AI for Customer Service - How to Improve Contact Center Efficiency with Machi...
 
Practical Explainable AI: How to build trustworthy, transparent and unbiased ...
Practical Explainable AI: How to build trustworthy, transparent and unbiased ...Practical Explainable AI: How to build trustworthy, transparent and unbiased ...
Practical Explainable AI: How to build trustworthy, transparent and unbiased ...
 
Managing machine learning
Managing machine learningManaging machine learning
Managing machine learning
 
1 Ads
1 Ads1 Ads
1 Ads
 
ML Framework for auto-responding to customer support queries
ML Framework for auto-responding to customer support queriesML Framework for auto-responding to customer support queries
ML Framework for auto-responding to customer support queries
 
Mit104 software engineering
Mit104  software engineeringMit104  software engineering
Mit104 software engineering
 
UCD overview
UCD overviewUCD overview
UCD overview
 
Maxdiff webinar_10_19_10
 Maxdiff webinar_10_19_10 Maxdiff webinar_10_19_10
Maxdiff webinar_10_19_10
 
2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx
 
SwathiMabbu_BA
SwathiMabbu_BASwathiMabbu_BA
SwathiMabbu_BA
 
Keeping Product Backlog Healthy
Keeping Product Backlog HealthyKeeping Product Backlog Healthy
Keeping Product Backlog Healthy
 
Managing Data Science Projects
Managing Data Science ProjectsManaging Data Science Projects
Managing Data Science Projects
 

More from UXDXConf

Building Design Systems that Work for Design and Development
Building Design Systems that Work for Design and DevelopmentBuilding Design Systems that Work for Design and Development
Building Design Systems that Work for Design and Development
UXDXConf
 
Design-Driven Leadership: Transforming Organizations through Creative Thinking
Design-Driven Leadership: Transforming Organizations through Creative ThinkingDesign-Driven Leadership: Transforming Organizations through Creative Thinking
Design-Driven Leadership: Transforming Organizations through Creative Thinking
UXDXConf
 
Improving Product Design with Futurism at ORACLE
Improving Product Design with Futurism at ORACLEImproving Product Design with Futurism at ORACLE
Improving Product Design with Futurism at ORACLE
UXDXConf
 
Motion for AI: Creating Empathy in Technology
Motion for AI: Creating Empathy in TechnologyMotion for AI: Creating Empathy in Technology
Motion for AI: Creating Empathy in Technology
UXDXConf
 
Transforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UXTransforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UX
UXDXConf
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
UXDXConf
 
Server-Driven User Interface (SDUI) at Priceline
Server-Driven User Interface (SDUI) at PricelineServer-Driven User Interface (SDUI) at Priceline
Server-Driven User Interface (SDUI) at Priceline
UXDXConf
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
UXDXConf
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
UXDXConf
 
Improving UX Research Quality with Cross-Department Collaboration
Improving UX Research Quality with Cross-Department CollaborationImproving UX Research Quality with Cross-Department Collaboration
Improving UX Research Quality with Cross-Department Collaboration
UXDXConf
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, Ocado
UXDXConf
 
We're Agile. So why haven't our outcomes improved?
We're Agile. So why haven't our outcomes improved?We're Agile. So why haven't our outcomes improved?
We're Agile. So why haven't our outcomes improved?
UXDXConf
 
Breaking Silos_The Shift from a Software Delivery to a Product Development Mi...
Breaking Silos_The Shift from a Software Delivery to a Product Development Mi...Breaking Silos_The Shift from a Software Delivery to a Product Development Mi...
Breaking Silos_The Shift from a Software Delivery to a Product Development Mi...
UXDXConf
 
How Intercom built ‘Fin’, a GPT-4 powered chatbot_Fergal Reid_UXDX_EMEA_2023
How Intercom built ‘Fin’, a GPT-4 powered chatbot_Fergal Reid_UXDX_EMEA_2023How Intercom built ‘Fin’, a GPT-4 powered chatbot_Fergal Reid_UXDX_EMEA_2023
How Intercom built ‘Fin’, a GPT-4 powered chatbot_Fergal Reid_UXDX_EMEA_2023
UXDXConf
 
Leveling Up Design Maturity in a Large-Scale Organisation_ Daniel Heaslip_ U...
Leveling Up  Design Maturity in a Large-Scale Organisation_ Daniel Heaslip_ U...Leveling Up  Design Maturity in a Large-Scale Organisation_ Daniel Heaslip_ U...
Leveling Up Design Maturity in a Large-Scale Organisation_ Daniel Heaslip_ U...
UXDXConf
 
Continuous-Research_Mike Brown_UXDX_ EMEA_2023
Continuous-Research_Mike Brown_UXDX_ EMEA_2023Continuous-Research_Mike Brown_UXDX_ EMEA_2023
Continuous-Research_Mike Brown_UXDX_ EMEA_2023
UXDXConf
 
Crafting Digital Products for Connected Appliances and Other Stories_ Alexis ...
Crafting Digital Products for Connected Appliances and Other Stories_ Alexis ...Crafting Digital Products for Connected Appliances and Other Stories_ Alexis ...
Crafting Digital Products for Connected Appliances and Other Stories_ Alexis ...
UXDXConf
 
Integrating AI _King's journey of Technology Transformation_Steven Collins_ U...
Integrating AI _King's journey of Technology Transformation_Steven Collins_ U...Integrating AI _King's journey of Technology Transformation_Steven Collins_ U...
Integrating AI _King's journey of Technology Transformation_Steven Collins_ U...
UXDXConf
 
Seamless UX: Invisible Transactions_Sudev Balakrishan_UXDX_EMEA_2023
Seamless UX: Invisible Transactions_Sudev Balakrishan_UXDX_EMEA_2023Seamless UX: Invisible Transactions_Sudev Balakrishan_UXDX_EMEA_2023
Seamless UX: Invisible Transactions_Sudev Balakrishan_UXDX_EMEA_2023
UXDXConf
 

More from UXDXConf (20)

Building Design Systems that Work for Design and Development
Building Design Systems that Work for Design and DevelopmentBuilding Design Systems that Work for Design and Development
Building Design Systems that Work for Design and Development
 
Design-Driven Leadership: Transforming Organizations through Creative Thinking
Design-Driven Leadership: Transforming Organizations through Creative ThinkingDesign-Driven Leadership: Transforming Organizations through Creative Thinking
Design-Driven Leadership: Transforming Organizations through Creative Thinking
 
Improving Product Design with Futurism at ORACLE
Improving Product Design with Futurism at ORACLEImproving Product Design with Futurism at ORACLE
Improving Product Design with Futurism at ORACLE
 
Motion for AI: Creating Empathy in Technology
Motion for AI: Creating Empathy in TechnologyMotion for AI: Creating Empathy in Technology
Motion for AI: Creating Empathy in Technology
 
Transforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UXTransforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UX
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
 
Server-Driven User Interface (SDUI) at Priceline
Server-Driven User Interface (SDUI) at PricelineServer-Driven User Interface (SDUI) at Priceline
Server-Driven User Interface (SDUI) at Priceline
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
Improving UX Research Quality with Cross-Department Collaboration
Improving UX Research Quality with Cross-Department CollaborationImproving UX Research Quality with Cross-Department Collaboration
Improving UX Research Quality with Cross-Department Collaboration
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, Ocado
 
We're Agile. So why haven't our outcomes improved?
We're Agile. So why haven't our outcomes improved?We're Agile. So why haven't our outcomes improved?
We're Agile. So why haven't our outcomes improved?
 
Breaking Silos_The Shift from a Software Delivery to a Product Development Mi...
Breaking Silos_The Shift from a Software Delivery to a Product Development Mi...Breaking Silos_The Shift from a Software Delivery to a Product Development Mi...
Breaking Silos_The Shift from a Software Delivery to a Product Development Mi...
 
How Intercom built ‘Fin’, a GPT-4 powered chatbot_Fergal Reid_UXDX_EMEA_2023
How Intercom built ‘Fin’, a GPT-4 powered chatbot_Fergal Reid_UXDX_EMEA_2023How Intercom built ‘Fin’, a GPT-4 powered chatbot_Fergal Reid_UXDX_EMEA_2023
How Intercom built ‘Fin’, a GPT-4 powered chatbot_Fergal Reid_UXDX_EMEA_2023
 
Leveling Up Design Maturity in a Large-Scale Organisation_ Daniel Heaslip_ U...
Leveling Up  Design Maturity in a Large-Scale Organisation_ Daniel Heaslip_ U...Leveling Up  Design Maturity in a Large-Scale Organisation_ Daniel Heaslip_ U...
Leveling Up Design Maturity in a Large-Scale Organisation_ Daniel Heaslip_ U...
 
Continuous-Research_Mike Brown_UXDX_ EMEA_2023
Continuous-Research_Mike Brown_UXDX_ EMEA_2023Continuous-Research_Mike Brown_UXDX_ EMEA_2023
Continuous-Research_Mike Brown_UXDX_ EMEA_2023
 
Crafting Digital Products for Connected Appliances and Other Stories_ Alexis ...
Crafting Digital Products for Connected Appliances and Other Stories_ Alexis ...Crafting Digital Products for Connected Appliances and Other Stories_ Alexis ...
Crafting Digital Products for Connected Appliances and Other Stories_ Alexis ...
 
Integrating AI _King's journey of Technology Transformation_Steven Collins_ U...
Integrating AI _King's journey of Technology Transformation_Steven Collins_ U...Integrating AI _King's journey of Technology Transformation_Steven Collins_ U...
Integrating AI _King's journey of Technology Transformation_Steven Collins_ U...
 
Seamless UX: Invisible Transactions_Sudev Balakrishan_UXDX_EMEA_2023
Seamless UX: Invisible Transactions_Sudev Balakrishan_UXDX_EMEA_2023Seamless UX: Invisible Transactions_Sudev Balakrishan_UXDX_EMEA_2023
Seamless UX: Invisible Transactions_Sudev Balakrishan_UXDX_EMEA_2023
 

Recently uploaded

Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 

Recently uploaded (20)

Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 

Strategic AI Integration in Engineering Teams

  • 1. Keyvan Azami Enterprise AI, Google ML for the Enterprise UXDX Conference 2024
  • 2. Applied AI research as enterprise competitive advantage Our Mission
  • 3. Google's IT landscape: ● Billions of users globally using products and services ● Millions of support inquiries raised each year ● Thousands of help articles Self Service Automation goal: Identify critical self help offerings and make them intuitive, fast, accessible, and effective Our goal with this project: Surface relevant support articles directly to internal users when they open a ticket so that they can solve their problems without hands-on support Case Study Problem Statement Impact Thousands of hours saved in Googler wait time Many issues solved on first touch
  • 4. Ideation and Prioritization Identify problems, estimate impact, assess feasibility, prioritize Data Acquisition Gather or create data, verify usability and access Data Exploration Understand the data and domain, create features and response Model Prototyping Iterate on hypotheses, validate results against model and business metrics Model Productionization Deploy automated pipeline and A/B test model Monitoring and Maintenance Monitor data and model drift, report performance and impact ML Project Lifecycle
  • 5. Surface relevant support articles directly to users when they open a ticket so that they can solve their problems without hands-on support What's the business impact? Tens of thousands of hours of efficiency gains Is ML a good solution for the problem? Yes it is. Lots of data, automated decisions, natural language data, retrieval problem (Q/A) Can we measure success for solving the problem? Yes. Because we are able to tie model metrics to estimated business impact Ideation & Prioritization Identify problems, estimate impact, assess feasibility, prioritize Project Impact Complexity Article Suggestion High Medium Tech Support Trends High High Anomaly Detection of Machine Logs Medium High
  • 6. What data do we have? ● Millions of historical resolved support tickets (title, description, location) ● 10% of those tickets are resolved with a link to a help article ● Thousands of articles (title and description) in our knowledge base What data would we want? Validated responses from users on whether or not the linked article solved the ticket Data Acquisition Gather or create data, verify usability and access Can't remote into my machine I can't remote connect into my desktop, which is turned on, and I've checked the connection and it doesn't seem to be the problem, Common SSH connection issues My network connection is poor and ruining my SSH session Changing the following settings may help. Run: ... 3 hour wait
  • 7. Features Ticket ● All tickets have titles and most have descriptions ● 10% of those tickets are resolved with a link ● Some of the tickets have templates ● Some of the tickets are machine generated Article ● Articles have meaningful titles and descriptions ● There is a long tail of articles that solve tickets Response 90% of tickets in our data don't have a self-service solution Data Exploration Understand the data and domain, create features and response
  • 8. Does my model not perform well enough but I have significant room for experimentation? Model Prototyping Iterate on hypotheses, validate results against model and business metrics Ticket Article Start with a simple model TF-IDF for both ticket and article Does my model not perform well enough because of my features? Have I framed the problem incorrectly? Precision 40% Recall 15% Then move to embeddings and then contextual embeddings
  • 9. Wait... I thought this was supposed to move linearly?
  • 10. Ideation and Prioritization Identify problems, estimate impact, assess feasibility, prioritize Data Acquisition Gather or create data, verify usability and access Data Exploration Understand the data and domain, create features and response Model Prototyping Iterate on hypotheses, validate results against model and business metrics Model Productionization Deploy automated pipeline and A/B test model Monitoring and Maintenance Monitor data and model drift, report performance and impact ML Project Lifecycle
  • 11. Really focus on the impact we are trying to achieve Ideation & Prioritization Identify problems, estimate impact, assess feasibility, prioritize Focus on returning the right article only when there's an article to return (precision) Reframe as classification and reduce the number of classes that we predict Can we simplify the problem? Do we really need to suggest from that entire long tail of articles?
  • 12. Does my model not perform well enough but I have significant room for experimentation? Model Prototyping Iterate on hypotheses, validate results against model and business metrics Then move to embeddings and then contextual embeddings Does my model not perform well enough because of my features? Precision 50% Recall 25% Article 1 Article k No Article Ticket embedding Start with a simple model TF-IDF for the tickets -- both summary and description (Articles are now the classes, so no encoding)
  • 13. We're still short of our goals Time to debug with a quick demo to our partner team
  • 14. Out of Office Notifier showing with exception Download a new version Ticket Model says... No article Data says... OK, that seems like a real mistake...
  • 15. I can see why the model would link this ticket to the article, but OK Help setting up monitors Choosing the correct monitor cable Ticket Model says... No article Data says...
  • 16. My machine certificate expired and I cannot connect google Installing certificates [a very detailed article] Ticket Model says... No article Data says... This seems like an error in our data We definitely have an under-labeling problem
  • 17. Ideation and Prioritization Identify problems, estimate impact, assess feasibility, prioritize Data Acquisition Gather or create data, verify usability and access Data Exploration Understand the data and domain, create features and response Model Prototyping Iterate on hypotheses, validate results against model and business metrics Model Productionization Deploy automated pipeline and A/B test model Monitoring and Maintenance Monitor data and model drift, report performance and impact ML Project Lifecycle
  • 18. Data Acquisition Gather or create data, verify usability and access 2,000 total tickets Positives: 800 (mix of solved and reasonable suggestions with coverage across articles) Negatives: 1,200
  • 19. Model Prototyping Iterate on hypotheses, validate results against model and business metrics Model Precision Recall Dual encoder, all articles 80% 21% TF-IDF classification, top 100 80% 34% Embeddings + CNN, top 100 80% 29% LM no fine-tuning, top 100 80% 39% LM with fine-tuning, top 100 80% 43%
  • 20. Considerations for model productionization Human-in-the-loop evaluation for initial A/B testing Do production results line up with prototyping validation? Yes! Extensive validation and golden set definitely helped with making this smooth Model Productionization Deploy automated pipeline and A/B test model
  • 21. Monitor data drift Are the features changing? (do tickets look materially different over time?) Are the class labels shifting? (are more articles being added / changed) Monitor model drift How is retraining the model affecting the P/R curve? Is performance degrading significantly? Monitoring and Maintenance Monitor data and model drift, report performance and impact
  • 22. Takeaways and other considerations
  • 23. However, always be asking yourself questions and be ready to iterate! Have a structured approach to your machine learning projects Takeaways Leverage the right tools for the job in each stage -- notebooks -> experiment managers -> full production framework Make sure to start with impactful problems that are good fits for machine learning