The document discusses ML1, an AI-powered Jira plugin created by Exadel that uses machine learning to predict field values in Jira tickets. It describes how ML1 was developed to help Exadel's technical support team automate ticket assignment and reduce processing time. Rather than using a single machine learning model for all users, ML1 takes an approach of training and using a separate model for each individual user or group. This allows the models to better reflect the specific data and needs of each user. The document also discusses considerations for monitoring and improving a multi-model machine learning system over time.
3. 3
#exadelML1
1. About Me & Exadel
2. About ML1
3. Multi-User ML Solutions
4. ML Pipeline
5. Monitoring & Improvement
6. Implementation
AGENDA
#exadelML1
4. About Me
4
An ML engineer at Exadel
The leader of the Exadel Python community and an
active member of Exadel AI & DS communities
Interested in NLP, problem-solving, and writing
Siamion
Karasik
6. Exadel is a software engineering company that delivers the digital platforms,
products, and applications our clients need to run and grow their businesses.
7. Exadel at a Glance
1998
Established in
ISO
27001 Certified
23
Offices in USA, Europe, Asia
25+
Solutions
20+
Open-source projects
1200+
Engineers
7
8. 8
#ML1
Artificial Intelligence
The Exadel AI Practice examines existing products
and processes to discover how modern AI/ML
solutions can be applied to add value and then
brings them to life.
10. 10
Technical Support at Exadel
Before:
Now:
Jira ticket
JC_Git_Management
Assign Category
Support Engineer
Assign Resolver
Support Engineer
Category
JC_Git_Management
Category
Assign Category Assign Resolver
Auto-Assignment Plugin
GIT help
I have an issue with GIT
Jira ticket
GIT help
I have an issue with GIT
Support Engineer
Support Engineer
12. About ML1
ML1 is an AI-powered Jira plug-in that predicts field values in issues/tickets
Predicting
Values
Training
Schedule
Training
Report
Users can select any
field to predict with their
ML model
Training can be set to a
schedule for automatically
improved accuracy
Users can get up-to-date
information on the success
of their model training
12
13. ML1 at Exadel
Our own Technical Support department uses ML1 to simplify the process of creating and processing
Jira tickets. Here are just a few of the benefits that we’ve seen so far:
Greatly Reduced
Assignment Time
Saved Time for
Our Employees
Saved Money on
Labor Costs
ML1 decreased the
amount of time
necessary to assign a
task from 10 minutes to 10
seconds
With around 10,000
tasks per year, ML1
saved our Technical
Support team
approximately 500
man hours
Even when the number
of technical support
tasks increased by 15%,
we didn’t have to hire
new technical support
staff
13
16. One-model-for-all ML Solution
16
ML Algorithm
Training Data
User 3
User 2
Metrics
ML Model
User 1
Feedback Data
Train
Predict
for
Feedback Loop - Retrain
17. Sometimes One-for-All Doesn’t Work
17
IoT
Legal restrictions IoT
Each client has a custom
ML problem - like in the
case of ML1
18. A-model-for-each ML Solution
18
ML Algorithm
Training Data
User 2
User 3
User 2
Metrics
ML Model
User 2
User 1
Train
Predict
for
Feedback Data
User 2
Feedback Data
User 3
Training Data
User 3
Training Data
User 1
ML Model
User 3
Metrics
Metrics
ML Model
User 1
Feedback Data
User 1
Train
Train
Feedback Loop - Retrain
Feedback Loop - Retrain
20. Choosing the ML Pipeline
20
Multiclass
text classification
42 unbalanced classes
and ~2500 samples
Experimented with:
● TfidfVectorizer, Word2Vec, TruncatedSVD
● Linear models (Logistic Regression, SVM)
● Tree-based models (Random Forest, Boosting)
21. Choosing the ML Pipeline
In the end, this simple pipeline works best on our Jira data:
21
Jira ticket
Concatenate
Title + Description
TfidfVectorizer
Logistic
Regression
GIT Support
Predicted Category
● TfidfVectorizer learns user-specific words
● Logistic Regression does not require many samples
● 70% accuracy
GIT help
I have an issue with GIT
22. Training with Unknown Data
22
With ML1, training data
is provided by users in
runtime
We do not have control
over training data set
size and quality
So the question is: will
our pipeline work for
others?
23. Walking in Someone Else’s Shoes
We tried another data set and experimented (GitHub)
23
How much extra accuracy will we
get with every 1k samples?
How many features should we
select?
24. Quantifying the “Shortage of Data”
24
Testing set
Testing set
Testing set
Testing set
Training set
Training set
Training set
Training set
4-fold validation (k=4)
Fold 1
Fold 2
Fold 3
Fold 4
0% 25% 50% 75% 100%
Training set
Training set
high std (cross-validation scores) ⇒ shortage of data
25. Data Representation Score
What if there are many small classes?
25
● Rule of thumb in ML: there should be at least K samples per class
● representation_score = sum(k for k in Counter(y).values() if k >= K) / len(y)
● We can’t ensure a high-quality model if representation_score is low
C1
40
C2
30
C3
10
C4
10
C5
10
K = 20 representation_score = 70%
70 samples 30 samples
27. Monitoring & Improvement Questions
27
How do we monitor a multi-
model ML solution?
● Accuracy
● Data drift
● Explainability
How do we improve the
system overall?
● AutoML?
● Federated Learning?
29. How does ML1 Work?
ML1 uses the historical data from any set of permissions to automatically predict the value of
any field
31
30. ML1’s Server Under the Hood
32
A single Docker container
Solves multiclass and multilabel text classification
Accepts training data right from the client
Trains & serves a separate ML model for every target
32. Multimodel ML Server from Scratch
An article with code examples to help
you write a multimodel server using
software engineering best practices:
35
34. THANK YOU!
Want to know more about ML at Exadel? Connect
to our Zoom session in 5 minutes:
https://tinyurl.com/ExadelML
or
copy the link
scan the QR code
CONTACT US Siamion Karasik - ML Engineer - skarasik@exadel.com
36. How do you Use ML1?
Step 1
Install ML1
Plugin for Jira in
your organization
Step 2
Install the ML Server
Step 3
Enable field prediction in
project settings and set
configurations
Step 4
Train your model
Step 5
Autocomplete selected
Jira field
In just five simple steps, Jira administrators can have ML1 up and running
39
38. Choosing the ML Pipeline
In the end, this simple pipeline works best on our Jira data:
48
● TfidfVectorizer learns user-specific words
● Logistic Regression does not require many samples
● 70% accuracy
39. Training with Unknown Data
● With ML1, training data is provided by users in runtime
● We do not have control over training data set size and quality
● So the question is: will our pipeline work for others?
61