SlideShare a Scribd company logo
1 of 17
Automating Agile Relative Estimation in
Arabic Context
About Me
12+ Years of Experience in Software
Development
Development Manager & Co-Founder
@CodeZone
Blogger @ galaldev@blogspot.com
Interests
• SW Engineering
• Deep (Machine) Learning
• Data Science
• System Performance
The Estimation Story
Estimation Story
Customer request
new Requirements
All Team estimates
each requirement
Story Points
Product owner tells
the customer the
expected time
Estimation Story
Now team is working hard to meet the
schedule 
Then you got a lot of new requirements from
another customer
Now the customer told you that “Please give
me an initial estimation for these
requirements”
What can you do?
Estimation Story
Option 1: Gather the
team and do estimation.
• But your team already has no
time to make a new estimation
Option 2: Hire a machine
to do the estimation
• It sounds good 
Estimation Story
Input Data
• The requirement text
contents written in
Arabic
Output
• Story points for each
requirement
The Model Architecture
Model Architecture
Text Preprocessing
Convert words to vectors
Train ConvNet
Evaluate The Model
Training
Dataset
828
User
Story
Testing
Dataset
100
User
Story
Text Preprocessing
Unstructured text requires some
preprocessing
Preprocessing vary from English to Arabic
This can be achieved by several cleaning
methods, such as
• replacing special characters
• replacing punctuation marks
• removing diacritics
• removing duplicate characters
• removing stop-words
• removing numbers … etc.
• Stemming
Text Preprocessing
•‫االهداف‬ ‫تعريف‬ ‫فى‬ ‫العميل‬ ‫باضافة‬ ‫تعديل‬<div
dir=rtl><p>‫باسم‬ ‫تابة‬ ‫اضافة‬ ‫مطلوب‬
‫اضافة‬ ‫يتم‬ ‫بحيث‬ ‫العميل‬ ‫عميل‬
‫لالهداف‬</p></div>
Original
Text
•‫ا‬ ‫مطلوب‬ ‫االهداف‬ ‫تعريف‬ ‫العميل‬ ‫باضافه‬ ‫تعديل‬‫ضافه‬
‫لاله‬ ‫عميل‬ ‫اضافه‬ ‫يتم‬ ‫بحيث‬ ‫العميل‬ ‫باسم‬ ‫تابه‬‫داف‬
Cleaned
Text
Words To Vectors
Text data are unstructured data, then it needs to
transform from text space to vector space in order to
deal with it as input features for Classifier algorithm
Word2Vec one of popular methods
It convert similar words to nearby vectors
Ex: “‫”الموظف‬ near to “‫”العامل‬ than “‫”السيارة‬
Words To Vectors
Train ConvNet
Convolution Neural Network (ConvNet) is a
neural network that can make use of the internal
structure of data such as the 2D structure
Some researches apply ConvNet to text
classification so that each unit in the convolution
layer responds to a small region of a document
There are four main steps in the Convolution
Neural Network (ConvNet):
• Convolution
Non-Linearity
Pooling or Sub Sampling
• Classification (Fully Connected Layer)
Kim 2014 Model architecture
Model Evaluation
Test the model Accuracy
using Test dataset model
never seen before
We got 74% Accuracy
Model Evaluation

More Related Content

Similar to Automating Agile Relative Estimation in Arabic Context Using Deep Learning

AWS Summit Singapore - Managing a Database Migration Project | Best Practices
AWS Summit Singapore - Managing a Database Migration Project | Best PracticesAWS Summit Singapore - Managing a Database Migration Project | Best Practices
AWS Summit Singapore - Managing a Database Migration Project | Best PracticesAmazon Web Services
 
Machine Learning With ML.NET
Machine Learning With ML.NETMachine Learning With ML.NET
Machine Learning With ML.NETDev Raj Gautam
 
Ramachandra_Reddy_Resume_2015
Ramachandra_Reddy_Resume_2015Ramachandra_Reddy_Resume_2015
Ramachandra_Reddy_Resume_2015Ramchandra Reddy
 
Design Document – Week 1 – ProposalCourse ID IT 491 CAPSTONE .docx
Design Document – Week 1 – ProposalCourse ID IT 491 CAPSTONE .docxDesign Document – Week 1 – ProposalCourse ID IT 491 CAPSTONE .docx
Design Document – Week 1 – ProposalCourse ID IT 491 CAPSTONE .docxcarolinef5
 
Lakshmana rao Y Resume
Lakshmana rao Y ResumeLakshmana rao Y Resume
Lakshmana rao Y ResumeLakshman Chan
 
Evolutionary db development
Evolutionary db development Evolutionary db development
Evolutionary db development Open Party
 
Datastage developer Resume
Datastage developer ResumeDatastage developer Resume
Datastage developer ResumeMallikarjuna P
 
Informatica 5+years of experince
Informatica 5+years of experinceInformatica 5+years of experince
Informatica 5+years of experinceDharma Rao
 
Informatica_5+years of experince
Informatica_5+years of experinceInformatica_5+years of experince
Informatica_5+years of experinceDharma Rao
 
Informatica 5+years of experince
Informatica 5+years of experinceInformatica 5+years of experince
Informatica 5+years of experinceDharma Rao
 
How to build an automated customer data onboarding pipeline
How to build an automated customer data onboarding pipelineHow to build an automated customer data onboarding pipeline
How to build an automated customer data onboarding pipelineCloverDX
 
Real-world software design practices when developing ASP.NET web systems by B...
Real-world software design practices when developing ASP.NET web systems by B...Real-world software design practices when developing ASP.NET web systems by B...
Real-world software design practices when developing ASP.NET web systems by B...Bojan Veljanovski
 
5.10 years Expetience in Asp.net with MVC
5.10 years Expetience in Asp.net with MVC5.10 years Expetience in Asp.net with MVC
5.10 years Expetience in Asp.net with MVCprashant zope
 

Similar to Automating Agile Relative Estimation in Arabic Context Using Deep Learning (20)

Resume sailaja
Resume sailajaResume sailaja
Resume sailaja
 
AWS Summit Singapore - Managing a Database Migration Project | Best Practices
AWS Summit Singapore - Managing a Database Migration Project | Best PracticesAWS Summit Singapore - Managing a Database Migration Project | Best Practices
AWS Summit Singapore - Managing a Database Migration Project | Best Practices
 
Machine Learning With ML.NET
Machine Learning With ML.NETMachine Learning With ML.NET
Machine Learning With ML.NET
 
Ramachandra_Reddy_Resume_2015
Ramachandra_Reddy_Resume_2015Ramachandra_Reddy_Resume_2015
Ramachandra_Reddy_Resume_2015
 
Design Document – Week 1 – ProposalCourse ID IT 491 CAPSTONE .docx
Design Document – Week 1 – ProposalCourse ID IT 491 CAPSTONE .docxDesign Document – Week 1 – ProposalCourse ID IT 491 CAPSTONE .docx
Design Document – Week 1 – ProposalCourse ID IT 491 CAPSTONE .docx
 
Lakshmana rao Y Resume
Lakshmana rao Y ResumeLakshmana rao Y Resume
Lakshmana rao Y Resume
 
Evolutionary db development
Evolutionary db development Evolutionary db development
Evolutionary db development
 
Shuchi_Agrawal
Shuchi_AgrawalShuchi_Agrawal
Shuchi_Agrawal
 
Resume
ResumeResume
Resume
 
Datastage developer Resume
Datastage developer ResumeDatastage developer Resume
Datastage developer Resume
 
Resume
ResumeResume
Resume
 
KarthikSNOW_CV
KarthikSNOW_CVKarthikSNOW_CV
KarthikSNOW_CV
 
Informatica 5+years of experince
Informatica 5+years of experinceInformatica 5+years of experince
Informatica 5+years of experince
 
Informatica_5+years of experince
Informatica_5+years of experinceInformatica_5+years of experince
Informatica_5+years of experince
 
Informatica 5+years of experince
Informatica 5+years of experinceInformatica 5+years of experince
Informatica 5+years of experince
 
How to build an automated customer data onboarding pipeline
How to build an automated customer data onboarding pipelineHow to build an automated customer data onboarding pipeline
How to build an automated customer data onboarding pipeline
 
Real-world software design practices when developing ASP.NET web systems by B...
Real-world software design practices when developing ASP.NET web systems by B...Real-world software design practices when developing ASP.NET web systems by B...
Real-world software design practices when developing ASP.NET web systems by B...
 
Prasanth_CV
Prasanth_CVPrasanth_CV
Prasanth_CV
 
CV_Celestina
CV_CelestinaCV_Celestina
CV_Celestina
 
5.10 years Expetience in Asp.net with MVC
5.10 years Expetience in Asp.net with MVC5.10 years Expetience in Asp.net with MVC
5.10 years Expetience in Asp.net with MVC
 

More from Mohamed Galal

اساليب البرمجيات الحديثة Modern Software Development
اساليب البرمجيات الحديثة Modern Software Developmentاساليب البرمجيات الحديثة Modern Software Development
اساليب البرمجيات الحديثة Modern Software DevelopmentMohamed Galal
 
مقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيمقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيMohamed Galal
 
بالعربي التطور في البرمجة باستخدام ال .Net
بالعربي التطور في البرمجة باستخدام ال .Netبالعربي التطور في البرمجة باستخدام ال .Net
بالعربي التطور في البرمجة باستخدام ال .NetMohamed Galal
 
Id generation strategies
Id generation strategiesId generation strategies
Id generation strategiesMohamed Galal
 
Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)Mohamed Galal
 

More from Mohamed Galal (7)

اساليب البرمجيات الحديثة Modern Software Development
اساليب البرمجيات الحديثة Modern Software Developmentاساليب البرمجيات الحديثة Modern Software Development
اساليب البرمجيات الحديثة Modern Software Development
 
مقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربيمقدمة عن NoSQL بالعربي
مقدمة عن NoSQL بالعربي
 
بالعربي التطور في البرمجة باستخدام ال .Net
بالعربي التطور في البرمجة باستخدام ال .Netبالعربي التطور في البرمجة باستخدام ال .Net
بالعربي التطور في البرمجة باستخدام ال .Net
 
Realtime web
Realtime webRealtime web
Realtime web
 
Id generation strategies
Id generation strategiesId generation strategies
Id generation strategies
 
Event sourcing
Event sourcingEvent sourcing
Event sourcing
 
Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)Modern databases and its challenges (SQL ,NoSQL, NewSQL)
Modern databases and its challenges (SQL ,NoSQL, NewSQL)
 

Recently uploaded

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 

Recently uploaded (20)

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 

Automating Agile Relative Estimation in Arabic Context Using Deep Learning

  • 1. Automating Agile Relative Estimation in Arabic Context
  • 2. About Me 12+ Years of Experience in Software Development Development Manager & Co-Founder @CodeZone Blogger @ galaldev@blogspot.com Interests • SW Engineering • Deep (Machine) Learning • Data Science • System Performance
  • 4. Estimation Story Customer request new Requirements All Team estimates each requirement Story Points Product owner tells the customer the expected time
  • 5. Estimation Story Now team is working hard to meet the schedule  Then you got a lot of new requirements from another customer Now the customer told you that “Please give me an initial estimation for these requirements” What can you do?
  • 6. Estimation Story Option 1: Gather the team and do estimation. • But your team already has no time to make a new estimation Option 2: Hire a machine to do the estimation • It sounds good 
  • 7. Estimation Story Input Data • The requirement text contents written in Arabic Output • Story points for each requirement
  • 9. Model Architecture Text Preprocessing Convert words to vectors Train ConvNet Evaluate The Model Training Dataset 828 User Story Testing Dataset 100 User Story
  • 10. Text Preprocessing Unstructured text requires some preprocessing Preprocessing vary from English to Arabic This can be achieved by several cleaning methods, such as • replacing special characters • replacing punctuation marks • removing diacritics • removing duplicate characters • removing stop-words • removing numbers … etc. • Stemming
  • 11. Text Preprocessing •‫االهداف‬ ‫تعريف‬ ‫فى‬ ‫العميل‬ ‫باضافة‬ ‫تعديل‬<div dir=rtl><p>‫باسم‬ ‫تابة‬ ‫اضافة‬ ‫مطلوب‬ ‫اضافة‬ ‫يتم‬ ‫بحيث‬ ‫العميل‬ ‫عميل‬ ‫لالهداف‬</p></div> Original Text •‫ا‬ ‫مطلوب‬ ‫االهداف‬ ‫تعريف‬ ‫العميل‬ ‫باضافه‬ ‫تعديل‬‫ضافه‬ ‫لاله‬ ‫عميل‬ ‫اضافه‬ ‫يتم‬ ‫بحيث‬ ‫العميل‬ ‫باسم‬ ‫تابه‬‫داف‬ Cleaned Text
  • 12. Words To Vectors Text data are unstructured data, then it needs to transform from text space to vector space in order to deal with it as input features for Classifier algorithm Word2Vec one of popular methods It convert similar words to nearby vectors Ex: “‫”الموظف‬ near to “‫”العامل‬ than “‫”السيارة‬
  • 14. Train ConvNet Convolution Neural Network (ConvNet) is a neural network that can make use of the internal structure of data such as the 2D structure Some researches apply ConvNet to text classification so that each unit in the convolution layer responds to a small region of a document There are four main steps in the Convolution Neural Network (ConvNet): • Convolution Non-Linearity Pooling or Sub Sampling • Classification (Fully Connected Layer)
  • 15. Kim 2014 Model architecture
  • 16. Model Evaluation Test the model Accuracy using Test dataset model never seen before We got 74% Accuracy

Editor's Notes

  1. Welcome and good morning. In this talk, I will discuss a real, actual problem we faced at CodeZone. We will discuss the solution we have adopted using one of the most advanced methods of artificial intelligence called Deep Learning.
  2. I am Mohamed Galal I have worked in Software Development for over 12 years, and have participated in the development of software serving clients in different fields. I also cofounded CodeZone 8 years ago, and am a Software Development Manager. I also created and maintain a blog which addresses and discusses technical topics. My blog is in Arabic to enrich Arabic content.
  3. Now, let's start with the story of how the Estimation works.
  4. Initially, the client has some requirements and has asked our company to address and implement them. After receiving the requirements, the team will discuss them and estimate the effort by specifying Story Points for each requirement or user story. The Product Owner will then communicate to the client the anticipated deadline to implement these requirements.
  5. At that point, the team will work hard to meet the deadline. However, during this period, a different client has requested an initial estimate for his new requirements. What should we do?
  6. The first option is to gather all or some of the team to complete an estimation. However, this would require our team to stop working on the current sprint, which may lead to delay. The second option is to hire a machine to do estimation (sounds interesting).
  7. Now, we are going to develop a software or a model to predict the story points for the new user stories depending on user stories text contents.
  8. Now, let's talk about Model Architecture.
  9. To build any model, we have to train it on historical data, called Training set. CodeZone has a backlog containing about 1,000 User Stories. We then, developed a training dataset containing 828 User Stories, and Testing Dataset contains 100 User Stories. The Training Dataset is a list of User Stories. Each User Story has the Story Points that the team has previously estimated, in order to instruct the model how to draw a relationship between the user story text contents and its story points. The Testing Dataset is compiled of User Stories. Each User Story has the Story Points that the team had previously estimated, but the model doesn’t "see" it during the training. However, when evaluating the model, the results predicted by the model are compared to the original story points to evaluate the model quality. The stages are as follows First, Text Data is preprocessed. Then, Text Data is converted to Vectors so it can be utilized and forwarded to the Convolution Neural Network, which is one of the famous Neural Networks used in Deep Learning. Finally, the last step is to evaluate the model.
  10. Usually, Text requires some preprocessing to be useful in the classification task. These methods differ from one language to another. For example, in Arabic, the Stopwords differ from the English language, and the methods of Stemming are different. These are some of the methods used to achieve this:
  11. Let's take for example, the Text before and after the preprocessing task. The original Text contains HTML Tags and a "في" preposition, which was removed from the clean text.
  12. The model can understand numbers, but not Text. So, the Text should be converted to numbers or Vectors. One of the most popular methods of doing this is, Word2Vec. This method converts the nearest words in the meaning to the nearest Vectors. For example, the word of “employee." Its Vector is close to the Vector of the word, "worker," which is more than the Vector for the word “car." This has a huge, positive impact on the classification process.
  13. Each word is represented by a Vector. Each User Story is represented by a matrix of its word Vectors, as follows:
  14. We have the data ready for the training process in this phase. We chose to work with a type of Neural Network called Convolution Neural Network. In fact, this Neural Network is used more with images, but some researchers used it in the text classification task, with good results.
  15. This is the architecture of the Convolution Neural Network that we used, which was previously published in a scientific journal in 2014 by an author named, Kim.
  16. At this point, we have created the model. We now, want to apply the Testing Dataset, which this model has never seen before. We have now tested the model, which resulted in a 74% Accuracy rate. This means our model correctly predicting 74% testing user stories to the original story points supplied by the team.
  17. This graph shows the result of testing after each training iteration.