SlideShare a Scribd company logo
1 of 21
Download to read offline
1/21
WCRE 2008,
Antwerp
Jane Huffman Hayes,Jane Huffman Hayes, GiulianoGiuliano ((GiulioGiulio) Antoniol and) Antoniol and
YannYann--GaGaëël Gul Guééhhééneucneuc
PrereqirPrereqir: Recovering Pre: Recovering Pre--RequirementsRequirements
via Cluster Analysisvia Cluster Analysis
2/21
WCRE 2008,
Antwerp
ContentContent
Problem StatementProblem Statement
PREREQUIR IdeaPREREQUIR Idea
PREREQUIR ProcessPREREQUIR Process
TechnologiesTechnologies
WEB Browser RequirementsWEB Browser Requirements
Case Study ResultsCase Study Results
ConclusionsConclusions
3/21
WCRE 2008,
Antwerp
The ChallengeThe Challenge
A few years after deployment, the RS may noA few years after deployment, the RS may no
longer exist.longer exist.
If it exists, it will be almost surely outdated.If it exists, it will be almost surely outdated.
My customers may desire new functionalities orMy customers may desire new functionalities or
technologies that my system may or may nottechnologies that my system may or may not
implement.implement.
I poll my stakeholders:I poll my stakeholders:
programmers, managers, testing team members,
marketing personnel, and end users;
find out what they believe the system should do.
4/21
WCRE 2008,
Antwerp
PREREQIR in EssencePREREQIR in Essence
We need a preWe need a pre--requirement document:requirement document:
what the competitor systems do;
what our customer base needs.
Obtain and vet a list of requirements from diverseObtain and vet a list of requirements from diverse
stakeholders.stakeholders.
Structure requirements by mapping them intoStructure requirements by mapping them into
representation suitable for grouping via patternrepresentation suitable for grouping via pattern--
recognition and similarityrecognition and similarity--based clustering.based clustering.
Analyze clustered requirements to divide themAnalyze clustered requirements to divide them
into set of essential and set of optionalinto set of essential and set of optional
requirements.requirements.
5/21
WCRE 2008,
Antwerp
The PREREQUIR ProcessThe PREREQUIR Process
Requirements ri
Split, Stop-word Removal,
Stemming
Tokenization
TF-IDF
rp
browser support zoom
unzoom page detail
1 2 3
ri
Browser support print
Clustering
PAM/AGNES
4
Recovered PRI ri
and oj
Requirements rp
101110
001010
Labelling
Clusters
Vector space/clusteringTextual documents
6/21
WCRE 2008,
Antwerp
PREREQUIR TechnologyPREREQUIR Technology
Standard information retrieval vector spaceStandard information retrieval vector space
model.model.
Indexing process:Indexing process:
Stopper;
Stemmer;
Thesaurus (not vital but helps);
TF-IDF indexing.
Clustering PAM and AGNES.Clustering PAM and AGNES.
Labeling: still an open question.Labeling: still an open question.
7/21
WCRE 2008,
Antwerp
Step 1Step 1 –– Collect Stakeholders RSCollect Stakeholders RS
By means of questionnaires, collect stakeholdersBy means of questionnaires, collect stakeholders
requirements.requirements.
We favor a nonWe favor a non--intrusive lightweight approach such as aintrusive lightweight approach such as a
WEB based questionnaire.WEB based questionnaire.
Minimize the risk of influencing stakeholder.Minimize the risk of influencing stakeholder.
There is risk that:There is risk that:
he/she did not really understand the task;
the granularity and level is very different between
respondents;
the respondent population is not heterogeneous enough;
the sample size is small.
8/21
WCRE 2008,
Antwerp
Step 2Step 2 –– Vector Space MappingVector Space Mapping
The goal is to group single requirements byThe goal is to group single requirements by
different users into clusters representing thedifferent users into clusters representing the
same functionality/concept.same functionality/concept.
By means of standard IR tools, map the collectedBy means of standard IR tools, map the collected
requirements into a vector space.requirements into a vector space.
Stopper, stemmer, and TDF/IDF plus thesaurusStopper, stemmer, and TDF/IDF plus thesaurus
expansions:expansions:
certain stakeholders may use cryptic terms such as
RFC or test/benchmark acronyms.
9/21
WCRE 2008,
Antwerp
Step 3Step 3 –– ClusteringClustering
Transform similarity into a distance.Transform similarity into a distance.
Apply robust partition around medoids.Apply robust partition around medoids.
Estimate the number of clusters (differentEstimate the number of clusters (different
requirements)requirements) silhouettesilhouette::
a(ia(i)) average distance to the other PRI in the cluster;average distance to the other PRI in the cluster;
b(ib(i)) is the average distance to PRI in the nearest cluster.is the average distance to PRI in the nearest cluster.
Take the flex close to max value of the averageTake the flex close to max value of the average
silhouette.silhouette.
{ })(),(max
)()(
)(
ibia
iaib
is
−
=
.
>0.70 very strong structure
0.50 … 0.70 reasonable structure
0.25 … 0.50 weak structure
< 0.25 no structure.
10/21
WCRE 2008,
Antwerp
Step 3Step 3 BisBis –– Tree StructureTree Structure
If there is a weak structure, check for aIf there is a weak structure, check for a
requirement tree organization.requirement tree organization.
ReRe--cluster with AGNES.cluster with AGNES.
Compute the Agglomerative Coefficient (AC).Compute the Agglomerative Coefficient (AC).
AC measures the strength of the hierarchicalAC measures the strength of the hierarchical
structure discovered.structure discovered.
AC > 0.9 a very strong hierarchical structure.AC > 0.9 a very strong hierarchical structure.
Impose a threshold on the average similarity toImpose a threshold on the average similarity to
avoid groupingavoid grouping ““too differenttoo different”” things.things.
11/21
WCRE 2008,
Antwerp
Step 4Step 4 –– Label ClustersLabel Clusters
Process each PRI of a cluster:Process each PRI of a cluster:
stopping, stemming;
build cluster-specific dictionary;
weight each word by its frequency in the cluster:
If a word is in all the PRI in a cluster, its weight is 1.00. If a word
appears in half of the PRI, its weight is 0.50.
For a given stemmed PRI, calculate a score:For a given stemmed PRI, calculate a score:
sum up the weights of the stems present in the cluster
dictionary to obtain a positive weight;
count the number of words in the cluster-specific dictionary
that are absent in the current PRI:
obtain a negative weight.
Assign a score to the PRI computed as:Assign a score to the PRI computed as:
the ratio positive weight / negative weights.
Label the cluster:Label the cluster:
take the PRI with the highest score.
12/21
WCRE 2008,
Antwerp
Case StudyCase Study
Mimic the recovery process for a Web browser.Mimic the recovery process for a Web browser.
Pool via ePool via e--mail to a set of users (about 200).mail to a set of users (about 200).
25 answers out of which we kept 22, overall 43325 answers out of which we kept 22, overall 433
user needs:user needs:
mostly male (20), age varies, average 36, standardmostly male (20), age varies, average 36, standard
deviation 9.5;deviation 9.5;
respondents: 10 researchers, five lecturers/professors,respondents: 10 researchers, five lecturers/professors,
four students, one programmer, and two projectfour students, one programmer, and two project
managers.managers.
13/21
WCRE 2008,
Antwerp
PAMPAM -- AGNESAGNES
We did not find a strong or evident clusterWe did not find a strong or evident cluster
structure:structure:
silhouette about 0.26;
region between 167 – 170 cluster:
say 170 clusters or less.
AGNES reports a strong structure:AGNES reports a strong structure:
AC above 0.9.
Grouping via AGNESGrouping via AGNES
grows a tree starting from leavesgrows a tree starting from leaves
14/21
WCRE 2008,
Antwerp
OutliersOutliers
Setting a cluster internal similarity thresholdSetting a cluster internal similarity threshold
decidesdecides
top level clusterstop level clusters
singleton clusterssingleton clusters -- outliersoutliers
inner nodesinner nodes
TheThe ““non keptnon kept”” are also important:are also important:
single user needs;
more expert users may use acronyms
must comply with ACID2
“too generic: sentences:
it should be fast.
15/21
WCRE 2008,
Antwerp
AGNES ClustersAGNES Clusters
0
100
200
300
400
500
600
700
800
900
1000
0.035
0.075
0.115
0.155
0.195
0.235
0.275
0.315
0.355
0.395
0.435
0.475
0.515
0.555
0.595
0.635
0.675
0.715
0.755
0.795
0.835
0.875
0.915
Tops
Intermediate
Overall
Outliers
Leaves
Threshold
16/21
WCRE 2008,
Antwerp
Manual VerificationManual Verification
Two people reviewed cluster and cluster labeling.Two people reviewed cluster and cluster labeling.
IR measures precision and recall.IR measures precision and recall.
Precision measures the quality of the clusters.Precision measures the quality of the clusters.
A conservative approach:A conservative approach:
“Yes” was assigned if both authors said “Yes”;
“No” was assigned if one of the authors said “No”;
“Maybe” was assigned in the other cases.
17/21
WCRE 2008,
Antwerp
Precision RecallPrecision Recall –– 0.360.36
128 Common User Needs, 181 Outliers128 Common User Needs, 181 Outliers
0
0,2
0,4
0,6
0,8
1
0,225
0,285
0,345
0,405
0,465
0,525
0,585
0,645
0,705
0,765
0,825
0,885
Precision
Recall
Percentage of Outliers
Threshold
18/21
WCRE 2008,
Antwerp
Traceability TaskTraceability Task
PRI for a Web browser provided:PRI for a Web browser provided:
Web site: www.learnthenet.com.
There are 20There are 20 LtNLtN PRI:PRI:
textual PRI ranging from 5 to 73 words, having on
average 23.5 words.
LtN10:LtN10: ““The toolbar should include a Reload orThe toolbar should include a Reload or
Refresh button to load the web page again.Refresh button to load the web page again.””
Trace via vector space retrieval withTrace via vector space retrieval with tftf--idfidf..
Similarity threshold of 0.20.Similarity threshold of 0.20.
19/21
WCRE 2008,
Antwerp
Manual Evaluation by Two AuthorsManual Evaluation by Two Authors
14 of the 2014 of the 20 LtNLtN PRI are traced:PRI are traced:
the 14 PRI were all marked asthe 14 PRI were all marked as ““YesYes”” by both authors.by both authors.
If we also include the two marked asIf we also include the two marked as ““MaybeMaybe””
there are 16there are 16 LtNLtN PRI out of 20 traced.PRI out of 20 traced.
Overall, between 70% (Overall, between 70% (““YesYes”” only) and 80%only) and 80%
((““YesYes”” andand ““MaybeMaybe””) of the) of the LtNLtN PRI are alsoPRI are also
found in the PRI obtained from the respondents.found in the PRI obtained from the respondents.
20/21
WCRE 2008,
Antwerp
Threats to ValidityThreats to Validity
External validityExternal validity: only one system and 22: only one system and 22
answers out of 200, impact of vocabulary is notanswers out of 200, impact of vocabulary is not
known.known.
Construct validityConstruct validity: computation performed using: computation performed using
widely adopted toolsets, other tool can producewidely adopted toolsets, other tool can produce
different results.different results.
Reliability validityReliability validity: material will be made: material will be made
available.available.
Internal validityInternal validity: subjectivity introduced by: subjectivity introduced by
experts,experts, ““YesYes”” if and only if both agrees.if and only if both agrees.
21/21
WCRE 2008,
Antwerp
ConclusionConclusion
AGNES clusters PRI with an accuracy of 70%.AGNES clusters PRI with an accuracy of 70%.
A similarity threshold of about 0.36, about 55% ofA similarity threshold of about 0.36, about 55% of
the PRI were common to two or morethe PRI were common to two or more
stakeholders and 42% were outliers:stakeholders and 42% were outliers:
128 – 181.
We automatically label the common and outlierWe automatically label the common and outlier
PRI with 82% of the labels being correct.PRI with 82% of the labels being correct.
The method achieves roughly 70% recall andThe method achieves roughly 70% recall and
70% precision when compared to a ground truth.70% precision when compared to a ground truth.

More Related Content

Similar to Recovering Pre-Requirements for Web Browsers via Cluster Analysis

Revisiting the Calibration of Modern Neural Networks
Revisiting the Calibration of Modern Neural NetworksRevisiting the Calibration of Modern Neural Networks
Revisiting the Calibration of Modern Neural NetworksSungchul Kim
 
IRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
IRJET- Object Detection and Recognition using Single Shot Multi-Box DetectorIRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
IRJET- Object Detection and Recognition using Single Shot Multi-Box DetectorIRJET Journal
 
Tool wear monitoring and alarm system based on pattern recognition with logic...
Tool wear monitoring and alarm system based on pattern recognition with logic...Tool wear monitoring and alarm system based on pattern recognition with logic...
Tool wear monitoring and alarm system based on pattern recognition with logic...Nehem Tudu
 
Issues in AI product development and practices in audio applications
Issues in AI product development and practices in audio applicationsIssues in AI product development and practices in audio applications
Issues in AI product development and practices in audio applicationsTaesu Kim
 
IRJET- Diagnosis of Diabetic Retinopathy using Machine Learning Algorithms
IRJET- Diagnosis of Diabetic Retinopathy using Machine Learning AlgorithmsIRJET- Diagnosis of Diabetic Retinopathy using Machine Learning Algorithms
IRJET- Diagnosis of Diabetic Retinopathy using Machine Learning AlgorithmsIRJET Journal
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningHoa Le
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slideswolf
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用CHENHuiMei
 
Comparison of Learning Algorithms for Handwritten Digit Recognition
Comparison of Learning Algorithms for Handwritten Digit RecognitionComparison of Learning Algorithms for Handwritten Digit Recognition
Comparison of Learning Algorithms for Handwritten Digit RecognitionSafaa Alnabulsi
 
Tsvi Lev. Practical Explainability for AI - with examples
Tsvi Lev. Practical Explainability for AI - with examplesTsvi Lev. Practical Explainability for AI - with examples
Tsvi Lev. Practical Explainability for AI - with examplesLviv Startup Club
 
Artificial Bee Colony Based Image Enhancement for Color Images in Discrete Wa...
Artificial Bee Colony Based Image Enhancement for Color Images in Discrete Wa...Artificial Bee Colony Based Image Enhancement for Color Images in Discrete Wa...
Artificial Bee Colony Based Image Enhancement for Color Images in Discrete Wa...IRJET Journal
 
IWMW 2002: Interoperability and Learning Standards briefing: Does Interoperab...
IWMW 2002: Interoperability and Learning Standards briefing: Does Interoperab...IWMW 2002: Interoperability and Learning Standards briefing: Does Interoperab...
IWMW 2002: Interoperability and Learning Standards briefing: Does Interoperab...IWMW
 
|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and BlockchainKan Yuenyong
 
The deep bootstrap framework review
The deep bootstrap framework reviewThe deep bootstrap framework review
The deep bootstrap framework reviewtaeseon ryu
 
IRJET- Analysis of Vehicle Number Plate Recognition
IRJET- Analysis of Vehicle Number Plate RecognitionIRJET- Analysis of Vehicle Number Plate Recognition
IRJET- Analysis of Vehicle Number Plate RecognitionIRJET Journal
 
DARMDN: Deep autoregressive mixture density nets for dynamical system mode...
   DARMDN: Deep autoregressive mixture density nets for dynamical system mode...   DARMDN: Deep autoregressive mixture density nets for dynamical system mode...
DARMDN: Deep autoregressive mixture density nets for dynamical system mode...Balázs Kégl
 
Advances in Bayesian Learning
Advances in Bayesian LearningAdvances in Bayesian Learning
Advances in Bayesian Learningbutest
 

Similar to Recovering Pre-Requirements for Web Browsers via Cluster Analysis (20)

Revisiting the Calibration of Modern Neural Networks
Revisiting the Calibration of Modern Neural NetworksRevisiting the Calibration of Modern Neural Networks
Revisiting the Calibration of Modern Neural Networks
 
IRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
IRJET- Object Detection and Recognition using Single Shot Multi-Box DetectorIRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
IRJET- Object Detection and Recognition using Single Shot Multi-Box Detector
 
Dfma
DfmaDfma
Dfma
 
Tool wear monitoring and alarm system based on pattern recognition with logic...
Tool wear monitoring and alarm system based on pattern recognition with logic...Tool wear monitoring and alarm system based on pattern recognition with logic...
Tool wear monitoring and alarm system based on pattern recognition with logic...
 
Issues in AI product development and practices in audio applications
Issues in AI product development and practices in audio applicationsIssues in AI product development and practices in audio applications
Issues in AI product development and practices in audio applications
 
IRJET- Diagnosis of Diabetic Retinopathy using Machine Learning Algorithms
IRJET- Diagnosis of Diabetic Retinopathy using Machine Learning AlgorithmsIRJET- Diagnosis of Diabetic Retinopathy using Machine Learning Algorithms
IRJET- Diagnosis of Diabetic Retinopathy using Machine Learning Algorithms
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearning
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slides
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
Comparison of Learning Algorithms for Handwritten Digit Recognition
Comparison of Learning Algorithms for Handwritten Digit RecognitionComparison of Learning Algorithms for Handwritten Digit Recognition
Comparison of Learning Algorithms for Handwritten Digit Recognition
 
Tsvi Lev. Practical Explainability for AI - with examples
Tsvi Lev. Practical Explainability for AI - with examplesTsvi Lev. Practical Explainability for AI - with examples
Tsvi Lev. Practical Explainability for AI - with examples
 
Artificial Bee Colony Based Image Enhancement for Color Images in Discrete Wa...
Artificial Bee Colony Based Image Enhancement for Color Images in Discrete Wa...Artificial Bee Colony Based Image Enhancement for Color Images in Discrete Wa...
Artificial Bee Colony Based Image Enhancement for Color Images in Discrete Wa...
 
IWMW 2002: Interoperability and Learning Standards briefing: Does Interoperab...
IWMW 2002: Interoperability and Learning Standards briefing: Does Interoperab...IWMW 2002: Interoperability and Learning Standards briefing: Does Interoperab...
IWMW 2002: Interoperability and Learning Standards briefing: Does Interoperab...
 
|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain
 
Let's Get to the Rapids
Let's Get to the RapidsLet's Get to the Rapids
Let's Get to the Rapids
 
The deep bootstrap framework review
The deep bootstrap framework reviewThe deep bootstrap framework review
The deep bootstrap framework review
 
IRJET- Analysis of Vehicle Number Plate Recognition
IRJET- Analysis of Vehicle Number Plate RecognitionIRJET- Analysis of Vehicle Number Plate Recognition
IRJET- Analysis of Vehicle Number Plate Recognition
 
DARMDN: Deep autoregressive mixture density nets for dynamical system mode...
   DARMDN: Deep autoregressive mixture density nets for dynamical system mode...   DARMDN: Deep autoregressive mixture density nets for dynamical system mode...
DARMDN: Deep autoregressive mixture density nets for dynamical system mode...
 
Advances in Bayesian Learning
Advances in Bayesian LearningAdvances in Bayesian Learning
Advances in Bayesian Learning
 
ppt
pptppt
ppt
 

More from Yann-Gaël Guéhéneuc

Advice for writing a NSERC Discovery grant application v0.5
Advice for writing a NSERC Discovery grant application v0.5Advice for writing a NSERC Discovery grant application v0.5
Advice for writing a NSERC Discovery grant application v0.5Yann-Gaël Guéhéneuc
 
Ptidej Architecture, Design, and Implementation in Action v2.1
Ptidej Architecture, Design, and Implementation in Action v2.1Ptidej Architecture, Design, and Implementation in Action v2.1
Ptidej Architecture, Design, and Implementation in Action v2.1Yann-Gaël Guéhéneuc
 
Evolution and Examples of Java Features, from Java 1.7 to Java 22
Evolution and Examples of Java Features, from Java 1.7 to Java 22Evolution and Examples of Java Features, from Java 1.7 to Java 22
Evolution and Examples of Java Features, from Java 1.7 to Java 22Yann-Gaël Guéhéneuc
 
Consequences and Principles of Software Quality v0.3
Consequences and Principles of Software Quality v0.3Consequences and Principles of Software Quality v0.3
Consequences and Principles of Software Quality v0.3Yann-Gaël Guéhéneuc
 
Some Pitfalls with Python and Their Possible Solutions v0.9
Some Pitfalls with Python and Their Possible Solutions v0.9Some Pitfalls with Python and Their Possible Solutions v0.9
Some Pitfalls with Python and Their Possible Solutions v0.9Yann-Gaël Guéhéneuc
 
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...Yann-Gaël Guéhéneuc
 
An Explanation of the Halting Problem and Its Consequences
An Explanation of the Halting Problem and Its ConsequencesAn Explanation of the Halting Problem and Its Consequences
An Explanation of the Halting Problem and Its ConsequencesYann-Gaël Guéhéneuc
 
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)Informaticien(ne)s célèbres (v1.0.2, 19/02/20)
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)Yann-Gaël Guéhéneuc
 
On Java Generics, History, Use, Caveats v1.1
On Java Generics, History, Use, Caveats v1.1On Java Generics, History, Use, Caveats v1.1
On Java Generics, History, Use, Caveats v1.1Yann-Gaël Guéhéneuc
 
On Reflection in OO Programming Languages v1.6
On Reflection in OO Programming Languages v1.6On Reflection in OO Programming Languages v1.6
On Reflection in OO Programming Languages v1.6Yann-Gaël Guéhéneuc
 

More from Yann-Gaël Guéhéneuc (20)

Advice for writing a NSERC Discovery grant application v0.5
Advice for writing a NSERC Discovery grant application v0.5Advice for writing a NSERC Discovery grant application v0.5
Advice for writing a NSERC Discovery grant application v0.5
 
Ptidej Architecture, Design, and Implementation in Action v2.1
Ptidej Architecture, Design, and Implementation in Action v2.1Ptidej Architecture, Design, and Implementation in Action v2.1
Ptidej Architecture, Design, and Implementation in Action v2.1
 
Evolution and Examples of Java Features, from Java 1.7 to Java 22
Evolution and Examples of Java Features, from Java 1.7 to Java 22Evolution and Examples of Java Features, from Java 1.7 to Java 22
Evolution and Examples of Java Features, from Java 1.7 to Java 22
 
Consequences and Principles of Software Quality v0.3
Consequences and Principles of Software Quality v0.3Consequences and Principles of Software Quality v0.3
Consequences and Principles of Software Quality v0.3
 
Some Pitfalls with Python and Their Possible Solutions v0.9
Some Pitfalls with Python and Their Possible Solutions v0.9Some Pitfalls with Python and Their Possible Solutions v0.9
Some Pitfalls with Python and Their Possible Solutions v0.9
 
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...
 
An Explanation of the Halting Problem and Its Consequences
An Explanation of the Halting Problem and Its ConsequencesAn Explanation of the Halting Problem and Its Consequences
An Explanation of the Halting Problem and Its Consequences
 
Are CPUs VMs Like Any Others? v1.0
Are CPUs VMs Like Any Others? v1.0Are CPUs VMs Like Any Others? v1.0
Are CPUs VMs Like Any Others? v1.0
 
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)Informaticien(ne)s célèbres (v1.0.2, 19/02/20)
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)
 
Well-known Computer Scientists v1.0.2
Well-known Computer Scientists v1.0.2Well-known Computer Scientists v1.0.2
Well-known Computer Scientists v1.0.2
 
On Java Generics, History, Use, Caveats v1.1
On Java Generics, History, Use, Caveats v1.1On Java Generics, History, Use, Caveats v1.1
On Java Generics, History, Use, Caveats v1.1
 
On Reflection in OO Programming Languages v1.6
On Reflection in OO Programming Languages v1.6On Reflection in OO Programming Languages v1.6
On Reflection in OO Programming Languages v1.6
 
ICSOC'21
ICSOC'21ICSOC'21
ICSOC'21
 
Vissoft21.ppt
Vissoft21.pptVissoft21.ppt
Vissoft21.ppt
 
Service computation20.ppt
Service computation20.pptService computation20.ppt
Service computation20.ppt
 
Serp4 iot20.ppt
Serp4 iot20.pptSerp4 iot20.ppt
Serp4 iot20.ppt
 
Msr20.ppt
Msr20.pptMsr20.ppt
Msr20.ppt
 
Iwesep19.ppt
Iwesep19.pptIwesep19.ppt
Iwesep19.ppt
 
Icsoc20.ppt
Icsoc20.pptIcsoc20.ppt
Icsoc20.ppt
 
Icsoc18.ppt
Icsoc18.pptIcsoc18.ppt
Icsoc18.ppt
 

Recently uploaded

办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 

Recently uploaded (20)

办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 

Recovering Pre-Requirements for Web Browsers via Cluster Analysis

  • 1. 1/21 WCRE 2008, Antwerp Jane Huffman Hayes,Jane Huffman Hayes, GiulianoGiuliano ((GiulioGiulio) Antoniol and) Antoniol and YannYann--GaGaëël Gul Guééhhééneucneuc PrereqirPrereqir: Recovering Pre: Recovering Pre--RequirementsRequirements via Cluster Analysisvia Cluster Analysis
  • 2. 2/21 WCRE 2008, Antwerp ContentContent Problem StatementProblem Statement PREREQUIR IdeaPREREQUIR Idea PREREQUIR ProcessPREREQUIR Process TechnologiesTechnologies WEB Browser RequirementsWEB Browser Requirements Case Study ResultsCase Study Results ConclusionsConclusions
  • 3. 3/21 WCRE 2008, Antwerp The ChallengeThe Challenge A few years after deployment, the RS may noA few years after deployment, the RS may no longer exist.longer exist. If it exists, it will be almost surely outdated.If it exists, it will be almost surely outdated. My customers may desire new functionalities orMy customers may desire new functionalities or technologies that my system may or may nottechnologies that my system may or may not implement.implement. I poll my stakeholders:I poll my stakeholders: programmers, managers, testing team members, marketing personnel, and end users; find out what they believe the system should do.
  • 4. 4/21 WCRE 2008, Antwerp PREREQIR in EssencePREREQIR in Essence We need a preWe need a pre--requirement document:requirement document: what the competitor systems do; what our customer base needs. Obtain and vet a list of requirements from diverseObtain and vet a list of requirements from diverse stakeholders.stakeholders. Structure requirements by mapping them intoStructure requirements by mapping them into representation suitable for grouping via patternrepresentation suitable for grouping via pattern-- recognition and similarityrecognition and similarity--based clustering.based clustering. Analyze clustered requirements to divide themAnalyze clustered requirements to divide them into set of essential and set of optionalinto set of essential and set of optional requirements.requirements.
  • 5. 5/21 WCRE 2008, Antwerp The PREREQUIR ProcessThe PREREQUIR Process Requirements ri Split, Stop-word Removal, Stemming Tokenization TF-IDF rp browser support zoom unzoom page detail 1 2 3 ri Browser support print Clustering PAM/AGNES 4 Recovered PRI ri and oj Requirements rp 101110 001010 Labelling Clusters Vector space/clusteringTextual documents
  • 6. 6/21 WCRE 2008, Antwerp PREREQUIR TechnologyPREREQUIR Technology Standard information retrieval vector spaceStandard information retrieval vector space model.model. Indexing process:Indexing process: Stopper; Stemmer; Thesaurus (not vital but helps); TF-IDF indexing. Clustering PAM and AGNES.Clustering PAM and AGNES. Labeling: still an open question.Labeling: still an open question.
  • 7. 7/21 WCRE 2008, Antwerp Step 1Step 1 –– Collect Stakeholders RSCollect Stakeholders RS By means of questionnaires, collect stakeholdersBy means of questionnaires, collect stakeholders requirements.requirements. We favor a nonWe favor a non--intrusive lightweight approach such as aintrusive lightweight approach such as a WEB based questionnaire.WEB based questionnaire. Minimize the risk of influencing stakeholder.Minimize the risk of influencing stakeholder. There is risk that:There is risk that: he/she did not really understand the task; the granularity and level is very different between respondents; the respondent population is not heterogeneous enough; the sample size is small.
  • 8. 8/21 WCRE 2008, Antwerp Step 2Step 2 –– Vector Space MappingVector Space Mapping The goal is to group single requirements byThe goal is to group single requirements by different users into clusters representing thedifferent users into clusters representing the same functionality/concept.same functionality/concept. By means of standard IR tools, map the collectedBy means of standard IR tools, map the collected requirements into a vector space.requirements into a vector space. Stopper, stemmer, and TDF/IDF plus thesaurusStopper, stemmer, and TDF/IDF plus thesaurus expansions:expansions: certain stakeholders may use cryptic terms such as RFC or test/benchmark acronyms.
  • 9. 9/21 WCRE 2008, Antwerp Step 3Step 3 –– ClusteringClustering Transform similarity into a distance.Transform similarity into a distance. Apply robust partition around medoids.Apply robust partition around medoids. Estimate the number of clusters (differentEstimate the number of clusters (different requirements)requirements) silhouettesilhouette:: a(ia(i)) average distance to the other PRI in the cluster;average distance to the other PRI in the cluster; b(ib(i)) is the average distance to PRI in the nearest cluster.is the average distance to PRI in the nearest cluster. Take the flex close to max value of the averageTake the flex close to max value of the average silhouette.silhouette. { })(),(max )()( )( ibia iaib is − = . >0.70 very strong structure 0.50 … 0.70 reasonable structure 0.25 … 0.50 weak structure < 0.25 no structure.
  • 10. 10/21 WCRE 2008, Antwerp Step 3Step 3 BisBis –– Tree StructureTree Structure If there is a weak structure, check for aIf there is a weak structure, check for a requirement tree organization.requirement tree organization. ReRe--cluster with AGNES.cluster with AGNES. Compute the Agglomerative Coefficient (AC).Compute the Agglomerative Coefficient (AC). AC measures the strength of the hierarchicalAC measures the strength of the hierarchical structure discovered.structure discovered. AC > 0.9 a very strong hierarchical structure.AC > 0.9 a very strong hierarchical structure. Impose a threshold on the average similarity toImpose a threshold on the average similarity to avoid groupingavoid grouping ““too differenttoo different”” things.things.
  • 11. 11/21 WCRE 2008, Antwerp Step 4Step 4 –– Label ClustersLabel Clusters Process each PRI of a cluster:Process each PRI of a cluster: stopping, stemming; build cluster-specific dictionary; weight each word by its frequency in the cluster: If a word is in all the PRI in a cluster, its weight is 1.00. If a word appears in half of the PRI, its weight is 0.50. For a given stemmed PRI, calculate a score:For a given stemmed PRI, calculate a score: sum up the weights of the stems present in the cluster dictionary to obtain a positive weight; count the number of words in the cluster-specific dictionary that are absent in the current PRI: obtain a negative weight. Assign a score to the PRI computed as:Assign a score to the PRI computed as: the ratio positive weight / negative weights. Label the cluster:Label the cluster: take the PRI with the highest score.
  • 12. 12/21 WCRE 2008, Antwerp Case StudyCase Study Mimic the recovery process for a Web browser.Mimic the recovery process for a Web browser. Pool via ePool via e--mail to a set of users (about 200).mail to a set of users (about 200). 25 answers out of which we kept 22, overall 43325 answers out of which we kept 22, overall 433 user needs:user needs: mostly male (20), age varies, average 36, standardmostly male (20), age varies, average 36, standard deviation 9.5;deviation 9.5; respondents: 10 researchers, five lecturers/professors,respondents: 10 researchers, five lecturers/professors, four students, one programmer, and two projectfour students, one programmer, and two project managers.managers.
  • 13. 13/21 WCRE 2008, Antwerp PAMPAM -- AGNESAGNES We did not find a strong or evident clusterWe did not find a strong or evident cluster structure:structure: silhouette about 0.26; region between 167 – 170 cluster: say 170 clusters or less. AGNES reports a strong structure:AGNES reports a strong structure: AC above 0.9. Grouping via AGNESGrouping via AGNES grows a tree starting from leavesgrows a tree starting from leaves
  • 14. 14/21 WCRE 2008, Antwerp OutliersOutliers Setting a cluster internal similarity thresholdSetting a cluster internal similarity threshold decidesdecides top level clusterstop level clusters singleton clusterssingleton clusters -- outliersoutliers inner nodesinner nodes TheThe ““non keptnon kept”” are also important:are also important: single user needs; more expert users may use acronyms must comply with ACID2 “too generic: sentences: it should be fast.
  • 15. 15/21 WCRE 2008, Antwerp AGNES ClustersAGNES Clusters 0 100 200 300 400 500 600 700 800 900 1000 0.035 0.075 0.115 0.155 0.195 0.235 0.275 0.315 0.355 0.395 0.435 0.475 0.515 0.555 0.595 0.635 0.675 0.715 0.755 0.795 0.835 0.875 0.915 Tops Intermediate Overall Outliers Leaves Threshold
  • 16. 16/21 WCRE 2008, Antwerp Manual VerificationManual Verification Two people reviewed cluster and cluster labeling.Two people reviewed cluster and cluster labeling. IR measures precision and recall.IR measures precision and recall. Precision measures the quality of the clusters.Precision measures the quality of the clusters. A conservative approach:A conservative approach: “Yes” was assigned if both authors said “Yes”; “No” was assigned if one of the authors said “No”; “Maybe” was assigned in the other cases.
  • 17. 17/21 WCRE 2008, Antwerp Precision RecallPrecision Recall –– 0.360.36 128 Common User Needs, 181 Outliers128 Common User Needs, 181 Outliers 0 0,2 0,4 0,6 0,8 1 0,225 0,285 0,345 0,405 0,465 0,525 0,585 0,645 0,705 0,765 0,825 0,885 Precision Recall Percentage of Outliers Threshold
  • 18. 18/21 WCRE 2008, Antwerp Traceability TaskTraceability Task PRI for a Web browser provided:PRI for a Web browser provided: Web site: www.learnthenet.com. There are 20There are 20 LtNLtN PRI:PRI: textual PRI ranging from 5 to 73 words, having on average 23.5 words. LtN10:LtN10: ““The toolbar should include a Reload orThe toolbar should include a Reload or Refresh button to load the web page again.Refresh button to load the web page again.”” Trace via vector space retrieval withTrace via vector space retrieval with tftf--idfidf.. Similarity threshold of 0.20.Similarity threshold of 0.20.
  • 19. 19/21 WCRE 2008, Antwerp Manual Evaluation by Two AuthorsManual Evaluation by Two Authors 14 of the 2014 of the 20 LtNLtN PRI are traced:PRI are traced: the 14 PRI were all marked asthe 14 PRI were all marked as ““YesYes”” by both authors.by both authors. If we also include the two marked asIf we also include the two marked as ““MaybeMaybe”” there are 16there are 16 LtNLtN PRI out of 20 traced.PRI out of 20 traced. Overall, between 70% (Overall, between 70% (““YesYes”” only) and 80%only) and 80% ((““YesYes”” andand ““MaybeMaybe””) of the) of the LtNLtN PRI are alsoPRI are also found in the PRI obtained from the respondents.found in the PRI obtained from the respondents.
  • 20. 20/21 WCRE 2008, Antwerp Threats to ValidityThreats to Validity External validityExternal validity: only one system and 22: only one system and 22 answers out of 200, impact of vocabulary is notanswers out of 200, impact of vocabulary is not known.known. Construct validityConstruct validity: computation performed using: computation performed using widely adopted toolsets, other tool can producewidely adopted toolsets, other tool can produce different results.different results. Reliability validityReliability validity: material will be made: material will be made available.available. Internal validityInternal validity: subjectivity introduced by: subjectivity introduced by experts,experts, ““YesYes”” if and only if both agrees.if and only if both agrees.
  • 21. 21/21 WCRE 2008, Antwerp ConclusionConclusion AGNES clusters PRI with an accuracy of 70%.AGNES clusters PRI with an accuracy of 70%. A similarity threshold of about 0.36, about 55% ofA similarity threshold of about 0.36, about 55% of the PRI were common to two or morethe PRI were common to two or more stakeholders and 42% were outliers:stakeholders and 42% were outliers: 128 – 181. We automatically label the common and outlierWe automatically label the common and outlier PRI with 82% of the labels being correct.PRI with 82% of the labels being correct. The method achieves roughly 70% recall andThe method achieves roughly 70% recall and 70% precision when compared to a ground truth.70% precision when compared to a ground truth.