A Machine Learning Approach for Identifying Expert StakeholdersCarlos Castro-HerreraJane Cleland-Huang
OutlineIntroduction & MotivationCase StudyIdentifying Expert StakeholdersDirect stakeholdersIndirect stakeholdersInferred stakeholdersConclusions and Future work9/2/20092RE09   •   Carlos Castro-Herrera   •   DePaul University
Introduction & MotivationRequirements Elicitation:Representative group of stakeholdersProactively engagedDiscovery and Analysis of the requirementsOrganizations adopting online collaborative toolsHow to identify Subject Matter Experts?Novel technique that automatically analyzes contributions and interests to identify relevant stakeholdersMachine learning techniques organize contributions into topicsIdentifies three classes of stakeholders: Direct, Indirect and Inferred.Can be combined into a single ranking.9/2/20093RE09   •   Carlos Castro-Herrera   •   DePaul University
Case StudyStudent Dataset36 Masters students:Requirements Engineering Class366 Feature requestsAmazon-like web portalPurchasing and Selling of school books9/2/20094RE09   •   Carlos Castro-Herrera   •   DePaul University
Direct StakeholdersMade specific contributions to a topicTopics can be identified:Manually AutomaticallyAutomatically:Contributions are pre-processed:Determine ‘ideal’ amount of topics: Can’s Cover CoefficientConsensus Clustering using Spherical K-Means:Co-Association Matrix: captures the proximity of the needsHierarchical Agglomerative Clustering over the final matrix9/2/2009RE09   •   Carlos Castro-Herrera   •   DePaul University5
Direct StakeholdersEx. Encryption Topic:28 Additional topics (purchases, used books, shopping cart)Contribution Metrics:Topic Contribution: % or requirements in the topic contributed by a stakeholderTopic SpecializationInverse of the number of topics a stakeholder is associated with9/2/2009RE09   •   Carlos Castro-Herrera   •   DePaul University6
Indirect StakeholdersMade contributions to a related topicSimilarity between topics:Vector Space Model using tf-idfCosine similarity metricSimilarity scores for the Encryption topic:Interest of stakeholders in the target topic can be weighted by the similarity of the related topics:9/2/2009RE09   •   Carlos Castro-Herrera   •   DePaul University7
Indirect StakeholdersVisualize the relationships between topics:Ex. Indirect Stakeholders for the Encryption topic:9/2/2009RE09   •   Carlos Castro-Herrera   •   DePaul University8
Inferred StakeholdersDirect and Indirect are based on content.Inferred stakeholders based on behavior profilesCollaborative Recommender Systems:9/2/2009RE09   •   Carlos Castro-Herrera   •   DePaul University9Infer interest based on the behavior of a stakeholders with similar contribution patterns
Inferred StakeholdersNeighbors are calculated using a similarity function:A prediction score for a particular item is calculated using the function:Ex. Top Recommendations for the Encryption topic (inferred stakeholders):9/2/2009RE09   •   Carlos Castro-Herrera   •   DePaul University10
Conclusions and Future WorkPresented a novel technique (proof of concept)Uses data mining Analyze stakeholders’ contributionsIndentify potential experts (key stakeholders) for a topicPotential uses: Identify stakeholders that:Should participate in a new featureCan bring in new perspectives to a stagnant discussionWill be impacted by a changeFuture work:More sophisticated modelMore rigorous evaluation9/2/200911RE09   •   Carlos Castro-Herrera   •   DePaul University

10 A Machine Learning Approach for Identifying Expert Stakeholders

  • 1.
    A Machine LearningApproach for Identifying Expert StakeholdersCarlos Castro-HerreraJane Cleland-Huang
  • 2.
    OutlineIntroduction & MotivationCaseStudyIdentifying Expert StakeholdersDirect stakeholdersIndirect stakeholdersInferred stakeholdersConclusions and Future work9/2/20092RE09 • Carlos Castro-Herrera • DePaul University
  • 3.
    Introduction & MotivationRequirementsElicitation:Representative group of stakeholdersProactively engagedDiscovery and Analysis of the requirementsOrganizations adopting online collaborative toolsHow to identify Subject Matter Experts?Novel technique that automatically analyzes contributions and interests to identify relevant stakeholdersMachine learning techniques organize contributions into topicsIdentifies three classes of stakeholders: Direct, Indirect and Inferred.Can be combined into a single ranking.9/2/20093RE09 • Carlos Castro-Herrera • DePaul University
  • 4.
    Case StudyStudent Dataset36Masters students:Requirements Engineering Class366 Feature requestsAmazon-like web portalPurchasing and Selling of school books9/2/20094RE09 • Carlos Castro-Herrera • DePaul University
  • 5.
    Direct StakeholdersMade specificcontributions to a topicTopics can be identified:Manually AutomaticallyAutomatically:Contributions are pre-processed:Determine ‘ideal’ amount of topics: Can’s Cover CoefficientConsensus Clustering using Spherical K-Means:Co-Association Matrix: captures the proximity of the needsHierarchical Agglomerative Clustering over the final matrix9/2/2009RE09 • Carlos Castro-Herrera • DePaul University5
  • 6.
    Direct StakeholdersEx. EncryptionTopic:28 Additional topics (purchases, used books, shopping cart)Contribution Metrics:Topic Contribution: % or requirements in the topic contributed by a stakeholderTopic SpecializationInverse of the number of topics a stakeholder is associated with9/2/2009RE09 • Carlos Castro-Herrera • DePaul University6
  • 7.
    Indirect StakeholdersMade contributionsto a related topicSimilarity between topics:Vector Space Model using tf-idfCosine similarity metricSimilarity scores for the Encryption topic:Interest of stakeholders in the target topic can be weighted by the similarity of the related topics:9/2/2009RE09 • Carlos Castro-Herrera • DePaul University7
  • 8.
    Indirect StakeholdersVisualize therelationships between topics:Ex. Indirect Stakeholders for the Encryption topic:9/2/2009RE09 • Carlos Castro-Herrera • DePaul University8
  • 9.
    Inferred StakeholdersDirect andIndirect are based on content.Inferred stakeholders based on behavior profilesCollaborative Recommender Systems:9/2/2009RE09 • Carlos Castro-Herrera • DePaul University9Infer interest based on the behavior of a stakeholders with similar contribution patterns
  • 10.
    Inferred StakeholdersNeighbors arecalculated using a similarity function:A prediction score for a particular item is calculated using the function:Ex. Top Recommendations for the Encryption topic (inferred stakeholders):9/2/2009RE09 • Carlos Castro-Herrera • DePaul University10
  • 11.
    Conclusions and FutureWorkPresented a novel technique (proof of concept)Uses data mining Analyze stakeholders’ contributionsIndentify potential experts (key stakeholders) for a topicPotential uses: Identify stakeholders that:Should participate in a new featureCan bring in new perspectives to a stagnant discussionWill be impacted by a changeFuture work:More sophisticated modelMore rigorous evaluation9/2/200911RE09 • Carlos Castro-Herrera • DePaul University