Team activity analysis / visualization


Published on

Published in: Education, Technology
1 Comment
  • Interesting stuff... I don't follow all of the maths but I thought you might be interested in something similar we've done at our site, especially around the visualisation part...
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • A Summary of what we’ve done from the beginning of the research project, 6 month ago.
  • Team activity analysis / visualization

    1. 1. Team Activity Analysis/Visualisation Project Members : Judy Kay,Nicolas Maisonneuve, Peter Reimann, Kalina Yacef
    2. 2. Objectives <ul><li>Goal: </li></ul><ul><ul><li>Try to build «  a system that will support groups in learning how to work more effectively as groups  » ( Reimann & al. 2004) </li></ul></ul><ul><ul><ul><li>Visualisation of the collaboration (for Teacher and Student) </li></ul></ul></ul><ul><ul><ul><li>With different levels of feedback: mirroring, advice </li></ul></ul></ul><ul><ul><li>For self-managed learning groups ( small teams of students ) </li></ul></ul><ul><ul><li>Using logs generated by the collaboration tools. </li></ul></ul>
    3. 3. Case studies <ul><li>Activity logs from real cases: </li></ul><ul><ul><li>IT project: semester long software development </li></ul></ul><ul><ul><li>Forum/discussion from the public Health School </li></ul></ul><ul><ul><li>Chat/Whiteboard from the Faculty of Education </li></ul></ul><ul><li>Event definition </li></ul><ul><ul><li>Event = { Author, Time, Type Event, Resource} </li></ul></ul><ul><li>IT Project: the Trac system. </li></ul>Ticket: Task Manager SVN: Source Repository Wiki: Web Page Editor
    4. 4. Related Works & Approach <ul><ul><li>Related works in the Team Activity analysis: </li></ul></ul><ul><ul><ul><li>About 20 recent articles found in the following fields: </li></ul></ul></ul><ul><ul><ul><ul><li>Source repository Mining , </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Process Mining </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Visualisation for Team Software development. </li></ul></ul></ul></ul><ul><ul><ul><li>Very few are interesting for our concern: the Collaboration aspect. </li></ul></ul></ul><ul><ul><ul><li>No satisfying result. </li></ul></ul></ul><ul><ul><li>Core of our research </li></ul></ul><ul><ul><ul><li>Visualisation part: Build some visualisations mirroring the group’s activity. </li></ul></ul></ul><ul><ul><ul><li>Datamining part: Find relevant/Frequent sequences of events characterizing: </li></ul></ul></ul>
    5. 5. <ul><li>Part 1 – Visualisation </li></ul>
    6. 6. Visualisation <ul><li>Goal:Visualise the team’s activity to: </li></ul><ul><ul><li>Easily compare users / groups. </li></ul></ul><ul><ul><li>Study some teamworking theories (Big-Five) </li></ul></ul><ul><li>3 types of visualisation: </li></ul><ul><ul><li>Visualise the users’ participation within a group. </li></ul></ul><ul><ul><li>Overview of the interaction between users. </li></ul></ul><ul><ul><li>The timeline representing the users’ activity. </li></ul></ul>
    7. 7. Activity Radar – Overview <ul><li>Properties: </li></ul><ul><ul><li>Participation for each data source. </li></ul></ul><ul><ul><li>Participation relative to the group / class. </li></ul></ul><ul><li>Calculation </li></ul><ul><li>The participation W for the source s : </li></ul><ul><ul><li>, set of all the events from s (wiki,svn,..) </li></ul></ul><ul><ul><li>,weight of participation for the event e. </li></ul></ul><ul><ul><ul><li>Wiki and SVN event, = added lines </li></ul></ul></ul><ul><ul><ul><li>Ticket, =1 (not quantifiable) </li></ul></ul></ul><ul><ul><ul><li>Todo: Quantify a ticket event by the importance of its action: w (Open) > w (Assign)> w (Comment) </li></ul></ul></ul>High participation Average Participation Low participation
    8. 8. Activity Radar – Comparison with a classic histogram <ul><li>Main difference: the reading process: </li></ul><ul><ul><li>Activity radar: Radial Scan </li></ul></ul><ul><ul><li>Histogram: Horizontal Scan </li></ul></ul><ul><li> Different purposes: </li></ul><ul><ul><li>Activity radar: fast global reading IF the shape is not too much deformed (DG1 vs DG2) (pb with a lot of users) </li></ul></ul><ul><ul><li>Histogram: more precise graph: useful for a user self-evaluation. </li></ul></ul><ul><li>Todo: study more the validity of this graph in a HCI pov </li></ul>DG 1 H1 DG 2 H2
    9. 9. Interaction Network <ul><li>Goal </li></ul><ul><ul><li>Overview of interactions in a group. </li></ul></ul><ul><ul><li>Unidirectional graph </li></ul></ul><ul><ul><li>Some possible analysis: </li></ul></ul><ul><ul><ul><li>Group centric: level of interaction in the group.(many/few connections, global shape) </li></ul></ul></ul><ul><ul><ul><li>User centric : some characteristic behaviors: Leader, lonely person. </li></ul></ul></ul><ul><ul><li>Analysis based on 2 parameters: </li></ul></ul><ul><ul><ul><li>number of connections: Influence of the user </li></ul></ul></ul><ul><ul><ul><li>Connection’s width: Quantity of interaction between 2 users. </li></ul></ul></ul>Interaction for the wiki Interaction for the tickets
    10. 10. <ul><li>Calculation of the interaction I </li></ul><ul><ul><li>, the set of all the resources from s </li></ul></ul><ul><ul><li>,the weight of the interaction between 2 users for the resource r. </li></ul></ul><ul><li>Calculation for </li></ul><ul><ul><li>Global Interaction: (ex: source code) </li></ul></ul><ul><li>TODO: Feedback notion: I(a,b) ≠I(b,a) </li></ul><ul><ul><li>Direct feedback (ex: chat) </li></ul></ul><ul><ul><li>Indirect feedback (ex: forum)… </li></ul></ul>Interaction Network <ul><li>Example of calculation: </li></ul><ul><li>Sequence of events (authors) for a resource: </li></ul><ul><li>< a,b,d,b,a,b,a,e,d,b,d,b,d> </li></ul><ul><li>global Interaction: </li></ul><ul><li>I(a,d) = I(d,a) = min(3,4)=3 </li></ul><ul><li>Direct feedback: </li></ul><ul><li>I(a,d)=0, l(d,a)=0, </li></ul><ul><li>l(b,d)=3, l(d,b)=2 </li></ul><ul><li>Indirect feedback: Aa={{b,d,b,a},{b}}  I(a,d)=1 </li></ul><ul><li>Ad={{a,b},{b,a,e},{b},{b}}  I(d,a)2 </li></ul>
    11. 11. Wattle Tree <ul><li>Goal: </li></ul><ul><ul><li>Overview of the group’s activity during the time of the project. </li></ul></ul><ul><ul><li>Depends on the domain </li></ul></ul><ul><li>Explanation of the graphic </li></ul><ul><ul><li>Each vertical tree = user’s activity </li></ul></ul><ul><ul><li>Each tree component = an event </li></ul></ul><ul><ul><ul><li>Red Petal: add/modify a wikipage (WIKI) </li></ul></ul></ul><ul><ul><ul><li>Blue Petal: add/modif source code (SVN) </li></ul></ul></ul><ul><ul><ul><li>Branch: open/close ticket (TICKET) </li></ul></ul></ul>
    12. 12. <ul><li>Part 2 – Datamining </li></ul>
    13. 13. Datamining <ul><li>Goal: Find frequent patterns (sequence of events) characterizing an aspect of the Teamwork. </li></ul><ul><ul><li>good/bad practices (Performance Indicator) </li></ul></ul><ul><ul><li>the user’s role (e.g. the leader/ normal user) </li></ul></ul><ul><ul><li>the group/user’s interaction level </li></ul></ul><ul><li>Event definition </li></ul><ul><ul><li>Event = {Author, Time, Type Event, Resource} </li></ul></ul>
    14. 14. Frequent Sequence Pattern <ul><ul><li>Customer Sequence : (ordered) LIST of items. </li></ul></ul><ul><ul><li>Item: a element in the alphabet (item collection called C). </li></ul></ul><ul><ul><li>subsequence : <e1,e2,…,en> is a subsequence of a sequence S if it matches the pattern <e1,*,e2,*,e3,…,*,en> in S. </li></ul></ul><ul><ul><li>Ex: <a,b,c> < a ,e, b ,a, c > </li></ul></ul><ul><ul><li>Support for a sequence S: the number of customer sequences of which S is a subsequence. </li></ul></ul><ul><ul><li>Problem: Find all the frequent subsequences with a support > Smin. </li></ul></ul><ul><ul><li>The alphabet C={a,b,c,d,e,f} </li></ul></ul>Example: Find all the seq with a Minimum Support = 3: Sup(<a,b,c>)=2 , no good. Sup(<a,c>)=4  <a,c> ok Sup(<e,b,a>)=3  <e,b,a> ok <ul><ul><li><d,e,b,a,c> </li></ul></ul>4 <ul><ul><li><b,e,b,f,a,c> </li></ul></ul>3 <ul><ul><li><a,c,c,b,c> </li></ul></ul>2 <ul><ul><li><a,e,b,a,c> </li></ul></ul>1 Sequence CustID
    15. 15. 2 main problems: <ul><li>Preprocessing Step: </li></ul><ul><li>Transform our sequences of events into a list of subsequences. </li></ul><ul><li>Mining Step: </li></ul><ul><ul><li>Made an algorithm to all the frequent subsequences in this list. </li></ul></ul>
    16. 16. Generation of the list of sequences <ul><li>Cutting by activity: </li></ul><ul><ul><li>Ideal situation : each event should be linked to an activity. Not yet </li></ul></ul><ul><ul><li> Timeout cutting: Create a new activity if [ti, ti+1]> α with ti the time of the event i (i.e the events are considered as independant each other) </li></ul></ul><ul><ul><li>Example: </li></ul></ul><ul><ul><ul><li>cutting seq at some points: </li></ul></ul></ul><ul><ul><li>[t2, t3]> α , [t6, t7]> α , [t8, t9]> α </li></ul></ul><ul><ul><ul><li>Results: 4 activities extracted </li></ul></ul></ul><ul><li>Cutting by resource: </li></ul><ul><ul><li>A sequence for each resource </li></ul></ul><ul><ul><li><s,t> </li></ul></ul>4 <ul><ul><li><t,t> </li></ul></ul>3 <ul><ul><li><t,s,s,w> </li></ul></ul>2 <ul><ul><li><w,w> </li></ul></ul>1 Sequence ActivityID <ul><ul><li><t> </li></ul></ul>Ticket2 <ul><ul><li><t,t> </li></ul></ul>Ticket1 <ul><ul><li><w,w,w> </li></ul></ul>wikiPage2 <ul><ul><li><w> </li></ul></ul>wikiPage1 Sequence ResID
    17. 17. Generation of the list of sequences <ul><li>Transformation of Sequence of events in a sequence of Items. </li></ul><ul><ul><li>Step 1: Define an Alphabet (collection of item) </li></ul></ul><ul><ul><li>Exemple: Item with 3 properties: </li></ul></ul><ul><ul><ul><li>Type of event </li></ul></ul></ul><ul><ul><ul><li>number of occurrences, </li></ul></ul></ul><ul><ul><ul><li>number of distinct authors, </li></ul></ul></ul><ul><ul><li>Step 2: transformation Seq of events  Seq of items </li></ul></ul><ul><ul><li>… . </li></ul></ul>3 <ul><ul><li><(1t1),(2s1),(1t1)> </li></ul></ul>3 <ul><ul><li><(1w1),(1t1),(2w2)> </li></ul></ul>2 <ul><ul><li><(3w2),(1t1)> </li></ul></ul>1 Sequence of items ActivityID <ul><ul><li><w1,w2,t1,t2> </li></ul></ul>4 <ul><ul><li><t1,s1,s1,s3,t2> </li></ul></ul>3 <ul><ul><li><w1,t1,w2,w1> </li></ul></ul>2 <ul><ul><li><w1,w1,w2,t1> </li></ul></ul>1 Sequence of events ActivtyID
    18. 18. GSP, Generalized Sequence Pattern. <ul><li>Based on the Heuristic: </li></ul><ul><ul><li>« if a sequence is frequent, the subsequences are frequent too. » </li></ul></ul><ul><li>Algorithm: </li></ul><ul><ul><li>Step 1: find all frequent items with a minimum support </li></ul></ul><ul><ul><li>Step k - Generation Phase: Generate some candidate k-sequence with the frequent k-1 sequence. </li></ul></ul><ul><ul><li>Step k - Support test: </li></ul></ul><ul><ul><li>test the support of every candidates. (delete the no frequent candidate) </li></ul></ul><ul><ul><li>Go to step K+1 </li></ul></ul>Ex: Minimum Support=3 Step 1: Sup(a)=3, Sup(b)=4, Sup(e)=1, Step k: Generation Phase (join phase) <a,b,c,d> 4 <a,b,f,c,e,d> 3 <a,b,f,c,d> 2 <a,b,c,f,d> 1 <a,b,f> <b,f, d> <a,b,f,d> <b,c,d> <a,b,c,d> <a,b,c> Candidate 4-sequence Frequent 3-sequence
    19. 19. Results - Resources. <ul><li>Possible Analysis on : </li></ul><ul><ul><li>the number of authors interacted in a resource </li></ul></ul><ul><ul><li>The number of time than a resource is modified </li></ul></ul><ul><ul><li>The life of the modification </li></ul></ul>1 0 0 0 0 (9w5,16) 0 0 1 0 0 (9t5,50) 208 755 654 160 99 (1s1,0) 1 5 10 54 11 (1w1,0) 13 95 22 49 0 (2s1,0) <ul><ul><li>… . </li></ul></ul>Ticket2 <ul><ul><li><(1,t,1,5) > </li></ul></ul>Ticket1 <ul><ul><li><(2,w,2,12) > </li></ul></ul>wikipage2 <ul><ul><li><(3,w,2,12) > </li></ul></ul>wikipage1 Sequence of items REsID <ul><ul><li><t1> </li></ul></ul>Ticket2 <ul><ul><li><t1,t2> </li></ul></ul>Ticket1 <ul><ul><li><w1,w2> </li></ul></ul>wikipage2 <ul><ul><li><w1,w1,w2> </li></ul></ul>wikipage1 Sequence of events ResID
    20. 20. Future works <ul><li>Data </li></ul><ul><ul><li>Problem due to the data: </li></ul></ul><ul><ul><ul><li>incompleteness of the data: no linked among the events </li></ul></ul></ul><ul><ul><ul><li>Bad use of the tool (svn) </li></ul></ul></ul><ul><ul><li>add meta data to link the events (e.g ticketID in the SVN commit log) </li></ul></ul><ul><ul><li>Analysis with data from some professional projects. </li></ul></ul><ul><li>Research on an algorithm handling noisy in the data: </li></ul><ul><ul><li>Add a notion of similarity < a,b,c,d,e> similar sequence than <a,b,c,d,e,f> or <a,b,c,e,e> </li></ul></ul><ul><ul><li>Find a heuristic to avoid a combinatory explosion. </li></ul></ul><ul><li>Classification of the user/group. </li></ul><ul><ul><li>Use the participation and interaction weights to clusterize the users/groups: </li></ul></ul><ul><ul><li>e.g. User={Part(svn)=100, Part(wiki)=1, Part(ticket)=0} = developer role </li></ul></ul>