SCAM 2012 Keynote Slides on Cooperative Testing and Analysis by Tao Xie

1,353 views
1,245 views

Published on

Cooperative Testing and Analysis: Human-Tool, Tool-Tool, and Human-Human Cooperations to Get Work Done.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,353
On SlideShare
0
From Embeds
0
Number of Embeds
13
Actions
Shares
0
Downloads
20
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

SCAM 2012 Keynote Slides on Cooperative Testing and Analysis by Tao Xie

  1. 1. SCAM 2012Human-Tool, Tool-Tool, and Human-Human Cooperations to Get Work Done Tao Xie North Carolina State University Raleigh, NC, USA
  2. 2.  Don’t forget human  Using your tools as end-to-end solutions  Helping your tools Don’t forget cooperations of tool-human and human-human  Human can help your tools too  Human and human could work together to help your tools, e.g., crowdsourcing 2
  3. 3. 3
  4. 4. SCAM 20xx will have a Human Papers track and inviteshuman papers of six pages length. Please considersubmitting a human paper outlining the details of yourhuman and your experience in building such human. Thehuman could be in any area under the umbrella of SCAM.… Some details are follows:•It aptly and accurately describes a human,•Motivates its existence well (and relates it to what camebefore),•Describes experience gained in developing the human.•Does not necessarily compare with other humans (this isnot a research paper) … 4
  5. 5. Chinese Animal ZodiacImage source: http://completechinese.blogspot.it/2012/02/ben-ming-nian-your-own-chinese-zodiac.html
  6. 6. What are changing?What are still the same?
  7. 7. http://worditout.com/word-cloud/113606
  8. 8. http://worditout.com/word-cloud/113605
  9. 9. http://worditout.com/word-cloud/113608
  10. 10. http://worditout.com/word-cloud/113609
  11. 11.  Human (SCAM Chairs): Cluster papers and label them with session titles Tool: Visualize the words in the session titles Human (We): Interpret the word popularity What Tool is Good/Better at? What Human is Good/Better at? 12
  12. 12. "Completely AutomatedPublic Turing test to tellComputers and HumansApart"
  13. 13.  Don’t forget human  Using your tools as end-to-end solutions  Helping your tools Don’t forget cooperations of tool-human and human-human  Human can help your tools too  Human and human could work together to help your tools, e.g., crowdsourcing 14
  14. 14. http://worditout.com/word-cloud/113609
  15. 15. http://worditout.com/word-cloud/113609
  16. 16. “During the past 21 years, over 75 papers and 9 Ph.D. theses have been published on pointer analysis. Given the tones of work on this topic one may wonder, “Havent we solved this problem yet? With input from many researchers in the field, this paper describes issues related to pointer analysis and remaining open problems.”Michael Hind. Pointer analysis: havent we solved this problem yet?. In Proc.ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools andEngineering (PASTE 2001) 19
  17. 17. Section 4.3 Designing an Analysis for a Client’s Needs“Barbara Ryder expands on this topic: “… We can all write an unbounded number of papers that compare different pointer analysis approximations in the abstract. However, this does not accomplish the key goal, which is to design and engineer pointer analyses that are useful for solving real software problems for realistic programs.” 20
  18. 18. http://worditout.com/word-cloud/113609
  19. 19. Zhenmin Li, Shan Lu, Suvda Myagmar,and Yuanyuan Zhou. CP-Miner: a tool forfinding copy-paste and related bugs inoperating system code. In Proc. OSDI2004. 23
  20. 20. Finding refactoring Available in Visual Studio 2012 opportunity Searching similar snippets for fixing bug once XIAO Code Clone Search service integrated into workflow of Microsoft Security Response Center (MSRC) Microsoft Technet Blog about XIAO: We wanted to be sure to address the vulnerable code wherever it appeared across the Microsoft code base. To that end, we have been working with Microsoft Research to develop a “Cloned Code Detection” system that we can run for every MSRC case to find any instance of the vulnerable code in any shipping product. This system is the one that found several of the copies of CVE-2011-3402 that we are now addressing with MS12-034.Yingnong Dang, Dongmei Zhang, Song Ge, Yingjun Qiu, and Tao Xie. XIAO: Tuning Code Clones at Hands of Engineers inPractice. In Proc. Annual Computer Security Applications Conference (ACSAC 2012) 24
  21. 21. XIAO enables code clone analysis with High scalability, High compatibility High tunability: what you tune is what you get High explorability: 1 2 3 4 5 6 6 4 3 7 1 1 5 1. Clone navigation based on source tree hierarchy 2. Pivoting of folder level statistics 1. Block correspondence 3. Folder level statistics 2. Block types 4. Clone function list in selected folder 3. Block navigation 5. Clone function filters 4. Copying 6. Sorting by bug or refactoring potential 5. Bug filing 7. Tagging 6. Tagging 2Yingnong Dang, Dongmei Zhang, Song Ge, Yingjun Qiu, and Tao Xie. XIAO: Tuning Code Clones at Hands of Engineers inPractice. In Proc. Annual Computer Security Applications Conference (ACSAC 2012) 25
  22. 22.  50 years of automated debugging research  N papers  only 5 evaluated with actual programmers “ ”Chris Parnin and Alessandro Orso. Are automated debugging techniques actually helpingprogrammers?. In Proc. ISSTA 2011
  23. 23. ICSM 2011 KeynoteLionel Briand. “Useful Software EngineeringResearch: Leading a Double-Agent Life” Andreas Zeller , Thomas Zimmermann , Christian Bird, Failure is a four-letter word: a parody in empirical research. … In Proc. PROMISE 2011 27
  24. 24. http://www.dagstuhl.de/programm/kalender/semhp/?semnr=1011 2010 Dagstuhl Seminar 10111Practical Software Testing: Tool Automation and Human Factors
  25. 25. Human Factors http://www.dagstuhl.de/programm/kalender/semhp/?semnr=1011 2010 Dagstuhl Seminar 10111Practical Software Testing: Tool Automation and Human Factors
  26. 26.
  27. 27. Andy Ko and Brad Myers. Debugging Reinvented: Asking and Answering Why and Why NotQuestions about Program Behavior. In Proc. ICSE 2008
  28. 28.  Don’t forget human  Using your tools as end-to-end solutions  Helping your tools Don’t forget cooperations of tool-human and human-human  Human can help your tools too  Human and human could work together to help your tools, e.g., crowdsourcing 32
  29. 29.  Motivation  Architecture recovery is challenging (abstraction gap)  Human typically has high-level view in mind Repeat  Human: define/update high-level model of interest  Tool: extract a source model  Human: define/update declarative mapping between high-level model and source model  Tool: compute a software reflexion model  Human: interpret the software reflexion model Until happyGail C. Murphy, David Notkin. Reengineering with Reflection Models: A Case Study. IEEE Computer 1997
  30. 30.  Recent advanced technique: Dynamic Symbolic Execution/Concolic Testing  Instrument code to explore feasible paths  Example tool: Pex from Microsoft Research (for .NET programs)Patrice Godefroid, Nils Klarlund, and Koushik Sen. DART: directed automated randomtesting. In Proc. PLDI 2005Koushik Sen, Darko Marinov, and Gul Agha. CUTE: a concolic unit testing engine for C. In Proc.ESEC/FSE 2005Nikolai Tillmann and Jonathan de Halleux. Pex - White Box Test Generation for .NET. In Proc.TAP 2008 34
  31. 31. Choose next pathCode to generate inputs for: Solve Execute&Monitorvoid CoverMe(int[] a) Constraints to solve Data Observed constraints{ if (a == null) return; null a==null if (a.Length > 0) if (a[0] == 1234567890) a!=null {} a!=null && !(a.Length>0) throw new Exception("bug"); a!=null && {0} a!=null &&} a.Length>0 Negated condition a.Length>0 && a[0]!=1 3 5 7 9 24680 F a==null a!=null && {2… 13} a!=null && T a.Length>0 && a.Length>0 && a[0]==1 3 5 7 9 24680 a[0]==1 3 5 7 9 24680 F a.Length>0 T Done: There is no path left. a[0]==123… F T
  32. 32. @NCSU ASE Method sequences  MSeqGen/Seeker [Thummalapenta et al. OOSPLA 11, ESEC/FSE 09], Covana [Xiao et al. ICSE 2011], OCAT [Jaygarl et al. ISSTA 10], Evacon [Inkumsah et al. ASE 08], Symclat [dAmorim et al. ASE 06] Environments e.g., db, file systems, network, …  DBApp Testing [Taneja et al. ESEC/FSE 11], [Pan et al. ASE 11]  CloudApp Testing [Zhang et al. IEEE Soft 12] Loops  Fitnex [Xie et al. DSN 09] Code evolution  eXpress [Taneja et al. ISSTA 11]
  33. 33. Download counts (20 months) (Feb. 2008 - Oct. 2009 ) Academic: 17,366 Devlabs: 13,022 Total: 30,388http://research.microsoft.com/projects/pex/
  34. 34. http://pexase.codeplex.com/ Publications: http://research.microsoft.com/en-us/projects/pex/community.aspx#publications
  35. 35. Running Symbolic PathFinder ...…============================================ ========== resultsno errors detected============================================ … ========== statisticselapsed time: 0:00:02states: new=4, visited=0, backtracked=4, end=2search: maxDepth=3, constraints=0choice generators: thread=1, data=2heap: gc=3, new=271, free=22instructions: 2875max memory: 81MBloaded code: classes=71, methods=884 39
  36. 36.  Example: Dynamic Symbolic Execution/Concolic Testing  Instrument code to explore feasible paths  Challenge: path explosion Total block coverage achieved is 50%, lowest coverage 16%. object-creation problems (OCP) - 65% external-method call problems (EMCP) – 27% 40
  37. 37. 00: class Graph : IVEListGraph { …03: public void AddVertex (IVertex v) { [Thummalapenta et al. OOPSLA 11]04: vertices.Add(v); // B1 }06: public Edge AddEdge (IVertex v1, IVertex v2) {07: if (!vertices.Contains(v1))  A graph example from08: throw new VNotFoundException(""); QuickGraph library09: // B210: if (!vertices.Contains(v2))11: throw new VNotFoundException("");  Includes two classes12: // B3 Graph14: Edge e = new Edge(v1, v2);15: edges.Add(e); } } DFSAlgorithm//DFS:DepthFirstSearch18: class DFSAlgorithm { …  Graph23: public void Compute (IVertex s) { ... AddVertex24: if (graph.GetEdges().Size() > 0) { // B425: isComputed = true; AddEdge: requires26: foreach (Edge e in graph.GetEdges()) { both vertices to be27: ... // B5 in graph28: }29: } } } 41 41
  38. 38.  Test target: Cover true00: class Graph : IVEListGraph { … branch (B4) of Line 2403: public void AddVertex (IVertex v) { [Thummalapenta et al. OOPSLA 11]04: vertices.Add(v); // B1 }06: public Edge AddEdge (IVertex v1, IVertex v2) {07: if (!vertices.Contains(v1))  Desired object08: throw new VNotFoundException(""); state: graph should09: // B2 include at least one10: if (!vertices.Contains(v2)) edge11: throw new VNotFoundException("");12: // B3  Target sequence:14: Edge e = new Edge(v1, v2);15: edges.Add(e); } } Graph ag = new Graph(); Vertex v1 = new Vertex(0);//DFS:DepthFirstSearch Vertex v2 = new Vertex(1);18: class DFSAlgorithm { … ag.AddVertex(v1);23: public void Compute (IVertex s) { ... ag.AddVertex(v2);24: if (graph.GetEdges().Size() > 0) { // B4 ag.AddEdge(v1, v2);25: isComputed = true; DFSAlgorithm algo = new26: foreach (Edge e in graph.GetEdges()) { DFSAlgorithm(ag);27: ... // B5 algo.Compute(v1);28: }29: } } } 42 42
  39. 39.  Example: Dynamic Symbolic Execution/Concolic (Pex)  Instrument code to explore feasible paths  Challenge: path explosion Total block coverage achieved is 50%, lowest coverage 16%. object-creation problems (OCP) - 65% external-method call problems (EMCP) – 27% 43
  40. 40.  Example 1:  File.Exists has data dependencies on program input  Subsequent branch at Line 1 using 1 the return value of File.Exists. Example 2:  Path.GetFullPath has data dependencies on program input  Path.GetFullPath throws 2 exceptions. Example 3: Stirng.Format do not cause any problem 3 44
  41. 41. Tackle object-creation problems with Factory Methods 45
  42. 42. Tackle external-method call problems with Mock Methods orMethod InstrumentationMocking System.IO.File.ReadAllText 46
  43. 43. Tools Typically Don’t Communicate Challenges Faced by Them to EnableRunning Symbolic PathFinder ... Cooperation between Tools… ========== resultsno errors detected and Users======================================================================================== … ========== statisticselapsed time: 0:00:02states: new=4, visited=0, backtracked=4, end=2search: maxDepth=3, constraints=0choice generators: thread=1, data=2heap: gc=3, new=271, free=22instructions: 2875max memory: 81MBloaded code: classes=71, methods=884 47
  44. 44.  Machine is better at task set A  Mechanical, tedious, repetitive tasks, …  Ex. solving constraints along a long path Human is better at task set B  Intelligence, human intent, abstraction, domain knowledge, …  Ex. local reasoning after a loop, recognizing naming semantics =A U B 48
  45. 45.  Human-Assisted Computing  Driver: tool Helper: human  Ex. Covana [Xiao et al. ICSE 2011] Human-Centric Computing  Driver: human  Helper: tool  Ex. Coding duels @Pex for FunInterfaces are important. Contents are important too! 49
  46. 46.  Motivation  Tools are often not powerful enough  Human is good at some aspects that tools are not  What difficulties does the tool face?  How to communicate info to the user to get help? Iterations to form Feedback Loop  How does the user help the tool based on the info? 50
  47. 47.  Motivation  Tools are often not powerful enough  Human is good at some aspects that tools are not  What difficulties does the tool face?  How to communicate info to the user to get help? Iterations to form Feedback Loop  How does the user help the tool based on the info? 51
  48. 48. external-method call problems (EMCP) object-creation problems (OCP) 52
  49. 49.  Existing solution  identify all executed external-method calls  report all object types of program inputs and fields Limitations  the number is often high  some identified problem are irrelevant for achieving higher structural coverage 53
  50. 50. Reported EMCPs: 44Reported OCPs: 18 vs.Real EMCPs: 0Real OCPs: 5 54
  51. 51. [Xiao et al. ICSE 11]  Goal: Precisely identify problems faced by tools when achieving structural coverage  Insight: Partially-Covered Statements have data dependency on real problem candidatesXusheng Xiao, Tao Xie, Nikolai Tillmann, and Jonathan de Halleux. Precise Identification ofProblems for Structural Test Generation. In Proc. ICSE 2011 55
  52. 52. Runtime Problem Events Candidate Program IdentificationGenerated ForwardTest Inputs Symbolic Problem Execution Candidates Runtime Coverage Information Data Identified Dependence Problems Analysis 56
  53. 53.  External-method calls whose arguments have data dependencies on program inputs Data Dependencies 57
  54. 54. Symbolic Expression:return(File.Exists) == trueElement ofEMCP Candidate:return(File.Exists)  Partially-covered branchBranch Statement Line 1 has data statements have datadependency on File.Exists at Line 1 dependencies on EMCP candidates for return values 58
  55. 55.  Subjects:  xUnit: unit testing framework for .NET ▪ 223 classes and interfaces with 11.4 KLOC  QuickGraph: C# graph library ▪ 165 classes and interfaces with 8.3 KLOC Evaluation setup:  Apply Pex to generate tests for program under test  Feed the program and generated tests to Covana  Compare existing solution and Covana 59
  56. 56.  RQ1: How effective is Covana in identifying the two main types of problems, EMCPs and OCPs? RQ2: How effective is Covana in pruning irrelevant problem candidates of EMCPs and OCPs? 60
  57. 57. Covana identifies• 43 EMCPs with only 1 false positive and 2 false negatives• 155 OCPs with 20 false positives and 30 false negatives. 61
  58. 58. Covana prunes• 97% (1567 in 1610) EMCP candidates with 1 false positive and 2 false negatives• 66% (296 in 451) OCP candidates with 20 false positives and 30 false negatives 62
  59. 59. [Xiao et al. ICSE 2011] Task: What need to automate?  Test-input generation What difficulties does the tool face?  Doesn’t know which methods to instrument and explore  Doesn’t know how to generate effective method sequences How to communicate info to the user to get her help?  Report encountered problems How does the user help the tool based on the info?  Instruct which external methods to instrument/write mock objects  Write factory methods for generating objects Iterations to form feedback loop?  Yes, till the user is happy with coverage or impatient
  60. 60.  Human-Assisted Computing  Driver: tool Helper: human  Ex. Covana [Xiao et al. ICSE 2011] Human-Centric Computing  Driver: human  Helper: tool  Ex. Coding duels @Pex for FunInterfaces are important. Contents are important too! 64
  61. 61. www.pexforfun.com 986,904 clicked Ask Pex! The contributed concept of Coding Duel games as major game type of Pex for Fun since Summer 2010 65
  62. 62. behavior Secret Impl == Player Impl Player Implementation Secret Implementation class Player { public static int Puzzle(int x) {class Secret { return x; public static int Puzzle(int x) { } if (x <= 0) return 1; } return x * Puzzle(x-1); }}class Test {public static void Driver(int x) { if (Secret.Puzzle(x) != Player.Puzzle(x)) throw new Exception(“Mismatch”); }} 66
  63. 63.  Coding duels at http://www.pexforfun.com/ Task for Human: write behavior-equiv code class Player { public static int Puzzle(int x) { return x; Human  Tool } }  Does my new code behave differently? How exactly? Human  Tool  Could you fix your code to handle failed/passed tests? Iterations to form feedback loop?  Yes, till tool generates no failed tests/player is impatient
  64. 64.  Coding duels at http://www.pexforfun.com/  Brain exercising/learning while having fun  Fun: iterative, adaptive/personalized, w/ win criterion  Abstraction/generalization, debugging, problem solving Brain exercising
  65. 65. Especially valuable in Massive Open Online Courses (MOOC)
  66. 66.  Everyone can contribute Internet  Coding duels  Duel solutionsclass Secret { public static int Puzzle(int x) { if (x <= 0) return 1; return x * Puzzle(x-1); } } 71
  67. 67. Puzzle Games Made from Internet Difficult Constraints or Object- Creation ProblemsNing Chen and Sunghun Kim. Puzzle-based Automatic Testing: bringing humans into the loop bysolving puzzles. In Proc. ASE 2012 Supported by MSR SEIF Award
  68. 68. StackMine [Han et al. ICSE 12] Pattern Matching Bug update Internet Problematic Bug Database Pattern Repository Bug filing Trace collection Trace Storage Trace analysisShi Han, Yingnong Dang, Song Ge, Dongmei Zhang, and Tao Xie. Performance Debugging in the Largevia Mining Millions of Stack Traces. In Proc. ICSE 2012 73
  69. 69. “We believe that the MSRA tool is highly valuable and much more efficient for mass trace (100+ traces) analysis. For 1000 traces, we believe the tool saves us 4-6 weeks of time to create new signatures, which is quite a significant productivity boost.” - from Development Manager in Windows Highly effective new issue discovery on Windows mini-hang Continuous impact on future Windows versionsShi Han, Yingnong Dang, Song Ge, Dongmei Zhang, and Tao Xie. Performance Debugging in the Largevia Mining Millions of Stack Traces. In Proc. ICSE 2012
  70. 70.  Static analysis + dynamic analysis  Static checking + Test generation  …  Dynamic analysis + static analysis  Fix generation + fix validation  …  Static analysis + static analysis  …  Dynamic analysis + dynamic analysis  …Example: Xiaoyin Wang, Lu Zhang, Tao Xie, Yingfei Xiong, and Hong Mei. Automating PresentationChanges in Dynamic Web Applications via Collaborative Hybrid Analysis. In Proc. FSE 2012 75
  71. 71.  Don’t forget human  Using your tools as end-to-end solutions  Helping your tools Don’t forget cooperations of tool-human and human-human  Human can help your tools too  Human and human could work together to help your tools, e.g., crowdsourcing 76
  72. 72.  Human-Assisted Computing  Covana Human-Centric Computing  Pex for Fun Human-Human Cooperation  StackMine
  73. 73.  Wonderful current/former students@NCSU ASE Collaborators, especially those from Microsoft Research Redmond/Asia, Peking University Colleagues who gave feedback and inspired meNSF grants CCF-0845272, CCF-0915400, CNS-0958235, ARO grant W911NF-08-1-0443, anNSA Science of Security, Lablet grant, a NIST grant, a 2011 Microsoft Research SEIF Award
  74. 74. https://sites.google.com/site/slesesymposium/
  75. 75. Questions ?https://sites.google.com/site/asergrp/

×