Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Towards Mining Software Repositories Research that Matters

1,429 views

Published on

Towards Mining Software Repositories Research that Matters. Talk slides at Next Generation of Mining Software Repositories '14 (Pre-FSE 2014 Event), Nov 15–16. HKUST, Hong Kong http://ng2014.msrworld.org/

Towards Mining Software Repositories Research that Matters

  1. 1. Towards Mining Software Repositories Research that Matters Tao Xie Department of Computer Science University of Illinois at Urbana-Champaign, USA taoxie@illinois.edu
  2. 2. Machine Learning that Matters “The basic argument in her paper is that machine learning might be in danger of losing its impact because the community as a whole has become quite self-referential. People are probably solving real-world problems using ML methods, but there is little sharing of these results within the community. Instead, people focus on existing benchmarks which might have originally had some connection to real-world problems which has been long forgotten, however.” “She proposes a number of tasks like $100M solved through ML based decision making or a human life saved through a diagnosis or an intervention recommended by an ML system to get ML back on track.” ICML’12 http://icml.cc/2012/papers/298.pdf http://blog.mikiobraun.de/2012/06/is-machine-learning-losing-impact.html
  3. 3. Redwine and Riddle Study (1985) • From idea to “the point it can be popularized and disseminated to the technical community at large” – Worst case: 23 years – Best case: 11 years – Mean: 17 years • 7.5 years from developed technology to wide availability Source©S. L. Pfleeger Sam Redwine Jr., William Riddle: Software Technology Maturation, In Proc. ICSE 1985.
  4. 4. Technology Maturation: Middleware Source©A. Wolfhttp://www.sigsoft.org/impact/docs/ImpactWolfBCS2008.pdf 15-20 years between first publication of an idea and widespread availability in products
  5. 5. Technology Maturation: Middleware Source©A.http://www.sigsoft.org/impact/docs/ImpactWolfBCS2008.pdf 15-20 years between first publication of an idea and widespread availability in products Shall we just stay in our comfort zone to wait for 15-20 years for our research to (or not to) produce practice impact?? How about the research that we did 15-20 years ago?? [Caveat: don’t forget the need of long-term/blue-sky research!!]
  6. 6. 2012 NSF Workshop on Formal Methods • Goal: to identify the future directions in research in formal methods and its transition to industrial practice. • Success examples mentioned by the attendees – SLAM/SDV – ASTREE – SMT-based tools – … http://goto.ucsd.edu/~rjhala/NSFWorkshop/
  7. 7. “What Happened to the Promise of Software Tools?” – Jim Larus http://www.srl.inf.ethz.ch/workshop2014/eth-larus.pdf https://www.youtube.com/watch?v=kO9OYnkeRTM
  8. 8. Impacts, Impacts, Impacts, … Image source: http://engage.synecoretech.com/marketing-technology-for-growth/bid/155279/How-Online-Content-Impacts-Your-Social-Media-Marketing-Strategy
  9. 9. Research Impacts 99319 22786 32987
  10. 10. Research Impacts SIGSOFT Impact Paper Awards, ICSE MIP awards, … …
  11. 11. Practice Impacts ACM Software System Awards 31 Awardees http://awards.acm.org/software_system/
  12. 12. Practice Impacts ACM Software System Awards • Development Environments/Tools – 2013: Coq – 2012: LLVM – 2011: Eclipse – 2007: Statemate – 2006: Eiffel – 2005: The Boyer-Moore Theorem Prover (ACL2) – 2003: MAKE – 2001: SPIN – 1992: Interlisp • Languages – 2002: Java – 1998: The S System (R statistical analysis) – 1997: Tcl/Tk – 1987: SMALLTALK
  13. 13. 2012 LLVM born at Illinois • The openness of the LLVM technology and the quality of its architecture and engineering design are key factors in understanding the success it has had both in academia and industry Vikram Adve Chris Lattner Evan Cheng http://llvm.org/
  14. 14. Practice Impacts commercialization/industrial adoption … SAGE ASTRÉE Statechart SPIN Moles Microsoft Research … …
  15. 15. Practice Impacts research publications  industrial adoption done by others … • ICSE 00 Daikon paper by Ernst et al.  Agitar Agitator – https://homes.cs.washington.edu/~mernst/pubs/invariants-relevance-icse2000.pdf • ASE 04 Rostra paper by Xie et al.  Parasoft Jtest improvement – http://web.engr.illinois.edu/~taoxie/publications/ase04.pdf • PLDI/FSE 05 DART/CUTE papers by Sen et al.  MSR SAGE, Pex – http://srl.cs.berkeley.edu/~ksen/papers/dart.pdf – http://srl.cs.berkeley.edu/~ksen/papers/C159-sen.pdf
  16. 16. HOW??? • Are these impact goals too far from you? • Can you plan for that? • What if you are in a university research group? • …
  17. 17. (How) Can A University Group Do It? • Aim for research impacts more commonly – but sometimes/often may not be predicted well, e.g., Whyper [USENIX SEC 13] http://web.engr.illinois.edu/~taoxie/publications/usenixsec13-whyper.pdf • Start a startup – but desirable to have right people (e.g., former students) to start – but software engineering tools may not sell crazily • Collaborate with industrial research labs – but many research lab projects may look like univ. projects • Collaborate with industrial product groups – but many probs faced by product groups may not be “researchy” • At least focus on problems that matter (now or future)!
  18. 18. (How) Can A University Group Do It? • Need to balance/unify producing great students vs./and great (high practice-impact) research http://www.cs.washington.edu/people/faculty/notkin/students conts.
  19. 19. Experience Reports on Successful Tool Transfer • Nikolai Tillmann, Jonathan de Halleux, and Tao Xie. Transferring an Automated Test Generation Tool to Practice: From Pex to Fakes and Code Digger. In Proceedings of ASE 2014, Experience Papers. http://web.engr.illinois.edu/~taoxie/publications/ase14- pexexperiences.pdf • Jian-Guang Lou, Qingwei Lin, Rui Ding, Qiang Fu, Dongmei Zhang, and Tao Xie. Software Analytics for Incident Management of Online Services: An Experience Report. In Proceedings ASE 2013, Experience Paper. http://web.engr.illinois.edu/~taoxie/publications/ase13-sas.pdf • Dongmei Zhang, Shi Han, Yingnong Dang, Jian-Guang Lou, Haidong Zhang, and Tao Xie. Software Analytics in Practice. IEEE Software, Special Issue on the Many Faces of Software Analytics, 2013. http://web.engr.illinois.edu/~taoxie/publications/ieeesoft13-softanalytics.pdf • Yingnong Dang, Dongmei Zhang, Song Ge, Chengyun Chu, Yingjun Qiu, and Tao Xie. XIAO: Tuning Code Clones at Hands of Engineers in Practice. In Proceedings of ACSAC 2012. http://web.engr.illinois.edu/~taoxie/publications/acsac12-xiao.pdf
  20. 20. Q & A http://www.cs.illinois.edu/homes/taoxie/ Contact: taoxie@illinois.edu Supported in part by a Microsoft Research Award, NSF grants CCF-1349666, CNS-1434582, CCF-1434596, CCF- 1434590, CNS-1439481, and the USA National Security Agency (NSA) Science of Security Lablet. Discussion
  21. 21. Discussion Topics: HOW??? • Are these impact goals too far from you? • Can you plan for that? • What if you are in a university research group? • …

×