Towards Mining Software Repositories Research that Matters
•7 likes•1,878 views
Download to read offline
Report
Towards Mining Software Repositories Research that Matters. Talk slides at Next Generation of Mining Software Repositories '14 (Pre-FSE 2014 Event), Nov 15–16. HKUST, Hong Kong http://ng2014.msrworld.org/
Towards Mining Software Repositories Research that Matters
1. Towards Mining Software Repositories
Research that Matters
Tao Xie
Department of Computer Science
University of Illinois at Urbana-Champaign, USA
taoxie@illinois.edu
2. Machine Learning that Matters
“The basic argument in her paper is that machine learning
might be in danger of losing its impact because the
community as a whole has become quite self-referential.
People are probably solving real-world problems using ML
methods, but there is little sharing of these results within
the community. Instead, people focus on existing
benchmarks which might have originally had some
connection to real-world problems which has been long
forgotten, however.”
“She proposes a number of tasks like $100M solved
through ML based decision making or a human life saved
through a diagnosis or an intervention recommended by
an ML system to get ML back on track.”
ICML’12
http://icml.cc/2012/papers/298.pdf
http://blog.mikiobraun.de/2012/06/is-machine-learning-losing-impact.html
6. 2012 NSF Workshop on Formal Methods
• Goal: to identify the future directions in research in
formal methods and its transition to industrial
practice.
• Success examples mentioned by the attendees
– SLAM/SDV
– ASTREE
– SMT-based tools
– …
http://goto.ucsd.edu/~rjhala/NSFWorkshop/
7. “What Happened to the Promise
of Software Tools?” – Jim Larus
http://www.srl.inf.ethz.ch/workshop2014/eth-larus.pdf
https://www.youtube.com/watch?v=kO9OYnkeRTM
11. Practice Impacts ACM Software System Awards
31 Awardees
http://awards.acm.org/software_system/
12. Practice Impacts ACM Software System Awards
• Development Environments/Tools
– 2013: Coq
– 2012: LLVM
– 2011: Eclipse
– 2007: Statemate
– 2006: Eiffel
– 2005: The Boyer-Moore Theorem Prover (ACL2)
– 2003: MAKE
– 2001: SPIN
– 1992: Interlisp
• Languages
– 2002: Java
– 1998: The S System (R statistical analysis)
– 1997: Tcl/Tk
– 1987: SMALLTALK
13. 2012 LLVM born at Illinois
• The openness of the LLVM technology and the quality of its
architecture and engineering design are key factors in
understanding the success it has had both in academia and
industry
Vikram Adve Chris Lattner Evan Cheng
http://llvm.org/
15. Practice Impacts
research publications industrial adoption done by others
…
• ICSE 00 Daikon paper by Ernst et al. Agitar Agitator
– https://homes.cs.washington.edu/~mernst/pubs/invariants-relevance-icse2000.pdf
• ASE 04 Rostra paper by Xie et al. Parasoft Jtest improvement
– http://web.engr.illinois.edu/~taoxie/publications/ase04.pdf
• PLDI/FSE 05 DART/CUTE papers by Sen et al. MSR SAGE, Pex
– http://srl.cs.berkeley.edu/~ksen/papers/dart.pdf
– http://srl.cs.berkeley.edu/~ksen/papers/C159-sen.pdf
16. HOW???
• Are these impact goals too far from you?
• Can you plan for that?
• What if you are in a university research
group?
• …
17. (How) Can A University Group Do It?
• Aim for research impacts more commonly
– but sometimes/often may not be predicted well,
e.g., Whyper [USENIX SEC 13] http://web.engr.illinois.edu/~taoxie/publications/usenixsec13-whyper.pdf
• Start a startup
– but desirable to have right people (e.g., former students) to start
– but software engineering tools may not sell crazily
• Collaborate with industrial research labs
– but many research lab projects may look like univ. projects
• Collaborate with industrial product groups
– but many probs faced by product groups may not be “researchy”
• At least focus on problems that matter (now or future)!
18. (How) Can A University Group Do It?
• Need to balance/unify producing great
students vs./and great (high practice-impact)
research
http://www.cs.washington.edu/people/faculty/notkin/students
conts.
19. Experience Reports on Successful Tool Transfer
• Nikolai Tillmann, Jonathan de Halleux, and Tao Xie. Transferring an Automated Test
Generation Tool to Practice: From Pex to Fakes and Code Digger. In Proceedings of ASE
2014, Experience Papers. http://web.engr.illinois.edu/~taoxie/publications/ase14-
pexexperiences.pdf
• Jian-Guang Lou, Qingwei Lin, Rui Ding, Qiang Fu, Dongmei Zhang, and Tao Xie. Software
Analytics for Incident Management of Online Services: An Experience Report. In
Proceedings ASE 2013, Experience Paper.
http://web.engr.illinois.edu/~taoxie/publications/ase13-sas.pdf
• Dongmei Zhang, Shi Han, Yingnong Dang, Jian-Guang Lou, Haidong Zhang, and Tao Xie.
Software Analytics in Practice. IEEE Software, Special Issue on the Many Faces of Software
Analytics, 2013. http://web.engr.illinois.edu/~taoxie/publications/ieeesoft13-softanalytics.pdf
• Yingnong Dang, Dongmei Zhang, Song Ge, Chengyun Chu, Yingjun Qiu, and Tao Xie. XIAO:
Tuning Code Clones at Hands of Engineers in Practice. In Proceedings of ACSAC 2012.
http://web.engr.illinois.edu/~taoxie/publications/acsac12-xiao.pdf
20. Q & A
http://www.cs.illinois.edu/homes/taoxie/
Contact: taoxie@illinois.edu
Supported in part by a Microsoft Research Award, NSF grants CCF-1349666, CNS-1434582, CCF-1434596, CCF-
1434590, CNS-1439481, and the USA National Security Agency (NSA) Science of Security Lablet.
Discussion
21. Discussion Topics: HOW???
• Are these impact goals too far from you?
• Can you plan for that?
• What if you are in a university research
group?
• …