ISEC'18 Tutorial: Research Methodology on Pursuing Impact-Driven Research
1. Research Methodology on
Pursuing Impact-Driven Research
Tao Xie
Department of Computer Science
University of Illinois at Urbana-Champaign
taoxie@illinois.edu
http://taoxie.cs.illinois.edu/
Innovations in Software Engineering Conference (ISEC 2018)
Feb 9-11 2018, Hyderabad, India
2. Evolution of Research Assessment
• #Papers
• #International Venue Papers
• #SCI/EI Papers
• #CCF A (B/C) Category Papers
• ???
CRA 2015 Report:
“Hiring Recommendation. Evaluate candidates on the basis of the contributions in their top one or
two publications, …”
“Tenure and Promotion Recommendation. Evaluate candidates for tenure and promotion on the
basis of the contributions in their most important three to five publications (where systems and
other artifacts may be included).”
http://cra.org/resources/best-practice-memos/incentivizing-quality-and-impact-evaluating-scholarship-in-hiring-tenure-and-promotion/
3. Societal Impact
ACM Richard Tapia Celebration of Diversity in Computing
Join us at the next Tapia Conference in Orlando, FL on September 19-22, 2018!
http://tapiaconference.org/
Margaret Burnett: “Womenomics &
Gender-Inclusive Software”
“Because anybody who thinks that we’re just
here because we’re smart forgets that we’re also
privileged, and we have to extend that farther. So
we’ve got to educate and help every generation
and we all have to keep it up in lots of ways.”
– David Notkin, 1955-2013
Andy Ko: “Why the
Software Industry Needs
Computing Education
Research”
4. Impact on Research Communities Beyond SE
Representational State Transfer
(REST) as a key architectural
principle of WWW (2000)
Related to funding/head-count allocation, student recruitment, …
community growth
Roy Fielding Richard Taylor
…
Andreas Zeller
Delta debugging (1999)Symbolic execution (1976)
also by James King, William
Howden, Karl Levitt, et al.
Lori Clarke
http://asegrp.blogspot.in/2016/07/outward-thinking-for-our-research.html
5. Practice Impact
• Diverse/balanced research styles shall/can be embraced
• Our community already well appreciates impact on other researchers, e.g.,
SIGSOFT Impact Awards, ICSE MIP, paper citations
• But often insufficient effort for last mileage or focus on real problems
• Strong need of ecosystem to incentivize practice impact pursued by
researchers
• Top down:
• Bottom up:
• Conference PC for reviewing papers
• Impact counterpart of “highly novel ideas”?
• Impact counterpart of “artifact evaluation”?
• Promote and recognize practice impact
• Counterpart of ACM Software System Award? http://www.cs.umd.edu/hcil/newabcs/
http://cra.org/resources/best-practice-memos/incentivizing-quality-and-impact-evaluating-scholarship-in-hiring-tenure-and-promotion/
6. Practice-Impact Levels of Research
• Study/involve industrial data/subjects
• Indeed, insights sometimes may benefit practitioners
• Hit (with a tool) and run
• Authors hit and run (upon industrial data/subjects)
• Practitioners hit and run
• Continuous adoption by practitioners
• Importance of benefited domain/system (which can be just a single one)
• Ex. WeChat test generation tool WeChat with > 900 million users
• Ex. MSRA SA on SAS MS Online Service with hundreds of million users
• Ex. Beihang U. on CarStream Shenzhou Rental with > 30,000 vehicles over 60 cities
• Scale of practitioner users
• Ex. MSR Pex Visual Studio 2015+ IntelliTest
• Ex. MSR Code Hunt with close to 6 million registered/anonymous/API accounts
• Ex. MSRA SA XIAO Visual Studio 2012+ Clone Analysis
Think about >90% startups fail! It is
challenging to start from research and
then single-handedly bring it to
continuous adoption by target users;
academia-industry collaborations are
often desirable.
7. Practice-Impact Levels of Research
• If there are practice impacts but no underlying research (e.g.,
published research), then there is no practice-impactful research
• More like a startup’s or a big company’s product with business secrets
• Some industry-academia collaborations treat university researchers
(students) like cheap(er) engineering labor no or little research
8. Desirable Problems for Academia-Industry
Collaborations
• Not all industrial problems are worth effort investment from university
groups
• High business/industry value
• Allow research publications (not business secret) to advance the knowledge
• Challenging problem (does it need highly intellectual university researchers?)
• Desirably real man-power investment from both sides
• My recent examples
• Tencent WeChat [FSE’16 Industry], [ICSE’17 SEIP]: Android app testing/analysis
• Exploring collaborations with Baidu, Alibaba, Huawei, etc.
• Exploring new collaborations with MSRA SA
9. Sustained Productive Academia-Industry
Collaborations
• Careful selection of target problems/projects
• Desirable to start with money-free collaborations(?)
• If curiosity-driven nature is also from industry (lab) side, watch out.
• Each collaboration party needs to bring in something important and unique –
win-win situation
• High demand of abstraction/generalization skills on the academic collaborators to pursue
research upon high-practice-impact work.
• Think more about the interest/benefit of the collaborating party
• (Long-term) relationship/trust building
• Mutual understanding of expected contributions to the collaborations
• Balancing research and “engineering”
• Focus, commitment, deliverables, funding, …
10. Optimizing “Research Return”:
Pick a Problem Best for You
Your Passion
(Interest/Passion)
High Impact
(Societal Needs/Purpose)
Your Strength
(Gifts/Potential)Best problems for you
Find your passion: If you don’t have to work/study for money, what would you do?
Test of impact: If you are given $1M to fund a research project, what would you fund?
Find your strength/Avoid your weakness: What are you (not) good at?
Find what interests you that you can do well, and is needed by the people Adapted from Slides by
ChengXiang Zhai, YY ZHou
11. Brief Desirable Characteristics of Your Paper/Project
• Two main elements
• Interesting idea(s) accompanying interesting claim(s)
• claim(s) well validated with evidence
• Then how to define “interesting”?
• Really depend on the readers’ taste but there may be general taste for a
community
• Ex: being the first in X, being non-trivial, contradicting conventional wisdoms, …
• Can be along problem or solution space; in SE, being the first to point out a
refreshing and practical problem would be much valued
• Uniqueness, elegance, significance?
D. Notkin: Software, Software Engineering and Software Engineering Research: Some Unconventional Thoughts. J. Comput.
Sci. Technol. 24(2): 189-197 (2009) https://link.springer.com/article/10.1007/s11390-009-9217-4
D. Notkin’s ICSM 2006 keynote talk.
12. Factors Affecting Choosing a Problem/Project
• What factors affect you (not) to choose a problem/project?
• Besides your supervisor/mentor asks you (not) to choose it
http://www.weizmann.ac.il/mcb/UriAlon/nurturing/HowToChooseGoodProblem.pdf
13. Big Picture and Vision
• Step back and think about what research problems will be most
important and most influential/significant to solve in the long term
• Long term could be the whole career
• People tend not to think about important/long term problems
Richard Hamming “you and your research”
http://www.cs.virginia.edu/~robins/YouAndYourResearch.html
Ivan Sutherland “technology and courage”
http://labs.oracle.com/techrep/Perspectives/smli_ps-1.pdf
Less important More important
Shorter term
Longer term
This slide was made based on
discussion with David Notkin
15. Big Picture and Vision –cont.
• If you are given 1 (4) million dollars to lead a team of 5 (10) team
members for 5 (10) years, what would you invest them on?
16. Factors Affecting Choosing a Problem/Project
• Impact/significant: Is the problem/solution important? Are
there any significant challenges?
• Industrial impact, research impact, …
• DON’T work on a problem imagined by you but not being a real problem
• E.g., determined based on your own experience, observation of practice,
feedback from others (e.g., colleagues, industrial collaborators)
• Novelty: is the problem novel? is the solution novel?
• If a well explored or crowded space, watch out (how much
space/depth? how many people in that space?)
17. Factors Affecting Choosing a Problem/Project II
• Risk: how likely the research could fail?
• reduced with significant feasibility studies and risk management in
the research development process
• E.g., manual “mining” of bugs
• Cost: how high effort investment would be needed?
• Sometimes being able to be reduced with using tools and
infrastructures available to us
• Need to consider evaluation cost (solutions to some problem may
be difficult to evaluate)
• But don’t shut down a direction simply due to cost
18. Factors Affecting Choosing a Problem/Project III
• Better than existing approaches (in important ways) besides new:
engineering vs. science
• Competitive advantage
• “secret weapon”
• Why you/your group is the best one to pursue it?
• Ex. a specific tool/infrastructure, access to specific data, collaborators, an
insight,…
• Need to know your own strengths/weaknesses
• Underlying assumptions and principles - how do you (systematically) choose
what to pursue?
• core values that drive your research agenda in some broad way
This slide was made based on discussion with David Notkin
19. Example Principles – Problem Space
• Question core assumptions or conventional wisdoms about SE
• Play around industrial tools to address their limitation
• Collaborate with industrial collaborators to decide on
problems of relevance to practice
• Investigate SE mining requirement and adapt or develop
mining algorithms to address them
(e.g., Suresh Thummalapenta [ICSE 09, ASE 09])
D. Notkin: Software, Software Engineering and Software Engineering Research: Some Unconventional Thoughts. J. Comput.
Sci. Technol. 24(2): 189-197 (2009) https://link.springer.com/article/10.1007/s11390-009-9217-4
D. Notkin’s ICSM 2006 keynote talk.
20. Example Principles – Solution Space
• Integration of static and dynamic analysis
• Using dynamic analysis to realize tasks originally realized by
static analysis
• Or the other way around
• Using compilers to realize tasks originally realized by
architectures
• Or the other way around
• …
21. Factors Affecting Choosing a Problem/Project IV
• Intellectual curiosity
• Other benefits (including option value)
• Emerging trends or space
• Funding opportunities, e.g., security
• Infrastructure used by later research
• …
• What you are interested in, enjoy, passionate, and believe in
• AND a personal taste
• Tradeoff among different factors
22. Dijkstra’s Three Golden Rules for Successful
Scientific Research
1. “Internal”: Raise your quality standards as high as you can live
with, avoid wasting your time on routine problems, and always
try to work as closely as possible at the boundary of your
abilities. Do this, because it is the only way of discovering how
that boundary should be moved forward.
2. “External”: We all like our work to be socially relevant and
scientifically sound. If we can find a topic satisfying both
desires, we are lucky; if the two targets are in conflict with each
other, let the requirement of scientific soundness prevail.
http://www.cs.utexas.edu/~EWD/ewd06xx/EWD637.PDF
23. Dijkstra’s Three Golden Rules for Successful
Scientific Research cont.
3. “Internal/ External”: Never tackle a problem of which you can be
pretty sure that (now or in the near future) it will be tackled by
others who are, in relation to that problem, at least as competent
and well-equipped as you.
http://www.cs.utexas.edu/~EWD/ewd06xx/EWD637.PDF
24. Jim Gray’s Five Key Properties for a Long-Range Research Goal
• Understandable: simple to state.
• Challenging: not obvious how to do it.
• Useful: clear benefit.
• Testable: progress and solution is testable.
• Incremental: can be broken in to smaller steps
• So that you can see intermediate progress
http://arxiv.org/ftp/cs/papers/9911/9911005.pdf
http://research.microsoft.com/pubs/68743/gray_turing_fcrc.pdf
25. Tony Hoare’s Criteria for a Grand Challenge
• Fundamental
• Astonishing
• Testable
• Inspiring
• Understandable
• Useful
• Historical
http://vimeo.com/39256698
http://www.cs.yale.edu/homes/dachuan/Grand/HoareCC.pdf
The Verifying Compiler: A Grand Challenge for
Computing Research by Hoare, CACM 2003
26. Tony Hoare’s Criteria for a Grand Challenge
cont.
• International
• Revolutionary
• Research-directed
• Challenging
• Feasible
• Incremental
• Co-operative
http://vimeo.com/39256698
http://www.cs.yale.edu/homes/dachuan/Grand/HoareCC.pdf
The Verifying Compiler: A Grand Challenge for
Computing Research by Hoare, CACM 2003
27. Tony Hoare’s Criteria for a Grand Challenge
cont.
• Competitive
• Effective
• Risk-managed
http://vimeo.com/39256698
http://www.cs.yale.edu/homes/dachuan/Grand/HoareCC.pdf
The Verifying Compiler: A Grand Challenge for
Computing Research by Hoare, CACM 2003
28. Heilmeier's Catechism
Anyone proposing a research project or product development effort should be able to
answer
• What are you trying to do? Articulate your objectives using absolutely
no jargon.
• How is it done today, and what are the limits of current practice?
• What's new in your approach and why do you think it will be
successful?
• Who cares?
• If you're successful, what difference will it make?
• What are the risks and the payoffs?
• How much will it cost?
• How long will it take?
• What are the midterm and final "exams" to check for success?
http://www9.georgetown.edu/faculty/yyt/bolts&nuts/TheHeilmeierCatechism.pdf
29. Ways of Coming Up a Problem/Project
• Know and investigate literatures and the area
• Investigate assumptions, limitations, generality, practicality, validation
of existing work
• Address issues in your own development experiences or from other
developers’
• Explore what is “hot” (pros and cons)
• See where your “hammers” could hit or be extended
• Ask “why not” on your own work or others’ work
• Understand existing patterns of thinking
• http://people.engr.ncsu.edu/txie/adviceonresearch.html
• Think more and hard, and interact with others
• Brainstorming sessions, reading groups
• …
Some points were extracted from Barbara Ryder’s slides: http://cse.unl.edu/~grother/nsefs/05/research.pdf
30. Example Techniques on Producing Research Ideas
• Research Matrix (Charles Ling and Qiang Yang)
• Shallow/Deep Paper Categorization (Tao Xie)
• Paper Recommendation (Tao Xie)
• Students recommend/describe a paper (not read by the advisor
before) to the advisor and start brainstorming from there
• Research Generalization (Tao Xie)
• “balloon”/ “donut” technique
37. More Advice Resources
• Advice on Writing Research Papers:
https://www.slideshare.net/taoxiease/how-to-write-research-papers-
24172046
• Common Technical Writing Issues:
https://www.slideshare.net/taoxiease/common-technical-writing-
issues-61264106
• More advice at http://taoxie.cs.illinois.edu/advice/
38. More Reading
• “On Impact in Software Engineering Research” by
Andreas Zeller
• “Doing Research in Software Analysis Lessons and Tips”
by Zhendong Su
• “Some Research Paper Writing Recommendations” by
Arie van Deursen
• “Does Being Exceptional Require an Exceptional Amount
of Work?” by Cal Newport
• Book: Crafting Your Research Future: A Guide to
Successful Master's and Ph.D. Degrees in Science &
Engineering by Charles Ling and Qiang Yang
39. Experience Reports on Successful Tool Transfer
• Yingnong Dang, Dongmei Zhang, Song Ge, Ray Huang, Chengyun Chu, and Tao Xie. Transferring Code-
Clone Detection and Analysis to Practice. In Proceedings of ICSE 2017, SEIP.
http://taoxie.cs.illinois.edu/publications/icse17seip-xiao.pdf
• Nikolai Tillmann, Jonathan de Halleux, and Tao Xie. Transferring an Automated Test Generation Tool to
Practice: From Pex to Fakes and Code Digger. In Proceedings of ASE 2014, Experience Papers.
http://taoxie.cs.illinois.edu/publications/ase14-pexexperiences.pdf
• Jian-Guang Lou, Qingwei Lin, Rui Ding, Qiang Fu, Dongmei Zhang, and Tao Xie. Software Analytics for
Incident Management of Online Services: An Experience Report. In Proceedings ASE 2013, Experience
Paper.
http://taoxie.cs.illinois.edu/publications/ase13-sas.pdf
• Dongmei Zhang, Shi Han, Yingnong Dang, Jian-Guang Lou, Haidong Zhang, and Tao Xie. Software
Analytics in Practice. IEEE Software, Special Issue on the Many Faces of Software Analytics, 2013.
http://taoxie.cs.illinois.edu/publications/ieeesoft13-softanalytics.pdf
• Yingnong Dang, Dongmei Zhang, Song Ge, Chengyun Chu, Yingjun Qiu, and Tao Xie. XIAO: Tuning Code
Clones at Hands of Engineers in Practice. In Proceedings of ACSAC 2012.
http://taoxie.cs.illinois.edu/publications/acsac12-xiao.pdf