Empirical se 2013-01-17


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Empirical se 2013-01-17

  1. 1. Systematic Literature ReviewChallenges and Opportunities Ivica Crnkovic ivica.crnkovic@mdh.se
  2. 2. Empirical SE Questions?• The questions similar to those an anthropologist might ask during first contact with a previously unknown culture. – How do people learn to program? – Can the future success of a programmer be predicted by personality tests? – Does the choice of programming language affect productivity? – Can the quality of code be measured? – Can data mining predict the location of software bugs? – ……. Greg Wilson, Jorge Aranda, Empirical Software Engineering, American Scientist https://www.americanscientist.org/issues/pub/empirical-software-engineering
  3. 3. Empirical Software Engineering• Evidence of particular aspect of SE – Activities, processes, technologies – Best practices, Lessons learned – Increased knowledge – Showing a new perspective of a particular knowledge. – …. –
  4. 4. Empirical Software Engineering Methods Case studies Surveys Literature reviews
  5. 5. Systematic Literature Review (SLR)• Finding evidence from (scientific) literature – Do it in a systematic way • State a question • Find the answerBased onBarbara Kitchenham, Evidence-Based Software Engineering and Systematic Reviewswww.scm.keele.ac.uk/ease/ease05_bk.ppt
  6. 6. Systematic (Literature) Review• Questions – what are the current problems in a specific area? – for a specific problem what are the reported solutions? – Which are the newest results in a particular area? – Which particular combination of two/several areas do exist?• Important! – The questions should be interesting form a research point of view – The questions should be attractive for the readers – The questions should be enough general to come to a conclusions that are sufficiently general – The questions should be specific enough to be able to provide enough specific findings
  7. 7. Systematic Review Procedure • Support Evidence-based paradigm – Start from a well-defined question • Step 1 – Define a repeatable strategy for searching the literature • Step 2 – Critically assess relevant literature • Step 3 – Synthesise literature • Step 4Ref: Barbara Kitchenham, Evidence-Based Software Engineering and Systematic Reviews 7
  8. 8. Systematic Review Process Develop Review Protocol Plan Review Validate Review Protocol Identify Relevant Research Select Primary Studies Conduct Review Assess Study Quality Extract Required Data Synthesise Data Write Review ReportDocument Review Validate Report 8
  9. 9. Showing the SLR through an example Example #1 Example 1: A systematic review of software architecture evolution research. Hongyu Pei Breivold, Ivica Crnkovic, Magnus Larsson, Information & Software Technology 54(1): 16-40 (2012)• software evolvability – the ability of a system to accommodate changes in its requirements throughout the system’s lifespan with the least possible cost while maintaining architectural integrity”• Interest: evolvability through software architecture
  10. 10. Evolvability property model is refined to Evolvability subcharacteristics 1 1..* 1 is refined to 1..* measured by measuring attributes 1..* metrics 1 1 reason about 1..* QoS Evolvability subcharacteristics Analyzability Architectural IntegrityQuestion: which are subcharacteristics? Changeability Portability Extensibility Testability Domain-specific attributes 1/22/2013 10
  11. 11. Showing the SLR through an example Example #2 Example 2: 15 Years of CBSE Symposium: Impact on the Research Community Josip Maras, Luka Lednicki, Ivica Crnkovic ACM/SigSoft Component-based Software Engineering Symposium 2012• Interest: What is the impact of CBSE Symposium publications?
  12. 12. CBSE events 1998 – Tokyo Workshop@ICSE 1999 – Los Angeles Initiation 2000 – Limerick 2001 – Toronto Focus 2002 – Orlando 2003 – Portland 2004 – Edinburgh Symposium@ICSE 2005 – St. Louis Broadening Scope QoSA 2006 – Västerås 2007 – Boston Symposium!@ICSE CompArch 2008 – Karlsruhe WCOP Collaboration phase 2009 – E. Stroudsburg ISARCS 2010 – Prague (WICSA) 2011 – Boulder 2012 - Bertinoro2013-01-22 CBSE 2012 - Bertinoro, Italy 12
  13. 13. Systematic Review Process Develop Review Protocol Plan Review Validate Review Protocol Identify Relevant Research Select Primary Studies Conduct Review Assess Study Quality Extract Required Data Synthesise Data Write Review ReportDocument Review Validate Report 13
  14. 14. Developing the Protocol• Review protocol – Specifies methods to be used for a systematic review – Predefined protocol • Reduces researcher bias by reducing opportunity for – Selection of papers driven by researcher expectations – Changing the research question to fit the results of the searches – Good practice for any empirical study 14
  15. 15. Protocol Contents -1/3• Background – Rationale for survey• Research question – Critical to define this before starting the research – Strategy used to search for primary sources 15
  16. 16. Protocol Contents – 2/3• Strategy to find primary studies – Search terms/keywords – Identify resources, databases, journals, conferences – Procedures for storing references – How publication bias will be handled • Grey literature • Direct approach to active researchers – How completeness will be determined • Useful to have the baseline paper to set start date• Selection Strategy – Inclusion/exclusion criteria • Handling multiple papers on one experiment • Quality assessment criteria 16
  17. 17. Protocol Contents- 3/3• Data extraction – What data will be extracted from each primary source – How to handle missing information – How data extraction reliability will be addressed • Usually multiple reviewers – Where data will be stored• Procedures for data synthesis – Formats for summarising data – Measures and analysis if meta-analysis is proposed 17
  18. 18. Research questions Search Keywords Resources/Database Search Inclusion/Exclusion Studies criteria filtering Primary Studieslegend analysis Statistical dataactivity synthesis New findingsartifact
  19. 19. Example 1 (Software Architecture Evolution) Research questions1. What approaches have been reported regarding the analysis and achievement of software evolvability at the architectural level?2. What are the main research topics covered in the scientific literature regarding analysis and achievement of evolvability-related quality attributes?3. …..4. What is the impact of the studies to research community and practice?
  20. 20. Example 1 (Software Architecture Evolution) Research questions Search Keywords Resources/Database Search keywords Databases & Resources: S1: software architecture AND evolvability ACM Digital Library IEEE Xplore S2: software architecture AND maintainability ScienceDirect – Elsevier S3: software architecture AND extensibility SpringerLink S4: software architecture AND adaptability Wiley InterScience S5: software architecture AND flexibility ISI Web of Science S6: software architecture AND changeability SCOPUS S7: software architecture AND modifiability (Google Scholar ) S8: software architecture AND analyzabilityKeywords should reflect the questions and the underlying theory/model
  21. 21. Example 2 (CBSE publications) Research questionsQuestionsImpact- Number of publications, total, per year, geographical distribution- citation index- Indirect impact: backward citations, Impact of the authors- What is the maturity level of CBSE?Topics of interest Which research topics where the most present at CBSE? What kind of research results were presented? What type of validations the publications had?
  22. 22. Example 2 (CBSE publications) Research questions Search Keywords Resources/DatabaseQuestions Search keywords Databases & Resources:Impact No search keywords CBSE Proceedings- Publications - all CBSE papers SpringerLink- citation index ACM Didgital Library- Indirect impact Google ScholarTopics of interest Web search
  23. 23. Search Studies filtering Primary Studies Example 1: Primary studies selection processInclusion CriteriaEnglish peer-reviewed studies that provide answers to the researchquestions.Studies that focus on software evolution.Studies that focus on software architecture analysis and/or software qualityanalysis related to software evolvability.Studies are published up to and including the first two quarters of 2010.Exclusion CriteriaStudies are not in English.Studies that are not related to the research questions.Studies in which claims are non-justified or ad-hoc statements instead ofbased on evidence.Duplicated studies.
  24. 24. Search Studies filtering Primary Studies Example 1: Primary studies selection process
  25. 25. Search Studies filtering Primary Studies Example 1: Primary studies selection process• Activities: – Provide search strings in databases and export the results to EndNote • Tedious work – different query languages and different export functionality – Extraction of the information in a suitable form for reading and selecting, removing duplicates, etc.• Goal: – To get a reasonable number of studies (<500, >20) • May require refinement of the questions – Achieve reliability – select the most significant literature
  26. 26. Search Studies filtering Primary Studies Example 2 (CBSE publications)• All publications are primary studies – 318 studies• Activities – Extract publications and create an relational- database – Populate database – Provide “Query and View” interactive web-based application for fast reading and publication classification
  27. 27. analysis Statistical data• Data extracted from the studies – “objective data” – Distribution of studies with respect to • Year of publications • Authors and research communities • Sources of publications • Citation distribution, the most cited studies• Analysis support – Manual, writing own software, – Help from some tools/portals • Google scholar • Perish & publish • Mendeley,…
  28. 28. analysis Statistical dataExample 1: statistical data
  29. 29. analysis Statistical dataExample 1: statistical data
  30. 30. analysis Statistical data Example 2: statistical data 100 80 60 # submitted 40 # published 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 154000350030002500 #citations - total: 3405 –20001500 (measured 2012-02-12)1000 500 0 1 2 3 4 5 6 7 8 9 10 11 12 13 141000 800 600 400 # citations per year 200 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
  31. 31. analysis Statistical data Example 2: statistical data Ref Study #citations Bruneton, Eric; Coupaye, Thierry; Leclercq, Matthieu; Quema, Vivien; Stefani, Jean-Bernard; An S04-02 306 Open Component Model and its Support in Java, 2004 PORE Procurement-Oriented Requirements Engineering Method for the Component-Based S99-1 Systems Engineering Development Paradigm,1999 118 Aoyama, Mikio; New Age of Software Development: How Component-Based Software S98-18 115 Engineering Changes the Way of Software Development ? 1998 Cervantes, Humberto; Hall, Richard S; Automating Service Dependency Management in a S03-3 103 Service-Oriented Component Model; 2003 S02-0 Chen, Shiping; Liu, Yan; Gorton, Ian; Performance Prediction of Component-based Applications, 2002 77 Top 10 citations Lau, Kung-kiu; Elizondo, Velasco, Perla; Wang, Zheng; Exogenous Connectors for Software S05-13 68 Components, 2005 Sentilles, Severine; Vulgarakis, Aneta; Bures, Tomas; Carlson, Jan; Crnkovic, Ivica; A Component S06-25 65 Model for Control-Intensive Embedded Systems; 2008 Seinturier, Lionel; Pessemier, Nicolas; Duchien, Laurence; Coupaye, Thierry; A Component Model S08-16 Engineered with Components and Aspects, 2006 65 S98-10 Kruchten, Philippe; Modeling Component Systems with the Unified Modeling Language, 1998 63 S04-2 S00-9 S03-1 S04-9 S99-1 S04-26 S03-3 S02-0 S04-19 S06-25 S98-18 S02-08 S04-5 S06-13 S05-13 Citation of papersf 2294 1984 909 899 840 832 817 810 646 555 543 455 454 450 447 that cited top 10 papers CBSE references outside CBSE events from CBSE authors #Citations C Szyperski, Component software: beyond object-oriented programming, 1998, 2002 6594 GT. Heineman, WT. Councill, Component-based software engineering: putting the pieces together, 2001 924 The most influential Authors from CBSE I Crnkovic, M Larsson, Building reliable component-based systems, 2002 623 (citations of the related work) T Coupaye et al, The fractal component model and its support in Java, Software: Practice, 2006 443 RH Reussner et al, Reliability prediction for component-based software architectures, Journal of Systems and Software 66 (3), 241-252 189
  32. 32. synthesis New findings Procedures for data synthesis• Goal: synthesize the information into a new knowledge – Based on a theory previously established • Validation of the theory • Description of some specific characteristics of the theory – Grounded theory • Build up a theory from the reading & analysis – Manual – Using some tools – the most frequent words, Concordance• The most difficult part – Requires experience and knowledge in the subject – Requires a kind of validation/review
  33. 33. synthesis New findingsExample 1 (Software Architecture Evolution) Quality Attribute Requirement Focused 7 studies Quality Considerations Quality Attribute during Design Scenario Focused 15 studies 2 studies Influencing Factor Focused 6 studies Experience Based 5 studies Quality Evaluation at Classification of 82 studies Scenario Based Architectural Level 7 studies 22 studies Metric Based Economic Valuation 10 studies 11 studies Architectural Knowledge Management 18 studies Modeling Techniques 16 studies
  34. 34. synthesis New findingsExample 1 (Software Architecture Evolution) Maturity classification: • Basic research • Concept formulation • Development and extension • Internal use • External use • Popularization
  35. 35. synthesis New findings Example 2 (CBSE publications) Component models 15% Component technologies Research Area 24% Extra-functional properties12% Composition & predictability 7% Software Architecture15% 13% Lifecycle Domains 8% Methodology 6% 1% Result characteristics • Procedures or techniques 19% Procedure or technique • Qualitative models 36% Qualita ve/Descrip ve Model Analy c Model • Analytic models2% 3% Nota on Or Tool • Notations or tools 9% Specific Solu on Answer Or Judgment • Specific solutions Report • Judgments 12% Empirical model • Reports 18% • Empirical models
  36. 36. synthesis New findings Example 2 (CBSE publications) 1% Evaluation Type • Not presented 7% 16% 19% Not presented Academic case study • Academic case study Simple examples • Simple examples Experiments • Experiments Industrial case study Formal specifica on • Industrial case study 18% 39% Literature comparision • Formal specification • Literature review100% Research Maturity • External enhancement and exploration90%80% External Enhancement70% And Explora on • Internal enhancement and exploration60% Internal Enhancement And Explora on • Development and extension50% Development And Extension • Conceptual formulation40%30% Concept Formula on20%10% 0% 98 99 00 01 02 03 04 05 06 07 08 09 10 11 12
  37. 37. Validation issues0. Is your approach OK? – Do you have the right questions? – Is the procedure feasible?1. How you can ensure that you have selected the right studies?2. How you can ensure that your analysis and synthesis is right?
  38. 38. The right studies?1. Are the selected sources appropriate – Selection of databases important (fortunately there are not so many) – Is Google/Google Scholar appropriate as a source?2. Have you missed to select some important studies? Do you have too many unimportant studies?
  39. 39. Studies selection • Several researchers involved in the process Selected studies ASelected studies Selectedusing automatic studies B queries comparison Discussion Selected studies C Filtering Final list
  40. 40. Comparison• Agreement?  Fleiss’ kappa
  41. 41. Synthesis/Findings Validationa) Your analysis/synthesis is based on a theory/model a) Existing classification, ontology b) Previous research results c) Extending/refinement of the existing theoriesb) You build your theory/model from startIterative process – building & validation Validation by a third person Synthesis Discussion
  42. 42. Reporting results• Several levels of information – Raw source information – Extensive detailed technical report – Research papers (Journal, Conference) – reference to source data, technical report
  43. 43. Write an SLR paper 1/2• Intro – Motivation – the most important • Why the question is interesting • What is the main question• The overall method used • The questions, the search keywords, source of information • Election process, data storage• Selected studies • Refer to the most important studies • Provide statistics, comment them
  44. 44. Write an SLR paper 2/2• Synthesis – Important – The findings (short description in general) – The findings related to the studies (classification/grouping of the studies)• Discussion – Additional findings, remarks, statistics from the studies related to the findings• Validation – Validation threat – Validation procedures (this can be specified in the methods part)• Conclusion• List of primary studies• references
  45. 45. Some Research Databases• SCOPUS http://www.scopus.com/home.url• ACM Digital Library (http://portal.acm.org)• Compendex (http://www.engineeringvillage.com)• IEEE Xplore (http://www.ieee.org/web/publications/xplore/)• ScienceDirect – Elsevier (http://www.elsevier.com)• SpringerLink (http://www.springerlink.com)• Wiley InterScience (http://www3.interscience.wiley.com)• ISI Web of Science (http://www.isiknowledge.com).
  46. 46. References for the systematic reviewKitchenham, Barbara. Procedures for Performing Systematic Reviews, Joint TechnicalRreport, Keele University TR/SE-0401 and NICTA 0400011T.1, July 2004.Australian National Health and Medical Research Council. How to review the evidence:systematic identification and review of the scientific literature, 2000. IBSN 186-4960329 .Australian National Health and Medical Research Council. How to use the evidence:assessment and application of scientific evidence. February 2000, ISBN 0 642 43295 2.Cochrane Collaboration. Cochrane Reviewers’ Handbook. Version 4.2.1. December2003.Glass, R.L., Vessey, I., Ramesh, V. Research in software engineering: an analysis of theliterature. IST 44, 2002, pp491-506Magne Jørgensen and Kjetil Moløkken. How large are Software Cost Overruns? CriticalComments on the Standish Group’s CHAOSReports, http://www.simula.no/publication_one.php?publication_id=711, 2004.Magne Jørgensen. A Review of Studies on Expert Estimation of Software DevelopmentEffort. Journal Systems and Software, Vol 70, Issues 1-2, 2004, pp 37-60. 46
  47. 47. References for the systematic review Khan, Khalid, S., ter Riet, Gerben., Glanville, Julia., Sowden, Amanda, J. and Kleijnen, Jo. (eds) Undertaking Systematic Review of Research on Effectiveness. CRD’s Guidance for those Carrying Out or Commissioning Reviews. CRD Report Number 4 (2nd Edition), NHS Centre for Reviews and Dissemination, University of York, IBSN 1 900640 20 1, March 2001. Pai, Madhukar, McCullovch, Michael, Gorman, Jennifer D., Pai, Nitika, Enanoria, Wayne, Kennedy, Gail, Tharyan, Prathap, Colford, John M. Jnr. Systematic reviews and meta-analysis: An illustrated, step-by- step guide. The National medical Journal of India, 17(2) 2004, pp 86-95. Sackett, D.L., Straus, S.E., Richardson, W.S., Rosenberg, W., and Haynes, R.B. Evidence-Based Medicine: How to Practice and Teach EBM, Second Edition, Churchill Livingstone: Edinburgh, 2000. 47