SlideShare a Scribd company logo
1 of 92
Performing Large Scale
 Software Engineering
       Studies


      Georgios Gousios
$whoami
SE = Empirical Sciecne
Observation
Hypothes{ie}s
Models
Validate




Invalidate
30




                  25




                  20
Number of Works




                  15




                  10




                  5




                  0
                       0   1   2   3        4         5        6         8   10   25   50
                                       Sample sizes (number of projects)
Metric      Gnome-VFS   Evolution     KDE

SCM History    10y 9m      12y 4m       14y

   Num
                5522       36835      >1200000
 Revisions

 SCM Size      105 MB      1.4 GB      60 GB

Num Emails
                 20         240         6252
(Nov 2008)
400



     300
GB




     200



     100   Others

             KDE
       0
           700 projects   LSE (2003-2006)   GenBank
                             Dataset
RCS     Postfix       Bugzilla
CVS
        MailMan   SF.net Tracker
SVN
                       Jira
Darcs    Marc
                      Gnats
 Git
 Hg
Researcher’s view
Time
       1. Go to project.org

       2. Dowload SVN, Mail,
          Bug, ask for IRC logs

       3. .....?

       4. Publish research
Researcher’s view               (some)   Project’s view
Time
       1. Go to project.org       1. Hm, a new visitor

                                  2. Hey, she is mirroring our bugzilla
       2. Dowload SVN, Mail,
          Bug, ask for IRC logs   3. .....?

       3. .....?                  4. Ban her!

       4. Publish research
Researcher’s view               (some)   Project’s view
Time
       1. Go to project.org       1. Hm, a new visitor

                                  2. Hey, she is mirroring our bugzilla
       2. Dowload SVN, Mail,
          Bug, ask for IRC logs   3. .....?

       3. .....?                  4. Ban her!

       4. Publish research
How is empirical research done in mature disciplines?
Pre-processed data
Result Sharing
Replication
Research Platforms
Our research:
A platform for software
 engineering research
In our research
1. We examined the current situation
2. We propose a platform for large scale
   research
3. We validated its design with 2 case studies
Empirical
            = Model
Study

            + Data

            + Metrics / Tools


            + Analysis Methods


            + Results analysis
Empirical
            = Model
Study

            + Data

            + Metrics / Tools


            + Analysis Methods


            + Results analysis
Empirical
            = Model
Study

            + Data

            + Metrics / Tools


            + Analysis Methods


            + Results analysis
Empirical
            = Model
Study

            + Data

            + Metrics / Tools


            + Analysis Methods


            + Results analysis
Research methods

                  40


                  35


                  30


                  25
Number of Works




                  20


                  15


                  10


                  5


                  0
                       0   EXP      FCS           ECS             CCS   SUR
                                          Research Methods used
What sources of data are in use?
                  45


                  40


                  35


                  30
Number of Works




                  25


                  20


                  15


                  10


                  5


                  0
                       0      BTS   SRC       SF        ECT   SCM
                                          Data Source
What is the examined data size (in number of projects) ?

                  30




                  25




                  20
Number of Works




                  15




                  10




                  5




                  0
                       0   1   2    3        4         5        6         8   10   25   50
                                        Sample sizes (number of projects)
Findings
β€’ Sample size very small
 β€’ How can we extract generic results?
β€’ No experiment replication
 β€’ Do we believe in each other’s work or
    just ignore it?
β€’ We did not check the stats...
@

Only 20% of the tools and data reported in ICSE
papers could be retrieved a year after publication
We need better
empirical studies
Rigorous Evaluation




Freely Available Empirical Data




  Tools and Results Sharing
                                  Better Empirical Studies




Software Engineering Platform
A software engineering
  research platform
Ready made tools
   Formalised data
      formats
                                                Easily extensible




A software engineering
  research platform

                                   Researcher      Large scale
       Pre processed data          community        procesing
Alitheia Core
Platform
Platform

Data
Platform

Data      Tools
Platform

Data      Tools   Processing
Data


      Raw Data                           Metadata


                                 Tool Results
Mailing Lists   BTS SCM

                                                Processed raw data
Mirroring Root
                                               /




 project.
               Project 1                   Project 2              Project 3
properties




    git              svn                    mails                  bugs



  Standard     Standard
     GIT         SVN
   format                                                               bug<id>
                format
                                                                          .xml
                               List 1                    List 2




             tmp                cur                      new




                           messageid.eml
Raw Data
                                        Mirroring Root
                                               /




 project.
               Project 1                   Project 2              Project 3
properties




    git              svn                    mails                  bugs



  Standard     Standard
     GIT         SVN
   format                                                               bug<id>
                format
                                                                          .xml
                               List 1                    List 2




             tmp                cur                      new




                           messageid.eml
ο€ο€•ο€Šο€…ο€–ο€“ο€„ο€’ο€‹ο€ƒο€‰ο€“ο€Šο€…ο€—ο€˜ο€‰ο€“ο€Šο€…ο€
                                                                                                ο€ο€ˆο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€                                                                                                                                                        ο€ο€•ο€”ο€’ο€‘ο€‰ο€‡ο€‹ο€’ο€Šο€Œο€‡ο€
                                                                                  ο€š                            ο€š
                                                                                                                                                                                                                                                                
                                             ο€ο€ο€Œο€‡ο€‘ο€ο€‹ο€“ο€˜ο€‰ο€“ο€Šο€…ο€                             ο€š
                                                                                            ο€š
                                                                                                      ο€šο€…         ο€š                                                                                                                                                               ο€š
                                                           ο€š
                                                                                                     
                                                                                                 ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€
                                                               ο€…                                 ο€ο€ο€’ο€˜ο€Œο€ƒο€‰ο€‡ο€₯                                                                                                                                                    ο€…           ο€…
                                                            ο€…                                                                  ο€…
                                                                                                 ο€ο€ο€žο€’ο€„ο€œο€‘ο€
                                             ο€ο€ˆο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€•ο€Šο€…ο€–ο€“ο€„ο€                                                                                                                                                                                         ο€ο€•ο€”ο€’ο€‘ο€‰ο€‡ο€‹ο€’ο€Šο€Œο€‡ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€
                                                                                                                                                                              ο€ο€ο€‡ο€‰ο€‹ο€“ο€ο€‚ο€¦ο€˜ο€‡ο€
                                                                                                                                                                                                                                  ο€ο€ο€“ο€…ο€‘ο€‰ο€ƒο€”ο€”ο€Œο€ƒο€‰ο€‡ο€
                                             ο€ο€ο€ο€Šο€…ο€–ο€—ο€˜ο€‰ο€                                          ο€ο€ο€ο€‹ο€‡ο€ƒο€‰ο€“ο€Šο€…ο€‚ο€ˆο€                                                                                                                                                    ο€ο€ο€…ο€Šο€Œο€‡ο€
                                                                                                                         ο€ο€ο€”ο€“ο€‘ο€‰ο€œο€Œο€                                                                  ο€ο€ο€‰ο€¦ο€˜ο€‡ο€                       ο€ο€ο€™ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€
                                             ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€                                          ο€ο€ο€Œο€‡ο€”ο€‰ο€ƒο€‚ο€ˆο€                                                                                                                                                       ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€
                                                                                                                         ο€ο€ο€‘ο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€                                                                               ο€ο€ο€Œο€‡ο€‘ο€ο€‹ο€“ο€˜ο€‰ο€“ο€Šο€…ο€
                                                                                         ο€ο€ο€‹ο€‡ο€˜ο€Šο€‹ο€‰ο€‡ο€‹ο€                                                                                                                                                      ο€ο€ο€”ο€Šο€ο€£ο€‡ο€Œο€
                                                                                                                                                                                                                                  
                                                                                                 ο€ο€ο€‹ο€‡ο€‘ο€Šο€”ο€’ο€‰ο€“ο€Šο€…ο€                  ο€š    ο€š                                                                   ο€š
                                                                                                                                                                                                                                  ο€ο€ο€Ÿο€ƒο€‘ο€Ÿο€ο€Šο€Œο€‡ο€
                                                                                                 ο€ο€ο€˜ο€‹ο€“ο€Šο€‹ο€“ο€‰ο€¦ο€
                                                                                                                                                                                                                  ο€š             ο€š
                                                                                                 ο€ο€ο€‘ο€Ÿο€Šο€‹ο€‰ο€‘ο€‡ο€‘ο€ο€
                                                                                            ο€š             ο€š ο€š                                                                                                                                  ο€…
                                                                                                                                                                                                             ο€…       ο€…
                                                                                    ο€…       ο€…                                                ο€…
                                                                                                                                                                                                                                 ο€ο€ο€”ο€’ο€„ο€“ο€…ο€•ο€Šο€…ο€–ο€“ο€„ο€’ο€‹ο€ƒο€‰ο€“ο€Šο€…ο€
                                                                                                                                                                                                       
                                                                               ο€ο€‘ο€‡ο€™ο€‡ο€”ο€Šο€˜ο€‡ο€‹ο€                                          ο€ο€ο€ƒο€“ο€”ο€“ο€…ο€„ο€ ο€“ο€‘ο€‰ο€‚ο€Ÿο€‹ο€‡ο€ƒο€Œο€
                                                                                                                                                                                                                              
                                                                                                                                                                                                    ο€ο€ο€˜ο€”ο€’ο€„ο€“ο€…ο€
                                                                                                                                                                              ο€š                                 
                                                                                                                                                                                                    ο€ο€ο€†ο€‡ο€‰ο€‹ο€“ο€ο€‚ο€¦ο€˜ο€‡ο€
                                                                                                                         ο€ο€ο€”ο€ƒο€‘ο€‰ο€¨ο€˜ο€Œο€ƒο€‰ο€‡ο€Œο€                                                                            ο€ο€ο€‰ο€¦ο€˜ο€‡ο€
                                                                                                                                                                                                    ο€ο€ο€†ο€…ο€‡ο€†ο€Šο€…ο€“ο€ο€
                                                                           ο€ο€ο€‘ο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€                                                                                                                        
                                                                                                                                                                                                    ο€ο€ο€Œο€‡ο€‘ο€ο€‹ο€“ο€˜ο€‰ο€“ο€Šο€…ο€
                                                                         ο€šο€š                                                             ο€š                                                                                     ο€ο€ο€˜ο€”ο€’ο€„ο€“ο€…ο€
                                                                                  ο€š         ο€š                                                             ο€š                                     ο€š        ο€šο€š          ο€š
                                                                                                                          ο€…
                                              ο€…                                                              ο€…                       ο€…
                         ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€     ο€…                                                                  
                                                                                        ο€…
                                                                                        ο€…
                         ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€                                                                              ο€ο€ο€‘ο€‡ο€…ο€Œο€‡ο€‹ο€                       ο€…                          ο€…                                                                                                                  ο€…ο€…
                                                                                                                                                     ο€…                                  ο€…
                         ο€ο€ο€‹ο€‡ο€™ο€“ο€‘ο€“ο€Šο€…ο€œο€Œο€                            ο€…      ο€₯ο€‡ο€˜ο€Šο€‹ο€‰ο€ο€‡ο€‘ο€‘ο€ƒο€„ο€‡ο€                      
                                                                                                                                                   ο€ο€ο€ƒο€“ο€”ο€“ο€…ο€„ο€ ο€“ο€‘ο€‰ο€‚ο€Ÿο€‹ο€‡ο€ƒο€Œο€ο€‡ο€ƒο€‘ο€’ο€‹ο€‡ο€†ο€‡ο€…ο€‰ο€                                                                                        ο€ο€ˆο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€ο€‡ο€ƒο€‘ο€’ο€‹ο€‡ο€†ο€‡ο€…ο€‰ο€
                         ο€ο€ο€‰ο€“ο€†ο€‡ο€‘ο€‰ο€ƒο€†ο€˜ο€                 ο€ο€‘ο€‡ο€™ο€‡ο€”ο€Šο€˜ο€‡ο€‹ο€ͺ                                           ο€ο€ο€†ο€‡ο€‘ο€‘ο€ƒο€„ο€‡ο€œο€Œο€                                                                                                                            ο€ο€‘ο€“ο€‹ο€‡ο€ο€‰ο€Šο€‹ο€¦ο€
                                                                         ο€ο€ο€žο€’ο€„ο€                                                                                                                                          ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€§ο€“ο€”ο€‡ο€ˆο€‰ο€ƒο€‰ο€‡ο€
                         ο€ο€ο€ο€Šο€†ο€†ο€“ο€‰ο€‰ο€‡ο€‹ο€                                                                            ο€ο€ο€‘ο€’ο€žο€Žο€‡ο€ο€‰ο€                              ο€ο€ο€‰ο€Ÿο€‹ο€‡ο€ƒο€Œο€                                                                                                              
                                                                 ο€ο€ο€‹ο€‡ο€˜ο€Šο€‹ο€‰ο€‡ο€‹ο€                                                                                                                                                                   ο€ο€ο€ˆο€•ο€ο€©ο€₯
                         ο€ο€ο€ο€Šο€†ο€†ο€“ο€‰ο€ο€‘ο€„ο€                                                                            ο€ο€ο€‘ο€‡ο€…ο€Œο€‘ο€ƒο€‰ο€‡ο€                                                                                                                                  
                                                      ο€ο€ο€Œο€‡ο€™ο€‡ο€”ο€Šο€˜ο€‡ο€‹ο€       ο€ο€ο€‰ο€“ο€†ο€‡ο€‘ο€‰ο€ƒο€†ο€˜ο€                                                                                                                                                                  ο€ο€ο€˜ο€ƒο€‰ο€Ÿο€
                                                                                                                   ο€ο€ο€žο€’ο€„ο€                                                                                                                                 ο€ο€ο€‘ο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€
                                                                                                                                                                                                                            ο€š                     ο€š
                                                                                                          ο€ο€ο€‰ο€Ÿο€‹ο€‡ο€ƒο€Œο€
                         ο€ο€ο€žο€‹ο€ƒο€…ο€ο€Ÿο€‡ο€Œο€                                                                             ο€ο€ο€Œο€‡ο€˜ο€‰ο€Ÿο€
                         ο€ο€ο€†ο€‡ο€‹ο€„ο€‡ο€Œο€                                                                               ο€ο€ο€˜ο€ƒο€‹ο€‡ο€…ο€‰ο€
                     ο€š
                                             ο€š                                                                                      ο€š                                                                                                  ο€…
                               ο€š         ο€š                                                                                                                                                                                                         ο€…
                                                                                                                                                                                                                              ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€§ο€“ο€”ο€‡ο€
                                                                           ο€…                                                                                                                                         ο€… 
                ο€…                  ο€…              ο€…                                                                                                                                          ο€…
                                                                               ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€ο€‡ο€ƒο€‘ο€’ο€‹ο€‡ο€†ο€‡ο€…ο€‰ο€               ο€…                                    ο€…                                ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€
ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€ο€ƒο€‹ο€‡ο€…ο€‰ο€
                             ο€ο€€ο€‹ο€ƒο€…ο€ο€Ÿο€                                                                                                                                                                             
                                                                                                                                                       
ο€ο€ο€˜ο€ƒο€‹ο€‡ο€…ο€‰ο€                                                                                                                                                                                                              ο€ο€ο€“ο€‘ο€‘ο€“ο€‹ο€‡ο€ο€‰ο€Šο€‹ο€¦ο€
                                                                                                                                         
ο€ο€ο€ο€Ÿο€“ο€”ο€Œο€                                                                                                                                                                                                               ο€ο€ο€Œο€“ο€‹ο€
                                                                               ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€                                                                
                                                                                                                                                                                                                       ο€ο€ο€™ο€ƒο€”ο€“ο€Œο€§ο€‹ο€Šο€†ο€
                                                                                                                                                                                                                       ο€ο€ο€™ο€ƒο€”ο€“ο€Œο€¨ο€…ο€‰ο€“ο€”ο€     ο€š
                                                                                                                                                                                                                       ο€ο€ο€ο€Šο€˜ο€¦ο€§ο€‹ο€Šο€†ο€
                                                                                                                                                                                                                                                              ο€…              ο€…
                                                                                                                                                                                                                                                            ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€§ο€“ο€”ο€‡ο€ο€‡ο€ƒο€‘ο€’ο€‹ο€‡ο€†ο€‡ο€…ο€‰ο€

                                                                                                                                                                                                                           ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€‘ο€“ο€‹ο€‡ο€ο€‰ο€Šο€‹ο€¦ο€              
                                                                                                                                                                                                                                                           
                                                                                                                                                                                                                                                           ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€§ο€“ο€”ο€‡ο€
Metadata
                                             ο€ο€•ο€Šο€…ο€–ο€“ο€„ο€’ο€‹ο€ƒο€‰ο€“ο€Šο€…ο€—ο€˜ο€‰ο€“ο€Šο€…ο€
                                                                                                ο€ο€ˆο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€                                                                                                                                                        ο€ο€•ο€”ο€’ο€‘ο€‰ο€‡ο€‹ο€’ο€Šο€Œο€‡ο€
                                                                                  ο€š                            ο€š
                                                                                                                                                                                                                                                                
                                             ο€ο€ο€Œο€‡ο€‘ο€ο€‹ο€“ο€˜ο€‰ο€“ο€Šο€…ο€                             ο€š
                                                                                            ο€š
                                                                                                      ο€šο€…         ο€š                                                                                                                                                               ο€š
                                                           ο€š
                                                                                                     
                                                                                                 ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€
                                                               ο€…                                 ο€ο€ο€’ο€˜ο€Œο€ƒο€‰ο€‡ο€₯                                                                                                                                                    ο€…           ο€…
                                                            ο€…                                                                  ο€…
                                                                                                 ο€ο€ο€žο€’ο€„ο€œο€‘ο€
                                             ο€ο€ˆο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€•ο€Šο€…ο€–ο€“ο€„ο€                                                                                                                                                                                         ο€ο€•ο€”ο€’ο€‘ο€‰ο€‡ο€‹ο€’ο€Šο€Œο€‡ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€
                                                                                                                                                                              ο€ο€ο€‡ο€‰ο€‹ο€“ο€ο€‚ο€¦ο€˜ο€‡ο€
                                                                                                                                                                                                                                  ο€ο€ο€“ο€…ο€‘ο€‰ο€ƒο€”ο€”ο€Œο€ƒο€‰ο€‡ο€
                                             ο€ο€ο€ο€Šο€…ο€–ο€—ο€˜ο€‰ο€                                          ο€ο€ο€ο€‹ο€‡ο€ƒο€‰ο€“ο€Šο€…ο€‚ο€ˆο€                                                                                                                                                    ο€ο€ο€…ο€Šο€Œο€‡ο€
                                                                                                                         ο€ο€ο€”ο€“ο€‘ο€‰ο€œο€Œο€                                                                  ο€ο€ο€‰ο€¦ο€˜ο€‡ο€                       ο€ο€ο€™ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€
                                             ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€                                          ο€ο€ο€Œο€‡ο€”ο€‰ο€ƒο€‚ο€ˆο€                                                                                                                                                       ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€
                                                                                                                         ο€ο€ο€‘ο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€                                                                               ο€ο€ο€Œο€‡ο€‘ο€ο€‹ο€“ο€˜ο€‰ο€“ο€Šο€…ο€
                                                                                         ο€ο€ο€‹ο€‡ο€˜ο€Šο€‹ο€‰ο€‡ο€‹ο€                                                                                                                                                      ο€ο€ο€”ο€Šο€ο€£ο€‡ο€Œο€
                                                                                                                                                                                                                                  
                                                                                                 ο€ο€ο€‹ο€‡ο€‘ο€Šο€”ο€’ο€‰ο€“ο€Šο€…ο€                  ο€š    ο€š                                                                   ο€š
                                                                                                                                                                                                                                  ο€ο€ο€Ÿο€ƒο€‘ο€Ÿο€ο€Šο€Œο€‡ο€
                                                                                                 ο€ο€ο€˜ο€‹ο€“ο€Šο€‹ο€“ο€‰ο€¦ο€
                                                                                                                                                                                                                  ο€š             ο€š
                                                                                                 ο€ο€ο€‘ο€Ÿο€Šο€‹ο€‰ο€‘ο€‡ο€‘ο€ο€
                                                                                            ο€š             ο€š ο€š                                                                                                                                  ο€…
                                                                                                                                                                                                             ο€…       ο€…
                                                                                    ο€…       ο€…                                                ο€…
                                                                                                                                                                                                                                 ο€ο€ο€”ο€’ο€„ο€“ο€…ο€•ο€Šο€…ο€–ο€“ο€„ο€’ο€‹ο€ƒο€‰ο€“ο€Šο€…ο€
                                                                                                                                                                                                       
                                                                               ο€ο€‘ο€‡ο€™ο€‡ο€”ο€Šο€˜ο€‡ο€‹ο€                                          ο€ο€ο€ƒο€“ο€”ο€“ο€…ο€„ο€ ο€“ο€‘ο€‰ο€‚ο€Ÿο€‹ο€‡ο€ƒο€Œο€
                                                                                                                                                                                                                              
                                                                                                                                                                                                    ο€ο€ο€˜ο€”ο€’ο€„ο€“ο€…ο€
                                                                                                                                                                              ο€š                                 
                                                                                                                                                                                                    ο€ο€ο€†ο€‡ο€‰ο€‹ο€“ο€ο€‚ο€¦ο€˜ο€‡ο€
                                                                                                                         ο€ο€ο€”ο€ƒο€‘ο€‰ο€¨ο€˜ο€Œο€ƒο€‰ο€‡ο€Œο€                                                                            ο€ο€ο€‰ο€¦ο€˜ο€‡ο€
                                                                                                                                                                                                    ο€ο€ο€†ο€…ο€‡ο€†ο€Šο€…ο€“ο€ο€
                                                                           ο€ο€ο€‘ο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€                                                                                                                        
                                                                                                                                                                                                    ο€ο€ο€Œο€‡ο€‘ο€ο€‹ο€“ο€˜ο€‰ο€“ο€Šο€…ο€
                                                                         ο€šο€š                                                             ο€š                                                                                     ο€ο€ο€˜ο€”ο€’ο€„ο€“ο€…ο€
                                                                                  ο€š         ο€š                                                             ο€š                                     ο€š        ο€šο€š          ο€š
                                                                                                                          ο€…
                                              ο€…                                                              ο€…                       ο€…
                         ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€     ο€…                                                                  
                                                                                        ο€…
                                                                                        ο€…
                         ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€                                                                              ο€ο€ο€‘ο€‡ο€…ο€Œο€‡ο€‹ο€                       ο€…                          ο€…                                                                                                                  ο€…ο€…
                                                                                                                                                     ο€…                                  ο€…
                         ο€ο€ο€‹ο€‡ο€™ο€“ο€‘ο€“ο€Šο€…ο€œο€Œο€                            ο€…      ο€₯ο€‡ο€˜ο€Šο€‹ο€‰ο€ο€‡ο€‘ο€‘ο€ƒο€„ο€‡ο€                      
                                                                                                                                                   ο€ο€ο€ƒο€“ο€”ο€“ο€…ο€„ο€ ο€“ο€‘ο€‰ο€‚ο€Ÿο€‹ο€‡ο€ƒο€Œο€ο€‡ο€ƒο€‘ο€’ο€‹ο€‡ο€†ο€‡ο€…ο€‰ο€                                                                                        ο€ο€ˆο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€ο€‡ο€ƒο€‘ο€’ο€‹ο€‡ο€†ο€‡ο€…ο€‰ο€
                         ο€ο€ο€‰ο€“ο€†ο€‡ο€‘ο€‰ο€ƒο€†ο€˜ο€                 ο€ο€‘ο€‡ο€™ο€‡ο€”ο€Šο€˜ο€‡ο€‹ο€ͺ                                           ο€ο€ο€†ο€‡ο€‘ο€‘ο€ƒο€„ο€‡ο€œο€Œο€                                                                                                                            ο€ο€‘ο€“ο€‹ο€‡ο€ο€‰ο€Šο€‹ο€¦ο€
                                                                         ο€ο€ο€žο€’ο€„ο€                                                                                                                                          ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€§ο€“ο€”ο€‡ο€ˆο€‰ο€ƒο€‰ο€‡ο€
                         ο€ο€ο€ο€Šο€†ο€†ο€“ο€‰ο€‰ο€‡ο€‹ο€                                                                            ο€ο€ο€‘ο€’ο€žο€Žο€‡ο€ο€‰ο€                              ο€ο€ο€‰ο€Ÿο€‹ο€‡ο€ƒο€Œο€                                                                                                              
                                                                 ο€ο€ο€‹ο€‡ο€˜ο€Šο€‹ο€‰ο€‡ο€‹ο€                                                                                                                                                                   ο€ο€ο€ˆο€•ο€ο€©ο€₯
                         ο€ο€ο€ο€Šο€†ο€†ο€“ο€‰ο€ο€‘ο€„ο€                                                                            ο€ο€ο€‘ο€‡ο€…ο€Œο€‘ο€ƒο€‰ο€‡ο€                                                                                                                                  
                                                      ο€ο€ο€Œο€‡ο€™ο€‡ο€”ο€Šο€˜ο€‡ο€‹ο€       ο€ο€ο€‰ο€“ο€†ο€‡ο€‘ο€‰ο€ƒο€†ο€˜ο€                                                                                                                                                                  ο€ο€ο€˜ο€ƒο€‰ο€Ÿο€
                                                                                                                   ο€ο€ο€žο€’ο€„ο€                                                                                                                                 ο€ο€ο€‘ο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€
                                                                                                                                                                                                                            ο€š                     ο€š
                                                                                                          ο€ο€ο€‰ο€Ÿο€‹ο€‡ο€ƒο€Œο€
                         ο€ο€ο€žο€‹ο€ƒο€…ο€ο€Ÿο€‡ο€Œο€                                                                             ο€ο€ο€Œο€‡ο€˜ο€‰ο€Ÿο€
                         ο€ο€ο€†ο€‡ο€‹ο€„ο€‡ο€Œο€                                                                               ο€ο€ο€˜ο€ƒο€‹ο€‡ο€…ο€‰ο€
                     ο€š
                                             ο€š                                                                                      ο€š                                                                                                  ο€…
                               ο€š         ο€š                                                                                                                                                                                                         ο€…
                                                                                                                                                                                                                              ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€§ο€“ο€”ο€‡ο€
                                                                           ο€…                                                                                                                                         ο€… 
                ο€…                  ο€…              ο€…                                                                                                                                          ο€…
                                                                               ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€ο€‡ο€ƒο€‘ο€’ο€‹ο€‡ο€†ο€‡ο€…ο€‰ο€               ο€…                                    ο€…                                ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€
ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€ο€ƒο€‹ο€‡ο€…ο€‰ο€
                             ο€ο€€ο€‹ο€ƒο€…ο€ο€Ÿο€                                                                                                                                                                             
                                                                                                                                                       
ο€ο€ο€˜ο€ƒο€‹ο€‡ο€…ο€‰ο€                                                                                                                                                                                                              ο€ο€ο€“ο€‘ο€‘ο€“ο€‹ο€‡ο€ο€‰ο€Šο€‹ο€¦ο€
                                                                                                                                         
ο€ο€ο€ο€Ÿο€“ο€”ο€Œο€                                                                                                                                                                                                               ο€ο€ο€Œο€“ο€‹ο€
                                                                               ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€                                                                
                                                                                                                                                                                                                       ο€ο€ο€™ο€ƒο€”ο€“ο€Œο€§ο€‹ο€Šο€†ο€
                                                                                                                                                                                                                       ο€ο€ο€™ο€ƒο€”ο€“ο€Œο€¨ο€…ο€‰ο€“ο€”ο€     ο€š
                                                                                                                                                                                                                       ο€ο€ο€ο€Šο€˜ο€¦ο€§ο€‹ο€Šο€†ο€
                                                                                                                                                                                                                                                              ο€…              ο€…
                                                                                                                                                                                                                                                            ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€§ο€“ο€”ο€‡ο€ο€‡ο€ƒο€‘ο€’ο€‹ο€‡ο€†ο€‡ο€…ο€‰ο€

                                                                                                                                                                                                                           ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€‘ο€“ο€‹ο€‡ο€ο€‰ο€Šο€‹ο€¦ο€              
                                                                                                                                                                                                                                                           
                                                                                                                                                                                                                                                           ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€§ο€“ο€”ο€‡ο€
Tools
Tools
                                                                                Metric
                                                                                  Metric
                                                                                Plug-in
                     Job                 Metadata            Web      Cluster      Metric
                                                                                 Plug-in          Metric
   Logging
                   Schedul               Updater           services   Service      Plug-in       Activator
                      er




     DB            Messagi                                   Web      Plug-in      Parser         Data
                                          Security
   Service           ng                                     Admin     Admin        Servic        Access
                                                                                     e


SQO-OSS                                               SQO-OSS                                       SQO-OSS




                             Project 1



                                                                                                          PV



             svn              mails                 bugs
                                                                                       PV



                                                                                                               PV

                    List 1               List 2
                                                                                            PV

                                                           Project
                                                           Mirror
     tmp             cur                  new
                                                                                                 Metadata Storage
Tools
public interface AlitheiaPlugin {
    String getVersion();
    String getAuthor();
    Date getDateInstalled();
    String getName();
    String getDescription();
    List<Result> getResultIfAlreadyCalculated(DAObject o, List<Metric> l);
    List<Result> getResult(DAObject o, List<Metric> l);
    List<Metric> getAllSupportedMetrics();
    List<Metric> getSupportedMetrics(Class<? extends DAObject> activationType);
    void run(DAObject o);
    boolean update();
    boolean install();
    boolean remove();
    boolean cleanup(DAObject sp);
    String getUniqueKey();
    Set<Class<? extends DAObject>> getActivationTypes();
    List<Class<? extends DAObject>> getMetricActivationTypes (Metric m);
    Set<PluginConfiguration> getConfigurationSchema();
    Set<String> getDependencies();
    Map<MetricType.Type, SortedSet<Long>> getObjectIdsToSync(StoredProject sp, Metric m);
}
Tools
public interface AlitheiaPlugin {
    String getVersion();
    String getAuthor();
    Date getDateInstalled();
    String getName();
    String getDescription();
    List<Result> getResultIfAlreadyCalculated(DAObject o, List<Metric> l);
    List<Result> getResult(DAObject o, List<Metric> l);
    List<Metric> getAllSupportedMetrics();
    List<Metric> getSupportedMetrics(Class<? extends DAObject> activationType);
    void run(DAObject o);
    boolean update();
    boolean install();
    boolean remove();
    boolean cleanup(DAObject sp);
    String getUniqueKey();
    Set<Class<? extends DAObject>> getActivationTypes();
    List<Class<? extends DAObject>> getMetricActivationTypes (Metric m);
    Set<PluginConfiguration> getConfigurationSchema();
    Set<String> getDependencies();
    Map<MetricType.Type, SortedSet<Long>> getObjectIdsToSync(StoredProject sp, Metric m);
}
Tools
@MetricDeclarations(metrics = {
    @MetricDecl(mnemonic="MNOF", activators={ProjectDirectory.class},
               descr="Number of Source Code Files in Module"),
    @MetricDecl(mnemonic="MNOL", activators={ProjectDirectory.class},
               descr="Number of lines in module", dependencies={"Wc.loc"}),
    @MetricDecl(mnemonic="AMS", activators={ProjectVersion.class},
               descr="Average Module Size"),
    @MetricDecl(mnemonic="ISSRCMOD", activators={ProjectDirectory.class},
               descr="Mark for modules containing source files")
})
public class ModuleMetricsImplementation extends AbstractMetric {

    public void run(ProjectFile pf) throws AlreadyProcessingException {[...]}
    public void run(ProjectVersion pv) throws AlreadyProcessingException {[...]}

    public List<Result> getResult(ProjectFile pf, Metric m) {
          return getResult(pf, ProjectFileMeasurement.class, m, Result.ResultType.INTEGER);
    }

    public List<Result> getResult(ProjectVersion pv, Metric m) {
          return getResult(pv, ProjectVersionMeasurement.class, m, Result.ResultType.FLOAT);
    }

}
public void run(ProjectFile pf) {
        // We do not support directories
                                                       Tools
        if (pf.getIsDirectory()) {
            return;
        }

        InputStream in = fds.getFileContents(pf);
        if (in == null) {
            return;
        }
        // Create an input stream from the project file's content
        try {
         // Measure the number of lines in the project file
            LineNumberReader lnr =
                new LineNumberReader(new InputStreamReader(in));
            int lines = 0;
            while (lnr.readLine() != null) {
                lines++;
            }
            lnr.close();

             // Store the results
             Metric metric = Metric.getMetricByMnemonic("LOC");
             ProjectFileMeasurement locm = new ProjectFileMeasurement();
             locm.setMetric(metric);
             locm.setProjectFile(pf);
             locm.setWhenRun(new Timestamp(System.currentTimeMillis()));
             locm.setResult(String.valueOf(lines));

              db.addRecord(locm);
              markEvaluation(metric, pf.getProjectVersion().getProject());
            } catch (IOException e) {
              log.error(this.getClass().getName() + " IO Error <" + e
                      + "> while measuring: " + pf.getFileName());
        }
    }
Processing - The scmmap algorithm




              CPU
Processing - the idmap algorithm




           CPU
Processing - in clusters
Line counting speed

          60



          45
Minutes




          30



          15



           0
               Naive implementation                 Alitheia Core
New cluster node connects
Case Studies
Do intense conversations
affect short-term project
      development?
How do we identify
                              intense discussions
             100000                                                                                  120000
                                                              Number of messages                                                               Levels of thread depth

             90000
                                                                                                     100000
             80000


             70000
                                                                                                     80000

             60000




                                                                                        Occurences
Occurences




             50000                                                                                   60000


             40000

                                                                                                     40000
             30000


             20000
                                                                                                     20000
             10000


                 0                                                                                       0
                      0   5       10               15                  20          25                         0   5   10                  15               20           25
                              Number of messages per thread                                                                Thread depth
Hypotheses
β€’ H1: Number of messages and thread depth
  are dependent variables
β€’ H2: We can identify intense discussions by
  identifying threads in top depth and msg/
  thread quartiles
β€’ H3: Intense discussions affect the
  repository’s source line intake
Method
β€’ Import projects in Alitheia Core
β€’ Develop metric plug-in to count the
  variables we are interested in
  β€’ 3 metrics
β€’ Emails from 60 projects, ~1,2 * 10^6 emails,
  679427 threads
β€’ Plug-in loc: 270 lines
100
                                                                      Fit function (0.55 + 1.59x)




                     80
Number of Messages




                     60




                     40




                     20




                      0
                           0   5   10   15   20        25        30        35        40         45   50
                                                  Thread depth



                                           H1:
                                        R^2 = 0.70
100
                                                                      Fit function (0.55 + 1.59x)




                     80
Number of Messages




                     60




                     40




                     20




                      0
                           0   5   10   15   20        25        30        35        40         45   50
                                                  Thread depth



                                           H1:
                                        R^2 = 0.70
β€’ H2: 99 discussion threads
β€’ H3 3:         Project
               Avogadro
                              Ξ—otEffect
                                622
             Deskbar-Applet     -103
                FreeBSD         -34
             Gnome-Network      -106
              Gnome-Utils       -263
                 GTK+           183
                Sabayon         -244
                  Vala          -356
                  LSR           1150
                 Meld            27
β€’ H2: 99 discussion threads
β€’ H3 3:         Project
               Avogadro
                              Ξ—otEffect
                                622
             Deskbar-Applet     -103
                FreeBSD         -34
             Gnome-Network      -106
              Gnome-Utils       -263
                 GTK+           183
                Sabayon         -244
                  Vala          -356
                  LSR           1150
                 Meld            27
β€’ H2: 99 discussion threads
β€’ H3 3:         Project
               Avogadro
                              Ξ—otEffect
                                622
             Deskbar-Applet     -103
                FreeBSD         -34
             Gnome-Network      -106
              Gnome-Utils       -263
                 GTK+           183
                Sabayon         -244
                  Vala          -356
                  LSR           1150
                 Meld            27
Does the number of
programmers affect
code maintainability?
Hypotheses

β€’ H1: Number of programmers affects code
  maintainability at the project level
β€’ H2: Number of programmers affects code
  maintainability at the directory level.
Method
β€’   Per langauge
    β€’   C & Java (risky)
β€’   Plug-ins that calculate
    β€’   Number of developers per period of time
    β€’   Halstead & McCabe (700 lines)
    β€’   Omar’s Maintainability Index (240 lines)
β€’   Data from 213 projects
Maintainability at the project level
           R^2 = 0.04
Maintainability at the project level
           R^2 = 0.04
Maintainability at the module level
            R^2 = 0.05
Maintainability at the module level
            R^2 = 0.05
Per language

 C: R^2 = 0.08
Java: R^2 = 0.07
Per language

 C: R^2 = 0.08
Java: R^2 = 0.07
Why do we need large
  scale research?
Project         Ξ—otEffect
      Banshee          -1121
   Deskbar-Applet       -103
      FreeBSD            -34
Gnome-Power-Manager     -773
    Gnome-Utils         -263
    GTranslator         -230
      Sabayon           -244
        Vala            -356
1
                                                        R2

                           0.9


                           0.8


                           0.7
Correlation Co-efficient




                           0.6


                           0.5


                           0.4


                           0.3


                           0.2


                           0.1


                            0



                            R^2 at the module level for all
                                       projects
TODO
Replication of published
         studies
Platform Data
  validation
Repositories for tools,
  data and results
The SoftEng cloud
     project
Thank you!


http://www.sqo-oss.org
   Georgios Gousios
   gousiosg@aueb.gr

More Related Content

Viewers also liked (10)

5to 3ra informΓ‘tica
5to 3ra informΓ‘tica 5to 3ra informΓ‘tica
5to 3ra informΓ‘tica
Β 
3
33
3
Β 
Arq226861
Arq226861Arq226861
Arq226861
Β 
Organizadores gráficos 1
Organizadores gráficos 1Organizadores gráficos 1
Organizadores gráficos 1
Β 
Śniadanie Daje Moc
Śniadanie Daje MocŚniadanie Daje Moc
Śniadanie Daje Moc
Β 
Maya - Lição 01
Maya - Lição 01Maya - Lição 01
Maya - Lição 01
Β 
RC
RCRC
RC
Β 
Introdução ao Processamento Paralelo (1)
Introdução ao Processamento Paralelo (1)Introdução ao Processamento Paralelo (1)
Introdução ao Processamento Paralelo (1)
Β 
внСшнСС строСниС Ρ€Ρ‹Π± 1
внСшнСС строСниС Ρ€Ρ‹Π± 1внСшнСС строСниС Ρ€Ρ‹Π± 1
внСшнСС строСниС Ρ€Ρ‹Π± 1
Β 
Dynamics 365 Field Service Enhancement
Dynamics 365 Field Service EnhancementDynamics 365 Field Service Enhancement
Dynamics 365 Field Service Enhancement
Β 

Similar to Performing Large Scale Software Engineering Studies: A Platform for Empirical Research

EUGM 2014 - Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...
EUGM 2014 -  Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...EUGM 2014 -  Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...
EUGM 2014 - Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...ChemAxon
Β 
Software tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningSoftware tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningAnubhav Jain
Β 
ALM Search Presentation for the VSS Arch Council
ALM Search Presentation for the VSS Arch CouncilALM Search Presentation for the VSS Arch Council
ALM Search Presentation for the VSS Arch CouncilSunita Shrivastava
Β 
Btech IT Sem VII and VIII-1 (1).pdf
Btech IT Sem VII and VIII-1 (1).pdfBtech IT Sem VII and VIII-1 (1).pdf
Btech IT Sem VII and VIII-1 (1).pdfAdityaBhateja1
Β 
How to Monitor Microservices
How to Monitor MicroservicesHow to Monitor Microservices
How to Monitor MicroservicesSysdig
Β 
Introduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaIntroduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaopenseesdays
Β 
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...Stuart Wrigley
Β 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilitiesIan Foster
Β 
θ©Ήε‰‘ι”‹οΌšBig databenchβ€”benchmarking big data systems
θ©Ήε‰‘ι”‹οΌšBig databenchβ€”benchmarking big data systemsθ©Ήε‰‘ι”‹οΌšBig databenchβ€”benchmarking big data systems
θ©Ήε‰‘ι”‹οΌšBig databenchβ€”benchmarking big data systemshdhappy001
Β 
θ©Ήε‰‘ι”‹οΌšBig databenchβ€”benchmarking big data systems
θ©Ήε‰‘ι”‹οΌšBig databenchβ€”benchmarking big data systemsθ©Ήε‰‘ι”‹οΌšBig databenchβ€”benchmarking big data systems
θ©Ήε‰‘ι”‹οΌšBig databenchβ€”benchmarking big data systemshdhappy001
Β 
EKON 23 Code_review_checklist
EKON 23 Code_review_checklistEKON 23 Code_review_checklist
EKON 23 Code_review_checklistMax Kleiner
Β 
SeCold - A Linked Data Platform for Mining Software Repositories
SeCold - A Linked Data Platform for  Mining Software RepositoriesSeCold - A Linked Data Platform for  Mining Software Repositories
SeCold - A Linked Data Platform for Mining Software Repositoriesimanmahsa
Β 
When develpment met test(shift left testing)
When develpment met test(shift left testing)When develpment met test(shift left testing)
When develpment met test(shift left testing)SangIn Choung
Β 
housing price prediction ppt in artificial
housing price prediction ppt in artificialhousing price prediction ppt in artificial
housing price prediction ppt in artificialKrishPatel802536
Β 
Ross Tredinnick - Rebecca J. Holz Research Data Management Talk 4/16/2013
Ross Tredinnick - Rebecca J. Holz Research Data Management Talk 4/16/2013Ross Tredinnick - Rebecca J. Holz Research Data Management Talk 4/16/2013
Ross Tredinnick - Rebecca J. Holz Research Data Management Talk 4/16/2013rossTnick
Β 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...PlanetData Network of Excellence
Β 
A Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons LearnedA Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons LearnedDatabricks
Β 
Reproducible research concepts and tools
Reproducible research concepts and toolsReproducible research concepts and tools
Reproducible research concepts and toolsC. Tobin Magle
Β 
Resilience Engineering: A field of study, a community, and some perspective s...
Resilience Engineering: A field of study, a community, and some perspective s...Resilience Engineering: A field of study, a community, and some perspective s...
Resilience Engineering: A field of study, a community, and some perspective s...John Allspaw
Β 

Similar to Performing Large Scale Software Engineering Studies: A Platform for Empirical Research (20)

EUGM 2014 - Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...
EUGM 2014 -  Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...EUGM 2014 -  Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...
EUGM 2014 - Brock Luty (Dart Neuroscience): A ChemAxon/KNIME based tool for ...
Β 
Software tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningSoftware tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data mining
Β 
ALM Search Presentation for the VSS Arch Council
ALM Search Presentation for the VSS Arch CouncilALM Search Presentation for the VSS Arch Council
ALM Search Presentation for the VSS Arch Council
Β 
Btech IT Sem VII and VIII-1 (1).pdf
Btech IT Sem VII and VIII-1 (1).pdfBtech IT Sem VII and VIII-1 (1).pdf
Btech IT Sem VII and VIII-1 (1).pdf
Β 
How to Monitor Microservices
How to Monitor MicroservicesHow to Monitor Microservices
How to Monitor Microservices
Β 
Introduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaIntroduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKenna
Β 
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
Infrastructure and Workflow for the Formal Evaluation of Semantic Search Tech...
Β 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilities
Β 
θ©Ήε‰‘ι”‹οΌšBig databenchβ€”benchmarking big data systems
θ©Ήε‰‘ι”‹οΌšBig databenchβ€”benchmarking big data systemsθ©Ήε‰‘ι”‹οΌšBig databenchβ€”benchmarking big data systems
θ©Ήε‰‘ι”‹οΌšBig databenchβ€”benchmarking big data systems
Β 
θ©Ήε‰‘ι”‹οΌšBig databenchβ€”benchmarking big data systems
θ©Ήε‰‘ι”‹οΌšBig databenchβ€”benchmarking big data systemsθ©Ήε‰‘ι”‹οΌšBig databenchβ€”benchmarking big data systems
θ©Ήε‰‘ι”‹οΌšBig databenchβ€”benchmarking big data systems
Β 
EKON 23 Code_review_checklist
EKON 23 Code_review_checklistEKON 23 Code_review_checklist
EKON 23 Code_review_checklist
Β 
SlideShare.pptx
SlideShare.pptxSlideShare.pptx
SlideShare.pptx
Β 
SeCold - A Linked Data Platform for Mining Software Repositories
SeCold - A Linked Data Platform for  Mining Software RepositoriesSeCold - A Linked Data Platform for  Mining Software Repositories
SeCold - A Linked Data Platform for Mining Software Repositories
Β 
When develpment met test(shift left testing)
When develpment met test(shift left testing)When develpment met test(shift left testing)
When develpment met test(shift left testing)
Β 
housing price prediction ppt in artificial
housing price prediction ppt in artificialhousing price prediction ppt in artificial
housing price prediction ppt in artificial
Β 
Ross Tredinnick - Rebecca J. Holz Research Data Management Talk 4/16/2013
Ross Tredinnick - Rebecca J. Holz Research Data Management Talk 4/16/2013Ross Tredinnick - Rebecca J. Holz Research Data Management Talk 4/16/2013
Ross Tredinnick - Rebecca J. Holz Research Data Management Talk 4/16/2013
Β 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Β 
A Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons LearnedA Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons Learned
Β 
Reproducible research concepts and tools
Reproducible research concepts and toolsReproducible research concepts and tools
Reproducible research concepts and tools
Β 
Resilience Engineering: A field of study, a community, and some perspective s...
Resilience Engineering: A field of study, a community, and some perspective s...Resilience Engineering: A field of study, a community, and some perspective s...
Resilience Engineering: A field of study, a community, and some perspective s...
Β 

Performing Large Scale Software Engineering Studies: A Platform for Empirical Research

  • 1. Performing Large Scale Software Engineering Studies Georgios Gousios
  • 3. SE = Empirical Sciecne
  • 4.
  • 5.
  • 6.
  • 11.
  • 12.
  • 13. 30 25 20 Number of Works 15 10 5 0 0 1 2 3 4 5 6 8 10 25 50 Sample sizes (number of projects)
  • 14. Metric Gnome-VFS Evolution KDE SCM History 10y 9m 12y 4m 14y Num 5522 36835 >1200000 Revisions SCM Size 105 MB 1.4 GB 60 GB Num Emails 20 240 6252 (Nov 2008)
  • 15. 400 300 GB 200 100 Others KDE 0 700 projects LSE (2003-2006) GenBank Dataset
  • 16. RCS Postfix Bugzilla CVS MailMan SF.net Tracker SVN Jira Darcs Marc Gnats Git Hg
  • 17. Researcher’s view Time 1. Go to project.org 2. Dowload SVN, Mail, Bug, ask for IRC logs 3. .....? 4. Publish research
  • 18. Researcher’s view (some) Project’s view Time 1. Go to project.org 1. Hm, a new visitor 2. Hey, she is mirroring our bugzilla 2. Dowload SVN, Mail, Bug, ask for IRC logs 3. .....? 3. .....? 4. Ban her! 4. Publish research
  • 19. Researcher’s view (some) Project’s view Time 1. Go to project.org 1. Hm, a new visitor 2. Hey, she is mirroring our bugzilla 2. Dowload SVN, Mail, Bug, ask for IRC logs 3. .....? 3. .....? 4. Ban her! 4. Publish research
  • 20. How is empirical research done in mature disciplines?
  • 25.
  • 26.
  • 27. Our research: A platform for software engineering research
  • 28. In our research 1. We examined the current situation 2. We propose a platform for large scale research 3. We validated its design with 2 case studies
  • 29. Empirical = Model Study + Data + Metrics / Tools + Analysis Methods + Results analysis
  • 30. Empirical = Model Study + Data + Metrics / Tools + Analysis Methods + Results analysis
  • 31. Empirical = Model Study + Data + Metrics / Tools + Analysis Methods + Results analysis
  • 32. Empirical = Model Study + Data + Metrics / Tools + Analysis Methods + Results analysis
  • 33. Research methods 40 35 30 25 Number of Works 20 15 10 5 0 0 EXP FCS ECS CCS SUR Research Methods used
  • 34. What sources of data are in use? 45 40 35 30 Number of Works 25 20 15 10 5 0 0 BTS SRC SF ECT SCM Data Source
  • 35. What is the examined data size (in number of projects) ? 30 25 20 Number of Works 15 10 5 0 0 1 2 3 4 5 6 8 10 25 50 Sample sizes (number of projects)
  • 36. Findings β€’ Sample size very small β€’ How can we extract generic results? β€’ No experiment replication β€’ Do we believe in each other’s work or just ignore it? β€’ We did not check the stats...
  • 37. @ Only 20% of the tools and data reported in ICSE papers could be retrieved a year after publication
  • 39. Rigorous Evaluation Freely Available Empirical Data Tools and Results Sharing Better Empirical Studies Software Engineering Platform
  • 40. A software engineering research platform
  • 41. Ready made tools Formalised data formats Easily extensible A software engineering research platform Researcher Large scale Pre processed data community procesing
  • 42.
  • 46. Platform Data Tools
  • 47. Platform Data Tools Processing
  • 48.
  • 49. Data Raw Data Metadata Tool Results Mailing Lists BTS SCM Processed raw data
  • 50. Mirroring Root / project. Project 1 Project 2 Project 3 properties git svn mails bugs Standard Standard GIT SVN format bug<id> format .xml List 1 List 2 tmp cur new messageid.eml
  • 51. Raw Data Mirroring Root / project. Project 1 Project 2 Project 3 properties git svn mails bugs Standard Standard GIT SVN format bug<id> format .xml List 1 List 2 tmp cur new messageid.eml
  • 52. ο€ο€•ο€Šο€…ο€–ο€“ο€„ο€’ο€‹ο€ƒο€‰ο€“ο€Šο€…ο€—ο€˜ο€‰ο€“ο€Šο€…ο€ ο€ο€ˆο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€ ο€ο€•ο€”ο€’ο€‘ο€‰ο€‡ο€‹ο€’ο€Šο€Œο€‡ο€  ο€š ο€š   ο€ο€ο€Œο€‡ο€‘ο€ο€‹ο€“ο€˜ο€‰ο€“ο€Šο€…ο€ ο€š ο€š ο€šο€… ο€š ο€š ο€š  ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€ ο€… ο€ο€ο€’ο€˜ο€Œο€ƒο€‰ο€‡ο€₯  ο€… ο€… ο€… ο€… ο€ο€ο€žο€’ο€„ο€œο€‘ο€ ο€ο€ˆο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€•ο€Šο€…ο€–ο€“ο€„ο€  ο€ο€•ο€”ο€’ο€‘ο€‰ο€‡ο€‹ο€’ο€Šο€Œο€‡ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€   ο€ο€ο€‡ο€‰ο€‹ο€“ο€ο€‚ο€¦ο€˜ο€‡ο€ ο€ο€ο€“ο€…ο€‘ο€‰ο€ƒο€”ο€”ο€Œο€ƒο€‰ο€‡ο€ ο€ο€ο€ο€Šο€…ο€–ο€—ο€˜ο€‰ο€ ο€ο€ο€ο€‹ο€‡ο€ƒο€‰ο€“ο€Šο€…ο€‚ο€ˆο€ ο€ο€ο€…ο€Šο€Œο€‡ο€ ο€ο€ο€”ο€“ο€‘ο€‰ο€œο€Œο€ ο€ο€ο€‰ο€¦ο€˜ο€‡ο€ ο€ο€ο€™ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€ ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€ ο€ο€ο€Œο€‡ο€”ο€‰ο€ƒο€‚ο€ˆο€ ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€ ο€ο€ο€‘ο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€  ο€ο€ο€Œο€‡ο€‘ο€ο€‹ο€“ο€˜ο€‰ο€“ο€Šο€…ο€  ο€ο€ο€‹ο€‡ο€˜ο€Šο€‹ο€‰ο€‡ο€‹ο€ ο€ο€ο€”ο€Šο€ο€£ο€‡ο€Œο€  ο€ο€ο€‹ο€‡ο€‘ο€Šο€”ο€’ο€‰ο€“ο€Šο€…ο€ ο€š ο€š ο€š ο€ο€ο€Ÿο€ƒο€‘ο€Ÿο€ο€Šο€Œο€‡ο€ ο€ο€ο€˜ο€‹ο€“ο€Šο€‹ο€“ο€‰ο€¦ο€  ο€š ο€š ο€ο€ο€‘ο€Ÿο€Šο€‹ο€‰ο€‘ο€‡ο€‘ο€ο€ ο€š ο€š ο€š ο€… ο€… ο€… ο€… ο€… ο€… ο€ο€ο€”ο€’ο€„ο€“ο€…ο€•ο€Šο€…ο€–ο€“ο€„ο€’ο€‹ο€ƒο€‰ο€“ο€Šο€…ο€  ο€ο€‘ο€‡ο€™ο€‡ο€”ο€Šο€˜ο€‡ο€‹ο€ ο€ο€ο€ƒο€“ο€”ο€“ο€…ο€„ο€ ο€“ο€‘ο€‰ο€‚ο€Ÿο€‹ο€‡ο€ƒο€Œο€  ο€ο€ο€˜ο€”ο€’ο€„ο€“ο€…ο€   ο€š  ο€ο€ο€†ο€‡ο€‰ο€‹ο€“ο€ο€‚ο€¦ο€˜ο€‡ο€  ο€ο€ο€”ο€ƒο€‘ο€‰ο€¨ο€˜ο€Œο€ƒο€‰ο€‡ο€Œο€ ο€ο€ο€‰ο€¦ο€˜ο€‡ο€ ο€ο€ο€†ο€…ο€‡ο€†ο€Šο€…ο€“ο€ο€ ο€ο€ο€‘ο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€   ο€ο€ο€Œο€‡ο€‘ο€ο€‹ο€“ο€˜ο€‰ο€“ο€Šο€…ο€ ο€šο€š ο€š ο€ο€ο€˜ο€”ο€’ο€„ο€“ο€…ο€ ο€š ο€š ο€š ο€š ο€šο€š ο€š ο€… ο€… ο€… ο€… ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€ ο€…  ο€… ο€… ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€ ο€ο€ο€‘ο€‡ο€…ο€Œο€‡ο€‹ο€ ο€… ο€… ο€…ο€… ο€… ο€… ο€ο€ο€‹ο€‡ο€™ο€“ο€‘ο€“ο€Šο€…ο€œο€Œο€ ο€… ο€₯ο€‡ο€˜ο€Šο€‹ο€‰ο€ο€‡ο€‘ο€‘ο€ƒο€„ο€‡ο€   ο€ο€ο€ƒο€“ο€”ο€“ο€…ο€„ο€ ο€“ο€‘ο€‰ο€‚ο€Ÿο€‹ο€‡ο€ƒο€Œο€ο€‡ο€ƒο€‘ο€’ο€‹ο€‡ο€†ο€‡ο€…ο€‰ο€ ο€ο€ˆο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€ο€‡ο€ƒο€‘ο€’ο€‹ο€‡ο€†ο€‡ο€…ο€‰ο€ ο€ο€ο€‰ο€“ο€†ο€‡ο€‘ο€‰ο€ƒο€†ο€˜ο€ ο€ο€‘ο€‡ο€™ο€‡ο€”ο€Šο€˜ο€‡ο€‹ο€ͺ ο€ο€ο€†ο€‡ο€‘ο€‘ο€ƒο€„ο€‡ο€œο€Œο€ ο€ο€‘ο€“ο€‹ο€‡ο€ο€‰ο€Šο€‹ο€¦ο€ ο€ο€ο€žο€’ο€„ο€ ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€§ο€“ο€”ο€‡ο€ˆο€‰ο€ƒο€‰ο€‡ο€ ο€ο€ο€ο€Šο€†ο€†ο€“ο€‰ο€‰ο€‡ο€‹ο€ ο€ο€ο€‘ο€’ο€žο€Žο€‡ο€ο€‰ο€  ο€ο€ο€‰ο€Ÿο€‹ο€‡ο€ƒο€Œο€   ο€ο€ο€‹ο€‡ο€˜ο€Šο€‹ο€‰ο€‡ο€‹ο€ ο€ο€ο€ˆο€•ο€ο€©ο€₯ ο€ο€ο€ο€Šο€†ο€†ο€“ο€‰ο€ο€‘ο€„ο€ ο€ο€ο€‘ο€‡ο€…ο€Œο€‘ο€ƒο€‰ο€‡ο€     ο€ο€ο€Œο€‡ο€™ο€‡ο€”ο€Šο€˜ο€‡ο€‹ο€ ο€ο€ο€‰ο€“ο€†ο€‡ο€‘ο€‰ο€ƒο€†ο€˜ο€ ο€ο€ο€˜ο€ƒο€‰ο€Ÿο€   ο€ο€ο€žο€’ο€„ο€  ο€ο€ο€‘ο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€  ο€š ο€š  ο€ο€ο€‰ο€Ÿο€‹ο€‡ο€ƒο€Œο€ ο€ο€ο€žο€‹ο€ƒο€…ο€ο€Ÿο€‡ο€Œο€ ο€ο€ο€Œο€‡ο€˜ο€‰ο€Ÿο€ ο€ο€ο€†ο€‡ο€‹ο€„ο€‡ο€Œο€ ο€ο€ο€˜ο€ƒο€‹ο€‡ο€…ο€‰ο€ ο€š ο€š ο€š ο€… ο€š ο€š ο€… ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€§ο€“ο€”ο€‡ο€ ο€… ο€…  ο€… ο€… ο€… ο€… ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€ο€‡ο€ƒο€‘ο€’ο€‹ο€‡ο€†ο€‡ο€…ο€‰ο€ ο€… ο€…  ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€ ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€ο€ƒο€‹ο€‡ο€…ο€‰ο€ ο€ο€€ο€‹ο€ƒο€…ο€ο€Ÿο€     ο€ο€ο€˜ο€ƒο€‹ο€‡ο€…ο€‰ο€ ο€ο€ο€“ο€‘ο€‘ο€“ο€‹ο€‡ο€ο€‰ο€Šο€‹ο€¦ο€     ο€ο€ο€ο€Ÿο€“ο€”ο€Œο€ ο€ο€ο€Œο€“ο€‹ο€ ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€  ο€ο€ο€™ο€ƒο€”ο€“ο€Œο€§ο€‹ο€Šο€†ο€ ο€ο€ο€™ο€ƒο€”ο€“ο€Œο€¨ο€…ο€‰ο€“ο€”ο€ ο€š ο€ο€ο€ο€Šο€˜ο€¦ο€§ο€‹ο€Šο€†ο€ ο€… ο€… ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€§ο€“ο€”ο€‡ο€ο€‡ο€ƒο€‘ο€’ο€‹ο€‡ο€†ο€‡ο€…ο€‰ο€ ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€‘ο€“ο€‹ο€‡ο€ο€‰ο€Šο€‹ο€¦ο€   ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€§ο€“ο€”ο€‡ο€
  • 53. Metadata ο€ο€•ο€Šο€…ο€–ο€“ο€„ο€’ο€‹ο€ƒο€‰ο€“ο€Šο€…ο€—ο€˜ο€‰ο€“ο€Šο€…ο€ ο€ο€ˆο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€ ο€ο€•ο€”ο€’ο€‘ο€‰ο€‡ο€‹ο€’ο€Šο€Œο€‡ο€  ο€š ο€š   ο€ο€ο€Œο€‡ο€‘ο€ο€‹ο€“ο€˜ο€‰ο€“ο€Šο€…ο€ ο€š ο€š ο€šο€… ο€š ο€š ο€š  ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€ ο€… ο€ο€ο€’ο€˜ο€Œο€ƒο€‰ο€‡ο€₯  ο€… ο€… ο€… ο€… ο€ο€ο€žο€’ο€„ο€œο€‘ο€ ο€ο€ˆο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€•ο€Šο€…ο€–ο€“ο€„ο€  ο€ο€•ο€”ο€’ο€‘ο€‰ο€‡ο€‹ο€’ο€Šο€Œο€‡ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€   ο€ο€ο€‡ο€‰ο€‹ο€“ο€ο€‚ο€¦ο€˜ο€‡ο€ ο€ο€ο€“ο€…ο€‘ο€‰ο€ƒο€”ο€”ο€Œο€ƒο€‰ο€‡ο€ ο€ο€ο€ο€Šο€…ο€–ο€—ο€˜ο€‰ο€ ο€ο€ο€ο€‹ο€‡ο€ƒο€‰ο€“ο€Šο€…ο€‚ο€ˆο€ ο€ο€ο€…ο€Šο€Œο€‡ο€ ο€ο€ο€”ο€“ο€‘ο€‰ο€œο€Œο€ ο€ο€ο€‰ο€¦ο€˜ο€‡ο€ ο€ο€ο€™ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€ ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€ ο€ο€ο€Œο€‡ο€”ο€‰ο€ƒο€‚ο€ˆο€ ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€ ο€ο€ο€‘ο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€  ο€ο€ο€Œο€‡ο€‘ο€ο€‹ο€“ο€˜ο€‰ο€“ο€Šο€…ο€  ο€ο€ο€‹ο€‡ο€˜ο€Šο€‹ο€‰ο€‡ο€‹ο€ ο€ο€ο€”ο€Šο€ο€£ο€‡ο€Œο€  ο€ο€ο€‹ο€‡ο€‘ο€Šο€”ο€’ο€‰ο€“ο€Šο€…ο€ ο€š ο€š ο€š ο€ο€ο€Ÿο€ƒο€‘ο€Ÿο€ο€Šο€Œο€‡ο€ ο€ο€ο€˜ο€‹ο€“ο€Šο€‹ο€“ο€‰ο€¦ο€  ο€š ο€š ο€ο€ο€‘ο€Ÿο€Šο€‹ο€‰ο€‘ο€‡ο€‘ο€ο€ ο€š ο€š ο€š ο€… ο€… ο€… ο€… ο€… ο€… ο€ο€ο€”ο€’ο€„ο€“ο€…ο€•ο€Šο€…ο€–ο€“ο€„ο€’ο€‹ο€ƒο€‰ο€“ο€Šο€…ο€  ο€ο€‘ο€‡ο€™ο€‡ο€”ο€Šο€˜ο€‡ο€‹ο€ ο€ο€ο€ƒο€“ο€”ο€“ο€…ο€„ο€ ο€“ο€‘ο€‰ο€‚ο€Ÿο€‹ο€‡ο€ƒο€Œο€  ο€ο€ο€˜ο€”ο€’ο€„ο€“ο€…ο€   ο€š  ο€ο€ο€†ο€‡ο€‰ο€‹ο€“ο€ο€‚ο€¦ο€˜ο€‡ο€  ο€ο€ο€”ο€ƒο€‘ο€‰ο€¨ο€˜ο€Œο€ƒο€‰ο€‡ο€Œο€ ο€ο€ο€‰ο€¦ο€˜ο€‡ο€ ο€ο€ο€†ο€…ο€‡ο€†ο€Šο€…ο€“ο€ο€ ο€ο€ο€‘ο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€   ο€ο€ο€Œο€‡ο€‘ο€ο€‹ο€“ο€˜ο€‰ο€“ο€Šο€…ο€ ο€šο€š ο€š ο€ο€ο€˜ο€”ο€’ο€„ο€“ο€…ο€ ο€š ο€š ο€š ο€š ο€šο€š ο€š ο€… ο€… ο€… ο€… ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€ ο€…  ο€… ο€… ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€ ο€ο€ο€‘ο€‡ο€…ο€Œο€‡ο€‹ο€ ο€… ο€… ο€…ο€… ο€… ο€… ο€ο€ο€‹ο€‡ο€™ο€“ο€‘ο€“ο€Šο€…ο€œο€Œο€ ο€… ο€₯ο€‡ο€˜ο€Šο€‹ο€‰ο€ο€‡ο€‘ο€‘ο€ƒο€„ο€‡ο€   ο€ο€ο€ƒο€“ο€”ο€“ο€…ο€„ο€ ο€“ο€‘ο€‰ο€‚ο€Ÿο€‹ο€‡ο€ƒο€Œο€ο€‡ο€ƒο€‘ο€’ο€‹ο€‡ο€†ο€‡ο€…ο€‰ο€ ο€ο€ˆο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€ο€‡ο€ƒο€‘ο€’ο€‹ο€‡ο€†ο€‡ο€…ο€‰ο€ ο€ο€ο€‰ο€“ο€†ο€‡ο€‘ο€‰ο€ƒο€†ο€˜ο€ ο€ο€‘ο€‡ο€™ο€‡ο€”ο€Šο€˜ο€‡ο€‹ο€ͺ ο€ο€ο€†ο€‡ο€‘ο€‘ο€ƒο€„ο€‡ο€œο€Œο€ ο€ο€‘ο€“ο€‹ο€‡ο€ο€‰ο€Šο€‹ο€¦ο€ ο€ο€ο€žο€’ο€„ο€ ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€§ο€“ο€”ο€‡ο€ˆο€‰ο€ƒο€‰ο€‡ο€ ο€ο€ο€ο€Šο€†ο€†ο€“ο€‰ο€‰ο€‡ο€‹ο€ ο€ο€ο€‘ο€’ο€žο€Žο€‡ο€ο€‰ο€  ο€ο€ο€‰ο€Ÿο€‹ο€‡ο€ƒο€Œο€   ο€ο€ο€‹ο€‡ο€˜ο€Šο€‹ο€‰ο€‡ο€‹ο€ ο€ο€ο€ˆο€•ο€ο€©ο€₯ ο€ο€ο€ο€Šο€†ο€†ο€“ο€‰ο€ο€‘ο€„ο€ ο€ο€ο€‘ο€‡ο€…ο€Œο€‘ο€ƒο€‰ο€‡ο€     ο€ο€ο€Œο€‡ο€™ο€‡ο€”ο€Šο€˜ο€‡ο€‹ο€ ο€ο€ο€‰ο€“ο€†ο€‡ο€‘ο€‰ο€ƒο€†ο€˜ο€ ο€ο€ο€˜ο€ƒο€‰ο€Ÿο€   ο€ο€ο€žο€’ο€„ο€  ο€ο€ο€‘ο€‰ο€Šο€‹ο€‡ο€Œο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€  ο€š ο€š  ο€ο€ο€‰ο€Ÿο€‹ο€‡ο€ƒο€Œο€ ο€ο€ο€žο€‹ο€ƒο€…ο€ο€Ÿο€‡ο€Œο€ ο€ο€ο€Œο€‡ο€˜ο€‰ο€Ÿο€ ο€ο€ο€†ο€‡ο€‹ο€„ο€‡ο€Œο€ ο€ο€ο€˜ο€ƒο€‹ο€‡ο€…ο€‰ο€ ο€š ο€š ο€š ο€… ο€š ο€š ο€… ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€§ο€“ο€”ο€‡ο€ ο€… ο€…  ο€… ο€… ο€… ο€… ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€ο€‡ο€ƒο€‘ο€’ο€‹ο€‡ο€†ο€‡ο€…ο€‰ο€ ο€… ο€…  ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€ ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€ο€ƒο€‹ο€‡ο€…ο€‰ο€ ο€ο€€ο€‹ο€ƒο€…ο€ο€Ÿο€     ο€ο€ο€˜ο€ƒο€‹ο€‡ο€…ο€‰ο€ ο€ο€ο€“ο€‘ο€‘ο€“ο€‹ο€‡ο€ο€‰ο€Šο€‹ο€¦ο€     ο€ο€ο€ο€Ÿο€“ο€”ο€Œο€ ο€ο€ο€Œο€“ο€‹ο€ ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€›ο€‡ο€‹ο€‘ο€“ο€Šο€…ο€  ο€ο€ο€™ο€ƒο€”ο€“ο€Œο€§ο€‹ο€Šο€†ο€ ο€ο€ο€™ο€ƒο€”ο€“ο€Œο€¨ο€…ο€‰ο€“ο€”ο€ ο€š ο€ο€ο€ο€Šο€˜ο€¦ο€§ο€‹ο€Šο€†ο€ ο€… ο€… ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€§ο€“ο€”ο€‡ο€ο€‡ο€ƒο€‘ο€’ο€‹ο€‡ο€†ο€‡ο€…ο€‰ο€ ο€ο€ο€‹ο€Šο€Žο€‡ο€ο€‰ο€‘ο€“ο€‹ο€‡ο€ο€‰ο€Šο€‹ο€¦ο€   ο€ο€ο€˜ο€‹ο€Šο€Žο€‡ο€ο€‰ο€§ο€“ο€”ο€‡ο€
  • 54. Tools
  • 55. Tools Metric Metric Plug-in Job Metadata Web Cluster Metric Plug-in Metric Logging Schedul Updater services Service Plug-in Activator er DB Messagi Web Plug-in Parser Data Security Service ng Admin Admin Servic Access e SQO-OSS SQO-OSS SQO-OSS Project 1 PV svn mails bugs PV PV List 1 List 2 PV Project Mirror tmp cur new Metadata Storage
  • 56. Tools public interface AlitheiaPlugin { String getVersion(); String getAuthor(); Date getDateInstalled(); String getName(); String getDescription(); List<Result> getResultIfAlreadyCalculated(DAObject o, List<Metric> l); List<Result> getResult(DAObject o, List<Metric> l); List<Metric> getAllSupportedMetrics(); List<Metric> getSupportedMetrics(Class<? extends DAObject> activationType); void run(DAObject o); boolean update(); boolean install(); boolean remove(); boolean cleanup(DAObject sp); String getUniqueKey(); Set<Class<? extends DAObject>> getActivationTypes(); List<Class<? extends DAObject>> getMetricActivationTypes (Metric m); Set<PluginConfiguration> getConfigurationSchema(); Set<String> getDependencies(); Map<MetricType.Type, SortedSet<Long>> getObjectIdsToSync(StoredProject sp, Metric m); }
  • 57. Tools public interface AlitheiaPlugin { String getVersion(); String getAuthor(); Date getDateInstalled(); String getName(); String getDescription(); List<Result> getResultIfAlreadyCalculated(DAObject o, List<Metric> l); List<Result> getResult(DAObject o, List<Metric> l); List<Metric> getAllSupportedMetrics(); List<Metric> getSupportedMetrics(Class<? extends DAObject> activationType); void run(DAObject o); boolean update(); boolean install(); boolean remove(); boolean cleanup(DAObject sp); String getUniqueKey(); Set<Class<? extends DAObject>> getActivationTypes(); List<Class<? extends DAObject>> getMetricActivationTypes (Metric m); Set<PluginConfiguration> getConfigurationSchema(); Set<String> getDependencies(); Map<MetricType.Type, SortedSet<Long>> getObjectIdsToSync(StoredProject sp, Metric m); }
  • 58. Tools @MetricDeclarations(metrics = { @MetricDecl(mnemonic="MNOF", activators={ProjectDirectory.class}, descr="Number of Source Code Files in Module"), @MetricDecl(mnemonic="MNOL", activators={ProjectDirectory.class}, descr="Number of lines in module", dependencies={"Wc.loc"}), @MetricDecl(mnemonic="AMS", activators={ProjectVersion.class}, descr="Average Module Size"), @MetricDecl(mnemonic="ISSRCMOD", activators={ProjectDirectory.class}, descr="Mark for modules containing source files") }) public class ModuleMetricsImplementation extends AbstractMetric { public void run(ProjectFile pf) throws AlreadyProcessingException {[...]} public void run(ProjectVersion pv) throws AlreadyProcessingException {[...]} public List<Result> getResult(ProjectFile pf, Metric m) { return getResult(pf, ProjectFileMeasurement.class, m, Result.ResultType.INTEGER); } public List<Result> getResult(ProjectVersion pv, Metric m) { return getResult(pv, ProjectVersionMeasurement.class, m, Result.ResultType.FLOAT); } }
  • 59. public void run(ProjectFile pf) { // We do not support directories Tools if (pf.getIsDirectory()) { return; } InputStream in = fds.getFileContents(pf); if (in == null) { return; } // Create an input stream from the project file's content try { // Measure the number of lines in the project file LineNumberReader lnr = new LineNumberReader(new InputStreamReader(in)); int lines = 0; while (lnr.readLine() != null) { lines++; } lnr.close(); // Store the results Metric metric = Metric.getMetricByMnemonic("LOC"); ProjectFileMeasurement locm = new ProjectFileMeasurement(); locm.setMetric(metric); locm.setProjectFile(pf); locm.setWhenRun(new Timestamp(System.currentTimeMillis())); locm.setResult(String.valueOf(lines)); db.addRecord(locm); markEvaluation(metric, pf.getProjectVersion().getProject()); } catch (IOException e) { log.error(this.getClass().getName() + " IO Error <" + e + "> while measuring: " + pf.getFileName()); } }
  • 60. Processing - The scmmap algorithm CPU
  • 61. Processing - the idmap algorithm CPU
  • 62. Processing - in clusters
  • 63. Line counting speed 60 45 Minutes 30 15 0 Naive implementation Alitheia Core
  • 64. New cluster node connects
  • 66. Do intense conversations affect short-term project development?
  • 67. How do we identify intense discussions 100000 120000 Number of messages Levels of thread depth 90000 100000 80000 70000 80000 60000 Occurences Occurences 50000 60000 40000 40000 30000 20000 20000 10000 0 0 0 5 10 15 20 25 0 5 10 15 20 25 Number of messages per thread Thread depth
  • 68. Hypotheses β€’ H1: Number of messages and thread depth are dependent variables β€’ H2: We can identify intense discussions by identifying threads in top depth and msg/ thread quartiles β€’ H3: Intense discussions affect the repository’s source line intake
  • 69. Method β€’ Import projects in Alitheia Core β€’ Develop metric plug-in to count the variables we are interested in β€’ 3 metrics β€’ Emails from 60 projects, ~1,2 * 10^6 emails, 679427 threads β€’ Plug-in loc: 270 lines
  • 70. 100 Fit function (0.55 + 1.59x) 80 Number of Messages 60 40 20 0 0 5 10 15 20 25 30 35 40 45 50 Thread depth H1: R^2 = 0.70
  • 71. 100 Fit function (0.55 + 1.59x) 80 Number of Messages 60 40 20 0 0 5 10 15 20 25 30 35 40 45 50 Thread depth H1: R^2 = 0.70
  • 72. β€’ H2: 99 discussion threads β€’ H3 3: Project Avogadro Ξ—otEffect 622 Deskbar-Applet -103 FreeBSD -34 Gnome-Network -106 Gnome-Utils -263 GTK+ 183 Sabayon -244 Vala -356 LSR 1150 Meld 27
  • 73. β€’ H2: 99 discussion threads β€’ H3 3: Project Avogadro Ξ—otEffect 622 Deskbar-Applet -103 FreeBSD -34 Gnome-Network -106 Gnome-Utils -263 GTK+ 183 Sabayon -244 Vala -356 LSR 1150 Meld 27
  • 74. β€’ H2: 99 discussion threads β€’ H3 3: Project Avogadro Ξ—otEffect 622 Deskbar-Applet -103 FreeBSD -34 Gnome-Network -106 Gnome-Utils -263 GTK+ 183 Sabayon -244 Vala -356 LSR 1150 Meld 27
  • 75. Does the number of programmers affect code maintainability?
  • 76. Hypotheses β€’ H1: Number of programmers affects code maintainability at the project level β€’ H2: Number of programmers affects code maintainability at the directory level.
  • 77. Method β€’ Per langauge β€’ C & Java (risky) β€’ Plug-ins that calculate β€’ Number of developers per period of time β€’ Halstead & McCabe (700 lines) β€’ Omar’s Maintainability Index (240 lines) β€’ Data from 213 projects
  • 78. Maintainability at the project level R^2 = 0.04
  • 79. Maintainability at the project level R^2 = 0.04
  • 80. Maintainability at the module level R^2 = 0.05
  • 81. Maintainability at the module level R^2 = 0.05
  • 82. Per language C: R^2 = 0.08 Java: R^2 = 0.07
  • 83. Per language C: R^2 = 0.08 Java: R^2 = 0.07
  • 84. Why do we need large scale research?
  • 85. Project Ξ—otEffect Banshee -1121 Deskbar-Applet -103 FreeBSD -34 Gnome-Power-Manager -773 Gnome-Utils -263 GTranslator -230 Sabayon -244 Vala -356
  • 86. 1 R2 0.9 0.8 0.7 Correlation Co-efficient 0.6 0.5 0.4 0.3 0.2 0.1 0 R^2 at the module level for all projects
  • 87. TODO
  • 89. Platform Data validation
  • 90. Repositories for tools, data and results
  • 91. The SoftEng cloud project
  • 92. Thank you! http://www.sqo-oss.org Georgios Gousios gousiosg@aueb.gr

Editor's Notes

  1. \n
  2. \n
  3. Software engineering and is an empirical science as it tries to explain phenomena that occur in software development using data that are a result of it.\n
  4. This statement is not mine but belongs to Vic Basilli and other people that said or implied it before him or worked with him or followed his recommendation and started a new and exciting research area, the MSR. So being an empirical science software engineering has to follow the steps of scientific method\n
  5. This statement is not mine but belongs to Vic Basilli and other people that said or implied it before him or worked with him or followed his recommendation and started a new and exciting research area, the MSR. So being an empirical science software engineering has to follow the steps of scientific method\n
  6. As required by the scientific method we observe the behaviour or the data...\n
  7. We formulate hypotheses...\n
  8. And we build models ....\n
  9. Validate or invalidate those hypotheses by running them against data from the real world. As an empirical science, we are in constant need for data\n
  10. While in other empirical fields it is difficult and expensive to get empirical data, for example consider medicine,\n in software engineering we have the OSS movement which produces vast quantities \nso the question that comes into mind is &amp;#x201C;can we use all that free data to do research with?&amp;#x201D;\n
  11. This may sound like a rhetorical question, but data shows that it is not. \n
  12. From a systematic review we have done, most software engineering papers (even those in good journals and conferences) are validating their hypotheses with data from just a couple of projects\n
  13. Trying to explain why, let&amp;#x2019;s have a look at some project numbers first. The largest project repository than one can work with is KDE at ~50GB of data. In a talk, Audris Mockus said that he collected more than 1TB of data\n
  14. If we compare to other empirical sciences \n
  15. The other great problem is that of original data disparity. Most research so far was done with CVS and Bugzilla. In reality, there are a lot of tools that store process data \n
  16. Then there is the problem of non co-operating projects, which is OK because running and maintaining the infrastructures researchers almost never give back to the community\n
  17. Then there is the problem of non co-operating projects, which is OK because running and maintaining the infrastructures researchers almost never give back to the community\n
  18. Then there is the problem of non co-operating projects, which is OK because running and maintaining the infrastructures researchers almost never give back to the community\n
  19. So let&amp;#x2019;s see how is research done in more mature discilines\n
  20. The have large data sets that are pre-processed. Researchers most of the time do not have to download new data and massage them before starting conducting experimets. the Flossmole project has shown the way\n
  21. Researchers share their results in the form of workshops and conferences, competitions and more significantly tools that produce them\n
  22. They also do not take forgranted everything published as they replicate the studies and the findings. We can give the example of medicine here.\n
  23. And most important of all, most other empirical research disciplines have research platforms. Shared tools that permit researchers to conduct research using standardised input and output formats on standardized datasets.\n
  24. But most of them are already using machines like this (and get funding) to do their experiments on shared research infrastructures\n
  25. Or request huge amounts of funding for building machines like this\n
  26. \n
  27. \n
  28. By reading more than 300 papers and conducting a systematic review of more than 70 randomly selected\n
  29. By reading more than 300 papers and conducting a systematic review of more than 70 randomly selected\n
  30. By reading more than 300 papers and conducting a systematic review of more than 70 randomly selected\n
  31. By reading more than 300 papers and conducting a systematic review of more than 70 randomly selected\n
  32. By reading more than 300 papers and conducting a systematic review of more than 70 randomly selected\n
  33. \n
  34. \n
  35. \n
  36. \n
  37. Our results are reinforced by a similar study that Carlo Gezzi did for his ICSE 2009 keynote\n
  38. \n
  39. &amp;#x397; &amp;#x3C0;&amp;#x3C1;&amp;#x3CC;&amp;#x3C4;&amp;#x3B1;&amp;#x3C3;&amp;#x3AE; &amp;#x3BC;&amp;#x3B1;&amp;#x3C2; &amp;#x3B3;&amp;#x3B9;&amp;#x3B1; &amp;#x3BA;&amp;#x3B1;&amp;#x3BB;&amp;#x3CD;&amp;#x3C4;&amp;#x3B5;&amp;#x3C1;&amp;#x3B5;&amp;#x3C2; &amp;#x3B5;&amp;#x3BC;&amp;#x3C0;&amp;#x3B5;&amp;#x3B9;&amp;#x3C1;&amp;#x3B9;&amp;#x3BA;&amp;#x3AD;&amp;#x3C2; &amp;#x3BC;&amp;#x3B5;&amp;#x3BB;&amp;#x3AD;&amp;#x3C4;&amp;#x3B5;&amp;#x3C2; &amp;#x3B5;&amp;#x3AF;&amp;#x3BD;&amp;#x3B1;&amp;#x3B9; &amp;#x3B2;&amp;#x3B1;&amp;#x3C3;&amp;#x3B9;&amp;#x3C3;&amp;#x3BC;&amp;#x3AD;&amp;#x3BD;&amp;#x3B7; &amp;#x3C3;&amp;#x3B5; 4 &amp;#x3B1;&amp;#x3C1;&amp;#x3C7;&amp;#x3AD;&amp;#x3C2;:\n-&amp;#x395;&amp;#x3BD;&amp;#x3C4;&amp;#x3B1;&amp;#x3C4;&amp;#x3B9;&amp;#x3BA;&amp;#x3AE; &amp;#x3B5;&amp;#x3C0;&amp;#x3B1;&amp;#x3BB;&amp;#x3AE;&amp;#x3B8;&amp;#x3B5;&amp;#x3C5;&amp;#x3C3;&amp;#x3B7; &amp;#x3C4;&amp;#x3C9;&amp;#x3BD; &amp;#x3B9;&amp;#x3B4;&amp;#x3B5;&amp;#x3CE;&amp;#x3BD;\n-&amp;#x395;&amp;#x3BB;&amp;#x3B5;&amp;#x3CD;&amp;#x3B8;&amp;#x3B5;&amp;#x3C1;&amp;#x3B1; &amp;#x3B4;&amp;#x3B9;&amp;#x3B1;&amp;#x3B8;&amp;#x3AD;&amp;#x3C3;&amp;#x3B9;&amp;#x3BC;&amp;#x3B1; &amp;#x3B4;&amp;#x3B5;&amp;#x3B4;&amp;#x3BF;&amp;#x3BC;&amp;#x3AD;&amp;#x3BD;&amp;#x3B1; &amp;#x3C0;&amp;#x3B5;&amp;#x3B9;&amp;#x3C1;&amp;#x3B1;&amp;#x3BC;&amp;#x3B1;&amp;#x3C4;&amp;#x3B9;&amp;#x3C3;&amp;#x3BC;&amp;#x3BF;&amp;#x3CD;\n-&amp;#x394;&amp;#x3B9;&amp;#x3B5;&amp;#x3C5;&amp;#x3BA;&amp;#x3CC;&amp;#x3BB;&amp;#x3C5;&amp;#x3BD;&amp;#x3C3;&amp;#x3B7; &amp;#x3C4;&amp;#x3B7;&amp;#x3C2; &amp;#x3B4;&amp;#x3B9;&amp;#x3B1;&amp;#x3BA;&amp;#x3AF;&amp;#x3BD;&amp;#x3B7;&amp;#x3C3;&amp;#x3B7;&amp;#x3C2; &amp;#x3B1;&amp;#x3C0;&amp;#x3BF;&amp;#x3C4;&amp;#x3B5;&amp;#x3BB;&amp;#x3B5;&amp;#x3C3;&amp;#x3BC;&amp;#x3AC;&amp;#x3C4;&amp;#x3C9;&amp;#x3BD; &amp;#x3BA;&amp;#x3B1;&amp;#x3B9; &amp;#x3B5;&amp;#x3C1;&amp;#x3B3;&amp;#x3B1;&amp;#x3BB;&amp;#x3B5;&amp;#x3AF;&amp;#x3C9;&amp;#x3BD;\n
  40. By saying the work platform, several requirements appear in front of us\n
  41. By saying the work platform, several requirements appear in front of us\n
  42. By saying the work platform, several requirements appear in front of us\n
  43. By saying the work platform, several requirements appear in front of us\n
  44. By saying the work platform, several requirements appear in front of us\n
  45. By saying the work platform, several requirements appear in front of us\n
  46. All those reasons led our research group to the SQO-OSS project, which produced the Alitheia Core tool. The project&amp;#x2019;s aim was to produce software quality analysis tools, but the original targets strayed towards creating infrastructures rather than the tools themselves\n
  47. All those reasons led our research group to the SQO-OSS project, which produced the Alitheia Core tool. The project&amp;#x2019;s aim was to produce software quality analysis tools, but the original targets strayed towards creating infrastructures rather than the tools themselves\n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n
  56. \n
  57. \n
  58. \n
  59. \n
  60. \n
  61. \n
  62. \n
  63. \n
  64. \n
  65. \n
  66. \n
  67. This is the interface each plugin metric must satisfy\n
  68. This is how a metric plug-in looks like\n
  69. This is the implementation of the line counting plug-in I &amp;#x2018;ve presented you earlier on, minus some bureaucracy (constructors, imports etc). Just about 20 Locs to retrieve a file from a revision, read its lines and store results in a database. This is comparable to a shell or python script, but it is way much faster. The abstractions Alitheia Core are very high level and cross platform and this is why the overhead it adds in algorithm implementation is minimal\n
  70. \n
  71. \n
  72. This is our cluster: Left a 3 thread processing node + project server, middle a 6 thread processing core, bottom mirroring and storage of raw data, file server, database and 8 processing threads to keep the CPUs busy while I/O, right web server + 16 slow processing threads. To scale, we just need more processing nodes.\n
  73. \n
  74. Another example of the cluster on its knees - Linear scalability display: The screen of the database server running just the database while other nodes start connecting: Queries Per Second increase is almost linear and load equals to processor cores -&gt; the machine is saturated. \n
  75. \n
  76. \n
  77. \n
  78. \n
  79. \n
  80. \n
  81. \n
  82. \n
  83. \n
  84. \n
  85. \n
  86. \n
  87. \n
  88. &amp;#x3A4;&amp;#x3B1; &amp;#x3AF;&amp;#x3B4;&amp;#x3B9;&amp;#x3B1; &amp;#x3B1;&amp;#x3C0;&amp;#x3BF;&amp;#x3C4;&amp;#x3B5;&amp;#x3BB;&amp;#x3AD;&amp;#x3C3;&amp;#x3BC;&amp;#x3B1;&amp;#x3C4;&amp;#x3B1;\n
  89. Now the interesting bit\n
  90. Selected results from the first case study\n
  91. Now imagine that I wanted to publish a paper or to present results to get money from a grant what would I do?\n
  92. I hope I &amp;#x2018;ve persuaded you about the usefulness of such studies, let see what we have to do from now on\n
  93. Having such a huge data base I think that we can refute several existing works :-)\n
  94. \n
  95. \n
  96. \n
  97. \n