Your SlideShare is downloading. ×
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Software Systems as Cities: a Controlled Experiment
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Software Systems as Cities: a Controlled Experiment

3,115

Published on

The slides for the presentation I gave at ICSE 2011, in Honolulu, Hawaii.

The slides for the presentation I gave at ICSE 2011, in Honolulu, Hawaii.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,115
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
37
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Hi, I’m Richard Wettel and I am here to present you Software Systems as Cities: A Controlled Experiment. This is work I did while I was a PhD student at the University of Lugano, in collaboration with my advisor, Michele Lanza, and with Romain Robbes, from the University of Chile.\n
  • Before diving into the controlled experiment, I’d like to give you a brief description of the approach evaluated in this work. Our approach, which is a software visualization approach, addresses mostly object-oriented software and is based on the following metaphor...\n
  • The system is a city, its packages are the city’s districts and its classes, the city’s buildings. The visible properties of the city artifacts reflect a set of software metrics. One of the configuration we use often is the following: the nesting level of the package mapped on the colors of the district (from dark to light grays), while for the classes, the number of methods is mapped on the building’s height, the number of attributes on its base size, and the number of lines of code on the color of the buildings: from dark gray to intense blue. Since software has so many facets, it was important for us to have a versatile metaphor that can be employed in different contexts. We applied it in the following three:\n
  • The system is a city, its packages are the city’s districts and its classes, the city’s buildings. The visible properties of the city artifacts reflect a set of software metrics. One of the configuration we use often is the following: the nesting level of the package mapped on the colors of the district (from dark to light grays), while for the classes, the number of methods is mapped on the building’s height, the number of attributes on its base size, and the number of lines of code on the color of the buildings: from dark gray to intense blue. Since software has so many facets, it was important for us to have a versatile metaphor that can be employed in different contexts. We applied it in the following three:\n
  • The system is a city, its packages are the city’s districts and its classes, the city’s buildings. The visible properties of the city artifacts reflect a set of software metrics. One of the configuration we use often is the following: the nesting level of the package mapped on the colors of the district (from dark to light grays), while for the classes, the number of methods is mapped on the building’s height, the number of attributes on its base size, and the number of lines of code on the color of the buildings: from dark gray to intense blue. Since software has so many facets, it was important for us to have a versatile metaphor that can be employed in different contexts. We applied it in the following three:\n
  • The system is a city, its packages are the city’s districts and its classes, the city’s buildings. The visible properties of the city artifacts reflect a set of software metrics. One of the configuration we use often is the following: the nesting level of the package mapped on the colors of the district (from dark to light grays), while for the classes, the number of methods is mapped on the building’s height, the number of attributes on its base size, and the number of lines of code on the color of the buildings: from dark gray to intense blue. Since software has so many facets, it was important for us to have a versatile metaphor that can be employed in different contexts. We applied it in the following three:\n
  • The system is a city, its packages are the city’s districts and its classes, the city’s buildings. The visible properties of the city artifacts reflect a set of software metrics. One of the configuration we use often is the following: the nesting level of the package mapped on the colors of the district (from dark to light grays), while for the classes, the number of methods is mapped on the building’s height, the number of attributes on its base size, and the number of lines of code on the color of the buildings: from dark gray to intense blue. Since software has so many facets, it was important for us to have a versatile metaphor that can be employed in different contexts. We applied it in the following three:\n
  • The system is a city, its packages are the city’s districts and its classes, the city’s buildings. The visible properties of the city artifacts reflect a set of software metrics. One of the configuration we use often is the following: the nesting level of the package mapped on the colors of the district (from dark to light grays), while for the classes, the number of methods is mapped on the building’s height, the number of attributes on its base size, and the number of lines of code on the color of the buildings: from dark gray to intense blue. Since software has so many facets, it was important for us to have a versatile metaphor that can be employed in different contexts. We applied it in the following three:\n
  • The system is a city, its packages are the city’s districts and its classes, the city’s buildings. The visible properties of the city artifacts reflect a set of software metrics. One of the configuration we use often is the following: the nesting level of the package mapped on the colors of the district (from dark to light grays), while for the classes, the number of methods is mapped on the building’s height, the number of attributes on its base size, and the number of lines of code on the color of the buildings: from dark gray to intense blue. Since software has so many facets, it was important for us to have a versatile metaphor that can be employed in different contexts. We applied it in the following three:\n
  • The system is a city, its packages are the city’s districts and its classes, the city’s buildings. The visible properties of the city artifacts reflect a set of software metrics. One of the configuration we use often is the following: the nesting level of the package mapped on the colors of the district (from dark to light grays), while for the classes, the number of methods is mapped on the building’s height, the number of attributes on its base size, and the number of lines of code on the color of the buildings: from dark gray to intense blue. Since software has so many facets, it was important for us to have a versatile metaphor that can be employed in different contexts. We applied it in the following three:\n
  • The system is a city, its packages are the city’s districts and its classes, the city’s buildings. The visible properties of the city artifacts reflect a set of software metrics. One of the configuration we use often is the following: the nesting level of the package mapped on the colors of the district (from dark to light grays), while for the classes, the number of methods is mapped on the building’s height, the number of attributes on its base size, and the number of lines of code on the color of the buildings: from dark gray to intense blue. Since software has so many facets, it was important for us to have a versatile metaphor that can be employed in different contexts. We applied it in the following three:\n
  • The system is a city, its packages are the city’s districts and its classes, the city’s buildings. The visible properties of the city artifacts reflect a set of software metrics. One of the configuration we use often is the following: the nesting level of the package mapped on the colors of the district (from dark to light grays), while for the classes, the number of methods is mapped on the building’s height, the number of attributes on its base size, and the number of lines of code on the color of the buildings: from dark gray to intense blue. Since software has so many facets, it was important for us to have a versatile metaphor that can be employed in different contexts. We applied it in the following three:\n
  • Program comprehension. This is a code city of ArgoUML, a Java system about 140 thousand lines of code. The visualization gives us a structural overview of the system and reveals several patterns in the form of building archetypes.\n
  • Program comprehension. This is a code city of ArgoUML, a Java system about 140 thousand lines of code. The visualization gives us a structural overview of the system and reveals several patterns in the form of building archetypes.\n
  • Antenna-like skyscrapers, representing classes with many methods and few attributes,\n
  • Antenna-like skyscrapers, representing classes with many methods and few attributes,\n
  • Antenna-like skyscrapers, representing classes with many methods and few attributes,\n
  • Antenna-like skyscrapers, representing classes with many methods and few attributes,\n
  • Antenna-like skyscrapers, representing classes with many methods and few attributes,\n
  • Antenna-like skyscrapers, representing classes with many methods and few attributes,\n
  • Office buildings for classes with many methods and many attributes,\n
  • Office buildings for classes with many methods and many attributes,\n
  • Office buildings for classes with many methods and many attributes,\n
  • Office buildings for classes with many methods and many attributes,\n
  • Office buildings for classes with many methods and many attributes,\n
  • Office buildings for classes with many methods and many attributes,\n
  • Parking lots, for classes with few methods and a lot of attributes, as in the case of this huge Java interface (the color of the parking lot shows that it does not contain any code, just a whole bunch of constants)\n
  • Parking lots, for classes with few methods and a lot of attributes, as in the case of this huge Java interface (the color of the parking lot shows that it does not contain any code, just a whole bunch of constants)\n
  • Parking lots, for classes with few methods and a lot of attributes, as in the case of this huge Java interface (the color of the parking lot shows that it does not contain any code, just a whole bunch of constants)\n
  • Parking lots, for classes with few methods and a lot of attributes, as in the case of this huge Java interface (the color of the parking lot shows that it does not contain any code, just a whole bunch of constants)\n
  • Parking lots, for classes with few methods and a lot of attributes, as in the case of this huge Java interface (the color of the parking lot shows that it does not contain any code, just a whole bunch of constants)\n
  • Parking lots, for classes with few methods and a lot of attributes, as in the case of this huge Java interface (the color of the parking lot shows that it does not contain any code, just a whole bunch of constants)\n
  • Or houses, for small classes with few attributes and methods.\n
  • Or houses, for small classes with few attributes and methods.\n
  • Or houses, for small classes with few attributes and methods.\n
  • Or houses, for small classes with few attributes and methods.\n
  • Or houses, for small classes with few attributes and methods.\n
  • Or houses, for small classes with few attributes and methods.\n
  • This was the first context, program comprehension.\n
  • The second context is assessing the quality of software design. For this we use code smells, which are violation of the rules of good object-oriented design. For example brain class and god class are design problems related to extensive complexity and lack of collaboration, essential in an object-oriented system. A data class is the opposite, a bare container of data, which does not have any behavior. \nIn our approach, we assign vivid colors to classes affected by such design problems, while the unaffected ones are gray and thus appear muted. This visualization called disharmony map enables us to focus on the problems in the context of the entire system.\n
  • The third application context of our approach is system evolution analysis. One of the techniques we developed in this context is time travel which allows us to watch the evolution of the city and thus of the system represented by it. I won’t get into details here, because, for reasons I’ll explain later, we did not evaluate our approach in this context, in spite of its potential.\n
  • The third application context of our approach is system evolution analysis. One of the techniques we developed in this context is time travel which allows us to watch the evolution of the city and thus of the system represented by it. I won’t get into details here, because, for reasons I’ll explain later, we did not evaluate our approach in this context, in spite of its potential.\n
  • The third application context of our approach is system evolution analysis. One of the techniques we developed in this context is time travel which allows us to watch the evolution of the city and thus of the system represented by it. I won’t get into details here, because, for reasons I’ll explain later, we did not evaluate our approach in this context, in spite of its potential.\n
  • The third application context of our approach is system evolution analysis. One of the techniques we developed in this context is time travel which allows us to watch the evolution of the city and thus of the system represented by it. I won’t get into details here, because, for reasons I’ll explain later, we did not evaluate our approach in this context, in spite of its potential.\n
  • The third application context of our approach is system evolution analysis. One of the techniques we developed in this context is time travel which allows us to watch the evolution of the city and thus of the system represented by it. I won’t get into details here, because, for reasons I’ll explain later, we did not evaluate our approach in this context, in spite of its potential.\n
  • The third application context of our approach is system evolution analysis. One of the techniques we developed in this context is time travel which allows us to watch the evolution of the city and thus of the system represented by it. I won’t get into details here, because, for reasons I’ll explain later, we did not evaluate our approach in this context, in spite of its potential.\n
  • The third application context of our approach is system evolution analysis. One of the techniques we developed in this context is time travel which allows us to watch the evolution of the city and thus of the system represented by it. I won’t get into details here, because, for reasons I’ll explain later, we did not evaluate our approach in this context, in spite of its potential.\n
  • The third application context of our approach is system evolution analysis. One of the techniques we developed in this context is time travel which allows us to watch the evolution of the city and thus of the system represented by it. I won’t get into details here, because, for reasons I’ll explain later, we did not evaluate our approach in this context, in spite of its potential.\n
  • The third application context of our approach is system evolution analysis. One of the techniques we developed in this context is time travel which allows us to watch the evolution of the city and thus of the system represented by it. I won’t get into details here, because, for reasons I’ll explain later, we did not evaluate our approach in this context, in spite of its potential.\n
  • The third application context of our approach is system evolution analysis. One of the techniques we developed in this context is time travel which allows us to watch the evolution of the city and thus of the system represented by it. I won’t get into details here, because, for reasons I’ll explain later, we did not evaluate our approach in this context, in spite of its potential.\n
  • The third application context of our approach is system evolution analysis. One of the techniques we developed in this context is time travel which allows us to watch the evolution of the city and thus of the system represented by it. I won’t get into details here, because, for reasons I’ll explain later, we did not evaluate our approach in this context, in spite of its potential.\n
  • The software systems as cities approach is implemented in a freely-available tool called CodeCity.\n
  • PAUSE\nAnd yet, what does this all mean?\nWe needed to take a pragmatic stance and wonder whether this approach could be useful to practitioners, too.\n
  • This is what our control experiment is aimed at finding out.\nFrom this experiment, I’d like to leave you with not only the results, but also the various decisions we took towards obtaining these results.\n
  • A controlled experiment will be at most as good as its design.\n
  • We first performed an extensive study of the works related to empirical evaluation of information visualization in general and of software visualization in particular.\n
  • To synthesize the lessons learned from the existing body of knowledge, we built a list of design desiderata, which we used as guidelines for the design our experiment.\n
  • Here are the ones where we could improve over the existing experiments...\n
  • The first thing we had to consider was the baseline. Ideally, we’d have a tool that supports all the three context that our approach supports. Unfortunately, we could not find one. Therefore, we started building the baseline from several tools.\n
  • The first thing we had to consider was the baseline. Ideally, we’d have a tool that supports all the three context that our approach supports. Unfortunately, we could not find one. Therefore, we started building the baseline from several tools.\n
  • The first thing we had to consider was the baseline. Ideally, we’d have a tool that supports all the three context that our approach supports. Unfortunately, we could not find one. Therefore, we started building the baseline from several tools.\n
  • The first thing we had to consider was the baseline. Ideally, we’d have a tool that supports all the three context that our approach supports. Unfortunately, we could not find one. Therefore, we started building the baseline from several tools.\n
  • The first thing we had to consider was the baseline. Ideally, we’d have a tool that supports all the three context that our approach supports. Unfortunately, we could not find one. Therefore, we started building the baseline from several tools.\n
  • After we knew what we were evaluating, we started designing the set of tasks. We have 6 program comprehension tasks, PAUSE\nand 4 design quality assessment tasks.\n
  • After we knew what we were evaluating, we started designing the set of tasks. We have 6 program comprehension tasks, PAUSE\nand 4 design quality assessment tasks.\n
  • After we knew what we were evaluating, we started designing the set of tasks. We have 6 program comprehension tasks, PAUSE\nand 4 design quality assessment tasks.\n
  • After we knew what we were evaluating, we started designing the set of tasks. We have 6 program comprehension tasks, PAUSE\nand 4 design quality assessment tasks.\n
  • After we knew what we were evaluating, we started designing the set of tasks. We have 6 program comprehension tasks, PAUSE\nand 4 design quality assessment tasks.\n
  • After we knew what we were evaluating, we started designing the set of tasks. We have 6 program comprehension tasks, PAUSE\nand 4 design quality assessment tasks.\n
  • After we knew what we were evaluating, we started designing the set of tasks. We have 6 program comprehension tasks, PAUSE\nand 4 design quality assessment tasks.\n
  • After we knew what we were evaluating, we started designing the set of tasks. We have 6 program comprehension tasks, PAUSE\nand 4 design quality assessment tasks.\n
  • Another classification of our tasks is: 9 quantitative and 1 qualitative (which is not considered in our quantitative test).\n
  • With this task set, we wanted to find out whether the use of our approach brings any benefits over the use of the baseline, in terms of correctness of the solutions (that is, a grade for the solution, as the ones you give your students for any assignment). And we were also interested in the potential improvements in terms of completion time. \n
  • With this task set, we wanted to find out whether the use of our approach brings any benefits over the use of the baseline, in terms of correctness of the solutions (that is, a grade for the solution, as the ones you give your students for any assignment). And we were also interested in the potential improvements in terms of completion time. \n
  • With this task set, we wanted to find out whether the use of our approach brings any benefits over the use of the baseline, in terms of correctness of the solutions (that is, a grade for the solution, as the ones you give your students for any assignment). And we were also interested in the potential improvements in terms of completion time. \n
  • With this task set, we wanted to find out whether the use of our approach brings any benefits over the use of the baseline, in terms of correctness of the solutions (that is, a grade for the solution, as the ones you give your students for any assignment). And we were also interested in the potential improvements in terms of completion time. \n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • Based on the questions, we have the following variables for our experiment. Dependent variables are the ones we measure: correctness and completion time. Independent variables are the ones whose effect we want to measure. Here we find the tool used to solve the tasks (CodeCity vs baseline). Moreover, we wanted to see whether the advantages of using our approach scale with the object system size, which is the second independent variable. This variable has two levels: medium and large and for these we chose two Java systems of different magnitudes and also different application domain (FindBugs is a bug detection tool, while Azureus is a peer-to-peer client). Finally, we wanted to eliminate the effect of background and experience level on the outcome of the experiment, and therefore me made them controlled variables.\n
  • And here we have these variables at play. On the one hand, we have two controlled variables, whose combination results in four blocks. On the other hand, the combination of the two independent variables results in 4 treatments (two experimental and two control). Our design is a between-subjects (every subject receives either a control treatment or an experimental one). And it is randomized-blocks, in that we assign a random combination of treatments to each of the four blocks separately. \n\nAll the preparation allowed us to conduct our experience with confidence \n
  • And here we have these variables at play. On the one hand, we have two controlled variables, whose combination results in four blocks. On the other hand, the combination of the two independent variables results in 4 treatments (two experimental and two control). Our design is a between-subjects (every subject receives either a control treatment or an experimental one). And it is randomized-blocks, in that we assign a random combination of treatments to each of the four blocks separately. \n\nAll the preparation allowed us to conduct our experience with confidence \n
  • And here we have these variables at play. On the one hand, we have two controlled variables, whose combination results in four blocks. On the other hand, the combination of the two independent variables results in 4 treatments (two experimental and two control). Our design is a between-subjects (every subject receives either a control treatment or an experimental one). And it is randomized-blocks, in that we assign a random combination of treatments to each of the four blocks separately. \n\nAll the preparation allowed us to conduct our experience with confidence \n
  • And here we have these variables at play. On the one hand, we have two controlled variables, whose combination results in four blocks. On the other hand, the combination of the two independent variables results in 4 treatments (two experimental and two control). Our design is a between-subjects (every subject receives either a control treatment or an experimental one). And it is randomized-blocks, in that we assign a random combination of treatments to each of the four blocks separately. \n\nAll the preparation allowed us to conduct our experience with confidence \n
  • And here we have these variables at play. On the one hand, we have two controlled variables, whose combination results in four blocks. On the other hand, the combination of the two independent variables results in 4 treatments (two experimental and two control). Our design is a between-subjects (every subject receives either a control treatment or an experimental one). And it is randomized-blocks, in that we assign a random combination of treatments to each of the four blocks separately. \n\nAll the preparation allowed us to conduct our experience with confidence \n
  • And here we have these variables at play. On the one hand, we have two controlled variables, whose combination results in four blocks. On the other hand, the combination of the two independent variables results in 4 treatments (two experimental and two control). Our design is a between-subjects (every subject receives either a control treatment or an experimental one). And it is randomized-blocks, in that we assign a random combination of treatments to each of the four blocks separately. \n\nAll the preparation allowed us to conduct our experience with confidence \n
  • \n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The way we planned to conduct the experiment was that for each group of subjects we would start with a training session, which is a one-hour presentation of the approach, concluded with a tool demonstration of CodeCity, which would train the participants for a potential experimental treatment. After the training session, or in the next days, we would follow with a number of experiment sessions. In this diagram, in a blue square we annotate the number of data points obtained with an experimental treatment, and the number in the gray square represents the number of data points a control treatment. The arrow show us the training session used to train the subjects who received the experimental treatments.\n
  • The first step was to organize a pilot studies with 7 master and 2 phd students, which would allow us to improve our questionnaire and to resolve the most common problems that would appear.\n
  • After this we started the experiment, which spanned 4 cities in Switzerland, Italy and Belgium and took around 6 month. We managed to collect 41 valid data points, of which 20 from industry practitioners.\n
  • After this we started the experiment, which spanned 4 cities in Switzerland, Italy and Belgium and took around 6 month. We managed to collect 41 valid data points, of which 20 from industry practitioners.\n
  • After this we started the experiment, which spanned 4 cities in Switzerland, Italy and Belgium and took around 6 month. We managed to collect 41 valid data points, of which 20 from industry practitioners.\n
  • After this we started the experiment, which spanned 4 cities in Switzerland, Italy and Belgium and took around 6 month. We managed to collect 41 valid data points, of which 20 from industry practitioners.\n
  • After this we started the experiment, which spanned 4 cities in Switzerland, Italy and Belgium and took around 6 month. We managed to collect 41 valid data points, of which 20 from industry practitioners.\n
  • After this we started the experiment, which spanned 4 cities in Switzerland, Italy and Belgium and took around 6 month. We managed to collect 41 valid data points, of which 20 from industry practitioners.\n
  • Here is the final distribution of treatments to subjects across the four groups.\n
  • To collect the data from our subjects, we used questionnaires containing the tasks, which gives us the raw data for measuring correctness, alternating with pages dedicated to entering the time, which allowed us to compute the completion time for each task.\n
  • To collect the data from our subjects, we used questionnaires containing the tasks, which gives us the raw data for measuring correctness, alternating with pages dedicated to entering the time, which allowed us to compute the completion time for each task.\n
  • To collect the data from our subjects, we used questionnaires containing the tasks, which gives us the raw data for measuring correctness, alternating with pages dedicated to entering the time, which allowed us to compute the completion time for each task.\n
  • To collect the data from our subjects, we used questionnaires containing the tasks, which gives us the raw data for measuring correctness, alternating with pages dedicated to entering the time, which allowed us to compute the completion time for each task.\n
  • To collect the data from our subjects, we used questionnaires containing the tasks, which gives us the raw data for measuring correctness, alternating with pages dedicated to entering the time, which allowed us to compute the completion time for each task.\n
  • To collect the data from our subjects, we used questionnaires containing the tasks, which gives us the raw data for measuring correctness, alternating with pages dedicated to entering the time, which allowed us to compute the completion time for each task.\n
  • To collect the data from our subjects, we used questionnaires containing the tasks, which gives us the raw data for measuring correctness, alternating with pages dedicated to entering the time, which allowed us to compute the completion time for each task.\n
  • To collect the data from our subjects, we used questionnaires containing the tasks, which gives us the raw data for measuring correctness, alternating with pages dedicated to entering the time, which allowed us to compute the completion time for each task.\n
  • To collect the data from our subjects, we used questionnaires containing the tasks, which gives us the raw data for measuring correctness, alternating with pages dedicated to entering the time, which allowed us to compute the completion time for each task.\n
  • To collect the data from our subjects, we used questionnaires containing the tasks, which gives us the raw data for measuring correctness, alternating with pages dedicated to entering the time, which allowed us to compute the completion time for each task.\n
  • To collect the data from our subjects, we used questionnaires containing the tasks, which gives us the raw data for measuring correctness, alternating with pages dedicated to entering the time, which allowed us to compute the completion time for each task.\n
  • To collect the data from our subjects, we used questionnaires containing the tasks, which gives us the raw data for measuring correctness, alternating with pages dedicated to entering the time, which allowed us to compute the completion time for each task.\n
  • To collect the data from our subjects, we used questionnaires containing the tasks, which gives us the raw data for measuring correctness, alternating with pages dedicated to entering the time, which allowed us to compute the completion time for each task.\n
  • We implemented a web app for controlling the time during the experiment. It provides a common time and shows, for each subject the current task and remaining time for the task. We allotted a finite maximum time for each task (10 minutes).\n
  • We implemented a web app for controlling the time during the experiment. It provides a common time and shows, for each subject the current task and remaining time for the task. We allotted a finite maximum time for each task (10 minutes).\n
  • We implemented a web app for controlling the time during the experiment. It provides a common time and shows, for each subject the current task and remaining time for the task. We allotted a finite maximum time for each task (10 minutes).\n
  • While measuring time was straightforward, it was not the case with correctness. For this we build one oracle per treatment and used it to grade the participants’ solutions. Moreover, we used blinding for two of us, such that they did not know while correcting whether the solution was obtained with our approach or with the baseline.\n
  • What did we find from our experiment?\n
  • For the two dependent variables (correctness and completion time), we performed a two-way analysis of variance using the SPSS tool, at a 95% confidence interval.\n
  • Using our approach, the subjects obtained more correct results on average than when using Eclipse and Excel, regardless of the object system size. Namely 24 percent more correct, a statistically significant improvement. This, according to Cohen’s d, is a large effect size.\n
  • In these boxplots you can see the distribution of the data points (the box covers the 50% of the data around the median. The lonely data point shows one data point outlier, representing an exceptionally best result for the control group with a large object system.\n
  • What about completion time. Again, our approach performed better than the baseline, again regardless of the object system size. Our approach enabled an average improvement of 12 percent of the task completion time. This is a statistically significant result and the effect size of this result is moderate, according to Cohen’s d.\n
  • Here are the boxplots which show the distribution of the data points for the completion time dependent variable.\n
  • Although it sounds like a total eclipse for the ones of you who are using Eclipse on a regular basis, the good news is that we know what to do to make it better...\n
  • Our experiment is easily replicable. In our technical report we provided every detail needed and, if you are interested, please feel free to replicate it.\n
  • Although these are only a subset of the stories around the experiment, I need to wrap up here.\n
  • The main points are: we designed our experiment based on a list of desiderata extracted from the body of literature. We then conducted the experiment over a period of 6 months in 4 locations spanning three countries and managed to engage 41 subjects, half of which are industry practitioners. The main results of our experiment show that our approach improved the performance of our experimental participants over the control participants in both correctness and completion time.\n
  • The main points are: we designed our experiment based on a list of desiderata extracted from the body of literature. We then conducted the experiment over a period of 6 months in 4 locations spanning three countries and managed to engage 41 subjects, half of which are industry practitioners. The main results of our experiment show that our approach improved the performance of our experimental participants over the control participants in both correctness and completion time.\n
  • The main points are: we designed our experiment based on a list of desiderata extracted from the body of literature. We then conducted the experiment over a period of 6 months in 4 locations spanning three countries and managed to engage 41 subjects, half of which are industry practitioners. The main results of our experiment show that our approach improved the performance of our experimental participants over the control participants in both correctness and completion time.\n
  • The main points are: we designed our experiment based on a list of desiderata extracted from the body of literature. We then conducted the experiment over a period of 6 months in 4 locations spanning three countries and managed to engage 41 subjects, half of which are industry practitioners. The main results of our experiment show that our approach improved the performance of our experimental participants over the control participants in both correctness and completion time.\n
  • The main points are: we designed our experiment based on a list of desiderata extracted from the body of literature. We then conducted the experiment over a period of 6 months in 4 locations spanning three countries and managed to engage 41 subjects, half of which are industry practitioners. The main results of our experiment show that our approach improved the performance of our experimental participants over the control participants in both correctness and completion time.\n
  • The point I’d like you to take away from this presentation is that our approach is at least a viable alternative to the current state of the practice, non-visual approaches for software exploration.\n
  • An experiment of such scale is not possible without the support and participation of many people. We’d like to thank them all. And I’d like to thank you for your time and attention.\n
  • \n
  • Transcript

    • 1. Software Systems as Cities: A Controlled Experiment Richard Wettel, Michele Lanza Romain Robbes REVEAL @ Faculty of Informatics PLEIAD @ DCC University of Lugano University of Chile Switzerland Chile
    • 2. Software Systems as Cities
    • 3. City Metaphor VISSOFT 2007
    • 4. City Metaphor class buildingpackage district VISSOFT 2007
    • 5. City Metaphor class buildingpackage district VISSOFT 2007
    • 6. City Metaphor class buildingpackage district nesting level color VISSOFT 2007
    • 7. City Metaphor number of methods (NOM) height number of attributes (NOA) base size number of lines of code (LOC) color class buildingpackage district nesting level color VISSOFT 2007
    • 8. Program Comprehension ArgoUML LOC 136,325 ICPC 2007
    • 9. Program Comprehension
    • 10. FacadeMDRImplNOA 3 skyscraperNOM 349LOC 3,413
    • 11. CPPParserNOA 85 office buildingNOM 204LOC 9,111
    • 12. JavaTokenTypesNOA 173 parking lotNOM 0LOC 0
    • 13. house PropPanelEvent NOA 2 NOM 3 LOC 37
    • 14. Program Comprehension ICPC 2007
    • 15. Design Quality Assessmentdisharmony mapArgoUML classes brain class 8 god class 30 god + brain 6 data class 17 unaffected 1,715 SoftVis 2008
    • 16. System Evolution Analysistime traveling WCRE 2008
    • 17. System Evolution Analysis time ArgoUMLtime traveling 8 major releases 6 years WCRE 2008
    • 18. System Evolution Analysis time ArgoUMLtime traveling 8 major releases 6 years WCRE 2008
    • 19. http://codecity.inf.usi.ch implemented in Smalltalk ICSE 2008 tool demo
    • 20. Is ituseful ?
    • 21. A Controlled Experiment
    • 22. Design
    • 23. technical report 2010State of the art?
    • 24. Design desiderata 1 Avoid comparing using a technique against not using it. 2 Involve participants from the industry. 3 Provide a not-so-short tutorial of the experimental tool to the participants. 4 Avoid, whenever possible, giving the tutorial right before the experiment. 5 Use the tutorial to cover both the research behind the approach and the tool. 6 Find a set of relevant tasks. 7 Choose real object systems that are relevant for the tasks. 8 Include more than one object system in the design. 9 Provide the same data to all participants.10 Limit the amount of time allowed for solving each task.11 Provide all the details needed to make the experiment replicable.12 Report results on individual tasks.13 Include tasks on which the expected result is not always to the advantage of the tool being evaluated.14 Take into account the possible wide range of experience level of the participants.
    • 25. Design desiderata 1 Avoid comparing using a technique against not using it. 2 Involve participants from the industry. 3 Provide a not-so-short tutorial of the experimental tool to the participants. 4 Avoid, whenever possible, giving the tutorial right before the experiment. 5 Use the tutorial to cover both the research behind the approach and the tool. 6 Find a set of relevant tasks. 7 Choose real object systems that are relevant for the tasks. 8 Include more than one object system in the design. 9 Provide the same data to all participants.10 Limit the amount of time allowed for solving each task.11 Provide all the details needed to make the experiment replicable.12 Report results on individual tasks.13 Include tasks on which the expected result is not always to the advantage of the tool being evaluated.14 Take into account the possible wide range of experience level of the participants.
    • 26. Finding a baseline1. program comprehension2. design quality assessment3. system evolution analysis
    • 27. Finding a baseline1. program comprehension2. design quality assessment3. system evolution analysis
    • 28. Finding a baseline1. program comprehension2. design quality assessment3. system evolution analysis
    • 29. Finding a baseline1. program comprehension2. design quality assessment
    • 30. Tasks
    • 31. Tasks program comprehension 6A1 Identity the convention used in the system to organize unit tests.A2.1& What is the spread of term T in the name of the classes, their attributes andA2.2 methods?A3 Evaluate the change impact of class C, in terms of intensity and dispersion.A4.1 Find the three classes with the highest number of methods. Find the three classes with the highest average number of lines of code perA4.2 method.
    • 32. Tasks program comprehension 6A1 Identity the convention used in the system to organize unit tests.A2.1& What is the spread of term T in the name of the classes, their attributes andA2.2 methods?A3 Evaluate the change impact of class C, in terms of intensity and dispersion.A4.1 Find the three classes with the highest number of methods. Find the three classes with the highest average number of lines of code perA4.2 method.B1.1 Identify the package with the highest percentage of god classes.B1.2 Identify the god class with the largest number of methods. Identify the dominant (affecting the highest number of classes) class-levelB2.1 design problem.B2.2 Write an overview of the class-level design problems in the system. design quality assessment 4
    • 33. Tasks program comprehension 6 5A1 Identity the convention used in the system to organize unit tests.A2.1& What is the spread of term T in the name of the classes, their attributes andA2.2 methods?A3 Evaluate the change impact of class C, in terms of intensity and dispersion.A4.1 Find the three classes with the highest number of methods. Find the three classes with the highest average number of lines of code perA4.2 method.B1.1 Identify the package with the highest percentage of god classes.B1.2 Identify the god class with the largest number of methods. Identify the dominant (affecting the highest number of classes) class-levelB2.1 design problem.B2.2 Write an overview of the class-level design problems in the system. design quality assessment 4
    • 34. Tasks quantitative 9 8A1 Identity the convention used in the system to organize unit tests.A2.1& What is the spread of term T in the name of the classes, their attributes andA2.2 methods?A3 Evaluate the change impact of class C, in terms of intensity and dispersion.A4.1 Find the three classes with the highest number of methods. Find the three classes with the highest average number of lines of code perA4.2 method.B1.1 Identify the package with the highest percentage of god classes.B1.2 Identify the god class with the largest number of methods. Identify the dominant (affecting the highest number of classes) class-levelB2.1 design problem.B2.2 Write an overview of the class-level design problems in the system. qualitative 1
    • 35. Main research questions
    • 36. Main research questions 1 Does the use of CodeCity increase the correctness of the solutions to program comprehension tasks, compared to non-visual exploration tools, regardless of the object system size?
    • 37. Main research questions 1 Does the use of CodeCity increase the correctness of the solutions to program comprehension tasks, compared to non-visual exploration tools, regardless of the object system size? 2 Does the use of CodeCity reduce the time needed to solve program comprehension tasks, compared to non-visual exploration tools, regardless of the object system size?
    • 38. Variables of the experiment
    • 39. Variables of the experiment correctnessdependent completion time CodeCity tool Eclipse + Excelindependent medium object system size large beginner experience level advancedcontrolled academia background industry
    • 40. Variables of the experiment correctnessdependent completion time CodeCity tool Eclipse + Excelindependent medium object system size large beginner experience level advancedcontrolled academia background industry
    • 41. Variables of the experiment correctnessdependent completion time CodeCity tool Eclipse + Excelindependent medium object system size large beginner experience level advancedcontrolled academia background industry
    • 42. Variables of the experiment correctnessdependent FindBugs completion time 1,320 classes 93,310 LOC CodeCity tool Eclipse + Excelindependent medium object system size large beginner experience level advanced Azureuscontrolled academia 4,656 classes background industry 454,387 LOC
    • 43. Variables of the experiment correctnessdependent completion time CodeCity tool Eclipse + Excelindependent medium object system size large beginner experience level advancedcontrolled academia background industry
    • 44. Variables of the experiment correctnessdependent completion time CodeCity tool Eclipse + Excelindependent medium object system size large beginner experience level advancedcontrolled academia background industry
    • 45. The experiment’s design between-subjects randomized-block
    • 46. The experiment’s design between-subjects randomized-block CodeCity T1 large Tool T2 medium Size Ecl+Excl T3 large T4 medium
    • 47. The experiment’s designbackground academia industryexperience beginner advanced beginner advanced B1 B2 B3 B4 between-subjects randomized-block CodeCity T1 large Tool T2 medium Size Ecl+Excl T3 large T4 medium
    • 48. The experiment’s designbackground academia industryexperience beginner advanced beginner advanced B1 B2 B3 B4 between-subjects randomized-block CodeCity T1 large Tool T2 medium Size Ecl+Excl T3 large T4 medium
    • 49. Execution
    • 50. Experimental runs
    • 51. Experimental runs day 1 timetraining session(1 hour)
    • 52. Experimental runs day 1 timetraining session(1 hour) e1experiment session(2 hours) c1
    • 53. Experimental runs day 1 day 2 timetraining session(1 hour) e1 e2experiment session(2 hours) c1 c2
    • 54. Experimental runs day 1 day 2 day 3 timetraining session(1 hour) e1 e2 e3experiment session(2 hours) c1 c2
    • 55. Experimental runs day 1 day 2 day 3 day 4 timetraining session(1 hour) e1 e2 e3experiment session(2 hours) c1 c2 c4
    • 56. Testing the waters 2009 November December 18 24 25 2 9Lugano 1 3 1 1 1 1 1
    • 57. Timeline of the experiment 2009 2010 November December January February ... April 18 24 25 2 9 21 28 5 8 14 28 18 22 24 25 14Lugano 1 3 1 1 1 1 1 1 1 1 3 Bologna 2 1 6 1 1 1 1 Antwerp 5 6Bern 4 1 6
    • 58. Timeline of the experiment 2009 2010 November December January February ... April 18 24 25 2 9 21 28 5 8 14 28 18 22 24 25 14Lugano 1 3 1 1 1 1 1 1 1 1 3 remote sessions Bologna 2 1 6 1 1 1 1 Antwerp 5 6 remote sessionBern 4 1 6
    • 59. Treatments and subjects academia industry beginner advanced advanced large 2 2 6 10 CodeCity medium 3 2 7 12 large 2 3 3 8 Ecl+Excl medium 2 5 4 11 9 12 20 41
    • 60. Collecting raw data
    • 61. Collecting raw data solution
    • 62. Collecting raw data completion time
    • 63. Controlling time
    • 64. Controlling timecommon time
    • 65. Controlling timecommon timeinfo on subjectsName (Task): Remaining time
    • 66. Assessing correctness 1 T2: Findbugs, analyzed with CodeCity A3: Impact Analysis B1.2 A1 Multiple locations. The god class containing the largest number of methods in the system is There are 40/41 [0.5pts] classes class MainFrame [0.8pts] Dispersed. [1pt] defined in the following 3 packages [1/6pts for each]: defined in package edu.umd.cs.findbugs.gui2 [0.1pts] which contains 119 [0.1pts] methods. • edu.umd.cs.findbugs A2.1 • edu.umd.cs.findbugs.bcel B2.1 Localized [0.5pts] in package edu.umd.cs.findbugs.detect [0.5pts]. • edu.umd.cs.findbugs.detect The dominant class-level design problem is DataClass [0.5pts] A2.2 A4.1 which affects a number of 67 [0.5pts] classes. Dispersed The 3 classes with the highest number of methods are [ 1 pts each correctly placed and 1 pts each misplaced]: 3 6 in the following (max. 5) packages [0.2pts for each]: 1. class AbstractFrameModelingVisitor • edu.umd.cs.findbugs defined in package edu.umd.cs.findbugs.ba contains 195 methods; • edu.umd.cs.findbugs.anttask 2. class MainFrame • edu.umd.cs.findbugs.ba defined in package edu.umd.cs.findbugs.gui2 contains 119 methods; • edu.umd.cs.findbugs.ba.deref 3. class BugInstance • edu.umd.cs.findbugs.ba.jsr305 defined in package edu.umd.cs.findbugs • edu.umd.cs.findbugs.ba.npe contains 118 methods or • edu.umd.cs.findbugs.ba.vna class TypeFrameModelingVisitor defined in package edu.umd.cs.findbugs.ba.type • edu.umd.cs.findbugs.bcel contains 118 methods. • edu.umd.cs.findbugs.classfile A4.2 • edu.umd.cs.findbugs.classfile.analysis The 3 classes with the highest average number of lines of code per method are [ 1 pts each correctly placed and 1 pts each • edu.umd.cs.findbugs.classfile.engine 3 6 misplaced]: • edu.umd.cs.findbugs.classfile.impl 1. class DefaultNullnessAnnotations • edu.umd.cs.findbugs.cloud defined in package edu.umd.cs.findbugs.ba has an average of 124 lines of code per method; • edu.umd.cs.findbugs.cloud.db 2. class DBCloud.PopulateBugs • edu.umd.cs.findbugs.detect defined in package edu.umd.cs.findbugs.cloud.db has an average of 114.5 lines of code per method; • edu.umd.cs.findbugs.gui 3. class BytecodeScanner • edu.umd.cs.findbugs.gui2 defined in package edu.umd.cs.findbugs.ba • edu.umd.cs.findbugs.jaif has an average of 80.75 lines of code per method. • edu.umd.cs.findbugs.model B1.1 • edu.umd.cs.findbugs.visitclass oracles The package with the highest percentage of god classes in the system is • edu.umd.cs.findbugs.workflow edu.umd.cs.findbugs.ba.deref [0.8pts] which contains 1 [0.1pts] god classes out of a total of 3 [0.1pts] classes.
    • 67. Assessing correctnessblinding 1 T2: Findbugs, analyzed with CodeCity A3: Impact Analysis B1.2 A1 Multiple locations. The god class containing the largest number of methods in the system is There are 40/41 [0.5pts] classes class MainFrame [0.8pts] Dispersed. [1pt] defined in the following 3 packages [1/6pts for each]: defined in package edu.umd.cs.findbugs.gui2 [0.1pts] which contains 119 [0.1pts] methods. • edu.umd.cs.findbugs A2.1 • edu.umd.cs.findbugs.bcel B2.1 Localized [0.5pts] in package edu.umd.cs.findbugs.detect [0.5pts]. • edu.umd.cs.findbugs.detect The dominant class-level design problem is DataClass [0.5pts] A2.2 A4.1 which affects a number of 67 [0.5pts] classes. Dispersed The 3 classes with the highest number of methods are [ 1 pts each correctly placed and 1 pts each misplaced]: 3 6 in the following (max. 5) packages [0.2pts for each]: 1. class AbstractFrameModelingVisitor • edu.umd.cs.findbugs defined in package edu.umd.cs.findbugs.ba contains 195 methods; • edu.umd.cs.findbugs.anttask 2. class MainFrame • edu.umd.cs.findbugs.ba defined in package edu.umd.cs.findbugs.gui2 contains 119 methods; • edu.umd.cs.findbugs.ba.deref 3. class BugInstance • edu.umd.cs.findbugs.ba.jsr305 defined in package edu.umd.cs.findbugs • edu.umd.cs.findbugs.ba.npe contains 118 methods or • edu.umd.cs.findbugs.ba.vna class TypeFrameModelingVisitor defined in package edu.umd.cs.findbugs.ba.type • edu.umd.cs.findbugs.bcel contains 118 methods. • edu.umd.cs.findbugs.classfile A4.2 • edu.umd.cs.findbugs.classfile.analysis The 3 classes with the highest average number of lines of code per method are [ 1 pts each correctly placed and 1 pts each • edu.umd.cs.findbugs.classfile.engine 3 6 misplaced]: • edu.umd.cs.findbugs.classfile.impl 1. class DefaultNullnessAnnotations • edu.umd.cs.findbugs.cloud defined in package edu.umd.cs.findbugs.ba has an average of 124 lines of code per method; • edu.umd.cs.findbugs.cloud.db 2. class DBCloud.PopulateBugs • edu.umd.cs.findbugs.detect defined in package edu.umd.cs.findbugs.cloud.db has an average of 114.5 lines of code per method; • edu.umd.cs.findbugs.gui 3. class BytecodeScanner • edu.umd.cs.findbugs.gui2 defined in package edu.umd.cs.findbugs.ba • edu.umd.cs.findbugs.jaif has an average of 80.75 lines of code per method. • edu.umd.cs.findbugs.model B1.1 • edu.umd.cs.findbugs.visitclass oracles The package with the highest percentage of god classes in the system is • edu.umd.cs.findbugs.workflow edu.umd.cs.findbugs.ba.deref [0.8pts] which contains 1 [0.1pts] god classes out of a total of 3 [0.1pts] classes.
    • 68. Results
    • 69. Statistical test two-way analysis of variance (ANOVA) 95% confidence interval
    • 70. Correctness Ecl+Excl CodeCityDoes the use of CodeCity increase the 8correctness of the solutions to programcomprehension tasks, compared to non- 7visual exploration tools, regardless of the 6object system size? 5 4 3 2 1 0 medium large
    • 71. Correctness Ecl+Excl CodeCityDoes the use of CodeCity increase the 8correctness of the solutions to programcomprehension tasks, compared to non- 7visual exploration tools, regardless of the 6object system size? 5 24.26% 4 3 2 more correct with CodeCity 1 large effect size (d=0.89) 0 medium large
    • 72. Correctness Ecl+Excl CodeCityDoes the use of CodeCity increase the 8correctness of the solutions to programcomprehension tasks, compared to non- 7visual exploration tools, regardless of theobject system size? 6 24.26% 5 4 more correct with CodeCity 3 large effect size (d=0.89) 2 medium large
    • 73. Completion time Ecl+Excl CodeCityDoes the use of CodeCity reduce the time 60needed to solve program comprehensiontasks, compared to non-visual exploration 50tools, regardless of the object system size? 40 30 20 10 0 medium large
    • 74. Completion time Ecl+Excl CodeCityDoes the use of CodeCity reduce the time 60needed to solve program comprehensiontasks, compared to non-visual exploration 50tools, regardless of the object system size? 40 12.01% 30 20 less time with CodeCity 10 moderate effect size (d=0.63) 0 medium large
    • 75. Completion time Ecl+Excl CodeCityDoes the use of CodeCity reduce the time 60needed to solve program comprehensiontasks, compared to non-visual explorationtools, regardless of the object system size? 50 12.01% 40 30 less time with CodeCity moderate effect size (d=0.63) 20 medium large
    • 76. after the first roundCodeCityvsEcl+Excl+24% correctness-12% completion time
    • 77. 193 a riteri nce ing C ie Block Exper dReplicability nd nce grou adva Back nced ent size indust ry adva Tr eatm S ystem nced indust ry adva Tool large nced b er ity in dustry adva Code Num CodeC y large ustry begin ner 1 it ind CodeC y large y ginner IA01 1 it large academ y be nced CodeC y adva IA02 1 odeC it la rge academ nced C ry adva IA03 1 ity large indust y nced CodeC y adva 4 1 it adem IA0 CodeC y large ac y nced 1 adem adva AB01 CodeC y it large ac vance d 1 indu stry ad AB02 CodeC y it large ry vance d IA05 1 CodeC y it large indust ad nced A01 1 eCit ium indust ry adva A Cod med ner 02 1 ity ium in dustry begin d AA CodeC y med y nce 2 m academ adva IA06 it re CodeC y mediu nced su indust ry adva res IA07 2 it m CodeC y mediu nced fair amo unt eP too amo unt indust ry adva very muc unt fair amo 2 m Tim IA08 it odeC mediu begin d ner fair litt h C ry very so m unt fair indust y no amo le eCity litt uch IA09 2 edium Cod m nce no amo uch adva fair so m le ity academ y very so m unt 03 2 ium iate t so ttle h AB CodeC y med ner uc begin d t litt uch adem ed IA10 2 it m inte ficult le t mediu ac rm .2 CodeC y no nce sim ple diate no amo uch m dif ossib li fair so m le y B2 adva t inte t academ y un IA11 2 it m mediu nced sim rme CodeC y p very no inte cult iate fair so m uch im adva no so m le academ no amo uch t 2 m triv ple litt it no IA12 dif rme diate mediu d nced t very e CodeC y un dif rme inte cult diate n ial ry adva indust too amo unt 2 m dif ficult diate inte rme AB04 it inte fair muc unt t CodeC y mediu no .1 ner fi ial fair amo t begin B2 litt unt fair e ry triv h triv ple indust y dif rme n A03 2 eCit ium sim ial fi ner very amo A med sim le ple Cod dif ficult begin d no amo unt dif ficult adem triv ial fair so m unt 5 2 ity sim ficult fair amo uch triv ple diate AB0 sim ial fair amo CodeC l large ac triv ple nce inte cult le no amo unt fair e emy sim ple adva dif rme diate n sim ial dif ossib t im ple inte ple t so un no inte ficult diate 2 very amo uch sim rme AA04 Exc acad nced triv large .2 sim ple diate litt unt ial Ecl+ inte ficult diate inte rme y B1 m adva p t triv academ y sim ple no so m uch fi Excl ial IA13 3 no so m le inte ple dif ossib diate sim ple no so m uch nced sim rme large sim ple dif rme no so m uch Ecl+ adva sim dif ple mu ch triv ficult academ y dif cult le fair Excl sim ple 3 im rme ch IA14 u large nced sim triv ple inte rme iate sim ple diate sim ial t Ecl+ no sim ple adva inte rme diate triv ial t academ sim ple Excl im ficult diate ult inte ficult inte rme diate dif ial d 3 p re 4– –3 triviaial >1 –6 inteterme ediate d intermssible ult t B06 t so e nced inte ple im cult inte rme rm diate .1 fi sim ple diate sim rme t A larg fic Ecl+ fic sim rme diate B1 dustry adva e im ple diate dif ple fi dif ossib le xcl dif ficult dif po 3 ult le 7 inte ple dif AB0 sim rme E in nced inte ficult le large ult im ossib dif ossib sim dif ossib diate e Ecl+ inte ficult le ustry adva fic te inte rme diate te sim rme le l sk dif ossib triv ple 3 if AA05 ind inte rme iate dia Exc ult le p dia inte ossib d p large im rme fic A4 er Ta rm diate vance p triv ial inte im Ecl+ le sim ple diate inte rme inte rme diate sim rme inte ial inte rme diate d ry p ssib Excl indust ad inte rme diate te 3 p im AA06 sim ple large nced dif ple P .2 sim rme diate p triv ficult diate inte ficult im ficult diate dia fic dif rme le inte ple Ecl+ po sim rme triv ple diate e inte rme el im ficult ry adva inte ossib 1– 0 sim rme diate xcl indust ev im AA07 3 edium dif ple diate dif rme cl+E ner im ossib le te g. in ple yL te begin dif inte E m inte ossib le 4 3 in rm p im ossib ry dia im ficult iate indust y rm le ult 3 Excl ium inte rme iate 15 te triv ial ner sim rme te im rme iate dif ficult inte ial IA dif ial med A4 iffic ed p Ecl+ begin d dif4 <1 1 trivia cult inte ult sim ossib diate sim 4– le 0 diateE of tr mple edia d sim rme diate p inte ficult le triv ple Excl academ y inte rme ssib te d sim ple diate IA16 4 m p ssib fic D mediu l 1– in6mple R diate .1 fi ple ub k adva owlle er eable simp rmed7ia1 10 sim7–rmea ber siinterm l 3 mp edia .En dia dif ple le nce ia Ecl+ inte rme 1– adva dif ficult le po sim 1– si3 term ev academ y im rme dif sim ial diate 4 Excl m 4–si etermse IA01 inte rme diate pin ar l mediu nced cli Ye ia inte im rme iate sim ficult dif Ecl+ p iv adva po < 3 dif oss iate inte ple dif ficult e em y im ossib le Excl inte ple rm inte rme 4 m acad im ficult d IA18 < 6 ult dif ossib le nced im ossib mediu le s dif d sim rme diate 7– OP sim ficult ible inte cult le Ecl+ 1 6 triv le y sim ple adva dif ficult diate dif ple diate 4– 6 simp academ y fic triv sim ple 4 Excl m p IA19 mediu nced – l inte ple inte fiault um ple inte rme p 1 1 Ecl+ sim rme diate e inte rme diate A3 p sim ficult adva p 10 d ple rm if J N sim academ y sim ple diate 7– –3 triv rme diate fi l 4– B08 4 +Exc ium v sim ple inte ficult nced inte ial diate inte ult sim ple diate 4– 0 sk c A med triv ple inte rme inte rme 1 Ecl dif 1 adva fic < 6 dif inte rme diate 6 academ y 1– d ta sim ial Excl 7–mp–3le p 4 ium ple 09 inte rme iate an– 6 sim rme iate nced 4– 1 sim10 le si 1 p AB inte rme iate med sim rme diate Ecl+ dif –6 ple 3 sim ple 7– triv3ficult 1– 6 adva inte rme triv rme diate sim ple diate d triv ple diate inte ed adem si ple O 1 sim ple l 4– 3 te 4 d m 4 3 AA1 0 Exc le sim l kn gin ng. si mple mediu ac 10 ial inte ple inte ple > –10 rm .2 inte rme ial dia 1– 6 triv plen ow ner si mple 7– –6 4 Ecl+ triv ial diate A2 sim rme stry 1– cks inte sim ialeg owle dg tr mple 1– 3 7 > 10 dif ple diate AA11 4 Excl m indu 0 su triv le 4 te sim ple inn dg eab ivial mediu d blo < 3 10 inte sim pele inn er eab mple pr – 10 dif ple > e3 s triv rme inte ficult Ecl+ – –6 triv ial nts an 1 1 si le < 1 sim ial > sim ficult diate sim rme sim ial Excl < ad gin led d able trivia 7l 1–3 4 m ple AA12 7– –3 te le inte iate 1– 1 1 mediu triv 7– 10 atme triv ple ial b ev.E inte ficult sim ial > 10 ple dia Ecl+ 1 0 sim ple 3 dif rme 4 0 im rme diate 1 sim ial triv ple diate l – to tre 10 ed tr ple 1 sim l inte ple 7– 10me > –3 le inte ssib te ainte va ssib rte able iv ple R 4 sim rme 7– <1 AA13 ia +Exc 7– –6 >ti 6 rm .1 im norminn dg eabsi ivial dia e 1– 0 ed 4– cts 1– 0 inte ple Ecl 1 0 knintea nce ia te er triv cult iate inte rme 1 A2 sim rme 1 dif rme le 1 inte subje 4– 3 sim ple < 3 7– –3 lem triv no inn er AA14 4 4 0 sim ple d 1– 6 te k si g rmc eiate b trivia 1 of the < 1 inte k sim 7– 10 sim ple triv gin le r 7– –6 w e triv ial tr po 1 3 n dg sim ple 4– 1 te le d ficu lip le el b > 10 b g sim ial bleeiv ment 4 0 triv ple be triv ple dif Essibev IA20 –3 sim kn ult wle e te 1 4– 6 intebeple wle dg 1– 0 kp g iffi knolt s po lt be cts nce ge dif ial > –6 triv ial fi 1 ad po w edia e triv ial im ficuL ial dia 1– 3 beia le ssign 1– 6 be va ne gearc 4– 3 sim ple diate sim ial – ial d e 7– 10 dif ple 1 c inte ple kn ia l nc sim ficult 1– 3 triv rme 4– 3 ea ult 1– 6 be tevan ncd legea e > 10 e ie .2. Th 1 3 ’p 1– 6 sim ple 4– 3 fic sim rme o A1 sim ficult in d rme d 10 c kn dif w plee dd te sim ple 7– –3 le 1– 3 4– 6 d dif ad tevrm edia Etep kn gin nce r k g A 7– –6 w sim ple ple no m inn ee ia x te 4– 0 inte be ow nce d able Table ad ow ne d ad pe ne d able ab k egin tece dge le able < 3 7– 6 sim ple jeno sim ple 1 4 in d rm dia 4– 0 dif ial dia triv ial sim ple 1 6 sim ple 1 ad intermncedia te de le now ne rmediaable 4– 1 1– 0 triv ficult iate sim ial ad van led r ad imw ficled r 4 6 kn va ce ge inte ple te triv ple b inan poss ge 1 inte ple inte v rme dsi ten dg ia te te 7– –3 triv rme < 6 Co sim 1– 3 ow rm e d v le lt IA0 1 va rme de te 7– –6 ed ple dia ib triv ial d 10 b IA0 u ble a 1– 1 IA0 2 3 sim ple diate dif rme sim ial va le e d iate adv mp d ia a 7– 10 triv inte ple IA0 3 eles 4– 6 sim rme Ja be gin led d 1 3 a be gin ne gea d 7– 10 4– AB 4 4– 0 ce d b AB 01 inte 4 6 1 1 3 – 7– 10 kn pe in cmple edia te Tha sim rme o sim ficult a be ne.5.led r dif ial v – 7 02 4– 6 ge te 7– –6 kn pe led d simble l diatex dva si rt term edia > –3 > 10 ex gin ne r ex gin tencple d le AA 5 7– 6 ple te p in nc rm g iate be sim nce AA 01 1– 0 inte ta IA0 1– 0 ble 4 0 sim ple sim ple be pe ne r a pein nermd ad an te ded iate Da le 1 1 7– 10 IA0 02 te a in 1 e Osi ple 1– 3 sim ple ial dia adv ansimed ed te inte ple diate adv inrt term edab be van letrivia edia A.5 Data 1– 3 7– –6 dia ex va te led ed ad va ne gea be gin rt r ad gin led r sim r – 0 kn inOP ple IA0 6 triv ple ete a si ce ple iate be ow ne te c rm ia A.5 4– 3 e e 7 1 noA 1– 3 a inte rme iate ad in wterm 4 0 7 > 10 ial dia triv ial kn gin 1 ad ow rt term m 4– 6 IA0 8 3 7– –6 inte ple diate ad ow ne d leble ed sim ial kn gin ne r ad va nce r IA0 4 0 be gin ne AB 9 inte ple kn gin cesimgea l triv rme 4– 6 le 1 IA1 03 kn va nce d 1– 0 ex ow ne r 7– –6 n inte ple diate o sim ple ex per t in mple edia ab 1 4– 6 p IA1 0 ad ow led ge ble kn ow nce d 1– 3 sim rme