Web Macros

456 views
412 views

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
456
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
5
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Web Macros

  1. 1. Characterizing Reusability of End-User Web Macro Scripts Chris Scaffidi 1 , Chris Bogart 2 , Margaret Burnett 2 , Allen Cypher 3 , Brad Myers 1 , Mary Shaw 1 1 Carnegie Mellon University 2 Oregon State University 3 IBM-Almaden
  2. 2. End-user programming <ul><li>Programming as a tool, not a finished product </li></ul><ul><li>Examples of end-user programming: </li></ul><ul><ul><li>creating spreadsheet to compute loan interest </li></ul></ul><ul><ul><li>creating web macro to pay phone bill </li></ul></ul><ul><li>Example programs </li></ul><ul><ul><li>spreadsheets, scripts, web macros </li></ul></ul><ul><li>By 2012, over 55 million people will regularly play the role of end-user programmer [3] </li></ul>Problem  Traits of reusability  Conclusion
  3. 3. Typical activities toward achieving end-user programming goals <ul><li>Create a new end-user program from scratch </li></ul><ul><li>Clone or copy-paste from existing end-user program </li></ul><ul><li>Tweak code </li></ul><ul><li>Programmatically call existing end-user code (rare) </li></ul><ul><li>Manually run a series of existing end-user programs </li></ul><ul><li>Or any combination of the above. </li></ul><ul><li>Create a new end-user program from scratch </li></ul><ul><li>Clone or copy-paste from existing end-user program </li></ul><ul><li>Tweak code </li></ul><ul><li>Programmatically call existing end-user code (rare) </li></ul><ul><li>Manually run a series of existing end-user programs </li></ul><ul><li>Or any combination of the above. </li></ul><ul><li>Note: 4 activities reuse or operate on existing code. </li></ul>Problem  Traits of reusability  Conclusion
  4. 4. Systems help users share, find, evaluate existing code <ul><li>CoScripter web macro “wiki” </li></ul><ul><ul><li>Provided by IBM as a repository of automated “how-to” knowledge, eg how to pay phone bill from Citi checking </li></ul></ul><ul><ul><li>The largest web macro repository to date </li></ul></ul><ul><ul><ul><li>> 6000 users, > 10000 scripts (incl. > 3000 “public” scripts) </li></ul></ul></ul><ul><ul><ul><li>In service since August 2007 </li></ul></ul></ul><ul><li>The wiki’s recommendation features: </li></ul><ul><ul><li>Keyword-based search </li></ul></ul><ul><ul><li>Can also recommend script based on URL (rarely used) </li></ul></ul><ul><ul><li>Quality indicators : download counters, ratings, reviews </li></ul></ul>Problem  Traits of reusability  Conclusion
  5. 5. Needed: a better model for indicating quality <ul><li>Existing popularity-based quality indicators </li></ul><ul><ul><li>Examples: download counters, ratings, reviews </li></ul></ul><ul><ul><li>To get a rating, somebody has to download & try macro </li></ul></ul><ul><ul><ul><li>(And users’ interests are so diverse that this may take a while) </li></ul></ul></ul><ul><ul><li>Sorting by downloads makes it hard for good new code to be “discovered” </li></ul></ul><ul><li>By analyzing code directly , can we predict reusability ? </li></ul><ul><ul><li>Already done successfully for object-oriented code [1] </li></ul></ul><ul><ul><li>What are the traits of reusable web macros ? </li></ul></ul>Problem  Traits of reusability  Conclusion
  6. 6. What are the traits of reusable web macros? <ul><li>Approach: </li></ul><ul><ul><li>Consider the steps required for reusing code </li></ul></ul><ul><ul><li>Identify macro traits that might support reusing code </li></ul></ul><ul><ul><li>Empirically test whether code with these traits is more likely to be reused </li></ul></ul>Problem  Traits of reusability  Conclusion
  7. 7. What are the traits of reusable web macros? <ul><li>Four fundamental steps of reuse in general [2]: </li></ul><ul><ul><li>Finding code </li></ul></ul><ul><ul><li>Understanding it </li></ul></ul><ul><ul><li>Modifying it </li></ul></ul><ul><ul><li>Composing it </li></ul></ul><ul><li>We expect that code is more reusable if it does not need modification to be reused. </li></ul><ul><li>Users rarely combine CoScripter web macros. </li></ul><ul><li>Traits should support finding, understanding, and not needing to modify. </li></ul>Problem  Traits of reusability  Conclusion
  8. 8. We identified 35 candidate traits in 8 categories <ul><li>Mass appeal – eg popular keywords F </li></ul><ul><li>Language – eg data values are in English U </li></ul><ul><li>Annotations – eg comments U </li></ul><ul><li>Flexibility – eg parameterization (variables) M </li></ul><ul><li>Length – eg small # distinct lines of code UM </li></ul><ul><li>Author information – eg at IBM IP address M </li></ul><ul><li>Advanced syntax – eg “control-click” keyword UM </li></ul><ul><li>No Preconditions – eg no cookies needed M </li></ul><ul><li>F = findability, U = understandability, M = not modifying </li></ul><ul><li>Note: our paper organized traits into slightly different groups than the ones shown above </li></ul>Problem  Traits of reusability  Conclusion
  9. 9. We empirically tested whether each trait corresponded to reuse <ul><li>Extracted 6 months of IBM wiki data </li></ul><ul><ul><li>Source code & usage logs for 937 public scripts </li></ul></ul><ul><ul><li>Four (binary) measures of reuse </li></ul></ul><ul><ul><ul><li>Execution by author > 24 hours after initial creation </li></ul></ul></ul><ul><ul><ul><li>Execution by any other user </li></ul></ul></ul><ul><ul><ul><li>Editing by any other user </li></ul></ul></ul><ul><ul><ul><li>Clone/copy-paste by any other user </li></ul></ul></ul><ul><li>Testing for correspondence </li></ul><ul><ul><li>For each candidate trait, divide scripts into two groups </li></ul></ul><ul><ul><ul><li>For boolean traits, based on true/false </li></ul></ul></ul><ul><ul><ul><li>For numerical traits, based on above/below mean </li></ul></ul></ul><ul><ul><li>Performed z-test of proportions : </li></ul></ul><ul><ul><ul><li>Does the trait correspond to higher likelihood of reuse? </li></ul></ul></ul>Problem  Traits of reusability  Conclusion
  10. 10. We found many traits that empirically corresponded to reuse. <ul><li>Traits significant at p<0.00036 wrt at least one reuse measure </li></ul><ul><ul><li>If websites hit by the macro contain certain keywords </li></ul></ul><ul><ul><li>If the macro was intended by IBM as a “tutorial” script </li></ul></ul><ul><ul><li>Number of comments in the macro’s code </li></ul></ul><ul><ul><li>If the macro has a title </li></ul></ul><ul><ul><li>Number of parameters in the macro </li></ul></ul><ul><ul><li>Number of literals hard-coded in the macro </li></ul></ul><ul><ul><li>Number of distinct lines of code in the macro </li></ul></ul><ul><ul><li>ID number of the macro author (indicates early adopter) </li></ul></ul><ul><ul><li>ID number of the script (generally lower for early adopters) </li></ul></ul><ul><ul><li>If the author was at an IBM IP address </li></ul></ul><ul><ul><li>Number of author’s previous scripts that had been reused </li></ul></ul><ul><ul><li>If the macro used ordinal advanced syntax </li></ul></ul><ul><ul><li>If the macro used “control-click”/”control-select” syntax </li></ul></ul><ul><ul><li>If the macro required user to be at a certain URL prior to run </li></ul></ul><ul><ul><li>If the macro hits a lot of different websites </li></ul></ul><ul><li>Traits significant at p<0.00036 wrt at least one reuse measure </li></ul><ul><ul><li>If websites hit by the macro contain certain keywords </li></ul></ul><ul><ul><li>If the macro was intended by IBM as a “tutorial” script </li></ul></ul><ul><ul><li>Number of comments in the macro’s code </li></ul></ul><ul><ul><li>If the macro has a title </li></ul></ul><ul><ul><li>Number of parameters in the macro </li></ul></ul><ul><ul><li>Number of literals hard-coded in the macro </li></ul></ul><ul><ul><li>Number of distinct lines of code in the macro </li></ul></ul><ul><ul><li>ID number of the macro author (indicates early adopter ) </li></ul></ul><ul><ul><li>ID number of the script (generally lower for early adopters ) </li></ul></ul><ul><ul><li>If the author was at an IBM IP address </li></ul></ul><ul><ul><li>Number of author’s previous scripts that had been reused </li></ul></ul><ul><ul><li>If the macro used ordinal advanced syntax </li></ul></ul><ul><ul><li>If the macro used “control-click”/”control-select” syntax </li></ul></ul><ul><ul><li>If the macro required user to be at a certain URL prior to run </li></ul></ul><ul><ul><li>If the macro hits a lot of different websites </li></ul></ul>Mass appeal traits Annotation traits Length traits Traits hinting higher author expertise Use of advance syntax Problem  Traits of reusability  Conclusion
  11. 11. Implications and future work <ul><li>These traits are “ raw materials ” for a predictive model. </li></ul><ul><ul><li>Different traits corresponded to different reuse measures </li></ul></ul><ul><ul><ul><li> The predictive model should not be trainable based on a given reuse measure. </li></ul></ul></ul><ul><ul><li>Length (a bit surprisingly) corresponded to higher reuse. </li></ul></ul><ul><ul><ul><li> Value of functional size (not just find/understand/modify) </li></ul></ul></ul><ul><li>We have made good progress on next steps: </li></ul><ul><ul><li>Create a model </li></ul></ul><ul><ul><li>Train and evaluate it on the test data (“10-fold validation”) </li></ul></ul><ul><ul><li>Compare to alternate models </li></ul></ul><ul><ul><li>Long-term: generalize to other kinds of end-user programs </li></ul></ul>Problem  Traits of reusability  Conclusion
  12. 12. Thank You <ul><li>To the RSSE’08 committee for this opportunity </li></ul><ul><li>To the EUSES Consortium for feedback </li></ul><ul><li>To NSF for funding </li></ul>Problem  Traits of reusability  Conclusion
  13. 13. References mentioned in this talk <ul><li>[1] V. Basili, L. Briand, and W. Melo. A Validation of Object-Oriented Design Metrics as Quality-Indicators. Trans. Software Eng. (22) , No. 10, 1996, 751-761. </li></ul><ul><li>[2] T. Biggerstaff and C. Richter. Reusability Framework, Assessment, and Directions. IEEE Software (4) , No. 2, March 1987, 41-49. </li></ul><ul><li>[3] C. Scaffidi, M. Shaw, and B. Myers. Estimating the Numbers of End Users and End User Programmers. 2005 IEEE Symp. Visual Lang. and Human-Centric Computing , 2005, 207-214. </li></ul><ul><li>See the paper for more references. </li></ul>Problem  Traits of reusability  Conclusion

×