Your SlideShare is downloading. ×

Web Macros

360

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
360
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
4
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Characterizing Reusability of End-User Web Macro Scripts Chris Scaffidi 1 , Chris Bogart 2 , Margaret Burnett 2 , Allen Cypher 3 , Brad Myers 1 , Mary Shaw 1 1 Carnegie Mellon University 2 Oregon State University 3 IBM-Almaden
  • 2. End-user programming
    • Programming as a tool, not a finished product
    • Examples of end-user programming:
      • creating spreadsheet to compute loan interest
      • creating web macro to pay phone bill
    • Example programs
      • spreadsheets, scripts, web macros
    • By 2012, over 55 million people will regularly play the role of end-user programmer [3]
    Problem  Traits of reusability  Conclusion
  • 3. Typical activities toward achieving end-user programming goals
    • Create a new end-user program from scratch
    • Clone or copy-paste from existing end-user program
    • Tweak code
    • Programmatically call existing end-user code (rare)
    • Manually run a series of existing end-user programs
    • Or any combination of the above.
    • Create a new end-user program from scratch
    • Clone or copy-paste from existing end-user program
    • Tweak code
    • Programmatically call existing end-user code (rare)
    • Manually run a series of existing end-user programs
    • Or any combination of the above.
    • Note: 4 activities reuse or operate on existing code.
    Problem  Traits of reusability  Conclusion
  • 4. Systems help users share, find, evaluate existing code
    • CoScripter web macro “wiki”
      • Provided by IBM as a repository of automated “how-to” knowledge, eg how to pay phone bill from Citi checking
      • The largest web macro repository to date
        • > 6000 users, > 10000 scripts (incl. > 3000 “public” scripts)
        • In service since August 2007
    • The wiki’s recommendation features:
      • Keyword-based search
      • Can also recommend script based on URL (rarely used)
      • Quality indicators : download counters, ratings, reviews
    Problem  Traits of reusability  Conclusion
  • 5. Needed: a better model for indicating quality
    • Existing popularity-based quality indicators
      • Examples: download counters, ratings, reviews
      • To get a rating, somebody has to download & try macro
        • (And users’ interests are so diverse that this may take a while)
      • Sorting by downloads makes it hard for good new code to be “discovered”
    • By analyzing code directly , can we predict reusability ?
      • Already done successfully for object-oriented code [1]
      • What are the traits of reusable web macros ?
    Problem  Traits of reusability  Conclusion
  • 6. What are the traits of reusable web macros?
    • Approach:
      • Consider the steps required for reusing code
      • Identify macro traits that might support reusing code
      • Empirically test whether code with these traits is more likely to be reused
    Problem  Traits of reusability  Conclusion
  • 7. What are the traits of reusable web macros?
    • Four fundamental steps of reuse in general [2]:
      • Finding code
      • Understanding it
      • Modifying it
      • Composing it
    • We expect that code is more reusable if it does not need modification to be reused.
    • Users rarely combine CoScripter web macros.
    • Traits should support finding, understanding, and not needing to modify.
    Problem  Traits of reusability  Conclusion
  • 8. We identified 35 candidate traits in 8 categories
    • Mass appeal – eg popular keywords F
    • Language – eg data values are in English U
    • Annotations – eg comments U
    • Flexibility – eg parameterization (variables) M
    • Length – eg small # distinct lines of code UM
    • Author information – eg at IBM IP address M
    • Advanced syntax – eg “control-click” keyword UM
    • No Preconditions – eg no cookies needed M
    • F = findability, U = understandability, M = not modifying
    • Note: our paper organized traits into slightly different groups than the ones shown above
    Problem  Traits of reusability  Conclusion
  • 9. We empirically tested whether each trait corresponded to reuse
    • Extracted 6 months of IBM wiki data
      • Source code & usage logs for 937 public scripts
      • Four (binary) measures of reuse
        • Execution by author > 24 hours after initial creation
        • Execution by any other user
        • Editing by any other user
        • Clone/copy-paste by any other user
    • Testing for correspondence
      • For each candidate trait, divide scripts into two groups
        • For boolean traits, based on true/false
        • For numerical traits, based on above/below mean
      • Performed z-test of proportions :
        • Does the trait correspond to higher likelihood of reuse?
    Problem  Traits of reusability  Conclusion
  • 10. We found many traits that empirically corresponded to reuse.
    • Traits significant at p<0.00036 wrt at least one reuse measure
      • If websites hit by the macro contain certain keywords
      • If the macro was intended by IBM as a “tutorial” script
      • Number of comments in the macro’s code
      • If the macro has a title
      • Number of parameters in the macro
      • Number of literals hard-coded in the macro
      • Number of distinct lines of code in the macro
      • ID number of the macro author (indicates early adopter)
      • ID number of the script (generally lower for early adopters)
      • If the author was at an IBM IP address
      • Number of author’s previous scripts that had been reused
      • If the macro used ordinal advanced syntax
      • If the macro used “control-click”/”control-select” syntax
      • If the macro required user to be at a certain URL prior to run
      • If the macro hits a lot of different websites
    • Traits significant at p<0.00036 wrt at least one reuse measure
      • If websites hit by the macro contain certain keywords
      • If the macro was intended by IBM as a “tutorial” script
      • Number of comments in the macro’s code
      • If the macro has a title
      • Number of parameters in the macro
      • Number of literals hard-coded in the macro
      • Number of distinct lines of code in the macro
      • ID number of the macro author (indicates early adopter )
      • ID number of the script (generally lower for early adopters )
      • If the author was at an IBM IP address
      • Number of author’s previous scripts that had been reused
      • If the macro used ordinal advanced syntax
      • If the macro used “control-click”/”control-select” syntax
      • If the macro required user to be at a certain URL prior to run
      • If the macro hits a lot of different websites
    Mass appeal traits Annotation traits Length traits Traits hinting higher author expertise Use of advance syntax Problem  Traits of reusability  Conclusion
  • 11. Implications and future work
    • These traits are “ raw materials ” for a predictive model.
      • Different traits corresponded to different reuse measures
        •  The predictive model should not be trainable based on a given reuse measure.
      • Length (a bit surprisingly) corresponded to higher reuse.
        •  Value of functional size (not just find/understand/modify)
    • We have made good progress on next steps:
      • Create a model
      • Train and evaluate it on the test data (“10-fold validation”)
      • Compare to alternate models
      • Long-term: generalize to other kinds of end-user programs
    Problem  Traits of reusability  Conclusion
  • 12. Thank You
    • To the RSSE’08 committee for this opportunity
    • To the EUSES Consortium for feedback
    • To NSF for funding
    Problem  Traits of reusability  Conclusion
  • 13. References mentioned in this talk
    • [1] V. Basili, L. Briand, and W. Melo. A Validation of Object-Oriented Design Metrics as Quality-Indicators. Trans. Software Eng. (22) , No. 10, 1996, 751-761.
    • [2] T. Biggerstaff and C. Richter. Reusability Framework, Assessment, and Directions. IEEE Software (4) , No. 2, March 1987, 41-49.
    • [3] C. Scaffidi, M. Shaw, and B. Myers. Estimating the Numbers of End Users and End User Programmers. 2005 IEEE Symp. Visual Lang. and Human-Centric Computing , 2005, 207-214.
    • See the paper for more references.
    Problem  Traits of reusability  Conclusion

×