More Related Content


Maximizing Correctness with Minimal User Effort to Learn Data Transformations

  1. Maximizing Correctness with Minimal User Effort to Learn Data Transformations Bo Wu and Craig Knoblock University of Southern California 1 Department of Computer Science
  2. 2 Art website Buyer
  3. 3 Dimension of artworks
  4. 4 Programming by Example Video is from Excel YouTube official channel (
  5. Too Many Records 5
  6. Overconfident Users 6 Users are often too confident to examine the results thoroughly
  7. Variations 7
  8. Problem Enable the users of PBE systems to achieve maximal correctness with minimal effort on large datasets 8 Help users to identify at least one of all incorrect records in every iteration with minimal effort on large datasets
  9. Approach Overview 9 Raw Transformed 10“ H x 8” W 10 H: 58 x W:25” 58 12”H x 9”W 12 11”H x 6” 11 … … 30 x 46” 30 x 46 Entire dataset Random Sampling Raw Transformed 10“ H x 8” W 10 11”H x 6” 11 … … 30 x 46” 30 x 46 Sampled records Verifying records Raw Transformed 11”H x 6” 11 30 x 46” 30 x 46 … … Sorting and color-codingRaw Transformed 30 x 46” 30 x 46 11”H x 6” 11 … …
  10. Learning from users’ feedback 10
  11. Verifying Records • First recommend records causing runtime errors – Records cause the program exit abnormally • Second recommend potentially incorrect records – Learn a binary meta-classifier 11 Input: 2008 Mitsubishi Galant ES $7500 (Sylmar CA) pic Raw Transformed 11”H x 6” 11 30 x 46” 30 x 46 … … Ex:
  12. Learning the Meta-classifier 12 cs1 … Meta-classifier cs2 cs4 cs3 cp1 … cp2 cp3 cp4 cf1 … cf2 cf3 cf4 Program agreement Format ambiguity Similarity cs3 cs4 cp2 cf1 w1 w2 w3 w4 …
  13. Evaluation • The recommendation contains incorrect records 13
  14. Evaluation • The recommendation can place incorrect records on top 14
  15. User study 15 Experiment setup: • 5 scenarios with 4000 records per scenario • 10 graduate students divided into two groups
  16. Summary and Future Work • Summary – Sample records – Identify incorrect/questionable records – Allow user to refine the recommendation – Color-code the results • Future work – Show histograms of the data – Translate the program to readable natural text 16
  17. 17 Questions ? Data and system available at
  18. Type of Classifiers • Classifier based on distance • Classifier based on agreement of programs • Classifier based on format ambiguity 18
  19. Learning from various past results 19 … Raw Transformed 26" H x 24" W x 12.5 26 Framed at 21.75" H x 24.25” W 21 12" H x 9" 12 … Raw Transformed Ravage 2099#24 (November, 1994) November, 1994 Gambit III#1 (September, 1997) September, 1997 (comic) Spidey Super Stories#12/2 (September, 1975) comic … Examples Incorrect records Correct records
  20. Sorting Records 20 Runtime errors Rank records using #failed_subprograms Rank records using meta-classifier output Yes No Checking transformed records Record #failed_subprograms 2008 Mitsubishi Galant ES $7500 (Sylmar CA) pic 3 1998 Honda Civic 12k miles s. Auto. - $3800 (Arcadia) 2

Editor's Notes

  1. Ashley wants to buy a painting for the space over her sofa She has strict space limits. Ex: the painting should be about 60’’ wide and 40’’ high
  2. Ashley got a spreadsheet of artworks on sale. The sizes information that she got is a long list of entries with the height, width and even depth in one entry. She has to split them into three columns and remove some extra text such as “H:”, “in.”, etc. Thus, she can then filter the artworks based on each degree’s size. Dataset has so many records that it requires her to write programs to solve problem. Problem: a long learning curve to learn this skill. The time should be used to decorate her house instead.
  3. Programming by example doesn’t require users to write code anymore.
  4. The list can have thousands of records. It is really hard to notice some records in the middle that are transformed incorrectly.
  5. According to previous research, User often believe that they have carefully examined all the records. They stop checking the results when there is still a large percentage of incorrect records in the dataset.
  6. To identify the Cannot rely on single rule or
  7. Random sampling is to address the too many records problem Verifying records can capture incorrect records in various scenarios Sorting and color-coding is to address over confident user problem Can also learn from the users interaction in current iteration to refine the recommendation
  8. Learn from the users feedback to refine the recommendation
  9. First, describe correctness Second, iteration time Third, total time. explain why certain scenarios have longer total time. Why in s5 and s3 beta has twice the iteration time as our approach? Why does the iteration time in beta varies much more than the times in our approach?
  10. Summary vs Conclusion