Be the first to like this
Data transformation often requires users to write many trivial and task-dependent programs to transform thousands of records. Recently, programming-by-example (PBE) approaches enable users to transform data without coding. A key challenge of these PBE approaches is to deliver correctly transformed results on large datasets, since these transformation programs are likely to be generated by non-expert users. To address this challenge, existing approaches aim to identify a small set of potentially incorrect records and ask users to examine these records instead of the entire dataset. However, because the transformation scenarios are highly task-dependent, existing approaches cannot capture the incorrect records for various scenarios. \ We present a approach that learns from past transformation scenarios to generate a meta-classifier to identify the incorrect records. Our approach color-codes these transformed records and then presents them for users to examine. The method allows users to either enter an example for a record transformed incorrectly or confirm the correctness of a transformed record. And our approach can learn from the users' labels to refine the meta-classifier to accurately identify the incorrect records. Simulation results and a user study show that our method can identify the incorrectly transformed records and reduce the user efforts in examining the results.