Programming by examples (PBE) is a new frontier in AI that enables users to create scripts from input-output examples. A killer application is in the space of data wrangling to automate tasks like string/number/date transformations (e.g., converting “FirstName LastName” to “LastName, FirstName”), column splitting, table extraction from log-files, webpages, and PDFs, normalizing semi-structured spreadsheets into structured tables, transforming JSON from one format to another, etc. This presentation will educate the audience about this new PBE-based programming paradigm: its applications, form factors inside different products, and the science behind it.
50% spreadsheets are semi-structured.
KPMG, Deloitte budget millions of dollars for normalization.
“FlashRelate: Extracting Relational Data from Semi-Structured Spreadsheets Using Examples”
[PLDI 2015] Dan Barowy, Sumit Gulwani, Ted Hart, Ben Zorn
Bureau of I.A.
Regional Dir. Numbers
Niles C. Tel: (800)645-8397
Jean H. Tel: (918)781-4600
Frank K. Tel: (615)564-6500
Niles C. (800)645-8397 (907)586-7252
Jean H. (918)781-4600 (918)781-4604
Frank K. (615)564-6500 (615)564-6701
of rows in
Program in D
“Programming by Examples: PL meets ML”
[APLAS 2017] Sumit Gulwani, Prateek Jain
• Logical Deduction: [OOPSLA '15] FlashMeta: A framework for inductive program synthesis
• Machine Learning: [ICLR '18] Neural-guided deductive search for real-time program synthesis from examples
• Program Features: [CAV '15] Predicting a correct program in programming by example
• Output Features: [IJCAI '17] Learning to learn programs from examples: going beyond program structure
• Distinguishing Inputs: [UIST '15] User Interaction Models for Disambiguation in Programming by Example
• Clustering: [OOPSLA '18] FlashProfile: A Framework for Synthesizing Data Profiles
Synthesis of intended programs from just the input.
• Tabular data extraction, Sort, Join
Synthesis of readable/modifiable code
Synthesis in target language of choice.
• Scala, R, PySpark
Code-first experience in existing workflows.
• IDE, Notebook
66“Automated Data Extraction using Predictive Program Synthesis”
[AAAI 2017] Mohammad Raza, Sumit Gulwani
Code Transformations by Examples
• Code refactoring consumes 40% time in migration.
– Old version to new version
– On-prem to cloud
– One framework to another
• Custom formatting
• Performance enhancements
• Repetitive bug fixes
– Feedback generation for programming education
73“Learning syntactic program transformations from examples”
[ICSE 2017] Reudismam Rolim, Gustavo Soares, et.al.
Programming by examples is a new frontier in AI.
• 10-100x productivity increase in some domains.
– Data Wrangling: Data scientists spend 80% time.
– Code Refactoring: Developers spend 40% time in migration.
• 99% of end users are non-programmers.
Next-generational AI techniques under the hood
• Logical Reasoning + Machine Learning
The Future: Multi-modal programming with Examples and NL
Questions/Feedback: Contact me at firstname.lastname@example.org
74Microsoft PROSE (PROgram Synthesis by Examples) Framework
Available for non-commercial use : https://microsoft.github.io/prose/