# Sumit Gulwani at AI Frontiers : Programming by Examples

AI Frontiers
Nov. 13, 2018
1 of 74

### Sumit Gulwani at AI Frontiers : Programming by Examples

• 1. Sumit Gulwani Microsoft Programming by Examples AI Frontiers Conference Nov 2018
• 2. =MID(B1,5,2) Example-based help-forum interaction 2 300_w5_aniSh_c1_b  w5 300_w30_aniSh_c1_b  w30 =MID(B1,5,2) =MID(B1,FIND(“_”,\$B:\$B)+1, FIND(“_”,REPLACE(\$B:\$B,1,FIND(“_”,\$B:\$B),””))-1)
• 3. Flash Fill (Excel feature) 3“Automating string processing in spreadsheets using input-output examples” [POPL 2011] Sumit Gulwani
• 4. 4
• 5. 5
• 6. 6
• 7. 7
• 8. 8
• 9. 9
• 10. 10
• 11. 11
• 12. 12
• 13. 13
• 14. 14
• 15. 15
• 16. 16
• 17. 17
• 18. 18
• 19. 19
• 20. 20
• 21. 21
• 22. 22
• 23. 23
• 24. 24
• 25. 25
• 26. 26
• 27. 27
• 28. 28
• 29. 29
• 30. 30
• 31. 31
• 32. 32
• 33. 33
• 34. 34
• 35. 35
• 36. 36
• 37. 37
• 38. 38
• 39. Number, DateTime Transformations 39 Input Output (round to 2 decimal places) 123.4567 123.46 123.4 123.40 78.234 78.23 Excel/C#: Python/C: Java: #.00 .2f #.## Input Output (3-hour weekday bucket) CEDAR AVE & COTTAGE AVE; HORSHAM; 2015-12-11 @ 13:34:52; Fri, 12PM - 3PM RT202 PKWY; MONTGOMERY; 2016-01-13 @ 09:05:41-Station:STA18; Wed, 9AM - 12PM ; UPPER GWYNEDD; 2015-12-11 @ 21:11:18; Fri, 9PM - 12AM “Synthesizing Number Transformations from Input-Output Examples” [CAV 2012] Rishabh Singh, Sumit Gulwani
• 40. Table Extraction 40“FlashExtract: A Framework for data extraction by examples” [PLDI 2014] Vu Le, Sumit Gulwani
• 41. 41
• 42. 42
• 43. 43
• 44. 44
• 45. 45
• 46. 46
• 47. 47
• 48. 48
• 49. 49
• 50. 50
• 51. Table Reshaping 51 50% spreadsheets are semi-structured. KPMG, Deloitte budget millions of dollars for normalization. “FlashRelate: Extracting Relational Data from Semi-Structured Spreadsheets Using Examples” [PLDI 2015] Dan Barowy, Sumit Gulwani, Ted Hart, Ben Zorn Bureau of I.A. Regional Dir. Numbers Niles C. Tel: (800)645-8397 Fax: (907)586-7252 Jean H. Tel: (918)781-4600 Fax: (918)781-4604 Frank K. Tel: (615)564-6500 Fax: (615)564-6701 Tel Fax Niles C. (800)645-8397 (907)586-7252 Jean H. (918)781-4600 (918)781-4604 Frank K. (615)564-6500 (615)564-6701 FlashRelate From few examples of rows in output table
• 52. 52
• 53. 53
• 54. 54
• 55. 55
• 56. 56
• 57. 57
• 58. 58
• 59. 59
• 60. 60
• 61. 61
• 62. 62
• 63. 63
• 64. 64
• 65. Disambiguator More Examples Intended Program in D PBE Architecture 65 Examples Program set Test inputs Ranked Program set DSL D Program Ranker “Programming by Examples: PL meets ML” [APLAS 2017] Sumit Gulwani, Prateek Jain Search Engine Search • Logical Deduction: [OOPSLA '15] FlashMeta: A framework for inductive program synthesis • Machine Learning: [ICLR '18] Neural-guided deductive search for real-time program synthesis from examples Ranking • Program Features: [CAV '15] Predicting a correct program in programming by example • Output Features: [IJCAI '17] Learning to learn programs from examples: going beyond program structure Disambiguation • Distinguishing Inputs: [UIST '15] User Interaction Models for Disambiguation in Programming by Example • Clustering: [OOPSLA '18] FlashProfile: A Framework for Synthesizing Data Profiles
• 66. New Frontiers Predictive Synthesis Synthesis of intended programs from just the input. • Tabular data extraction, Sort, Join Synthesis of readable/modifiable code Synthesis in target language of choice. • Scala, R, PySpark Code-first experience in existing workflows. • IDE, Notebook 66“Automated Data Extraction using Predictive Program Synthesis” [AAAI 2017] Mohammad Raza, Sumit Gulwani
• 67. 67
• 68. 68
• 69. 69
• 70. 70
• 71. 71
• 72. 72
• 73. Code Transformations by Examples • Code refactoring consumes 40% time in migration. – Old version to new version – On-prem to cloud – One framework to another • Custom formatting • Performance enhancements • Repetitive bug fixes – Feedback generation for programming education 73“Learning syntactic program transformations from examples” [ICSE 2017] Reudismam Rolim, Gustavo Soares, et.al.
• 74. Programming by examples is a new frontier in AI. • 10-100x productivity increase in some domains. – Data Wrangling: Data scientists spend 80% time. – Code Refactoring: Developers spend 40% time in migration. • 99% of end users are non-programmers. Next-generational AI techniques under the hood • Logical Reasoning + Machine Learning The Future: Multi-modal programming with Examples and NL Questions/Feedback: Contact me at sumitg@microsoft.com Conclusion 74Microsoft PROSE (PROgram Synthesis by Examples) Framework Available for non-commercial use : https://microsoft.github.io/prose/