SlideShare a Scribd company logo
1 of 26
Download to read offline
Andrey Z
a
kh
a
revich for MLST, 9.03.2022
Program Synthesis,
DreamCoder and ARC
About me
• 2002 —
fi
rst Hello World 🤯


• 2010 — dropped out from the university


• 2012-2021 — various software engineering jobs, mostly backend


• 2013 —
fi
rst neural network, in Ruby (sic)


• 2018 — immigrated to Israel


• March 2020 — started working on an ARC solution


• July 2021 — learned about DreamCoder and switched to using it as a base for an
ARC solution attempt
Outline
• What is program synthesis


• Top-level overview of DreamCoder


• What is ARC and why it is important


• My own insights from working on all this and possible directions
Program Synthesis
• Task de
fi
nition


• ????


• PROGRAM!
Program synthesis
• Input-output pairs (FlashMeta, DreamCoder)


• Text prompt (Codex, Copilot)


• Logical constraints (Coq)


• High-level code (compilers)
Task definition
Program synthesis
• Incremental search over a tree of possible programs


• Tree is exponentially big


• Hard to evaluate incomplete programs


• Full-text generation by language model


• Hard to ensure syntax, type, and memory correctness


• Prompts usually don’t include tests


• Genetic programming


• Needs mutation and crossover operations that preserve code correctness


• Hard to de
fi
ne which intermediate programs are more “
fi
t”
Main approaches
DreamCoder
1. Takes a set of tasks


2. Starts with a library (or grammar) of primitive functions


3. (Enumeration) Tries to solve tasks with the current grammar


4. Generates more possible programs from the current grammar (dreams)


5. (Recognition) Trains a neural network on found solutions and dreams that
predicts probabilities of each function from a string of task description


6. (Enumeration) Tries to solve tasks again, this time using probabilities from NN


7. (Compression) Looks for repeated patterns in found solutions, adds them to
grammar


8. Goes to step 3
DreamCoder
• Takes a set of tasks of the same type and a
grammar


• Programs are expressions of typed lambda calculus
with De Bruijn indexing


• Search starts from a single Hole ( ?? ) of expected
type


• All primitives from the grammar are checked if they
can unify with the hole type


• All possibilities are weighted according to the
grammar, partial solutions are stored in a priority
queue


• When all holes are
fi
lled, the program is checked
against all unsolved tasks
Enumeration
(
?
?
[list(int)
-
>
list(int)] )
(lambda
?
?
[list(int)] )
(lambda empty) (lambda $0)
(lambda (cons
?
?
[int]
?
?
[list(int)] )
…
(lambda (cons 0
?
?
[list(int)] ) …
DreamCoder
• Type-correctness requires that we go from output to input


• We can’t check partial programs for runtime correctness, all the possible solutions
of (cons (car empty)
?
?
) will be explored (within priorities and time limit)


• Can generate in
fi
nite loops, requires timeouts and interruption management
Enumeration
DreamCoder
Generates many transformations of found
programs and looks for repeating
subprograms such that adding them to
library reduces combined length of all
found solutions and the library itself
Compression
DreamCoder
• RNN with LSTM layers


• Input is task de
fi
nition


• Di
ff
erent domains can have di
ff
erent features (like a couple CNN layers for image
domains)


• For a grammar with n functions NN will have n+1 outputs


• Each output is a probability that the corresponding function is used in the
solution to the task


• The last output is the probability of a free variable term
Recognition
DreamCoder
• Gradually expands its library of available functions, thus learning new discrete
concepts without human guidance


• NN model can be referred to as an intuition part. “This task looks like I should
totally use reduce and not map in the solution”


• No support of dependent types means that we can’t propagate constraints
through holes, see (car empty) example


• Single probability for a function may be not enough for complex problems with
long solutions that utilize a big portion of the library. There is the context
grammar extension, but it’s still fairly limited


• Lambda calculus may be quite limited for e
ffi
cient algorithms
Overall
Abstraction and Reasoning Corpus
• Introduced by François Chollet in “On the
Measure of Intelligence”


• Solvable by humans but not machines


• Targets ability to operate with complex
combinations of abstract patterns without
knowledge about real world, except for
Core Knowledge


• Has parallels with skill acquisition


• Private test set su
ffi
ciently di
ff
erent from
public train and test data


• Tests developer-aware generalization
Intermission
How do I solve these tasks?
How do I solve these tasks?
How do I solve these tasks?
Abstractors
• A.k.a reversible functions


• Somewhat akin to witness functions from FlashMeta


• A combination of to_abstract and from_abstract operations


• Preserve information, but present it in a di
ff
erent, possibly more e
ffi
cient way


• to_abstract can have several outputs


• to_abstract can output several possible options


• Examples: grid_size, extract_background, extract_objects,
group_similar_items, group_objects_by_color, vert_symmetry
How to evaluate representations?
A good evaluation function should:


• Work on di
ff
erent data types


• Probably not Monte-Carlo — if it returns non-zero result, we have a solution


My current solution is weighted Kolmogorov complexity.


• Each type has a certain weight per item


• Items of complex types use sum of the weight of all their subitems plus the
weight of the type itself
Intermediate results
• Solved 34/400 training tasks with a threshold of 500 visited partial solutions


• Abstractor library was quite limited


• I had to write all abstractors by hand


• I had to manually pick weights for di
ff
erent abstractors and types
Moving to DreamCoder
• It can learn new functions from primitives on its own


• It can learn weights for functions on its own
Why?
Moving to DreamCoder
• Written in OCaml — no type information in runtime, hard to experiment, not so
easy to read


• Creating programs from output to input means that I don’t have any intermediate
representations to evaluate during the search
Obstacles
Moving to DreamCoder
• No runtime type information in OCaml and absolute type strictness (you can have
either unit ref and have no idea what’s inside, or manually specify all the
possible options) meant that I can’t manipulate any intermediate representations
at all. The solution is to rewrite it to another more dynamic language, I chose Julia


• Introduce named variables to generated programs as in let $x = … in …


• Make search bidirectional, go for simpler representations of both input and
output while checking if new representations can help in explaining the output


• Add a special class of reversible functions, specify how they can be combined so
that the compression step will be able to learn new abstractors without losing
their reversible nature


• Measure intermediate data complexity, learn type weight alongside function
probabilities
Path to solution
Moving to DreamCoder
• What is the best way to evaluate a program with a types and functions weights
set? If we make decisions based on the qualities of intermediate representations,
it’s no longer an admissible search problem


• Should we run NN model not only in the beginning of an attempt to solve a task,
but also on some intermediate representations? We are no longer constraint by
OCaml here, but our model should be able to deal with various data types on its
own without our additional feature engineering


• Should we add dependent types support and learn aliases for them? Rectangle is
still an object but it supports some very speci
fi
c set of operations
Questions
References
• Kevin Ellis, Lucas Morales, Mathias Sable ́-Meyer, Armando Solar-Lezama, and Josh
Tenenbaum: Library learning for neurally-guided bayesian program induction. (2018)


• Ellis, K., Wong, C., Nye, M., Sable-Meyer, M., Cary, L., Morales, L., Hewitt, L.,


Solar-Lezama, A., Tenenbaum, J.B.: Dreamcoder: Growing generalizable, inter-


pretable knowledge with wake-sleep bayesian program learning (2020)


• Chollet, F.: On the measure of intelligence (2019)


• Polozov, O., Gulwani, S.: Flashmeta: a framework for inductive program synthesis. In: Aldrich,
J., Eugster, P. (eds.) OOPSLA. pp. 107–126. ACM (2015), http: //dblp.uni-trier.de/db/conf/
oopsla/oopsla2015.html#PolozovG15


• Alford, S., Gandhi, A., Rangamani, A., Banburski, A., Wang, T., Dandekar, S., ... & Chin, P. (2021,
November). Neural-Guided, Bidirectional Program Search for Abstraction and Reasoning.
In International Conference on Complex Networks and Their Applications (pp. 657-668).
Springer, Cham.
That’s all!
• I’m open for collaboration and discussions


• I’m also open for employment, especially on
something related


• https://github.com/andreyz4k/ec/tree/
julia_enumerator


• https://www.linkedin.com/in/
andreyzakharevich/


• Or @andreyz4k on most social media

More Related Content

What's hot

Lecture 01 introduction to compiler
Lecture 01 introduction to compilerLecture 01 introduction to compiler
Lecture 01 introduction to compilerIffat Anjum
 
Bootstrapping in Compiler
Bootstrapping in CompilerBootstrapping in Compiler
Bootstrapping in CompilerAkhil Kaushik
 
Best Great Ideas on Java Research Papers
Best Great Ideas on Java Research PapersBest Great Ideas on Java Research Papers
Best Great Ideas on Java Research Paperssuzanneriverabme
 
compiler and their types
compiler and their typescompiler and their types
compiler and their typespatchamounika7
 
Lec 02 logical eq (Discrete Mathematics)
Lec 02   logical eq (Discrete Mathematics)Lec 02   logical eq (Discrete Mathematics)
Lec 02 logical eq (Discrete Mathematics)Naosher Md. Zakariyar
 
Conditional statement c++
Conditional statement c++Conditional statement c++
Conditional statement c++amber chaudary
 
Interpolation and-its-application
Interpolation and-its-applicationInterpolation and-its-application
Interpolation and-its-applicationApurbo Datta
 
2_2Specification of Tokens.ppt
2_2Specification of Tokens.ppt2_2Specification of Tokens.ppt
2_2Specification of Tokens.pptRatnakar Mikkili
 
Propositional logic & inference
Propositional logic & inferencePropositional logic & inference
Propositional logic & inferenceSlideshare
 
Lexical analysis - Compiler Design
Lexical analysis - Compiler DesignLexical analysis - Compiler Design
Lexical analysis - Compiler DesignKuppusamy P
 
Techniques & applications of Compiler
Techniques & applications of CompilerTechniques & applications of Compiler
Techniques & applications of CompilerPreethi AKNR
 
Nlp toolkits and_preprocessing_techniques
Nlp toolkits and_preprocessing_techniquesNlp toolkits and_preprocessing_techniques
Nlp toolkits and_preprocessing_techniquesankit_ppt
 
Full Python in 20 slides
Full Python in 20 slidesFull Python in 20 slides
Full Python in 20 slidesrfojdar
 
Compiler vs interpreter
Compiler vs interpreterCompiler vs interpreter
Compiler vs interpreterKamal Tamang
 
Recursion - Algorithms and Data Structures
Recursion - Algorithms and Data StructuresRecursion - Algorithms and Data Structures
Recursion - Algorithms and Data StructuresPriyanka Rana
 
Lab report for Prolog program in artificial intelligence.
Lab report for Prolog program in artificial intelligence.Lab report for Prolog program in artificial intelligence.
Lab report for Prolog program in artificial intelligence.Alamgir Hossain
 
Introduction to Problem Solving Techniques- Python
Introduction to Problem Solving Techniques- PythonIntroduction to Problem Solving Techniques- Python
Introduction to Problem Solving Techniques- PythonPriyankaC44
 

What's hot (20)

Lecture 01 introduction to compiler
Lecture 01 introduction to compilerLecture 01 introduction to compiler
Lecture 01 introduction to compiler
 
Bootstrapping in Compiler
Bootstrapping in CompilerBootstrapping in Compiler
Bootstrapping in Compiler
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Best Great Ideas on Java Research Papers
Best Great Ideas on Java Research PapersBest Great Ideas on Java Research Papers
Best Great Ideas on Java Research Papers
 
compiler and their types
compiler and their typescompiler and their types
compiler and their types
 
Information Extraction
Information ExtractionInformation Extraction
Information Extraction
 
Lec 02 logical eq (Discrete Mathematics)
Lec 02   logical eq (Discrete Mathematics)Lec 02   logical eq (Discrete Mathematics)
Lec 02 logical eq (Discrete Mathematics)
 
Conditional statement c++
Conditional statement c++Conditional statement c++
Conditional statement c++
 
Interpolation and-its-application
Interpolation and-its-applicationInterpolation and-its-application
Interpolation and-its-application
 
2_2Specification of Tokens.ppt
2_2Specification of Tokens.ppt2_2Specification of Tokens.ppt
2_2Specification of Tokens.ppt
 
Propositional logic & inference
Propositional logic & inferencePropositional logic & inference
Propositional logic & inference
 
Lexical analysis - Compiler Design
Lexical analysis - Compiler DesignLexical analysis - Compiler Design
Lexical analysis - Compiler Design
 
3b. LMD & RMD.pdf
3b. LMD & RMD.pdf3b. LMD & RMD.pdf
3b. LMD & RMD.pdf
 
Techniques & applications of Compiler
Techniques & applications of CompilerTechniques & applications of Compiler
Techniques & applications of Compiler
 
Nlp toolkits and_preprocessing_techniques
Nlp toolkits and_preprocessing_techniquesNlp toolkits and_preprocessing_techniques
Nlp toolkits and_preprocessing_techniques
 
Full Python in 20 slides
Full Python in 20 slidesFull Python in 20 slides
Full Python in 20 slides
 
Compiler vs interpreter
Compiler vs interpreterCompiler vs interpreter
Compiler vs interpreter
 
Recursion - Algorithms and Data Structures
Recursion - Algorithms and Data StructuresRecursion - Algorithms and Data Structures
Recursion - Algorithms and Data Structures
 
Lab report for Prolog program in artificial intelligence.
Lab report for Prolog program in artificial intelligence.Lab report for Prolog program in artificial intelligence.
Lab report for Prolog program in artificial intelligence.
 
Introduction to Problem Solving Techniques- Python
Introduction to Problem Solving Techniques- PythonIntroduction to Problem Solving Techniques- Python
Introduction to Problem Solving Techniques- Python
 

Similar to Program Synthesis, DreamCoder, and ARC

Oop(object oriented programming)
Oop(object oriented programming)Oop(object oriented programming)
Oop(object oriented programming)geetika goyal
 
SE-IT JAVA LAB OOP CONCEPT
SE-IT JAVA LAB OOP CONCEPTSE-IT JAVA LAB OOP CONCEPT
SE-IT JAVA LAB OOP CONCEPTnikshaikh786
 
CPP16 - Object Design
CPP16 - Object DesignCPP16 - Object Design
CPP16 - Object DesignMichael Heron
 
History of Object Orientation in OOP.ppt
History of Object Orientation in OOP.pptHistory of Object Orientation in OOP.ppt
History of Object Orientation in OOP.pptathar549116
 
History of Object Orientation in OOP.ppt
History of Object Orientation in OOP.pptHistory of Object Orientation in OOP.ppt
History of Object Orientation in OOP.pptMuhammad Athar
 
Are High Level Programming Languages for Multicore and Safety Critical Conver...
Are High Level Programming Languages for Multicore and Safety Critical Conver...Are High Level Programming Languages for Multicore and Safety Critical Conver...
Are High Level Programming Languages for Multicore and Safety Critical Conver...InfinIT - Innovationsnetværket for it
 
CPP02 - The Structure of a Program
CPP02 - The Structure of a ProgramCPP02 - The Structure of a Program
CPP02 - The Structure of a ProgramMichael Heron
 
SKILLWISE - OOPS CONCEPT
SKILLWISE - OOPS CONCEPTSKILLWISE - OOPS CONCEPT
SKILLWISE - OOPS CONCEPTSkillwise Group
 
Metaprogramming in Ruby
Metaprogramming in RubyMetaprogramming in Ruby
Metaprogramming in RubyVolodymyr Byno
 
Introduction to Software - Coder Forge - John Mulhall
Introduction to Software - Coder Forge - John MulhallIntroduction to Software - Coder Forge - John Mulhall
Introduction to Software - Coder Forge - John MulhallJohn Mulhall
 
Online TechTalk  "Patterns in Embedded SW Design"
Online TechTalk  "Patterns in Embedded SW Design"Online TechTalk  "Patterns in Embedded SW Design"
Online TechTalk  "Patterns in Embedded SW Design"GlobalLogic Ukraine
 
Programming language paradigms
Programming language paradigmsProgramming language paradigms
Programming language paradigmsAshok Raj
 
Single Responsibility Principle
Single Responsibility PrincipleSingle Responsibility Principle
Single Responsibility PrincipleBADR
 

Similar to Program Synthesis, DreamCoder, and ARC (20)

2CPP19 - Summation
2CPP19 - Summation2CPP19 - Summation
2CPP19 - Summation
 
Problem solving
Problem solvingProblem solving
Problem solving
 
[OOP - Lec 01] Introduction to OOP
[OOP - Lec 01] Introduction to OOP[OOP - Lec 01] Introduction to OOP
[OOP - Lec 01] Introduction to OOP
 
Oop(object oriented programming)
Oop(object oriented programming)Oop(object oriented programming)
Oop(object oriented programming)
 
SE-IT JAVA LAB OOP CONCEPT
SE-IT JAVA LAB OOP CONCEPTSE-IT JAVA LAB OOP CONCEPT
SE-IT JAVA LAB OOP CONCEPT
 
CPP16 - Object Design
CPP16 - Object DesignCPP16 - Object Design
CPP16 - Object Design
 
History of Object Orientation in OOP.ppt
History of Object Orientation in OOP.pptHistory of Object Orientation in OOP.ppt
History of Object Orientation in OOP.ppt
 
History of Object Orientation in OOP.ppt
History of Object Orientation in OOP.pptHistory of Object Orientation in OOP.ppt
History of Object Orientation in OOP.ppt
 
Are High Level Programming Languages for Multicore and Safety Critical Conver...
Are High Level Programming Languages for Multicore and Safety Critical Conver...Are High Level Programming Languages for Multicore and Safety Critical Conver...
Are High Level Programming Languages for Multicore and Safety Critical Conver...
 
The Big Picture
The Big PictureThe Big Picture
The Big Picture
 
tensorflow.pptx
tensorflow.pptxtensorflow.pptx
tensorflow.pptx
 
CPP02 - The Structure of a Program
CPP02 - The Structure of a ProgramCPP02 - The Structure of a Program
CPP02 - The Structure of a Program
 
[OOP - Lec 02] Why do we need OOP
[OOP - Lec 02] Why do we need OOP[OOP - Lec 02] Why do we need OOP
[OOP - Lec 02] Why do we need OOP
 
SKILLWISE - OOPS CONCEPT
SKILLWISE - OOPS CONCEPTSKILLWISE - OOPS CONCEPT
SKILLWISE - OOPS CONCEPT
 
Metaprogramming in Ruby
Metaprogramming in RubyMetaprogramming in Ruby
Metaprogramming in Ruby
 
Chapter 1
Chapter 1Chapter 1
Chapter 1
 
Introduction to Software - Coder Forge - John Mulhall
Introduction to Software - Coder Forge - John MulhallIntroduction to Software - Coder Forge - John Mulhall
Introduction to Software - Coder Forge - John Mulhall
 
Online TechTalk  "Patterns in Embedded SW Design"
Online TechTalk  "Patterns in Embedded SW Design"Online TechTalk  "Patterns in Embedded SW Design"
Online TechTalk  "Patterns in Embedded SW Design"
 
Programming language paradigms
Programming language paradigmsProgramming language paradigms
Programming language paradigms
 
Single Responsibility Principle
Single Responsibility PrincipleSingle Responsibility Principle
Single Responsibility Principle
 

Recently uploaded

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Recently uploaded (20)

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

Program Synthesis, DreamCoder, and ARC

  • 1. Andrey Z a kh a revich for MLST, 9.03.2022 Program Synthesis, DreamCoder and ARC
  • 2. About me • 2002 — fi rst Hello World 🤯 • 2010 — dropped out from the university • 2012-2021 — various software engineering jobs, mostly backend • 2013 — fi rst neural network, in Ruby (sic) • 2018 — immigrated to Israel • March 2020 — started working on an ARC solution • July 2021 — learned about DreamCoder and switched to using it as a base for an ARC solution attempt
  • 3. Outline • What is program synthesis • Top-level overview of DreamCoder • What is ARC and why it is important • My own insights from working on all this and possible directions
  • 4. Program Synthesis • Task de fi nition • ???? • PROGRAM!
  • 5. Program synthesis • Input-output pairs (FlashMeta, DreamCoder) • Text prompt (Codex, Copilot) • Logical constraints (Coq) • High-level code (compilers) Task definition
  • 6. Program synthesis • Incremental search over a tree of possible programs • Tree is exponentially big • Hard to evaluate incomplete programs • Full-text generation by language model • Hard to ensure syntax, type, and memory correctness • Prompts usually don’t include tests • Genetic programming • Needs mutation and crossover operations that preserve code correctness • Hard to de fi ne which intermediate programs are more “ fi t” Main approaches
  • 7. DreamCoder 1. Takes a set of tasks 2. Starts with a library (or grammar) of primitive functions 3. (Enumeration) Tries to solve tasks with the current grammar 4. Generates more possible programs from the current grammar (dreams) 5. (Recognition) Trains a neural network on found solutions and dreams that predicts probabilities of each function from a string of task description 6. (Enumeration) Tries to solve tasks again, this time using probabilities from NN 7. (Compression) Looks for repeated patterns in found solutions, adds them to grammar 8. Goes to step 3
  • 8. DreamCoder • Takes a set of tasks of the same type and a grammar • Programs are expressions of typed lambda calculus with De Bruijn indexing • Search starts from a single Hole ( ?? ) of expected type • All primitives from the grammar are checked if they can unify with the hole type • All possibilities are weighted according to the grammar, partial solutions are stored in a priority queue • When all holes are fi lled, the program is checked against all unsolved tasks Enumeration ( ? ? [list(int) - > list(int)] ) (lambda ? ? [list(int)] ) (lambda empty) (lambda $0) (lambda (cons ? ? [int] ? ? [list(int)] ) … (lambda (cons 0 ? ? [list(int)] ) …
  • 9. DreamCoder • Type-correctness requires that we go from output to input • We can’t check partial programs for runtime correctness, all the possible solutions of (cons (car empty) ? ? ) will be explored (within priorities and time limit) • Can generate in fi nite loops, requires timeouts and interruption management Enumeration
  • 10. DreamCoder Generates many transformations of found programs and looks for repeating subprograms such that adding them to library reduces combined length of all found solutions and the library itself Compression
  • 11. DreamCoder • RNN with LSTM layers • Input is task de fi nition • Di ff erent domains can have di ff erent features (like a couple CNN layers for image domains) • For a grammar with n functions NN will have n+1 outputs • Each output is a probability that the corresponding function is used in the solution to the task • The last output is the probability of a free variable term Recognition
  • 12. DreamCoder • Gradually expands its library of available functions, thus learning new discrete concepts without human guidance • NN model can be referred to as an intuition part. “This task looks like I should totally use reduce and not map in the solution” • No support of dependent types means that we can’t propagate constraints through holes, see (car empty) example • Single probability for a function may be not enough for complex problems with long solutions that utilize a big portion of the library. There is the context grammar extension, but it’s still fairly limited • Lambda calculus may be quite limited for e ffi cient algorithms Overall
  • 13. Abstraction and Reasoning Corpus • Introduced by François Chollet in “On the Measure of Intelligence” • Solvable by humans but not machines • Targets ability to operate with complex combinations of abstract patterns without knowledge about real world, except for Core Knowledge • Has parallels with skill acquisition • Private test set su ffi ciently di ff erent from public train and test data • Tests developer-aware generalization
  • 15. How do I solve these tasks?
  • 16. How do I solve these tasks?
  • 17. How do I solve these tasks?
  • 18. Abstractors • A.k.a reversible functions • Somewhat akin to witness functions from FlashMeta • A combination of to_abstract and from_abstract operations • Preserve information, but present it in a di ff erent, possibly more e ffi cient way • to_abstract can have several outputs • to_abstract can output several possible options • Examples: grid_size, extract_background, extract_objects, group_similar_items, group_objects_by_color, vert_symmetry
  • 19. How to evaluate representations? A good evaluation function should: • Work on di ff erent data types • Probably not Monte-Carlo — if it returns non-zero result, we have a solution My current solution is weighted Kolmogorov complexity. • Each type has a certain weight per item • Items of complex types use sum of the weight of all their subitems plus the weight of the type itself
  • 20. Intermediate results • Solved 34/400 training tasks with a threshold of 500 visited partial solutions • Abstractor library was quite limited • I had to write all abstractors by hand • I had to manually pick weights for di ff erent abstractors and types
  • 21. Moving to DreamCoder • It can learn new functions from primitives on its own • It can learn weights for functions on its own Why?
  • 22. Moving to DreamCoder • Written in OCaml — no type information in runtime, hard to experiment, not so easy to read • Creating programs from output to input means that I don’t have any intermediate representations to evaluate during the search Obstacles
  • 23. Moving to DreamCoder • No runtime type information in OCaml and absolute type strictness (you can have either unit ref and have no idea what’s inside, or manually specify all the possible options) meant that I can’t manipulate any intermediate representations at all. The solution is to rewrite it to another more dynamic language, I chose Julia • Introduce named variables to generated programs as in let $x = … in … • Make search bidirectional, go for simpler representations of both input and output while checking if new representations can help in explaining the output • Add a special class of reversible functions, specify how they can be combined so that the compression step will be able to learn new abstractors without losing their reversible nature • Measure intermediate data complexity, learn type weight alongside function probabilities Path to solution
  • 24. Moving to DreamCoder • What is the best way to evaluate a program with a types and functions weights set? If we make decisions based on the qualities of intermediate representations, it’s no longer an admissible search problem • Should we run NN model not only in the beginning of an attempt to solve a task, but also on some intermediate representations? We are no longer constraint by OCaml here, but our model should be able to deal with various data types on its own without our additional feature engineering • Should we add dependent types support and learn aliases for them? Rectangle is still an object but it supports some very speci fi c set of operations Questions
  • 25. References • Kevin Ellis, Lucas Morales, Mathias Sable ́-Meyer, Armando Solar-Lezama, and Josh Tenenbaum: Library learning for neurally-guided bayesian program induction. (2018) • Ellis, K., Wong, C., Nye, M., Sable-Meyer, M., Cary, L., Morales, L., Hewitt, L., 
 Solar-Lezama, A., Tenenbaum, J.B.: Dreamcoder: Growing generalizable, inter- 
 pretable knowledge with wake-sleep bayesian program learning (2020) • Chollet, F.: On the measure of intelligence (2019) • Polozov, O., Gulwani, S.: Flashmeta: a framework for inductive program synthesis. In: Aldrich, J., Eugster, P. (eds.) OOPSLA. pp. 107–126. ACM (2015), http: //dblp.uni-trier.de/db/conf/ oopsla/oopsla2015.html#PolozovG15 • Alford, S., Gandhi, A., Rangamani, A., Banburski, A., Wang, T., Dandekar, S., ... & Chin, P. (2021, November). Neural-Guided, Bidirectional Program Search for Abstraction and Reasoning. In International Conference on Complex Networks and Their Applications (pp. 657-668). Springer, Cham.
  • 26. That’s all! • I’m open for collaboration and discussions • I’m also open for employment, especially on something related • https://github.com/andreyz4k/ec/tree/ julia_enumerator • https://www.linkedin.com/in/ andreyzakharevich/ • Or @andreyz4k on most social media