SlideShare a Scribd company logo
1 of 18
A Syntactic Neural Model
for General-Purpose Code
Generation
Pengcheng Yin, Graham Neubig
ACL2017
M1 Tomoya Ogata
1
Abstract
• Input
• natural language descriptions
• Output
• source code written in a general-purpose programming
language
• Existing data-driven methods treat this problem as a
language generation task without considering the
underlying syntax of the target programming language
• propose a novel neural architecture powered by a
grammar model to explicitly capture the target syntax
as prior knowledge
2
Given an NL description x, our task is to generate the
code snippet c in a modern PL based on the intent of x.
We define a probabilistic grammar model of generating
an AST y given x
𝑦 is then deterministically converted to the
corresponding surface code
an AST is generated by applying several production rules
composed of a head node and multiple child nodes
The Code Generation Problem
3
Grammar Model
• APPLYRULE Actions
• GENTOKEN Actions
4
APPLYRULE Actions
• APPLYRULE chooses a rule from the subset that has
a head matching the type of 𝑛 𝑓𝑡
• uses r to expand 𝑛 𝑓𝑡
by appending all child nodes
specified by the selected production
• When a variable terminal node is added to the
derivation and becomes the frontier node, the
grammar model then switches to GENTOKEN
actions to populate the variable terminal with
tokens
5
GENTOKEN Actions
Once we reach a frontier node 𝑛 𝑓𝑡
that corresponds
to a variable type, GENTOKEN actions are used to fill
this node with values.
At each time step, GENTOKEN appends one terminal
token to the current frontier variable node
6
Tracking Generation States
Action Embedding: at
Context Vector: ct
Parent Feeding: pt
7
Parent Feeding
• parent information 𝑝𝑡 from two sources
• (1) the hidden state of parent action 𝑠 𝑝 𝑡
• (2) the embedding of parent action 𝑎 𝑝 𝑡
• The parent feeding schema enables the model to
utilize the information of parent code segments to
make more confident predictions
8
Calculating Action Probabilities
g(st): tanh(W・st+b)
e(r): one-hot vector for rule r
p(gen|・), p(copy|・): tanh(W・st+b)
9
Training
• Given a dataset of pairs of NL descriptions 𝑥𝑖 and
code 𝑐𝑖 snippets
• we parse 𝑐𝑖 into its AST 𝑦𝑖
• decompose 𝑦𝑖 into a sequence of oracle actions
• The model is then optimized by maximizing the log-
likelihood of the oracle action sequence
10
Experimental Evaluation
• Datasets
• HEARTHSTONE (HS) dataset
• a collection of Python classes that implement cards for the card
game HearthStone
• DJANGO dataset
• a collection of lines of code from the Django web framework, each
with a manually annotated NL description
• IFTTT dataset
• a domain- specific benchmark that provides an interesting side
comparison
• Metrics
• accuracy
• BLUE-4
11
Experimental Evaluation
• Preprocessing
• Input are tokenized using NLTK
• replacing quoted strings in the inputs with place holders
• extract unary closures whose frequency is larger than a
threshold
• Configuration
• node type embeddings: 64
• All other embeddings: 128
• RNN states: 256
• hidden layers: 50
12
Model
• Latent Predictor Network (LPN), a state-of-the-art
sequence- to-sequence code generation model
• SEQ2TREE, a neural semantic parsing model
• NMT system using a standard encoder-decoder
architecture with attention and unknown word
replacement
13
Result (HS, DJANG0)
14
Output Example
15
Result (IFTTT)
16
Error Analysis
• randomly sampled and labeled 100 and 50 failed
examples from DJANGO and HS, respectively
• using different parameter names when defining a function
• omitting (or adding) default values of parameters in function
calls
• DJANGO
• 30%: the pointer network failed
• 25%: the generated code only partially implemented the
required functionality
• 10%: malformed English inputs
• 5%: preprocessing errors
• 30%: could not be easily categorized into the above
• HS
• partial implementation errors
17
Conclusion
• This paper proposes a syntax-driven neural code
generation approach that generates an abstract
syntax tree by sequentially applying actions from a
grammar model
18

More Related Content

What's hot

[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You NeedDaiki Tanaka
 
CNIT 126: 13: Data Encoding
CNIT 126: 13: Data EncodingCNIT 126: 13: Data Encoding
CNIT 126: 13: Data EncodingSam Bowne
 
Decision Making & Loops
Decision Making & LoopsDecision Making & Loops
Decision Making & LoopsAkhil Kaushik
 
Applications of the Reverse Engineering Language REIL
Applications of the Reverse Engineering Language REILApplications of the Reverse Engineering Language REIL
Applications of the Reverse Engineering Language REILzynamics GmbH
 
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...Mateus S. H. Cruz
 
Cilk - An Efficient Multithreaded Runtime System
Cilk - An Efficient Multithreaded Runtime SystemCilk - An Efficient Multithreaded Runtime System
Cilk - An Efficient Multithreaded Runtime SystemShareek Ahamed
 
Capturing Monotonic Components from Input Patterns
Capturing Monotonic Components from Input PatternsCapturing Monotonic Components from Input Patterns
Capturing Monotonic Components from Input Patternsijeukens
 
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRANDeep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRANTAUS - The Language Data Network
 
CNIT 141: 10. RSA
CNIT 141: 10. RSACNIT 141: 10. RSA
CNIT 141: 10. RSASam Bowne
 
TensorFlow Technology
TensorFlow TechnologyTensorFlow Technology
TensorFlow Technologynarayan dudhe
 
Moving to neural machine translation at google - gopro-meetup
Moving to neural machine translation at google  - gopro-meetupMoving to neural machine translation at google  - gopro-meetup
Moving to neural machine translation at google - gopro-meetupChester Chen
 
QVT Traceability: What does it really mean?
QVT Traceability: What does it really mean?QVT Traceability: What does it really mean?
QVT Traceability: What does it really mean?Edward Willink
 
2017:12:06 acl読み会"Learning attention for historical text normalization by lea...
2017:12:06 acl読み会"Learning attention for historical text normalization by lea...2017:12:06 acl読み会"Learning attention for historical text normalization by lea...
2017:12:06 acl読み会"Learning attention for historical text normalization by lea...ayaha osaki
 
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...Mateus S. H. Cruz
 
Erlang For Five Nines
Erlang For Five NinesErlang For Five Nines
Erlang For Five NinesBarcamp Cork
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingMinh Pham
 
Introduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaIntroduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaopenseesdays
 

What's hot (20)

[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
 
CNIT 126: 13: Data Encoding
CNIT 126: 13: Data EncodingCNIT 126: 13: Data Encoding
CNIT 126: 13: Data Encoding
 
Decision Making & Loops
Decision Making & LoopsDecision Making & Loops
Decision Making & Loops
 
Applications of the Reverse Engineering Language REIL
Applications of the Reverse Engineering Language REILApplications of the Reverse Engineering Language REIL
Applications of the Reverse Engineering Language REIL
 
Hd2
Hd2Hd2
Hd2
 
Analysis of a Modified RC4
Analysis of a Modified RC4 Analysis of a Modified RC4
Analysis of a Modified RC4
 
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...
Fast, Private and Verifiable: Server-aided Approximate Similarity Computation...
 
Cilk - An Efficient Multithreaded Runtime System
Cilk - An Efficient Multithreaded Runtime SystemCilk - An Efficient Multithreaded Runtime System
Cilk - An Efficient Multithreaded Runtime System
 
R intro
R introR intro
R intro
 
Capturing Monotonic Components from Input Patterns
Capturing Monotonic Components from Input PatternsCapturing Monotonic Components from Input Patterns
Capturing Monotonic Components from Input Patterns
 
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRANDeep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
Deep Learning for Machine Translation, by Satoshi Enoue, SYSTRAN
 
CNIT 141: 10. RSA
CNIT 141: 10. RSACNIT 141: 10. RSA
CNIT 141: 10. RSA
 
TensorFlow Technology
TensorFlow TechnologyTensorFlow Technology
TensorFlow Technology
 
Moving to neural machine translation at google - gopro-meetup
Moving to neural machine translation at google  - gopro-meetupMoving to neural machine translation at google  - gopro-meetup
Moving to neural machine translation at google - gopro-meetup
 
QVT Traceability: What does it really mean?
QVT Traceability: What does it really mean?QVT Traceability: What does it really mean?
QVT Traceability: What does it really mean?
 
2017:12:06 acl読み会"Learning attention for historical text normalization by lea...
2017:12:06 acl読み会"Learning attention for historical text normalization by lea...2017:12:06 acl読み会"Learning attention for historical text normalization by lea...
2017:12:06 acl読み会"Learning attention for historical text normalization by lea...
 
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...
Realizing Fine-Grained and Flexible Access Control to Outsourced Data with At...
 
Erlang For Five Nines
Erlang For Five NinesErlang For Five Nines
Erlang For Five Nines
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 
Introduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaIntroduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKenna
 

Similar to [論文紹介]A syntactic neural model for general purpose code generation

Natural Language to Visualization by Neural Machine Translation
Natural Language to Visualization by Neural Machine TranslationNatural Language to Visualization by Neural Machine Translation
Natural Language to Visualization by Neural Machine Translationivaderivader
 
Scaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMScaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMfnothaft
 
Natural Language Interface to Knowledge Graph
Natural Language Interface to Knowledge GraphNatural Language Interface to Knowledge Graph
Natural Language Interface to Knowledge GraphVaticle
 
4-BlockCipher-DES-CEN451-BSE-Spring2022-17042022-104521am.pdf
4-BlockCipher-DES-CEN451-BSE-Spring2022-17042022-104521am.pdf4-BlockCipher-DES-CEN451-BSE-Spring2022-17042022-104521am.pdf
4-BlockCipher-DES-CEN451-BSE-Spring2022-17042022-104521am.pdfNAWAZURREHMANAWAN
 
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft..."Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...Dataconomy Media
 
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNetFrom Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNetEric Haibin Lin
 
Probabilistic data structures. Part 2. Cardinality
Probabilistic data structures. Part 2. CardinalityProbabilistic data structures. Part 2. Cardinality
Probabilistic data structures. Part 2. CardinalityAndrii Gakhov
 
Presentation Slides - Genetic algorithm based key generation for fully homomo...
Presentation Slides - Genetic algorithm based key generation for fully homomo...Presentation Slides - Genetic algorithm based key generation for fully homomo...
Presentation Slides - Genetic algorithm based key generation for fully homomo...MajedahAlkharji
 
5_RNN_LSTM.pdf
5_RNN_LSTM.pdf5_RNN_LSTM.pdf
5_RNN_LSTM.pdfFEG
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRUananth
 
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...Anne Nicolas
 
RNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential DataRNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential DataYao-Chieh Hu
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia VoulibasiISSEL
 
Scalable up genomic analysis with ADAM
Scalable up genomic analysis with ADAMScalable up genomic analysis with ADAM
Scalable up genomic analysis with ADAMfnothaft
 
Parallel Distributed Deep Learning on HPCC Systems
Parallel Distributed Deep Learning on HPCC SystemsParallel Distributed Deep Learning on HPCC Systems
Parallel Distributed Deep Learning on HPCC SystemsHPCC Systems
 

Similar to [論文紹介]A syntactic neural model for general purpose code generation (20)

Trans coder
Trans coderTrans coder
Trans coder
 
Natural Language to Visualization by Neural Machine Translation
Natural Language to Visualization by Neural Machine TranslationNatural Language to Visualization by Neural Machine Translation
Natural Language to Visualization by Neural Machine Translation
 
Scaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAMScaling up genomic analysis with ADAM
Scaling up genomic analysis with ADAM
 
Natural Language Interface to Knowledge Graph
Natural Language Interface to Knowledge GraphNatural Language Interface to Knowledge Graph
Natural Language Interface to Knowledge Graph
 
4-BlockCipher-DES-CEN451-BSE-Spring2022-17042022-104521am.pdf
4-BlockCipher-DES-CEN451-BSE-Spring2022-17042022-104521am.pdf4-BlockCipher-DES-CEN451-BSE-Spring2022-17042022-104521am.pdf
4-BlockCipher-DES-CEN451-BSE-Spring2022-17042022-104521am.pdf
 
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft..."Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
 
Chord
ChordChord
Chord
 
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNetFrom Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
 
Probabilistic data structures. Part 2. Cardinality
Probabilistic data structures. Part 2. CardinalityProbabilistic data structures. Part 2. Cardinality
Probabilistic data structures. Part 2. Cardinality
 
Presentation Slides - Genetic algorithm based key generation for fully homomo...
Presentation Slides - Genetic algorithm based key generation for fully homomo...Presentation Slides - Genetic algorithm based key generation for fully homomo...
Presentation Slides - Genetic algorithm based key generation for fully homomo...
 
5_RNN_LSTM.pdf
5_RNN_LSTM.pdf5_RNN_LSTM.pdf
5_RNN_LSTM.pdf
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...
Kernel Recipes 2018 - 10 years of automated evolution in the Linux kernel - J...
 
RNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential DataRNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential Data
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
 
Scalable up genomic analysis with ADAM
Scalable up genomic analysis with ADAMScalable up genomic analysis with ADAM
Scalable up genomic analysis with ADAM
 
Introduction
IntroductionIntroduction
Introduction
 
PD_Tcl_Examples
PD_Tcl_ExamplesPD_Tcl_Examples
PD_Tcl_Examples
 
ICSE20_Tao_slides.pptx
ICSE20_Tao_slides.pptxICSE20_Tao_slides.pptx
ICSE20_Tao_slides.pptx
 
Parallel Distributed Deep Learning on HPCC Systems
Parallel Distributed Deep Learning on HPCC SystemsParallel Distributed Deep Learning on HPCC Systems
Parallel Distributed Deep Learning on HPCC Systems
 

Recently uploaded

Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 

Recently uploaded (20)

Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 

[論文紹介]A syntactic neural model for general purpose code generation

  • 1. A Syntactic Neural Model for General-Purpose Code Generation Pengcheng Yin, Graham Neubig ACL2017 M1 Tomoya Ogata 1
  • 2. Abstract • Input • natural language descriptions • Output • source code written in a general-purpose programming language • Existing data-driven methods treat this problem as a language generation task without considering the underlying syntax of the target programming language • propose a novel neural architecture powered by a grammar model to explicitly capture the target syntax as prior knowledge 2
  • 3. Given an NL description x, our task is to generate the code snippet c in a modern PL based on the intent of x. We define a probabilistic grammar model of generating an AST y given x 𝑦 is then deterministically converted to the corresponding surface code an AST is generated by applying several production rules composed of a head node and multiple child nodes The Code Generation Problem 3
  • 4. Grammar Model • APPLYRULE Actions • GENTOKEN Actions 4
  • 5. APPLYRULE Actions • APPLYRULE chooses a rule from the subset that has a head matching the type of 𝑛 𝑓𝑡 • uses r to expand 𝑛 𝑓𝑡 by appending all child nodes specified by the selected production • When a variable terminal node is added to the derivation and becomes the frontier node, the grammar model then switches to GENTOKEN actions to populate the variable terminal with tokens 5
  • 6. GENTOKEN Actions Once we reach a frontier node 𝑛 𝑓𝑡 that corresponds to a variable type, GENTOKEN actions are used to fill this node with values. At each time step, GENTOKEN appends one terminal token to the current frontier variable node 6
  • 7. Tracking Generation States Action Embedding: at Context Vector: ct Parent Feeding: pt 7
  • 8. Parent Feeding • parent information 𝑝𝑡 from two sources • (1) the hidden state of parent action 𝑠 𝑝 𝑡 • (2) the embedding of parent action 𝑎 𝑝 𝑡 • The parent feeding schema enables the model to utilize the information of parent code segments to make more confident predictions 8
  • 9. Calculating Action Probabilities g(st): tanh(W・st+b) e(r): one-hot vector for rule r p(gen|・), p(copy|・): tanh(W・st+b) 9
  • 10. Training • Given a dataset of pairs of NL descriptions 𝑥𝑖 and code 𝑐𝑖 snippets • we parse 𝑐𝑖 into its AST 𝑦𝑖 • decompose 𝑦𝑖 into a sequence of oracle actions • The model is then optimized by maximizing the log- likelihood of the oracle action sequence 10
  • 11. Experimental Evaluation • Datasets • HEARTHSTONE (HS) dataset • a collection of Python classes that implement cards for the card game HearthStone • DJANGO dataset • a collection of lines of code from the Django web framework, each with a manually annotated NL description • IFTTT dataset • a domain- specific benchmark that provides an interesting side comparison • Metrics • accuracy • BLUE-4 11
  • 12. Experimental Evaluation • Preprocessing • Input are tokenized using NLTK • replacing quoted strings in the inputs with place holders • extract unary closures whose frequency is larger than a threshold • Configuration • node type embeddings: 64 • All other embeddings: 128 • RNN states: 256 • hidden layers: 50 12
  • 13. Model • Latent Predictor Network (LPN), a state-of-the-art sequence- to-sequence code generation model • SEQ2TREE, a neural semantic parsing model • NMT system using a standard encoder-decoder architecture with attention and unknown word replacement 13
  • 17. Error Analysis • randomly sampled and labeled 100 and 50 failed examples from DJANGO and HS, respectively • using different parameter names when defining a function • omitting (or adding) default values of parameters in function calls • DJANGO • 30%: the pointer network failed • 25%: the generated code only partially implemented the required functionality • 10%: malformed English inputs • 5%: preprocessing errors • 30%: could not be easily categorized into the above • HS • partial implementation errors 17
  • 18. Conclusion • This paper proposes a syntax-driven neural code generation approach that generates an abstract syntax tree by sequentially applying actions from a grammar model 18