SlideShare a Scribd company logo
1 of 24
Download to read offline
A Comprehensive Review of
Large Language Models for
Code Generation
Presented By: Sai Pragna Kancheti
INTRODUCTION:
 Chatgpt like chatbots has become popular in recent times, These chatbots are natural
language processing tools that are developed for general-purpose and uses artificial
intelligence to generate text after a user enters a prompt.
 Although these chatbots are made for general purpose, they are also good at
generating code from user prompts using Large Language Models
 In this presentation, we are going to systematically review Large Language
Models for code generation base on user prompts
 At the end, based on the results we have presented some Insights for further
research in this direction
What are LLMs?
 A large language model is a more advanced sort of language model that is
developed on vast volumes of text data using deep learning techniques.
 These models can generate human-like text and perform a variety of natural
language processing tasks
 The complexity of a language model can range from simple n-gram models to
more complex neural network models.
 Examples: GPT-3 (Generative Pretrained Transformer 3), BERT (Bidirectional
Encoder Representations from Transformers), RoBERTa (Robustly Optimized
BERT Approach) ,etc.,
LLMs for code generation
 The recent models excel at tasks like code completion and code synthesis
from natural language descriptions.
 One such promising model developed in the recent times is Austin et al.
(2021),which has demonstrated significant progress toward AI-based
programming aid.
 One of the largest of these models, Codex (Chen et al., 2021), has been
deployed as an in-IDE developer assistant that automatically generates code
based on the user's context in the real-world production tool GitHub Copilot1.
 Despite the enormous success of large language models of code, the most
powerful models are not publicly accessible.
LLMs for code
generation
Some of the Existing models of
code,their sizes and
availability(open source or not
open-source ) is shown in the
figure.
Challenges With the available LLMs for code
Generation
 Although these models can show good performance for code generation based
on the user prompt. There are some following challenges needed to be
addressed for these models for further development in this scope
 There was no large open-source language model trained almost exclusively on
code from multiple programming languages.
 Lack of availability of powerful models that are publicly accessible.
 Unavailability of access to the model's internals.
 This prohibits these models from being applied to code generation tasks and
inhibits research in this particular field for low-resource organizations
PRETRAINING
METHODS
Types of Pretraining Methods
Left-to-Right Language Models
 The auto-regressive, left-to-right language models predict the likelihood of a
certain token depending on the sequence of tokens that have come before it
 These models' sequential, left-to-right operation is especially useful for
activities connected to program generation, such as auto-completion code.
 However, because code isn't often produced in a single left-to-right pass,
utilizing context that appears "after" the moment of generation is difficult.
 Examples: CodeParrot, GPT-Neo ,GPT-J (6B) ,Codex (12B), GPT-NeoX (20B),
and Google’s (137B) (Austin et al., 2021)
 These type of the models are considered in review.
Masked Language Models
 While auto-regressive language models are powerful for modeling the
probability of sequences, their unidirectional nature makes them less suitable
for producing effective whole-sequence representations for downstream tasks
such as classification.
 One popular bidirectional objective function used widely in representation
learning is masked language modeling.
 where the aim is to predict masked text pieces based on surrounding context.
 Examples: CodeBERT (125M) and CuBERT (345M) are some of the examples of
these models.
Encoder-decoder Models
 An encoder-decoder model first uses an encoder to encode an input
sequence, and then uses a left-to-right LM to decode an output sequence
conditioned on the input sequence.
 Popular pretraining objectives include masked span prediction where the
input sequence is randomly masked with multiple masks and the output
sequence are the masked contents in order
 and denoising sequence reconstruction where the input is a corrupted
sequence and the output is the original sequence.
 These pretrained models are useful in many sequence-to-sequence tasks
 Examples: CodeT5 (220M) and PLBART (406M)
COMPARED MODELS
Existing Models
 Codex: Codex is a Language Learning Model (LLM) that has been specifically
adjusted using Python code available to the public on GitHub.
 This model employs GPT-3 due to its substantial proficiency in creating Python
programs. Despite being considerably smaller than GPT-3, with a total of 12 billion
parameters, Codex still exhibits remarkable performance.
 GPT-Neo: GPT-Neo is a series of substantial large language models have been trained
on the Pile dataset.
 These models, similar to GPT-3, are available in different sizes including 125M, 1.3B,
and 2.7B parameter versions.
 The GPT-Neo 2.7B version, in particular, is a transformer model that has been
developed based on EleutherAI's recreation of the GPT-3 architecture.
Existing Models
 GPT-J : GPT-J, developed by EleutherAI, is an open source model with 6 billion
parameters, trained on The Pile dataset.
 It largely adheres to the GPT-2 architecture and stands out as the highest performing
transformer language model available to the public, in terms of its zero-shot performance
on a range of subsequent tasks.
 CodeParrot: CodeParrot is a model based on GPT-2, possessing 1.5 billion
parameters, which has been specifically fine-tuned using publicly accessible code from
GitHub for the purpose of generating Python code
Introduced model- PolyCoder
 To overcome the challenges of
available LLMs for code
generation a new PolyCoder
model is introduced , which
boasts 2.7 billion parameters,
is trained on a diverse range of
repositories sourced from
GitHub, encompassing 12
distinct programming
languages. As shown in the
table
PolyCoder’s Training
 Polycoder uses the GPT-2 model architecture.
 To investigate the effect of model size scaling, it was trained using three
different model sizes: 2.7 billion, 400 million, and 160 million parameters,
with the largest 2.7B model equalling GPT-Neo's capacity to allow a fair
comparison
 The 2.7 billion parameter model is a 32-layer, 2,560 dimensional Transformer
model with a maximum context window of 2048 tokens, and it was trained
using a batch size of 128 sequences (262K tokens) for a total of 150K steps
PolyCoder’s Training
 The following table is a
Comparison of design
decisions and hyper-
parameters in training
different models of code.
PolyCoder’s Training
 The following figure is
the Training and
validation loss during the
150K step training
process
Results
Results of Extrinsic evaluations:
 Among the current models, PolyCoder performs less effectively than the comparably
sized GPT-Neo and even the smaller Codex 300M. In the grand scheme of things,
PolyCoder ranks after Codex, GPT-Neo/J, but outperforms CodeParrot
 Despite being trained exclusively on code, PolyCoder lags behind a model of similar
size, GPT-Neo 2.7B, which was trained on the Pile, a mix of both code and natural
language texts
 This finding implies that future studies could profit from mixing code from diverse
programming languages, along with natural language text
Results of Extrinsic evaluations:
 The following table
shows results of different
models on the
HumanEval benchmark,
and the number of
different typesof tokens
seen during the training
process.
Results of Intrinsic Evaluations
 Interestingly, PolyCoder surpasses Codex and all other models when it comes to the C
language. When considering only open-source models, PolyCoder outperforms the
similarly sized GPT-Neo 2.7B in C, JavaScript, Rust, Scala, and TypeScript
 In the remaining 11 languages apart from C, all other open-source models, including
the newly introduced PolyCoder, exhibit significantly lower performance (higher
perplexity) compared to Codex.
 This observation could imply that for languages where larger models don't yield extra
benefits, training the model solely on code might be sufficient or even slightly more
advantageous than training on a combination of natural language and code
Conclusions
 We've presented the results of a systematic evaluatoion of large language models for
code. The findings generally indicate that performance improves with bigger models
and extended training durations.
 Based on the results, we infer that GPT-Neo's superior performance over PolyCoder in
certain languages suggests that training on both natural language text and code can
enhance code modeling
 However, it's noteworthy that in the realm of the C programming language, PolyCoder
outperforms all models, including Codex, by achieving a lower perplexity
Thank
You

More Related Content

Similar to acomprehensivereviewoflargelanguagemodelsfor-230515063139-1fc27b64.pdf

Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...
Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...
Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...majid lotfinia
 
Ready, set, go! An introduction to the Go programming language
Ready, set, go! An introduction to the Go programming languageReady, set, go! An introduction to the Go programming language
Ready, set, go! An introduction to the Go programming languageRTigger
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
STATICMOCK : A Mock Object Framework for Compiled Languages
STATICMOCK : A Mock Object Framework for Compiled Languages STATICMOCK : A Mock Object Framework for Compiled Languages
STATICMOCK : A Mock Object Framework for Compiled Languages ijseajournal
 
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worldsmbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling WorldsIstvan Rath
 
Model driven software engineering in practice book - Chapter 9 - Model to tex...
Model driven software engineering in practice book - Chapter 9 - Model to tex...Model driven software engineering in practice book - Chapter 9 - Model to tex...
Model driven software engineering in practice book - Chapter 9 - Model to tex...Marco Brambilla
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word documentSIVAJISADHANA
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word documentSIVAJISADHANA
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word documentSIVAJISADHANA
 
BloombergGPT.pdfA Large Language Model for Finance
BloombergGPT.pdfA Large Language Model for FinanceBloombergGPT.pdfA Large Language Model for Finance
BloombergGPT.pdfA Large Language Model for Finance957671457
 
Doppl Development Introduction
Doppl Development IntroductionDoppl Development Introduction
Doppl Development IntroductionDiego Perini
 
INFN ML FPGA Course 2022
INFN ML FPGA Course 2022INFN ML FPGA Course 2022
INFN ML FPGA Course 2022Mirko Mariotti
 
DevBCN Vertex AI - Pipelines for your MLOps workflows
DevBCN Vertex AI - Pipelines for your MLOps workflowsDevBCN Vertex AI - Pipelines for your MLOps workflows
DevBCN Vertex AI - Pipelines for your MLOps workflowsMárton Kodok
 

Similar to acomprehensivereviewoflargelanguagemodelsfor-230515063139-1fc27b64.pdf (20)

short-story.pptx
short-story.pptxshort-story.pptx
short-story.pptx
 
Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...
Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...
Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...
 
Ready, set, go! An introduction to the Go programming language
Ready, set, go! An introduction to the Go programming languageReady, set, go! An introduction to the Go programming language
Ready, set, go! An introduction to the Go programming language
 
thrift-20070401
thrift-20070401thrift-20070401
thrift-20070401
 
WEBSITE DEVELOPMENT
WEBSITE DEVELOPMENTWEBSITE DEVELOPMENT
WEBSITE DEVELOPMENT
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
STATICMOCK : A Mock Object Framework for Compiled Languages
STATICMOCK : A Mock Object Framework for Compiled Languages STATICMOCK : A Mock Object Framework for Compiled Languages
STATICMOCK : A Mock Object Framework for Compiled Languages
 
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worldsmbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
 
Model driven software engineering in practice book - Chapter 9 - Model to tex...
Model driven software engineering in practice book - Chapter 9 - Model to tex...Model driven software engineering in practice book - Chapter 9 - Model to tex...
Model driven software engineering in practice book - Chapter 9 - Model to tex...
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
 
BloombergGPT.pdfA Large Language Model for Finance
BloombergGPT.pdfA Large Language Model for FinanceBloombergGPT.pdfA Large Language Model for Finance
BloombergGPT.pdfA Large Language Model for Finance
 
Doppl Development Introduction
Doppl Development IntroductionDoppl Development Introduction
Doppl Development Introduction
 
INFN ML FPGA Course 2022
INFN ML FPGA Course 2022INFN ML FPGA Course 2022
INFN ML FPGA Course 2022
 
codex (1).ppt
codex (1).pptcodex (1).ppt
codex (1).ppt
 
resume
resumeresume
resume
 
Inside.Net
Inside.NetInside.Net
Inside.Net
 
DevBCN Vertex AI - Pipelines for your MLOps workflows
DevBCN Vertex AI - Pipelines for your MLOps workflowsDevBCN Vertex AI - Pipelines for your MLOps workflows
DevBCN Vertex AI - Pipelines for your MLOps workflows
 
Evolution and History of Programming Languages - Software/Hardware/System
Evolution and History of Programming Languages - Software/Hardware/SystemEvolution and History of Programming Languages - Software/Hardware/System
Evolution and History of Programming Languages - Software/Hardware/System
 

Recently uploaded

Amelia's Dad's Father of the Bride Speech
Amelia's Dad's Father of the Bride SpeechAmelia's Dad's Father of the Bride Speech
Amelia's Dad's Father of the Bride Speechdavidbearn1
 
VIP Ramnagar Call Girls, Ramnagar escorts Girls 📞 8617697112
VIP Ramnagar Call Girls, Ramnagar escorts Girls 📞 8617697112VIP Ramnagar Call Girls, Ramnagar escorts Girls 📞 8617697112
VIP Ramnagar Call Girls, Ramnagar escorts Girls 📞 8617697112Nitya salvi
 
Storyboard short: Ferrarius Tries to Sing
Storyboard short: Ferrarius Tries to SingStoryboard short: Ferrarius Tries to Sing
Storyboard short: Ferrarius Tries to SingLyneSun
 
Young⚡Call Girls in Lajpat Nagar Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Lajpat Nagar Delhi >༒9667401043 Escort ServiceYoung⚡Call Girls in Lajpat Nagar Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Lajpat Nagar Delhi >༒9667401043 Escort Servicesonnydelhi1992
 
Lucknow 💋 Escort Service in Lucknow (Adult Only) 8923113531 Escort Service 2...
Lucknow 💋 Escort Service in Lucknow  (Adult Only) 8923113531 Escort Service 2...Lucknow 💋 Escort Service in Lucknow  (Adult Only) 8923113531 Escort Service 2...
Lucknow 💋 Escort Service in Lucknow (Adult Only) 8923113531 Escort Service 2...anilsa9823
 
Jeremy Casson - An Architectural and Historical Journey Around Europe
Jeremy Casson - An Architectural and Historical Journey Around EuropeJeremy Casson - An Architectural and Historical Journey Around Europe
Jeremy Casson - An Architectural and Historical Journey Around EuropeJeremy Casson
 
Best Call girls in Lucknow - 9548086042 - with hotel room
Best Call girls in Lucknow - 9548086042 - with hotel roomBest Call girls in Lucknow - 9548086042 - with hotel room
Best Call girls in Lucknow - 9548086042 - with hotel roomdiscovermytutordmt
 
RAK Call Girls Service # 971559085003 # Call Girl Service In RAK
RAK Call Girls Service # 971559085003 # Call Girl Service In RAKRAK Call Girls Service # 971559085003 # Call Girl Service In RAK
RAK Call Girls Service # 971559085003 # Call Girl Service In RAKedwardsara83
 
Lucknow 💋 Female Escorts Service in Lucknow | Service-oriented sexy call girl...
Lucknow 💋 Female Escorts Service in Lucknow | Service-oriented sexy call girl...Lucknow 💋 Female Escorts Service in Lucknow | Service-oriented sexy call girl...
Lucknow 💋 Female Escorts Service in Lucknow | Service-oriented sexy call girl...anilsa9823
 
Lucknow 💋 Call Girls in Lucknow | Service-oriented sexy call girls 8923113531...
Lucknow 💋 Call Girls in Lucknow | Service-oriented sexy call girls 8923113531...Lucknow 💋 Call Girls in Lucknow | Service-oriented sexy call girls 8923113531...
Lucknow 💋 Call Girls in Lucknow | Service-oriented sexy call girls 8923113531...anilsa9823
 
Lucknow 💋 High Profile Call Girls in Lucknow - Book 8923113531 Call Girls Ava...
Lucknow 💋 High Profile Call Girls in Lucknow - Book 8923113531 Call Girls Ava...Lucknow 💋 High Profile Call Girls in Lucknow - Book 8923113531 Call Girls Ava...
Lucknow 💋 High Profile Call Girls in Lucknow - Book 8923113531 Call Girls Ava...anilsa9823
 
Lucknow 💋 Call Girl in Lucknow 10k @ I'm VIP Independent Escorts Girls 892311...
Lucknow 💋 Call Girl in Lucknow 10k @ I'm VIP Independent Escorts Girls 892311...Lucknow 💋 Call Girl in Lucknow 10k @ I'm VIP Independent Escorts Girls 892311...
Lucknow 💋 Call Girl in Lucknow 10k @ I'm VIP Independent Escorts Girls 892311...anilsa9823
 
Editorial sephora annual report design project
Editorial sephora annual report design projectEditorial sephora annual report design project
Editorial sephora annual report design projecttbatkhuu1
 
FULL NIGHT — 9999894380 Call Girls In Saket | Delhi
FULL NIGHT — 9999894380 Call Girls In Saket | DelhiFULL NIGHT — 9999894380 Call Girls In Saket | Delhi
FULL NIGHT — 9999894380 Call Girls In Saket | DelhiSaketCallGirlsCallUs
 
Aminabad @ Book Call Girls in Lucknow - 450+ Call Girl Cash Payment 🍵 8923113...
Aminabad @ Book Call Girls in Lucknow - 450+ Call Girl Cash Payment 🍵 8923113...Aminabad @ Book Call Girls in Lucknow - 450+ Call Girl Cash Payment 🍵 8923113...
Aminabad @ Book Call Girls in Lucknow - 450+ Call Girl Cash Payment 🍵 8923113...akbard9823
 
Lucknow 💋 Call Girl in Lucknow Phone No 8923113531 Elite Escort Service Avail...
Lucknow 💋 Call Girl in Lucknow Phone No 8923113531 Elite Escort Service Avail...Lucknow 💋 Call Girl in Lucknow Phone No 8923113531 Elite Escort Service Avail...
Lucknow 💋 Call Girl in Lucknow Phone No 8923113531 Elite Escort Service Avail...anilsa9823
 
Lucknow 💋 Call Girl in Lucknow | Whatsapp No 8923113531 VIP Escorts Service A...
Lucknow 💋 Call Girl in Lucknow | Whatsapp No 8923113531 VIP Escorts Service A...Lucknow 💋 Call Girl in Lucknow | Whatsapp No 8923113531 VIP Escorts Service A...
Lucknow 💋 Call Girl in Lucknow | Whatsapp No 8923113531 VIP Escorts Service A...anilsa9823
 
Young⚡Call Girls in Uttam Nagar Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Uttam Nagar Delhi >༒9667401043 Escort ServiceYoung⚡Call Girls in Uttam Nagar Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Uttam Nagar Delhi >༒9667401043 Escort Servicesonnydelhi1992
 

Recently uploaded (20)

Amelia's Dad's Father of the Bride Speech
Amelia's Dad's Father of the Bride SpeechAmelia's Dad's Father of the Bride Speech
Amelia's Dad's Father of the Bride Speech
 
VIP Ramnagar Call Girls, Ramnagar escorts Girls 📞 8617697112
VIP Ramnagar Call Girls, Ramnagar escorts Girls 📞 8617697112VIP Ramnagar Call Girls, Ramnagar escorts Girls 📞 8617697112
VIP Ramnagar Call Girls, Ramnagar escorts Girls 📞 8617697112
 
Storyboard short: Ferrarius Tries to Sing
Storyboard short: Ferrarius Tries to SingStoryboard short: Ferrarius Tries to Sing
Storyboard short: Ferrarius Tries to Sing
 
Young⚡Call Girls in Lajpat Nagar Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Lajpat Nagar Delhi >༒9667401043 Escort ServiceYoung⚡Call Girls in Lajpat Nagar Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Lajpat Nagar Delhi >༒9667401043 Escort Service
 
Lucknow 💋 Escort Service in Lucknow (Adult Only) 8923113531 Escort Service 2...
Lucknow 💋 Escort Service in Lucknow  (Adult Only) 8923113531 Escort Service 2...Lucknow 💋 Escort Service in Lucknow  (Adult Only) 8923113531 Escort Service 2...
Lucknow 💋 Escort Service in Lucknow (Adult Only) 8923113531 Escort Service 2...
 
Jeremy Casson - An Architectural and Historical Journey Around Europe
Jeremy Casson - An Architectural and Historical Journey Around EuropeJeremy Casson - An Architectural and Historical Journey Around Europe
Jeremy Casson - An Architectural and Historical Journey Around Europe
 
Best Call girls in Lucknow - 9548086042 - with hotel room
Best Call girls in Lucknow - 9548086042 - with hotel roomBest Call girls in Lucknow - 9548086042 - with hotel room
Best Call girls in Lucknow - 9548086042 - with hotel room
 
RAK Call Girls Service # 971559085003 # Call Girl Service In RAK
RAK Call Girls Service # 971559085003 # Call Girl Service In RAKRAK Call Girls Service # 971559085003 # Call Girl Service In RAK
RAK Call Girls Service # 971559085003 # Call Girl Service In RAK
 
Lucknow 💋 Female Escorts Service in Lucknow | Service-oriented sexy call girl...
Lucknow 💋 Female Escorts Service in Lucknow | Service-oriented sexy call girl...Lucknow 💋 Female Escorts Service in Lucknow | Service-oriented sexy call girl...
Lucknow 💋 Female Escorts Service in Lucknow | Service-oriented sexy call girl...
 
Lucknow 💋 Call Girls in Lucknow | Service-oriented sexy call girls 8923113531...
Lucknow 💋 Call Girls in Lucknow | Service-oriented sexy call girls 8923113531...Lucknow 💋 Call Girls in Lucknow | Service-oriented sexy call girls 8923113531...
Lucknow 💋 Call Girls in Lucknow | Service-oriented sexy call girls 8923113531...
 
Lucknow 💋 High Profile Call Girls in Lucknow - Book 8923113531 Call Girls Ava...
Lucknow 💋 High Profile Call Girls in Lucknow - Book 8923113531 Call Girls Ava...Lucknow 💋 High Profile Call Girls in Lucknow - Book 8923113531 Call Girls Ava...
Lucknow 💋 High Profile Call Girls in Lucknow - Book 8923113531 Call Girls Ava...
 
RAJKOT CALL GIRL 76313*77252 CALL GIRL IN RAJKOT
RAJKOT CALL GIRL 76313*77252 CALL GIRL IN RAJKOTRAJKOT CALL GIRL 76313*77252 CALL GIRL IN RAJKOT
RAJKOT CALL GIRL 76313*77252 CALL GIRL IN RAJKOT
 
Lucknow 💋 Call Girl in Lucknow 10k @ I'm VIP Independent Escorts Girls 892311...
Lucknow 💋 Call Girl in Lucknow 10k @ I'm VIP Independent Escorts Girls 892311...Lucknow 💋 Call Girl in Lucknow 10k @ I'm VIP Independent Escorts Girls 892311...
Lucknow 💋 Call Girl in Lucknow 10k @ I'm VIP Independent Escorts Girls 892311...
 
Editorial sephora annual report design project
Editorial sephora annual report design projectEditorial sephora annual report design project
Editorial sephora annual report design project
 
FULL NIGHT — 9999894380 Call Girls In Saket | Delhi
FULL NIGHT — 9999894380 Call Girls In Saket | DelhiFULL NIGHT — 9999894380 Call Girls In Saket | Delhi
FULL NIGHT — 9999894380 Call Girls In Saket | Delhi
 
(NEHA) Call Girls Mumbai Call Now 8250077686 Mumbai Escorts 24x7
(NEHA) Call Girls Mumbai Call Now 8250077686 Mumbai Escorts 24x7(NEHA) Call Girls Mumbai Call Now 8250077686 Mumbai Escorts 24x7
(NEHA) Call Girls Mumbai Call Now 8250077686 Mumbai Escorts 24x7
 
Aminabad @ Book Call Girls in Lucknow - 450+ Call Girl Cash Payment 🍵 8923113...
Aminabad @ Book Call Girls in Lucknow - 450+ Call Girl Cash Payment 🍵 8923113...Aminabad @ Book Call Girls in Lucknow - 450+ Call Girl Cash Payment 🍵 8923113...
Aminabad @ Book Call Girls in Lucknow - 450+ Call Girl Cash Payment 🍵 8923113...
 
Lucknow 💋 Call Girl in Lucknow Phone No 8923113531 Elite Escort Service Avail...
Lucknow 💋 Call Girl in Lucknow Phone No 8923113531 Elite Escort Service Avail...Lucknow 💋 Call Girl in Lucknow Phone No 8923113531 Elite Escort Service Avail...
Lucknow 💋 Call Girl in Lucknow Phone No 8923113531 Elite Escort Service Avail...
 
Lucknow 💋 Call Girl in Lucknow | Whatsapp No 8923113531 VIP Escorts Service A...
Lucknow 💋 Call Girl in Lucknow | Whatsapp No 8923113531 VIP Escorts Service A...Lucknow 💋 Call Girl in Lucknow | Whatsapp No 8923113531 VIP Escorts Service A...
Lucknow 💋 Call Girl in Lucknow | Whatsapp No 8923113531 VIP Escorts Service A...
 
Young⚡Call Girls in Uttam Nagar Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Uttam Nagar Delhi >༒9667401043 Escort ServiceYoung⚡Call Girls in Uttam Nagar Delhi >༒9667401043 Escort Service
Young⚡Call Girls in Uttam Nagar Delhi >༒9667401043 Escort Service
 

acomprehensivereviewoflargelanguagemodelsfor-230515063139-1fc27b64.pdf

  • 1. A Comprehensive Review of Large Language Models for Code Generation Presented By: Sai Pragna Kancheti
  • 2. INTRODUCTION:  Chatgpt like chatbots has become popular in recent times, These chatbots are natural language processing tools that are developed for general-purpose and uses artificial intelligence to generate text after a user enters a prompt.  Although these chatbots are made for general purpose, they are also good at generating code from user prompts using Large Language Models  In this presentation, we are going to systematically review Large Language Models for code generation base on user prompts  At the end, based on the results we have presented some Insights for further research in this direction
  • 3. What are LLMs?  A large language model is a more advanced sort of language model that is developed on vast volumes of text data using deep learning techniques.  These models can generate human-like text and perform a variety of natural language processing tasks  The complexity of a language model can range from simple n-gram models to more complex neural network models.  Examples: GPT-3 (Generative Pretrained Transformer 3), BERT (Bidirectional Encoder Representations from Transformers), RoBERTa (Robustly Optimized BERT Approach) ,etc.,
  • 4. LLMs for code generation  The recent models excel at tasks like code completion and code synthesis from natural language descriptions.  One such promising model developed in the recent times is Austin et al. (2021),which has demonstrated significant progress toward AI-based programming aid.  One of the largest of these models, Codex (Chen et al., 2021), has been deployed as an in-IDE developer assistant that automatically generates code based on the user's context in the real-world production tool GitHub Copilot1.  Despite the enormous success of large language models of code, the most powerful models are not publicly accessible.
  • 5. LLMs for code generation Some of the Existing models of code,their sizes and availability(open source or not open-source ) is shown in the figure.
  • 6. Challenges With the available LLMs for code Generation  Although these models can show good performance for code generation based on the user prompt. There are some following challenges needed to be addressed for these models for further development in this scope  There was no large open-source language model trained almost exclusively on code from multiple programming languages.  Lack of availability of powerful models that are publicly accessible.  Unavailability of access to the model's internals.  This prohibits these models from being applied to code generation tasks and inhibits research in this particular field for low-resource organizations
  • 9. Left-to-Right Language Models  The auto-regressive, left-to-right language models predict the likelihood of a certain token depending on the sequence of tokens that have come before it  These models' sequential, left-to-right operation is especially useful for activities connected to program generation, such as auto-completion code.  However, because code isn't often produced in a single left-to-right pass, utilizing context that appears "after" the moment of generation is difficult.  Examples: CodeParrot, GPT-Neo ,GPT-J (6B) ,Codex (12B), GPT-NeoX (20B), and Google’s (137B) (Austin et al., 2021)  These type of the models are considered in review.
  • 10. Masked Language Models  While auto-regressive language models are powerful for modeling the probability of sequences, their unidirectional nature makes them less suitable for producing effective whole-sequence representations for downstream tasks such as classification.  One popular bidirectional objective function used widely in representation learning is masked language modeling.  where the aim is to predict masked text pieces based on surrounding context.  Examples: CodeBERT (125M) and CuBERT (345M) are some of the examples of these models.
  • 11. Encoder-decoder Models  An encoder-decoder model first uses an encoder to encode an input sequence, and then uses a left-to-right LM to decode an output sequence conditioned on the input sequence.  Popular pretraining objectives include masked span prediction where the input sequence is randomly masked with multiple masks and the output sequence are the masked contents in order  and denoising sequence reconstruction where the input is a corrupted sequence and the output is the original sequence.  These pretrained models are useful in many sequence-to-sequence tasks  Examples: CodeT5 (220M) and PLBART (406M)
  • 13. Existing Models  Codex: Codex is a Language Learning Model (LLM) that has been specifically adjusted using Python code available to the public on GitHub.  This model employs GPT-3 due to its substantial proficiency in creating Python programs. Despite being considerably smaller than GPT-3, with a total of 12 billion parameters, Codex still exhibits remarkable performance.  GPT-Neo: GPT-Neo is a series of substantial large language models have been trained on the Pile dataset.  These models, similar to GPT-3, are available in different sizes including 125M, 1.3B, and 2.7B parameter versions.  The GPT-Neo 2.7B version, in particular, is a transformer model that has been developed based on EleutherAI's recreation of the GPT-3 architecture.
  • 14. Existing Models  GPT-J : GPT-J, developed by EleutherAI, is an open source model with 6 billion parameters, trained on The Pile dataset.  It largely adheres to the GPT-2 architecture and stands out as the highest performing transformer language model available to the public, in terms of its zero-shot performance on a range of subsequent tasks.  CodeParrot: CodeParrot is a model based on GPT-2, possessing 1.5 billion parameters, which has been specifically fine-tuned using publicly accessible code from GitHub for the purpose of generating Python code
  • 15. Introduced model- PolyCoder  To overcome the challenges of available LLMs for code generation a new PolyCoder model is introduced , which boasts 2.7 billion parameters, is trained on a diverse range of repositories sourced from GitHub, encompassing 12 distinct programming languages. As shown in the table
  • 16. PolyCoder’s Training  Polycoder uses the GPT-2 model architecture.  To investigate the effect of model size scaling, it was trained using three different model sizes: 2.7 billion, 400 million, and 160 million parameters, with the largest 2.7B model equalling GPT-Neo's capacity to allow a fair comparison  The 2.7 billion parameter model is a 32-layer, 2,560 dimensional Transformer model with a maximum context window of 2048 tokens, and it was trained using a batch size of 128 sequences (262K tokens) for a total of 150K steps
  • 17. PolyCoder’s Training  The following table is a Comparison of design decisions and hyper- parameters in training different models of code.
  • 18. PolyCoder’s Training  The following figure is the Training and validation loss during the 150K step training process
  • 20. Results of Extrinsic evaluations:  Among the current models, PolyCoder performs less effectively than the comparably sized GPT-Neo and even the smaller Codex 300M. In the grand scheme of things, PolyCoder ranks after Codex, GPT-Neo/J, but outperforms CodeParrot  Despite being trained exclusively on code, PolyCoder lags behind a model of similar size, GPT-Neo 2.7B, which was trained on the Pile, a mix of both code and natural language texts  This finding implies that future studies could profit from mixing code from diverse programming languages, along with natural language text
  • 21. Results of Extrinsic evaluations:  The following table shows results of different models on the HumanEval benchmark, and the number of different typesof tokens seen during the training process.
  • 22. Results of Intrinsic Evaluations  Interestingly, PolyCoder surpasses Codex and all other models when it comes to the C language. When considering only open-source models, PolyCoder outperforms the similarly sized GPT-Neo 2.7B in C, JavaScript, Rust, Scala, and TypeScript  In the remaining 11 languages apart from C, all other open-source models, including the newly introduced PolyCoder, exhibit significantly lower performance (higher perplexity) compared to Codex.  This observation could imply that for languages where larger models don't yield extra benefits, training the model solely on code might be sufficient or even slightly more advantageous than training on a combination of natural language and code
  • 23. Conclusions  We've presented the results of a systematic evaluatoion of large language models for code. The findings generally indicate that performance improves with bigger models and extended training durations.  Based on the results, we infer that GPT-Neo's superior performance over PolyCoder in certain languages suggests that training on both natural language text and code can enhance code modeling  However, it's noteworthy that in the realm of the C programming language, PolyCoder outperforms all models, including Codex, by achieving a lower perplexity