SlideShare a Scribd company logo
1 of 24
A Comprehensive Review of
Large Language Models for
Code Generation
Presented By: Sai Pragna Kancheti
INTRODUCTION:
 Chatgpt like chatbots has become popular in recent times, These chatbots are natural
language processing tools that are developed for general-purpose and uses artificial
intelligence to generate text after a user enters a prompt.
 Although these chatbots are made for general purpose, they are also good at
generating code from user prompts using Large Language Models
 In this presentation, we are going to systematically review Large Language
Models for code generation base on user prompts
 At the end, based on the results we have presented some Insights for further
research in this direction
What are LLMs?
 A large language model is a more advanced sort of language model that is
developed on vast volumes of text data using deep learning techniques.
 These models can generate human-like text and perform a variety of natural
language processing tasks
 The complexity of a language model can range from simple n-gram models to
more complex neural network models.
 Examples: GPT-3 (Generative Pretrained Transformer 3), BERT (Bidirectional
Encoder Representations from Transformers), RoBERTa (Robustly Optimized
BERT Approach) ,etc.,
LLMs for code generation
 The recent models excel at tasks like code completion and code synthesis
from natural language descriptions.
 One such promising model developed in the recent times is Austin et al.
(2021),which has demonstrated significant progress toward AI-based
programming aid.
 One of the largest of these models, Codex (Chen et al., 2021), has been
deployed as an in-IDE developer assistant that automatically generates code
based on the user's context in the real-world production tool GitHub Copilot1.
 Despite the enormous success of large language models of code, the most
powerful models are not publicly accessible.
LLMs for code
generation
Some of the Existing models of
code,their sizes and
availability(open source or not
open-source ) is shown in the
figure.
Challenges With the available LLMs for code
Generation
 Although these models can show good performance for code generation based
on the user prompt. There are some following challenges needed to be
addressed for these models for further development in this scope
 There was no large open-source language model trained almost exclusively on
code from multiple programming languages.
 Lack of availability of powerful models that are publicly accessible.
 Unavailability of access to the model's internals.
 This prohibits these models from being applied to code generation tasks and
inhibits research in this particular field for low-resource organizations
PRETRAINING
METHODS
Types of Pretraining Methods
Left-to-Right Language Models
 The auto-regressive, left-to-right language models predict the likelihood of a
certain token depending on the sequence of tokens that have come before it
 These models' sequential, left-to-right operation is especially useful for
activities connected to program generation, such as auto-completion code.
 However, because code isn't often produced in a single left-to-right pass,
utilizing context that appears "after" the moment of generation is difficult.
 Examples: CodeParrot, GPT-Neo ,GPT-J (6B) ,Codex (12B), GPT-NeoX (20B),
and Google’s (137B) (Austin et al., 2021)
 These type of the models are considered in review.
Masked Language Models
 While auto-regressive language models are powerful for modeling the
probability of sequences, their unidirectional nature makes them less suitable
for producing effective whole-sequence representations for downstream tasks
such as classification.
 One popular bidirectional objective function used widely in representation
learning is masked language modeling.
 where the aim is to predict masked text pieces based on surrounding context.
 Examples: CodeBERT (125M) and CuBERT (345M) are some of the examples of
these models.
Encoder-decoder Models
 An encoder-decoder model first uses an encoder to encode an input
sequence, and then uses a left-to-right LM to decode an output sequence
conditioned on the input sequence.
 Popular pretraining objectives include masked span prediction where the
input sequence is randomly masked with multiple masks and the output
sequence are the masked contents in order
 and denoising sequence reconstruction where the input is a corrupted
sequence and the output is the original sequence.
 These pretrained models are useful in many sequence-to-sequence tasks
 Examples: CodeT5 (220M) and PLBART (406M)
COMPARED MODELS
Existing Models
 Codex: Codex is a Language Learning Model (LLM) that has been specifically
adjusted using Python code available to the public on GitHub.
 This model employs GPT-3 due to its substantial proficiency in creating Python
programs. Despite being considerably smaller than GPT-3, with a total of 12 billion
parameters, Codex still exhibits remarkable performance.
 GPT-Neo: GPT-Neo is a series of substantial large language models have been trained
on the Pile dataset.
 These models, similar to GPT-3, are available in different sizes including 125M, 1.3B,
and 2.7B parameter versions.
 The GPT-Neo 2.7B version, in particular, is a transformer model that has been
developed based on EleutherAI's recreation of the GPT-3 architecture.
Existing Models
 GPT-J : GPT-J, developed by EleutherAI, is an open source model with 6 billion
parameters, trained on The Pile dataset.
 It largely adheres to the GPT-2 architecture and stands out as the highest performing
transformer language model available to the public, in terms of its zero-shot performance
on a range of subsequent tasks.
 CodeParrot: CodeParrot is a model based on GPT-2, possessing 1.5 billion
parameters, which has been specifically fine-tuned using publicly accessible code from
GitHub for the purpose of generating Python code
Introduced model- PolyCoder
 To overcome the challenges of
available LLMs for code
generation a new PolyCoder
model is introduced , which
boasts 2.7 billion parameters,
is trained on a diverse range of
repositories sourced from
GitHub, encompassing 12
distinct programming
languages. As shown in the
table
PolyCoder’s Training
 Polycoder uses the GPT-2 model architecture.
 To investigate the effect of model size scaling, it was trained using three
different model sizes: 2.7 billion, 400 million, and 160 million parameters,
with the largest 2.7B model equalling GPT-Neo's capacity to allow a fair
comparison
 The 2.7 billion parameter model is a 32-layer, 2,560 dimensional Transformer
model with a maximum context window of 2048 tokens, and it was trained
using a batch size of 128 sequences (262K tokens) for a total of 150K steps
PolyCoder’s Training
 The following table is a
Comparison of design
decisions and hyper-
parameters in training
different models of code.
PolyCoder’s Training
 The following figure is
the Training and
validation loss during the
150K step training
process
Results
Results of Extrinsic evaluations:
 Among the current models, PolyCoder performs less effectively than the comparably
sized GPT-Neo and even the smaller Codex 300M. In the grand scheme of things,
PolyCoder ranks after Codex, GPT-Neo/J, but outperforms CodeParrot
 Despite being trained exclusively on code, PolyCoder lags behind a model of similar
size, GPT-Neo 2.7B, which was trained on the Pile, a mix of both code and natural
language texts
 This finding implies that future studies could profit from mixing code from diverse
programming languages, along with natural language text
Results of Extrinsic evaluations:
 The following table
shows results of different
models on the
HumanEval benchmark,
and the number of
different typesof tokens
seen during the training
process.
Results of Intrinsic Evaluations
 Interestingly, PolyCoder surpasses Codex and all other models when it comes to the C
language. When considering only open-source models, PolyCoder outperforms the
similarly sized GPT-Neo 2.7B in C, JavaScript, Rust, Scala, and TypeScript
 In the remaining 11 languages apart from C, all other open-source models, including
the newly introduced PolyCoder, exhibit significantly lower performance (higher
perplexity) compared to Codex.
 This observation could imply that for languages where larger models don't yield extra
benefits, training the model solely on code might be sufficient or even slightly more
advantageous than training on a combination of natural language and code
Conclusions
 We've presented the results of a systematic evaluatoion of large language models for
code. The findings generally indicate that performance improves with bigger models
and extended training durations.
 Based on the results, we infer that GPT-Neo's superior performance over PolyCoder in
certain languages suggests that training on both natural language text and code can
enhance code modeling
 However, it's noteworthy that in the realm of the C programming language, PolyCoder
outperforms all models, including Codex, by achieving a lower perplexity
Thank
You

More Related Content

What's hot

Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Anant Corporation
 

What's hot (20)

LanGCHAIN Framework
LanGCHAIN FrameworkLanGCHAIN Framework
LanGCHAIN Framework
 
Generative AI
Generative AIGenerative AI
Generative AI
 
Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdf
 
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
 
Generative AI
Generative AIGenerative AI
Generative AI
 
Transformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGITransformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGI
 
Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchers
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
 
Generative Models and ChatGPT
Generative Models and ChatGPTGenerative Models and ChatGPT
Generative Models and ChatGPT
 
Bert
BertBert
Bert
 
How ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundlyHow ChatGPT and AI-assisted coding changes software engineering profoundly
How ChatGPT and AI-assisted coding changes software engineering profoundly
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
 
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
 
An Introduction to Generative AI - May 18, 2023
An Introduction  to Generative AI - May 18, 2023An Introduction  to Generative AI - May 18, 2023
An Introduction to Generative AI - May 18, 2023
 
An Introduction to Generative AI
An Introduction  to Generative AIAn Introduction  to Generative AI
An Introduction to Generative AI
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdf
 
What is langchain
What is langchainWhat is langchain
What is langchain
 
A brief primer on OpenAI's GPT-3
A brief primer on OpenAI's GPT-3A brief primer on OpenAI's GPT-3
A brief primer on OpenAI's GPT-3
 
‘Big models’: the success and pitfalls of Transformer models in natural langu...
‘Big models’: the success and pitfalls of Transformer models in natural langu...‘Big models’: the success and pitfalls of Transformer models in natural langu...
‘Big models’: the success and pitfalls of Transformer models in natural langu...
 
Fine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsFine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP models
 

Similar to A Comprehensive Review of Large Language Models for.pptx

SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
DevOpsDays Tel Aviv
 
Model driven software engineering in practice book - Chapter 9 - Model to tex...
Model driven software engineering in practice book - Chapter 9 - Model to tex...Model driven software engineering in practice book - Chapter 9 - Model to tex...
Model driven software engineering in practice book - Chapter 9 - Model to tex...
Marco Brambilla
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
SIVAJISADHANA
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
SIVAJISADHANA
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
SIVAJISADHANA
 

Similar to A Comprehensive Review of Large Language Models for.pptx (20)

Smart modeling of smart software
Smart modeling of smart softwareSmart modeling of smart software
Smart modeling of smart software
 
codex.pptx
codex.pptxcodex.pptx
codex.pptx
 
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
 
01 overview
01 overview01 overview
01 overview
 
01 overview
01 overview01 overview
01 overview
 
고급컴파일러구성론_개레_230303.pptx
고급컴파일러구성론_개레_230303.pptx고급컴파일러구성론_개레_230303.pptx
고급컴파일러구성론_개레_230303.pptx
 
Recent Trends in Translation of Programming Languages using NLP Approaches
Recent Trends in Translation of Programming Languages using NLP ApproachesRecent Trends in Translation of Programming Languages using NLP Approaches
Recent Trends in Translation of Programming Languages using NLP Approaches
 
Software Modeling and Artificial Intelligence: friends or foes?
Software Modeling and Artificial Intelligence: friends or foes?Software Modeling and Artificial Intelligence: friends or foes?
Software Modeling and Artificial Intelligence: friends or foes?
 
short-story.pptx
short-story.pptxshort-story.pptx
short-story.pptx
 
Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...
Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...
Model-To-Text Transformation Language chapter 9 – J Cabot model driven engine...
 
Ready, set, go! An introduction to the Go programming language
Ready, set, go! An introduction to the Go programming languageReady, set, go! An introduction to the Go programming language
Ready, set, go! An introduction to the Go programming language
 
thrift-20070401
thrift-20070401thrift-20070401
thrift-20070401
 
WEBSITE DEVELOPMENT
WEBSITE DEVELOPMENTWEBSITE DEVELOPMENT
WEBSITE DEVELOPMENT
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
STATICMOCK : A Mock Object Framework for Compiled Languages
STATICMOCK : A Mock Object Framework for Compiled Languages STATICMOCK : A Mock Object Framework for Compiled Languages
STATICMOCK : A Mock Object Framework for Compiled Languages
 
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worldsmbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
mbeddr meets IncQuer - Combining the Best Features of Two Modeling Worlds
 
Model driven software engineering in practice book - Chapter 9 - Model to tex...
Model driven software engineering in practice book - Chapter 9 - Model to tex...Model driven software engineering in practice book - Chapter 9 - Model to tex...
Model driven software engineering in practice book - Chapter 9 - Model to tex...
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
 

Recently uploaded

Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
AnaAcapella
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf arts
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
 
AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.ppt
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
Simple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdfSimple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdf
 
OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdfFICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
How to Add a Tool Tip to a Field in Odoo 17
How to Add a Tool Tip to a Field in Odoo 17How to Add a Tool Tip to a Field in Odoo 17
How to Add a Tool Tip to a Field in Odoo 17
 

A Comprehensive Review of Large Language Models for.pptx

  • 1. A Comprehensive Review of Large Language Models for Code Generation Presented By: Sai Pragna Kancheti
  • 2. INTRODUCTION:  Chatgpt like chatbots has become popular in recent times, These chatbots are natural language processing tools that are developed for general-purpose and uses artificial intelligence to generate text after a user enters a prompt.  Although these chatbots are made for general purpose, they are also good at generating code from user prompts using Large Language Models  In this presentation, we are going to systematically review Large Language Models for code generation base on user prompts  At the end, based on the results we have presented some Insights for further research in this direction
  • 3. What are LLMs?  A large language model is a more advanced sort of language model that is developed on vast volumes of text data using deep learning techniques.  These models can generate human-like text and perform a variety of natural language processing tasks  The complexity of a language model can range from simple n-gram models to more complex neural network models.  Examples: GPT-3 (Generative Pretrained Transformer 3), BERT (Bidirectional Encoder Representations from Transformers), RoBERTa (Robustly Optimized BERT Approach) ,etc.,
  • 4. LLMs for code generation  The recent models excel at tasks like code completion and code synthesis from natural language descriptions.  One such promising model developed in the recent times is Austin et al. (2021),which has demonstrated significant progress toward AI-based programming aid.  One of the largest of these models, Codex (Chen et al., 2021), has been deployed as an in-IDE developer assistant that automatically generates code based on the user's context in the real-world production tool GitHub Copilot1.  Despite the enormous success of large language models of code, the most powerful models are not publicly accessible.
  • 5. LLMs for code generation Some of the Existing models of code,their sizes and availability(open source or not open-source ) is shown in the figure.
  • 6. Challenges With the available LLMs for code Generation  Although these models can show good performance for code generation based on the user prompt. There are some following challenges needed to be addressed for these models for further development in this scope  There was no large open-source language model trained almost exclusively on code from multiple programming languages.  Lack of availability of powerful models that are publicly accessible.  Unavailability of access to the model's internals.  This prohibits these models from being applied to code generation tasks and inhibits research in this particular field for low-resource organizations
  • 9. Left-to-Right Language Models  The auto-regressive, left-to-right language models predict the likelihood of a certain token depending on the sequence of tokens that have come before it  These models' sequential, left-to-right operation is especially useful for activities connected to program generation, such as auto-completion code.  However, because code isn't often produced in a single left-to-right pass, utilizing context that appears "after" the moment of generation is difficult.  Examples: CodeParrot, GPT-Neo ,GPT-J (6B) ,Codex (12B), GPT-NeoX (20B), and Google’s (137B) (Austin et al., 2021)  These type of the models are considered in review.
  • 10. Masked Language Models  While auto-regressive language models are powerful for modeling the probability of sequences, their unidirectional nature makes them less suitable for producing effective whole-sequence representations for downstream tasks such as classification.  One popular bidirectional objective function used widely in representation learning is masked language modeling.  where the aim is to predict masked text pieces based on surrounding context.  Examples: CodeBERT (125M) and CuBERT (345M) are some of the examples of these models.
  • 11. Encoder-decoder Models  An encoder-decoder model first uses an encoder to encode an input sequence, and then uses a left-to-right LM to decode an output sequence conditioned on the input sequence.  Popular pretraining objectives include masked span prediction where the input sequence is randomly masked with multiple masks and the output sequence are the masked contents in order  and denoising sequence reconstruction where the input is a corrupted sequence and the output is the original sequence.  These pretrained models are useful in many sequence-to-sequence tasks  Examples: CodeT5 (220M) and PLBART (406M)
  • 13. Existing Models  Codex: Codex is a Language Learning Model (LLM) that has been specifically adjusted using Python code available to the public on GitHub.  This model employs GPT-3 due to its substantial proficiency in creating Python programs. Despite being considerably smaller than GPT-3, with a total of 12 billion parameters, Codex still exhibits remarkable performance.  GPT-Neo: GPT-Neo is a series of substantial large language models have been trained on the Pile dataset.  These models, similar to GPT-3, are available in different sizes including 125M, 1.3B, and 2.7B parameter versions.  The GPT-Neo 2.7B version, in particular, is a transformer model that has been developed based on EleutherAI's recreation of the GPT-3 architecture.
  • 14. Existing Models  GPT-J : GPT-J, developed by EleutherAI, is an open source model with 6 billion parameters, trained on The Pile dataset.  It largely adheres to the GPT-2 architecture and stands out as the highest performing transformer language model available to the public, in terms of its zero-shot performance on a range of subsequent tasks.  CodeParrot: CodeParrot is a model based on GPT-2, possessing 1.5 billion parameters, which has been specifically fine-tuned using publicly accessible code from GitHub for the purpose of generating Python code
  • 15. Introduced model- PolyCoder  To overcome the challenges of available LLMs for code generation a new PolyCoder model is introduced , which boasts 2.7 billion parameters, is trained on a diverse range of repositories sourced from GitHub, encompassing 12 distinct programming languages. As shown in the table
  • 16. PolyCoder’s Training  Polycoder uses the GPT-2 model architecture.  To investigate the effect of model size scaling, it was trained using three different model sizes: 2.7 billion, 400 million, and 160 million parameters, with the largest 2.7B model equalling GPT-Neo's capacity to allow a fair comparison  The 2.7 billion parameter model is a 32-layer, 2,560 dimensional Transformer model with a maximum context window of 2048 tokens, and it was trained using a batch size of 128 sequences (262K tokens) for a total of 150K steps
  • 17. PolyCoder’s Training  The following table is a Comparison of design decisions and hyper- parameters in training different models of code.
  • 18. PolyCoder’s Training  The following figure is the Training and validation loss during the 150K step training process
  • 20. Results of Extrinsic evaluations:  Among the current models, PolyCoder performs less effectively than the comparably sized GPT-Neo and even the smaller Codex 300M. In the grand scheme of things, PolyCoder ranks after Codex, GPT-Neo/J, but outperforms CodeParrot  Despite being trained exclusively on code, PolyCoder lags behind a model of similar size, GPT-Neo 2.7B, which was trained on the Pile, a mix of both code and natural language texts  This finding implies that future studies could profit from mixing code from diverse programming languages, along with natural language text
  • 21. Results of Extrinsic evaluations:  The following table shows results of different models on the HumanEval benchmark, and the number of different typesof tokens seen during the training process.
  • 22. Results of Intrinsic Evaluations  Interestingly, PolyCoder surpasses Codex and all other models when it comes to the C language. When considering only open-source models, PolyCoder outperforms the similarly sized GPT-Neo 2.7B in C, JavaScript, Rust, Scala, and TypeScript  In the remaining 11 languages apart from C, all other open-source models, including the newly introduced PolyCoder, exhibit significantly lower performance (higher perplexity) compared to Codex.  This observation could imply that for languages where larger models don't yield extra benefits, training the model solely on code might be sufficient or even slightly more advantageous than training on a combination of natural language and code
  • 23. Conclusions  We've presented the results of a systematic evaluatoion of large language models for code. The findings generally indicate that performance improves with bigger models and extended training durations.  Based on the results, we infer that GPT-Neo's superior performance over PolyCoder in certain languages suggests that training on both natural language text and code can enhance code modeling  However, it's noteworthy that in the realm of the C programming language, PolyCoder outperforms all models, including Codex, by achieving a lower perplexity