SlideShare a Scribd company logo
Chain-Of-Thought Prompting
Presented at 36th Conference on Neural Information Processing Systems (NeurIPS 2022).
What is chain-of-thought Prompting
● Chain of Thought(C-o-T): Method for
breaking down complex problems into
simpler, manageable steps.
● Crucial for enhancing performance in
arithmetic, common sense, and symbolic
reasoning.
● Example: PaLM 540B model, using 8 C-o-T
exemplars, demonstrates superior
performance over fine-tuned GPT-3.
● Sets a new state-of-the-art performance in the
GSM8K benchmark of math word problems.
Chain-Of-Thought Prompting
● Concept and Application: Enables language models to generate coherent
series of intermediate reasoning steps leading to the final answer.
● Advantages:
○ Problem Decomposition: Allows models to tackle multi-step problems effectively by
breaking them down.
○ Interpretability and Debugging: Provides an insight into how the model arrives at answers,
aiding in error analysis.
○ Task Versatility: Applicable to a range of tasks including math word problems,
commonsense reasoning, and symbolic manipulation.
○ Accessibility in Large Models: Can be elicited in large, off-the-shelf language models via
few-shot prompting with examples.
Chain-Of-Thought Prompting
● Natural Language Problem-Solving: Demonstrates how
C-o-T converts complex tasks into a series of logical
steps.
● Example of C-o-T in Action:
○ Problem: “Jane has 12 flowers. After giving 2 to her mom,
and 3 to her dad, how many does she have left?”
○ C-o-T: “Jane gives 2 flowers to her mom, she has 10... then
after she gives 3 to her dad, she will have 7... so the
answer is 7.”
● Benefits in Diverse Reasoning Tasks: Proven
effectiveness in arithmetic, commonsense, and
symbolic reasoning.
● Interpretability: Provides a window into model's thought
process, enhancing transparency.
Arithmetic Reasoning - Setup
● Though simple for humans, arithmetic reasoning
is a task where language models often struggle
● Benchmark Overviews: GSM8K, SVAMP, ASDiv,
AQuA, and MAWPS benchmarks.
● Prompting Methods Compared: Standard few-
shot prompting vs. chain-of-thought prompting.
● Methodology of Study: Using few-shot exemplars
with chains of thought for model prompting.
Arithmetic Reasoning - Results
● Performance Leap with PaLM 540B: PaLM 540B model
significantly outperforms standard models in arithmetic
reasoning tasks.
● Benchmark Comparisons: Clear superiority of C-o-T prompting
evident across various math word problem benchmarks
including GSM8K, SVAMP, ASDiv, AQuA, and MAWPS.
● State-of-the-Art Achievement: Chain-of-thought prompting sets
new performance records, especially notable in the GSM8K
benchmark.
● Model Size and Performance: Notably, the benefits of C-o-T
prompting become pronounced with larger models (around 100B
parameters).
Arithmetic Reasoning - Ablation Study
● Variation Testing: Analyzed the impact of different
prompting methods on arithmetic reasoning.
● Equation-Only vs. Full C-o-T: Just providing equations is
less effective, especially for complex problems.
● Computation Amount Not Key: More computation (using
dots) doesn't enhance performance, underscoring C-o-T's
unique contribution.
● Sequential Reasoning Crucial: Placing C-o-T after the answer
doesn't improve results, highlighting the need for sequential
reasoning steps.
Robustness of Chain of Thought
● Impact of Different Exemplars: Explored how changes in
exemplars affect C-o-T performance.
● Annotator Influence: Assessed how different annotators'
interpretations impact the model's reasoning ability.
● Consistency Across Scenarios: Demonstrated C-o-T's
effectiveness remains robust under various conditions.
Common Sense Reasoning
● C-o-T Enhances Commonsense Tasks:
Showcases significant improvement in
models' ability to reason in common sense
scenarios.
● Notable Performance Gains: Details how C-
o-T outperforms standard approaches in
nuanced, context-rich problems.
● Real-World Examples: Provides specific
instances where C-o-T improved outcomes
in commonsense reasoning.
Symbolic Reasoning
● Effective in Symbolic Tasks: Highlights C-o-T's
success in interpreting and solving symbolic
reasoning problems.
● Complex Input Handling: Emphasizes C-o-T’s
capability to process and analyze intricate symbolic
data.
● Case Study Examples: Shares specific examples
from the paper demonstrating C-o-T’s impact.
Discussions
● Study Insights: Summarizes major observations and learnings from the
research.
● AI Development Impact: Discusses how these findings can shape the
future of language model development.
● Exploring Potential and Limits: Reflects on the broad applications and any
identified limitations of C-o-T.
● Future Research Suggestions: Proposes directions for further
investigation based on study outcomes.
Conclusion
● Key Findings: Chain-of-thought (C-o-T) prompting significantly enhances
reasoning abilities in large language models.
● Effective with Large Models: Most effective with larger models (e.g., PaLM
540B), especially for complex, multi-step problems.
● Beyond Simple Equations: C-o-T outperforms simple equation-only prompting,
especially in semantically complex tasks.
● Facilitates Natural Language Reasoning: Enables models to reason through
problems step-by-step in natural language.
● Broader Implications: These insights open up new possibilities for AI in
complex problem-solving and reasoning tasks.
FAQs
FAQ 1: What is the role of prompt engineering?
Answer: Prompt sensitivity is crucial. Different prompts affect models in varied ways. C-o-T is robust to
different annotators and exemplar variations, but prompt engineering can still significantly enhance
performance.
FAQ 2: Will C-o-T prompting improve performance for my task?
Answer: C-o-T is broadly applicable, especially beneficial when tasks require multi-step reasoning, use large
models, and when the scaling curve is relatively flat​​.
FAQ 3: Why is prompting with the equation only not enough?
Answer: While useful for simpler tasks, equation-only prompting falls short in complex scenarios like GSM8K,
where the semantic challenge requires natural language reasoning steps​
Thank You.

More Related Content

What's hot

Chat GPT and Generative AI in Higher Education - Empowering Educators and Lea...
Chat GPT and Generative AI in Higher Education - Empowering Educators and Lea...Chat GPT and Generative AI in Higher Education - Empowering Educators and Lea...
Chat GPT and Generative AI in Higher Education - Empowering Educators and Lea...
Alain Goudey
 
‘Big models’: the success and pitfalls of Transformer models in natural langu...
‘Big models’: the success and pitfalls of Transformer models in natural langu...‘Big models’: the success and pitfalls of Transformer models in natural langu...
‘Big models’: the success and pitfalls of Transformer models in natural langu...
Leiden University
 
An Introduction to Generative AI - May 18, 2023
An Introduction  to Generative AI - May 18, 2023An Introduction  to Generative AI - May 18, 2023
An Introduction to Generative AI - May 18, 2023
CoriFaklaris1
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practices
DianaGray10
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
Fiza987241
 
Understanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix GohUnderstanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix Goh
NUS-ISS
 
ChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptx
Jesus Rodriguez
 
OpenAI Chatgpt.pptx
OpenAI Chatgpt.pptxOpenAI Chatgpt.pptx
OpenAI Chatgpt.pptx
Nawroz University
 
Microsoft + OpenAI: Recent Updates (Machine Learning 15minutes! Broadcast #74)
Microsoft + OpenAI: Recent Updates (Machine Learning 15minutes! Broadcast #74)Microsoft + OpenAI: Recent Updates (Machine Learning 15minutes! Broadcast #74)
Microsoft + OpenAI: Recent Updates (Machine Learning 15minutes! Broadcast #74)
Naoki (Neo) SATO
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?
Itai Yaffe
 
A Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptxA Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptx
SaiPragnaKancheti
 
Increase Productivity with ChatGPT
Increase Productivity with ChatGPTIncrease Productivity with ChatGPT
Increase Productivity with ChatGPT
PrinceGarg95
 
Using Generative AI in the Classroom .pptx
Using Generative AI in the Classroom .pptxUsing Generative AI in the Classroom .pptx
Using Generative AI in the Classroom .pptx
JonathanDietz3
 
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Mihai Criveti
 
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
taozen
 
A comprehensive guide to prompt engineering.pdf
A comprehensive guide to prompt engineering.pdfA comprehensive guide to prompt engineering.pdf
A comprehensive guide to prompt engineering.pdf
StephenAmell4
 
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
VINCI Digital - Industrial IoT (IIoT) Strategic Advisory
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit Dubey
Rohit Dubey
 
Large Language Models.pdf
Large Language Models.pdfLarge Language Models.pdf
Large Language Models.pdf
BLINXAI
 
Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchers
Steven Van Vaerenbergh
 

What's hot (20)

Chat GPT and Generative AI in Higher Education - Empowering Educators and Lea...
Chat GPT and Generative AI in Higher Education - Empowering Educators and Lea...Chat GPT and Generative AI in Higher Education - Empowering Educators and Lea...
Chat GPT and Generative AI in Higher Education - Empowering Educators and Lea...
 
‘Big models’: the success and pitfalls of Transformer models in natural langu...
‘Big models’: the success and pitfalls of Transformer models in natural langu...‘Big models’: the success and pitfalls of Transformer models in natural langu...
‘Big models’: the success and pitfalls of Transformer models in natural langu...
 
An Introduction to Generative AI - May 18, 2023
An Introduction  to Generative AI - May 18, 2023An Introduction  to Generative AI - May 18, 2023
An Introduction to Generative AI - May 18, 2023
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practices
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
 
Understanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix GohUnderstanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix Goh
 
ChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptx
 
OpenAI Chatgpt.pptx
OpenAI Chatgpt.pptxOpenAI Chatgpt.pptx
OpenAI Chatgpt.pptx
 
Microsoft + OpenAI: Recent Updates (Machine Learning 15minutes! Broadcast #74)
Microsoft + OpenAI: Recent Updates (Machine Learning 15minutes! Broadcast #74)Microsoft + OpenAI: Recent Updates (Machine Learning 15minutes! Broadcast #74)
Microsoft + OpenAI: Recent Updates (Machine Learning 15minutes! Broadcast #74)
 
Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?Why do the majority of Data Science projects never make it to production?
Why do the majority of Data Science projects never make it to production?
 
A Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptxA Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptx
 
Increase Productivity with ChatGPT
Increase Productivity with ChatGPTIncrease Productivity with ChatGPT
Increase Productivity with ChatGPT
 
Using Generative AI in the Classroom .pptx
Using Generative AI in the Classroom .pptxUsing Generative AI in the Classroom .pptx
Using Generative AI in the Classroom .pptx
 
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
 
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
 
A comprehensive guide to prompt engineering.pdf
A comprehensive guide to prompt engineering.pdfA comprehensive guide to prompt engineering.pdf
A comprehensive guide to prompt engineering.pdf
 
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈: 𝐂𝐡𝐚𝐧𝐠𝐢𝐧𝐠 𝐇𝐨𝐰 𝐁𝐮𝐬𝐢𝐧𝐞𝐬𝐬 𝐈𝐧𝐧𝐨𝐯𝐚𝐭𝐞𝐬 𝐚𝐧𝐝 𝐎𝐩𝐞𝐫𝐚𝐭𝐞𝐬
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit Dubey
 
Large Language Models.pdf
Large Language Models.pdfLarge Language Models.pdf
Large Language Models.pdf
 
Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchers
 

Similar to Chain-Of-Thought Prompting.pptx

A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...
Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...
Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
Volodymyr Lyubinets. One startup's journey of building ML pipelines for text ...
Volodymyr Lyubinets. One startup's journey of building ML pipelines for text ...Volodymyr Lyubinets. One startup's journey of building ML pipelines for text ...
Volodymyr Lyubinets. One startup's journey of building ML pipelines for text ...
Lviv Startup Club
 
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language ModelsScaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Models
taeseon ryu
 
Fdp session rtu session 1
Fdp session rtu session 1Fdp session rtu session 1
Fdp session rtu session 1
sprsingh1
 
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
Infoshare
 
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
Pramati Technologies
 
Paper presentation on LLM compression
Paper presentation on LLM compression Paper presentation on LLM compression
Paper presentation on LLM compression
SanjanaRajeshKothari
 
Om0010 operations management
Om0010 operations managementOm0010 operations management
Om0010 operations management
smumbahelp
 
Om0010 operations management
Om0010 operations managementOm0010 operations management
Om0010 operations management
Study Stuff
 
Scolari's ICCD17 Talk
Scolari's ICCD17 TalkScolari's ICCD17 Talk
Scolari's ICCD17 Talk
NECST Lab @ Politecnico di Milano
 
Goal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Goal Decomposition and Abductive Reasoning for Policy Analysis and RefinementGoal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Goal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Emil Lupu
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to hero
Daniel Marcous
 
An introduction to CP Optimizer
An introduction to CP OptimizerAn introduction to CP Optimizer
An introduction to CP Optimizer
Philippe Laborie
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
Ivo Andreev
 
Simulacion luis garciaguzman-21012011
Simulacion luis garciaguzman-21012011Simulacion luis garciaguzman-21012011
Simulacion luis garciaguzman-21012011
lideresacademicos
 
Om0010 operations management
Om0010 operations managementOm0010 operations management
Om0010 operations management
smumbahelp
 
CH-1.1 Introduction (1).pptx
CH-1.1 Introduction (1).pptxCH-1.1 Introduction (1).pptx
CH-1.1 Introduction (1).pptx
satvikkushwaha1
 
[Cryptica 22] Deep Learning on Tabular Data, Predicting Profitability - Peiyu...
[Cryptica 22] Deep Learning on Tabular Data, Predicting Profitability - Peiyu...[Cryptica 22] Deep Learning on Tabular Data, Predicting Profitability - Peiyu...
[Cryptica 22] Deep Learning on Tabular Data, Predicting Profitability - Peiyu...
DataScienceConferenc1
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
Greg Makowski
 

Similar to Chain-Of-Thought Prompting.pptx (20)

A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...
Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...
Exploiting the Enumeration of All Feature Model Configurations: A New Perspec...
 
Volodymyr Lyubinets. One startup's journey of building ML pipelines for text ...
Volodymyr Lyubinets. One startup's journey of building ML pipelines for text ...Volodymyr Lyubinets. One startup's journey of building ML pipelines for text ...
Volodymyr Lyubinets. One startup's journey of building ML pipelines for text ...
 
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language ModelsScaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Models
 
Fdp session rtu session 1
Fdp session rtu session 1Fdp session rtu session 1
Fdp session rtu session 1
 
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
 
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
 
Paper presentation on LLM compression
Paper presentation on LLM compression Paper presentation on LLM compression
Paper presentation on LLM compression
 
Om0010 operations management
Om0010 operations managementOm0010 operations management
Om0010 operations management
 
Om0010 operations management
Om0010 operations managementOm0010 operations management
Om0010 operations management
 
Scolari's ICCD17 Talk
Scolari's ICCD17 TalkScolari's ICCD17 Talk
Scolari's ICCD17 Talk
 
Goal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Goal Decomposition and Abductive Reasoning for Policy Analysis and RefinementGoal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
Goal Decomposition and Abductive Reasoning for Policy Analysis and Refinement
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to hero
 
An introduction to CP Optimizer
An introduction to CP OptimizerAn introduction to CP Optimizer
An introduction to CP Optimizer
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
Simulacion luis garciaguzman-21012011
Simulacion luis garciaguzman-21012011Simulacion luis garciaguzman-21012011
Simulacion luis garciaguzman-21012011
 
Om0010 operations management
Om0010 operations managementOm0010 operations management
Om0010 operations management
 
CH-1.1 Introduction (1).pptx
CH-1.1 Introduction (1).pptxCH-1.1 Introduction (1).pptx
CH-1.1 Introduction (1).pptx
 
[Cryptica 22] Deep Learning on Tabular Data, Predicting Profitability - Peiyu...
[Cryptica 22] Deep Learning on Tabular Data, Predicting Profitability - Peiyu...[Cryptica 22] Deep Learning on Tabular Data, Predicting Profitability - Peiyu...
[Cryptica 22] Deep Learning on Tabular Data, Predicting Profitability - Peiyu...
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
 

Recently uploaded

AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!
GDSC PJATK
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Jeffrey Haguewood
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
flufftailshop
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 

Recently uploaded (20)

AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
 
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfNunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdf
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 

Chain-Of-Thought Prompting.pptx

  • 1. Chain-Of-Thought Prompting Presented at 36th Conference on Neural Information Processing Systems (NeurIPS 2022).
  • 2. What is chain-of-thought Prompting ● Chain of Thought(C-o-T): Method for breaking down complex problems into simpler, manageable steps. ● Crucial for enhancing performance in arithmetic, common sense, and symbolic reasoning. ● Example: PaLM 540B model, using 8 C-o-T exemplars, demonstrates superior performance over fine-tuned GPT-3. ● Sets a new state-of-the-art performance in the GSM8K benchmark of math word problems.
  • 3. Chain-Of-Thought Prompting ● Concept and Application: Enables language models to generate coherent series of intermediate reasoning steps leading to the final answer. ● Advantages: ○ Problem Decomposition: Allows models to tackle multi-step problems effectively by breaking them down. ○ Interpretability and Debugging: Provides an insight into how the model arrives at answers, aiding in error analysis. ○ Task Versatility: Applicable to a range of tasks including math word problems, commonsense reasoning, and symbolic manipulation. ○ Accessibility in Large Models: Can be elicited in large, off-the-shelf language models via few-shot prompting with examples.
  • 4. Chain-Of-Thought Prompting ● Natural Language Problem-Solving: Demonstrates how C-o-T converts complex tasks into a series of logical steps. ● Example of C-o-T in Action: ○ Problem: “Jane has 12 flowers. After giving 2 to her mom, and 3 to her dad, how many does she have left?” ○ C-o-T: “Jane gives 2 flowers to her mom, she has 10... then after she gives 3 to her dad, she will have 7... so the answer is 7.” ● Benefits in Diverse Reasoning Tasks: Proven effectiveness in arithmetic, commonsense, and symbolic reasoning. ● Interpretability: Provides a window into model's thought process, enhancing transparency.
  • 5. Arithmetic Reasoning - Setup ● Though simple for humans, arithmetic reasoning is a task where language models often struggle ● Benchmark Overviews: GSM8K, SVAMP, ASDiv, AQuA, and MAWPS benchmarks. ● Prompting Methods Compared: Standard few- shot prompting vs. chain-of-thought prompting. ● Methodology of Study: Using few-shot exemplars with chains of thought for model prompting.
  • 6. Arithmetic Reasoning - Results ● Performance Leap with PaLM 540B: PaLM 540B model significantly outperforms standard models in arithmetic reasoning tasks. ● Benchmark Comparisons: Clear superiority of C-o-T prompting evident across various math word problem benchmarks including GSM8K, SVAMP, ASDiv, AQuA, and MAWPS. ● State-of-the-Art Achievement: Chain-of-thought prompting sets new performance records, especially notable in the GSM8K benchmark. ● Model Size and Performance: Notably, the benefits of C-o-T prompting become pronounced with larger models (around 100B parameters).
  • 7. Arithmetic Reasoning - Ablation Study ● Variation Testing: Analyzed the impact of different prompting methods on arithmetic reasoning. ● Equation-Only vs. Full C-o-T: Just providing equations is less effective, especially for complex problems. ● Computation Amount Not Key: More computation (using dots) doesn't enhance performance, underscoring C-o-T's unique contribution. ● Sequential Reasoning Crucial: Placing C-o-T after the answer doesn't improve results, highlighting the need for sequential reasoning steps.
  • 8. Robustness of Chain of Thought ● Impact of Different Exemplars: Explored how changes in exemplars affect C-o-T performance. ● Annotator Influence: Assessed how different annotators' interpretations impact the model's reasoning ability. ● Consistency Across Scenarios: Demonstrated C-o-T's effectiveness remains robust under various conditions.
  • 9. Common Sense Reasoning ● C-o-T Enhances Commonsense Tasks: Showcases significant improvement in models' ability to reason in common sense scenarios. ● Notable Performance Gains: Details how C- o-T outperforms standard approaches in nuanced, context-rich problems. ● Real-World Examples: Provides specific instances where C-o-T improved outcomes in commonsense reasoning.
  • 10. Symbolic Reasoning ● Effective in Symbolic Tasks: Highlights C-o-T's success in interpreting and solving symbolic reasoning problems. ● Complex Input Handling: Emphasizes C-o-T’s capability to process and analyze intricate symbolic data. ● Case Study Examples: Shares specific examples from the paper demonstrating C-o-T’s impact.
  • 11. Discussions ● Study Insights: Summarizes major observations and learnings from the research. ● AI Development Impact: Discusses how these findings can shape the future of language model development. ● Exploring Potential and Limits: Reflects on the broad applications and any identified limitations of C-o-T. ● Future Research Suggestions: Proposes directions for further investigation based on study outcomes.
  • 12. Conclusion ● Key Findings: Chain-of-thought (C-o-T) prompting significantly enhances reasoning abilities in large language models. ● Effective with Large Models: Most effective with larger models (e.g., PaLM 540B), especially for complex, multi-step problems. ● Beyond Simple Equations: C-o-T outperforms simple equation-only prompting, especially in semantically complex tasks. ● Facilitates Natural Language Reasoning: Enables models to reason through problems step-by-step in natural language. ● Broader Implications: These insights open up new possibilities for AI in complex problem-solving and reasoning tasks.
  • 13. FAQs FAQ 1: What is the role of prompt engineering? Answer: Prompt sensitivity is crucial. Different prompts affect models in varied ways. C-o-T is robust to different annotators and exemplar variations, but prompt engineering can still significantly enhance performance. FAQ 2: Will C-o-T prompting improve performance for my task? Answer: C-o-T is broadly applicable, especially beneficial when tasks require multi-step reasoning, use large models, and when the scaling curve is relatively flat​​. FAQ 3: Why is prompting with the equation only not enough? Answer: While useful for simpler tasks, equation-only prompting falls short in complex scenarios like GSM8K, where the semantic challenge requires natural language reasoning steps​