A neural network solves, explains, and generates university math problems by program synthesis and few-shot learning at human level.pdf

A Neural Network Solves, Explains, and
Generates University Math Problems by Program
Synthesis and Few-Shot Learning at Human Level
Iddo Drori1,a,b
, Sarah Zhanga
, Reece Shuttlewortha
, Leonard Tangc
, Albert Lua
, Elizabeth Kea
, Kevin Liua
, Linda Chena
,
Sunny Trana
, Newman Chengb
, Roman Wangb
, Nikhil Singha
, Taylor L. Pattic
, Jayson Lynchd
, Avi Shporera
, Nakul Vermab
,
Eugene Wub
, and Gilbert Stranga
a
Massachusetts Institute of Technology; b
Columbia University; c
Harvard University; d
University of Waterloo
This manuscript was compiled on April 7, 2022
We demonstrate that a neural network pre-trained on text and fine-
tuned on code solves mathematics course problems, explains solu-
tions, and generates new questions at human level. We automatically
synthesize programs using few-shot learning and OpenAI’s Codex
transformer and execute them to solve course problems at 80% auto-
matic accuracy. We curate a new dataset of questions from MIT’s
large mathematics courses (Single Variable and Multivariable Cal-
culus, Differential Equations, Introduction to Probability and Statis-
tics, Linear Algebra, and Mathematics for Computer Science) and
Columbia University’s Computational Linear Algebra course. We
solve questions from a MATH dataset (on Prealgebra, Algebra, Count-
ing and Probability, Intermediate Algebra, Number Theory, and Pre-
calculus), the latest benchmark of advanced mathematics problems
designed to assess mathematical reasoning. We randomly sample
questions from each course and benchmark topic and generate so-
lutions with multiple modalities, including numbers, equations, and
plots. The latest GPT-3 language model pre-trained on text auto-
matically solves only 18% of these university questions. In con-
trast, using program synthesis and few-shot learning with Codex that
is fine tuned on code generates programs that automatically solve
80% of these questions. Our approach improves upon the previous
state-of-the-art automatic solution accuracy on the benchmark top-
ics from 8.8% to 81.1%. We perform a survey to evaluate the quality
and difficulty of generated questions. This work is the first to au-
tomatically solve university-level mathematics course questions at
a human level. This is also the first work to explain, and generate
university-level mathematics course questions at scale, a milestone
for higher education.
Neural networks | Mathematics courses | Answering, explaining, and
generating questions |
Introduction
Until this work, it was widely believed that neural net-
works could not solve advanced mathematics problems
(1). However, the previous unsuccessful studies used only
text-based pre-training. We now demonstrate that a neural
network, OpenAI Codex, that is pre-trained on text and fine-
tuned on code automatically answers 80% of university-level
mathematics problems by program synthesis using few-shot
learning.
Figure 1 illustrates several example problems: computing
the volume generated by rotating the graph of a single variable
function around an axis, computing the Lorenz attractor and
its projection, and computing and demonstrating the geometry
of a singular value decomposition (SVD). For the first time, we
demonstrate that not only can a single machine learning model
solve these example problems, it can solve a wide variety of
mathematics courses at scale.
Related Work. Transformers are deep learning architectures
based only on attention mechanisms (2) that do not use re-
current neural networks or convolutional neural networks.
Transformer-based language models have enjoyed tremendous
success across a variety of natural language processing (NLP)
tasks, including zero-shot and few-shot language tasks (3).
However, these models have largely failed to solve math prob-
lems (4–6). In particular, previous work using transformers,
such as GPT-3 (3), have failed to solve mathematics courses
problems because the transformers were pre-trained on text
alone.
Pre-training a transformer is computationally expensive
and most often involves vast amounts of unlabeled data. The
most common optimization objectives for pre-training language
models are (i) masked word prediction: predicting a random
deleted word in a sentence or predicting the next word, or
(ii) classifying whether two sentences follow each other or not.
This computationally expensive step is usually done once and
followed by a relatively fast fine-tuning step. In fine-tuning,
Significance Statement
We provide the first demonstration that a neural network auto-
matically solves, explains, and generates university-level prob-
lems from the largest MIT mathematics courses at human level.
Our methods combine three innovations: (i) using recent neural
networks pre-trained on text and fine-tuned on code, rather
than pre-trained on text, (ii) few shot-learning to automatically
synthesize programs that correctly solve course problems, and
(iii) a pipeline to solve questions, explain solutions, and gener-
ate new questions indistinguishable by students from course
questions. Our work improves upon state-of-the-art increas-
ing automatic accuracy on randomly sampled questions on a
benchmark by an order of magnitude. Implications for higher
education include new roles of AI in automated course evalua-
tion and content generation.
1. Wrote the paper. 2. Created figures. 3. Provided ideas. 4. Curated datasets. 5. Gen-
erated questions. 6. Generated solutions. 7. Run GPT-3 and Codex. 8. Executed programs.
9. Evaluated solutions. 10. Analyzed data. 11. Wrote code. 12. Designed survey. I.D.
1,2,3,4,5,6,7,8,9,10,11,12; S.T. 1,4,5,6,7,8,9; R.W. 4,5,6,7,8,9; K.L. 1,4,5,6,8,9; N.C. 4,5,6,7,8,9;
L.T. 1,4,5,6,7,8,9; E.K. 4,7,8,9; N.S. 1,2,3,10,11,12; T.L.P. 1,2,3,7,8,9; J.L. 1,3,4,5,6,7,8,9; A.S.
1,10; N.V. 1,3,10,12; E.W. 1,2,3,12; G.S. 1,3
There are no competing interests.
1
Iddo Drori. E-mail: idrori@mit.edu
www.pnas.org/cgi/doi/10.1073/pnas.XXXXXXXXXX PNAS | April 7, 2022 | vol. XXX | no. XX | 1–180
arXiv:2112.15594v3
[cs.LG]
6
Apr
2022

Input (Problems) Output (Answers, Visualizations, Explanations)
MIT Courses
Columbia U Course
Zero-Shot Learning
70% automatic solve rate
Calculus
Diﬀerential Eq.
Intro to Prob.
Linear Algebra
Math for CS
Comp. Lin. Alg.
Prealgebra
Algebra
Number Theory
Counting and Prob.
Inter. Algebra
Precalculus
MATH Topics
Few-Shot Learning
80% automatic solve rate
Program Synthesis
As-is / write a program / use sympy / use simulations
Codex
Calculus
Differential Equations Linear Algebra
Introduction to Probability and Statistics
1. Generate a random number x from a uniform distribution on the interval [0, θ]
2. Test the hypothesis that θ = 2 by rejecting H0 if x ≤ 0.1 or x ≥ 1.9
3. Simulate the probability of a type I error Explanation
Fig. 1. We apply a neural network, OpenAI Codex, to solve, explain, and generate mathematics problems. Our input math problems are randomly sampled from MIT and
Columbia University courses and from the MATH dataset (left). We use zero-shot and few-shot learning to automatically generate programs, solving 80% of the questions. We
then use Codex to explain the synthesized programs. The modality of the program outputs take diverse forms depending on the course problems, such as numerical responses
or plots generated from text by program synthesis (right). For example, in Calculus 18.01-2: the volume generated by rotating the ﬁnite 2-dimensional region bounded by two
2-dimensional graphs about the plotted axis (top right); in Differential Equations 18.03: the Lorenz strange attractor (bottom left); In Linear Algebra 18.06: the geometry of the
singular value decomposition (SVD) (bottom right).
the pre-trained model is tuned using a specific dataset or task.
This work demonstrates that OpenAI’s Codex (7), a trans-
former that has been pre-trained on text and then fine-tuned
on code, generates programs (i.e., conducts program synthesis)
that solve math course problems at scale, and that few-shot
learning automatically solves 80% of the math course problems.
Previous work has seen modest success on simpler or spe-
cialized mathematics problem benchmarks. Techniques based
on co-training output to verify (8, 9) or predict expression trees
(10–15) are able to solve elementary school-level math prob-
lems, such as MAWPS and Math23k, with over 80% accuracy.
However, these approaches do not extend to high-school, math
Olympiad, nor university-level courses. Co-training paired
with graph neural networks (GNNs) to predict arithmetic
expression trees is able to solve university-level problems in
Machine Learning (16) with up to 95% accuracy. However,
that work is limited to numeric answers, and is moreover over-
fit to a specific course, such that it does not generalize to other
courses.
Major Contributions. Our main contribution, as shown in Fig-
ure 2, is demonstrating that a single neural network model,
OpenAI Codex, automatically solves 80% of randomly selected
university-level mathematics problems (six MIT mathematics
courses and one Columbia University course) by program syn-
thesis and few-shot learning. We also automatically explain
the solutions and generate new questions, a process requiring
only seconds per problem. The courses are listed in Table 1.
We randomly sample 25 questions per course and the prob-
lems are solved as-is or with minor contextual information
that is automatically applied. If applicable, the neural net-
work outputs an executable program that answers the problem
and visualizes the result. Furthermore, our method explains
the solutions and generates new problems that are nearly
indistinguishable from human-written problems.
This methodology increases the solution accuracy on the
MATH benchmark (5) from 8.8% accuracy using previously
state-of-the-art methods to 81.1% accuracy using automatic
few-shot learning. The MATH benchmark measures the math-
ematical problem-solving ability of neural network models
with challenging problems sourced from high school math
competitions, such as the AMC 10∗
, AMC 12, and AIME†
.
The methods we propose are simple and broadly applica-
ble. The first is using a transformer model pre-trained on
text and fine-tuned on code, so that it is adept at synthesiz-
ing programmatic solutions. The second is to use zero-shot
learning of the questions as-is or with automatically added
contextual information about the problem or program. The
third is to embed the questions and use few-shot learning
based on question-code pairs of nearest embedded zero-shot
questions. In our work, we use the latest version of Open AI
Codex (7), namely code-davinci-002.
Methods
Dataset. We randomly sample 25 questions from each of seven
courses: MIT’s 18.01 Single Variable Calculus, 18.02 Multi-
variable Calculus, 18.03 Differential Equations, 18.05 Intro-
duction to Probability and Statistics, 18.06 Linear Algebra,
6.042 Mathematics for Computer Science, and Columbia Uni-
versity’s COMS3251 Computational Linear Algebra. For the
MATH dataset, we randomly sample 15 questions from each of
six topics in the dataset. We validate that our results are not
merely overfitting training data by solving a new Computa-
tional Linear Algebra course COMS3251 which is not available
∗
American Mathematics Competitions
†
American Invitational Mathematics Examination
2 | www.pnas.org/cgi/doi/10.1073/pnas.XXXXXXXXXX Drori et al.

Randomly Sampled Questions
(100% Solved)
Zero-Shot Successful
(70% Automatic Solve Rate)
Few-Shot Successful
(+10% Automatic Solve Rate)
Manual Modiﬁcation (20%)
Few-Shot Unsuccessful
Zero-Shot
Unsuccessful
Zero-Shot Learning:
Acceptable input is:
• no change to original question
• add “write a program”
• add “using sympy”
• add “using simulations”
Few-Shot Learning:
If zero-shot does not work, per-
form few-shot learning using
(question, code) pairs:
1. Embed all questions
2. Compute pairwise similarity
between embeddings
3. Rank the questions that were
zero-shot to select examples for
few-shot
Fig. 2. We select a random sample of questions from each course or topic that do not contain input images or require proofs. A language model pre-trained on text (GPT-3
text-davinci-002) automatically solves only 18% (for courses) and 25.5% (for the MATH benchmark topics) of these questions. In contrast, using zero-shot learning with
a network pre-trained on text and ﬁne tuned on code (OpenAI Codex code-davinci-002) we synthesize programs that automatically solve 70% (for courses) and 72.2%
(for the MATH benchmark topics) of the questions, and using few-shot learning automatically solve 80% (for courses) and 81.1% (for the MATH benchmark topics) of the
questions. We use the nearest embedded zero-shot questions and their synthesized code for few-shot learning. The remaining 20% of the course questions and 18.9% of
MATH benchmark topic questions are manually prompted to solve the question.
online, and is therefore unseen during Codex training. We
automatically obtain correct answers for 80% of the randomly
sampled university math course questions and 81.1% of the
randomly sampled MATH benchmark questions. The previous
state-of-the-art on this benchmark before this work was 8.8%
(4).
Workflow. Our method takes a course problem as input and
synthesizes a program that, when run, outputs the solution.
Figure 4 compares the percent of automatically solved ques-
tions for each course using our zero-shot learning and few-shot
learning approaches with the latest GPT-3 (text-davinci-002)
and Codex (code-davinci-002) versions. The error bars on the
totals are standard errors.
Figures 3 shows examples of automatic workflows for solv-
ing course questions and generating explanations using Codex.
The panels show the original question, the automatic aug-
mentation with context, the resulting synthesized program,
the executed output answer that is the solution, and the ex-
planation of the solution program. Questions are given to
Codex either as-is, or by automatically adding minor context,
as described below.
Automatic Context.
Programming Language Context. Best results are obtained
when the Codex prompt specifies that a program should be
written and specifies which programming language should be
used. We add the text “write a program” before the question
and focus on the Python programming language by placing
the text within Pythonic triple quotes.
Library Context. Likewise, best results are obtained when the
Codex prompt specifies which programming package should
be used. For instance, we may add the Python library SymPy
as context (see Figure 3 top panel 18.01), specifying that
the program synthesized to solve the problem should use this
package.
Figure 5 shows the Python programming packages used by
each course. Each colored stacked bar represents the number
of questions in the course using that package. NumPy and
Sympy are used by all courses. Matplotlib is used by courses
with questions that require plotting. Math, Random, and
SciPy are used by around half of the courses. The usage
patterns of these courses is incorporated automatically in our
approach; as we only specify SymPy or plot related imports,
these other package imports are automatically synthesized.
Multi-modal Output. The output answer may be of numerous
modalities. In the examples featured in Figure 3, the output
answers are: an equation (18.01), a Boolean value (18.02), a
plot (18.03), a numerical value (18.05), and a vector (18.03
and 18.06).
Simulation. Figure 3 (18.05) shows an example from Prob-
ability and Statistics where the question is turned into a
probabilistic programming task that generates simulations in
order to compute an empirical statistic.
Automatic Zero-Shot and Few-Shot Learning. Zero-shot learn-
ing synthesizes a program from the original question or from
the automatically augmented question without using any exam-
ples. This method automatically solves 70% of the questions.
Few-shot learning begins by embedding all the questions
that are solved by zero-shot learning. Then, separately for
each course, we select the nearest zero-shot questions based
on the cosine similarity with the target question. We use the
nearest zero-shot (question, code) pairs as examples for few-
shot learning. This increases the number of questions that are
automatically solved from 70% by zero-shot learning to 80%
by few-shot learning. Figure 3 (18.02) demonstrates few-shot
Drori et al. PNAS | April 7, 2022 | vol. XXX | no. XX | 3

ID Course Question Solution
1 18.01
Single Variable Calculus
A bacteria population is 4000 at time t = 0 and its rate of growth is 1000 ∗ 2t
bacteria per hour after t hours. What is the population after one hour?
4000 + 1000
log(2)
2 18.02
Multi-variable Calculus
Describe the graph of the function f:
f(x, y) = 10 −
p
x2 + y2
3 18.03
Differential Equations
Find general solutions of the differential equations. If an initial condition is given,
find the corresponding particular solution. Throughout, primes denote derivatives
with respect to x. y′
+ y = 2, y(0) = 0
y(x) = 2(1 − e−x
)
4 18.05
Introduction to Probability
and Statistics
Calculate the probability of getting a three-of-a-kind poker hand. 0.021128
5 18.06
Linear Algebra
Find a combination x1w1 + x2w2 + x3w3 that gives the zero vector with x1 = 1.
w1 is the vector (1;2;3). w2 is the vector (4; 5; 6). w3 is the vector (7; 8; 9).
x1 = 1, x2 = −2, x3 = 1
6 6.042
Mathematics for
Computer Science
Find a number x ∈ {0, 1, ..., 112} such that 11x ≡ 1 (mod 113). 72
7 COMS3251
Computational
Linear Algebra
Given a d-dimensional non-zero vector v, compute the rank of the matrix vv′
1
8 MATH
Prealgebra
What is the greatest common factor of 84, 112 and 210? 14
9 MATH
Algebra
Let N, O be functions such that N(x) = 2
√
x, and O(x) = x2
. What is
N(O(N(O(N(O(3))))))?
24
10 MATH
Number Theory
How many four-digit numbers whose digits add up to 9 are divisible by 11? 0
11 MATH
Counting and Probability
A standard six-sided fair die is rolled four times. The probability that the product
of all four numbers rolled is a perfect square is m
n
, where m and n are relatively
prime positive integers. Find m + n.
187
12 MATH
Intermediate Algebra
Given that x2
+ y2
= 14x + 6y + 6, find the largest possible value of 3x + 4y. 73
13 MATH
Precalculus
If the six solutions of x6
= −64 are written in the form a + bi, where a and b are
real, find the product of those solutions with a > 0.
4
Table 1. Example questions and solutions from six MIT courses (18.01, 18.02, 18.03, 18.05, 18.06, 6.042), one Columbia University course
(COMS3251), and six topics of the MATH dataset (Pre-Algebra, Algebra, Intermediate Algebra, Counting and Probability, Precalculus, Number
Theory). The solutions can contain numerical answers, equations, and plots, among other modalities.
learning. Given the original question, the nearest embedded
question is found among the question that have been zero-shot.
This (question, code) pair along with the automatic input are
given to Codex which results in generated code that correctly
solves the problem.
Manual Prompts.
Question Tidying. While 80% of the question are automati-
cally solved by zero-shot and few-shot learning; 20% of the
questions may require manual prompts. These questions may
be vague or contain superfluous information (e.g., reference
movie characters or current events), and require tidying to
extract the essence of the question. Question tidying primarily
involves removing superfluous information, breaking down long
sentence structures into smaller components, and converting
prompts into a programming format.
Interaction for Visualization. Another form of manual prompt-
ing occurs when an answer plot requires multiple steps to
generate a visually pleasing and clear plot. These special
cases, which are among the remaining 20% of the questions re-
quire interactively prompting Codex until reaching the desired
visualizations.
Automatic Explanation. Explanations are generated automati-
cally using a prompt consisting of three quotes followed by the
text “Here’s what the above code is doing: 1.”. This prompt is
given after the question and the generated code, and results
in a step-by-step explanation of the solution code.
Automatic Questions Generation and their Human Evaluation.
We use Codex to generate new questions for each course. This
is done by creating a numbered list of questions from the
dataset. This list is cut off after a random number of questions,
and the result is used to prompt Codex to generate the next
question. This process is repeated to generate many new
questions for each course. Questions may be generated based
on course topic and given student performance may also be
generated based on level of difficulty.
To evaluate the generated questions, we survey MIT stu-
dents who have taken these courses or their equivalents to
compare the quality and difficulty of machine-generated ques-

Ans
2*x/(2*x-3) -
2*(x**2-1)/(2*x-3)**2
Find the derivative
of the function
using the definition
of a derivative.
f(x)= (x**2-1)/(2*x-3)
Question
18.01 Single Variable Calculus Program
Using sympy find the
derivative of the func-
tion using the definition
of the derivative.
f(x)= (x**2-1)/(2*x-3)
Input
Codex Codex
Explanation
1. We import sympy as sp
2. We create a symbol x
3. We create a function f
4. We print the derivative
of f with respect to x
Ans
R =1/2[ 2,−√2;√2,√2]
What 2 by 2 matrix R rotates
every vector through 45 degrees?
Example: the vector [1,0] goes to
[sqrt(2)/2, sqrt(2)/2].
Question
18.06 Linear Algebra Program
Codex Codex
Explanation
1. We define a function
called rotation_matrix
which takes in a parame-
ter theta.
2. We return a 2D numpy
array which is the rota-
tion matrix.
3. We call the function
with theta = pi/4.
4. We print the result.
Ans
0.1
One generates a number x
from a uniform distribution on
the interval [0, θ]. One de-
cides to test H0 : θ = 2
against HA : θ = 2 by reject-
ing H0 if x ≤ 0.1 or x ≥ 1.9.
Compute the probability of a
type I error.
Question
18.05 Introduction to Probability and Statistics Program
One generates a
number x from a uni-
form distribution on the
interval [0, θ]. One de-
cides to test H0 : θ = 2
against HA : θ = 2 by
rejecting H0 if x ≤ 0.1 or
x ≥ 1.9. Using simula-
tions, compute the
probability of a type I
error.
Input
Codex Codex
Explanation
1. Generate a random
number x from a uniform
distribution on the in-
terval [0, θ]
2. Test the hypothesis
that θ = 2 by rejecting
H0 if x ≤ 0.1 or x ≥ 1.9
3. Simulate the probabil-
ity of a type I error
Lorem ipsum
Ans
(0,0)
Find the critical
point or points of
the given autono-
mous system.
dx/dt=2*x-y,
dy/dt=x-3y
Question
18.03 Differential Equations Program
Use Sympy to find the criti-
cal point or points of the
given autonomous
system. dx/dt=2*x-y,
dy/dt=x-3y
Plot using streamplot
Input
Codex Codex
Explanation
1. We create a grid of x
and y values.
2. We create a function
for dx/dt and dy/dt.
3. We plot dx/dt and
dy/dt.
4. We find the critical
points.
5. We plot the critical
points.
Determine
whether the alter-
nating series con-
verge or diverge.
Question
18.02 Multivariate Calculus Program
Use comparison tests to determine
whether the infinite series converge
or diverge.
from sympy import Sum,
Symbol, oo, limit,
init_printing
init_printing()
n = Symbol('n')
s = Sum(1/(n**2 + n + 1),
(n, 1, oo))
limit(s, n, oo)
"""
The series converges.
"""
Few Shot (Question, Code)
Codex
Explanation
Codex
1. The first line imports the Sum
function from the sympy module.
2. The second line imports the
Symbol function from the sympy
module.
3. The third line imports the oo
function from the sympy module.
4. The fourth line imports the
limit function from the sympy
module.
5. The fifth line imports the
init_printing function from the
sympy module.
6. The sixth line creates a vari-
able called n and assigns it the
value of the Symbol function.
7. The seventh line creates a vari-
able called s and assigns it the
value of the Sum function.
8. The eighth line prints the limit
of the variable s as n approaches
infinity.
Ans
Converges
Write a program using
sympy to determine
whether the alternat-
ing series
converges or diverges.
Input
Fig. 3. Example pipelines automatically solving questions from MIT mathematics courses and explaining the solutions. 18.01 Single Variable Calculus Zero-Shot example:
Given a question and the automatically generated prefix "using SymPy", Codex is prompted and outputs a program. Running the program results in equations which are the
correct answer. The program is then fed to Codex again with an automatic prompt which results in a generated explanation of the code. 18.02 Multi-variable Calculus Few-Shot
example: Given a question, the prefix "write a program using SymPy" is automatically generated. The question is embedded along with the other questions in the course that
are zero-shot. The nearest zero-shot question and its corresponding code are used as a few-shot example. Together the few-shot example pair and the input question are fed
into Codex, which generates a program that solves the question. The question, program, and a prompt for explanation are fed into Codex to generate the explanation. 18.03
Differential Equations Zero-Shot example: In this example the answer is both a vector and a plot. 18.05 Introduction to Probability and Statistics Zero-Shot example: Given the
question, a probabilistic program is generated by using simulations. 18.06 Linear Algebra Zero-Shot example: The output answer is the correct vector.
tions with human-written questions for each of the courses ‡
‡
The IRB that approved the survey is MIT IRB Exempt Id E-3792. The survey was optional, included
informed consent, with the following description: “We are conducting a survey to assess the quality
and difficulty of automatically generated questions for STEM courses. You will be presented with
a series of blocks consisting of questions, either human-written (taken from an actual course) or

Codex Few−Shot Codex Zero−Shot GPT−3
0.00
0.25
0.50
0.75
1.00
1
8
.
0
1
1
8
.
0
2
1
8
.
0
3
1
8
.
0
5
1
8
.
0
6
6
.
0
4
2
C
O
M
S
3
2
5
1
T
o
t
a
l
MIT Mathematics Courses and a New Columbia Course
Automatic
Solve−Rate
A
A
l
g
e
b
r
a
C
o
u
n
t
i
n
g
&
P
r
o
b
a
b
i
l
i
t
y
I
n
t
e
r
m
e
d
i
a
t
e
A
l
g
e
b
r
a
N
u
m
b
e
r
T
h
e
o
r
y
P
r
e
a
l
g
e
b
r
a
P
r
e
c
a
l
c
u
l
u
s
T
o
t
a
l
MATH Benchmark
B
Fig. 4. Comparison between automatic solve rates on MIT mathematics courses and a new Columbia University course (A), and a MATH benchmark dataset (B). The latest
OpenAI GPT-3 (text-davinci-002), a transformer pre-trained on text, achieves 18% (A) and 25.5% (B) automatic solve rates. In contrast, program synthesis with zero-shot and
few-shot learning using the latest OpenAI Codex (code-davinci-002), a transformer pre-trained on text and fine tuned on code, achieves an automatic solve rate of 80% (A) and
81.1% (B).
We randomly sample five original questions and five generated
questions for each MIT course. In the survey, students are
asked to read ten questions from each course, which are a
mixture of randomly ordered human-written and machine-
generated questions.
For each of the 60 course questions, the students are asked
three survey questions: (i) is the question human-written or
machine-generated? (ii) is the question appropriate or not
appropriate for the specific course? and (iii) what is the
question’s difficulty level on a scale between 1 (easiest) and 5
(hardest)? An example of this survey format is given in Figure
6. The students are asked to simply provide their ratings and
not to solve the questions. The survey is conducted online
and anonymously.
generated with machine learning, but you will not be told the source of a given question. For
each question, you will be asked (a) whether you think the question is human-written or machine-
generated, (b) whether the question is appropriate for the given course, and finally (c) how you
would rate the difficulty of the question. Please carefully read each question and answer to the
best of your ability”.
0
10
20
30
18.03 18.01 18.02 COMS3251 18.05 18.06 6.042
Course
Count
Library
math
matplotlib
mpl_toolkits
networkx
numpy
random
scipy
sympy
Fig. 5. Imported Python programming libraries by course: NumPy is used by nearly
all courses. Matplotlib is used by courses with questions that involve plotting. Sympy
is used by most of the courses, and SciPy by half of the courses. Introduction to
Probability and Statistics (18.05) uses the statistics package.
Results
Questions Solved. We solve a total of 265 questions, 213
of them automatically, as described in the Supplemen-
tary Information. These 265 questions include 25 ran-
domly sampled questions from each of the seven courses
(18.01/18.02/18.03/18.05/18.06/6.042/COMS3251), and 15
randomly sampled questions for each of the six topics in the
MATH dataset (Prealgebra/Algebra/Number Theory/Count-
ing and Probability/Intermediate Algebra/Precalculus). The
breakdown of the questions solved automatically by zero-shot
and few-shot learning as compared with GPT-3 is shown in
Figure 4.
Visualization of Embedded Questions. We embed the 175
mathematics course questions onto a 2,048 dimensional
space using OpenAI’s (text-similarity-babbage-001) embed-
ding which captures semantic similarity between texts. We
then use uniform manifold approximation and projection
(UMAP) (17) reducing dimensionality to two dimensions. Fig-
ure 8 shows that the embedded questions are clustered by
course topics. On the top right we see clusters of questions
from MIT’s 18.06 Linear Algebra (black plus) and Columbia’s
COMS3251 Computational Linear Algebra (yellow plus). On
the left side we see a cluster of the questions from MIT’s 18.01
(red circles), 18.02 (green circles), and 18.03 (blue circles). On
the bottom right we see a cluster of the questions from MIT’s
18.05 Introduction to Probability and Statistics (magenta x)
and 6.042 Mathematics for Computer Science (cyan x), both
courses cover probability and statistics.
Automatically Generating New Questions. We use the curated
questions to generate new questions for each of the courses
and topics. This is done by prompting Codex with numbered
human-written questions and automatically generating the
next question. We generate 130 new questions as shown in the
Supplementary Information. These include ten new questions
for each of the seven courses and each of the six MATH topics.

Fig. 6. Student survey question: For each of 60 questions, students are asked if (i) the question is human-written or machine-generated, (ii) the question is appropriate or
inappropriate for the course, and (iii) to rate the difficulty level of each question on a scale between 1 (easiest) and 5 (hardest).
Fig. 7. Student survey results: Panel A compares between the level of difficulty of human-written questions and questions generated by our approach for each course based on
the student ratings. The plot shows the means of the difficulty ratings between 1 (easiest) and 5 (hardest), and their 95% confidence intervals. Panel B shows the percentage of
human-written and machine-generated questions rated as appropriate and not appropriate for the course. Panel C shows the percentage of human-written questions rated as
human-written or machine-generated (left) and the percentage of machine-generated questions rated as human-written or machine-generated (right).
Table 2 shows one generated question for each course and
MATH topic. Generating a question takes less than a second
and we can generate an arbitrarily large number of questions.
We create prompts of 25 randomly selected problems for which
Codex generates correct answers, remove the questions after a
randomly chosen question in the list, and have Codex complete
the next new question.
Student Survey Results. Fifteen participants completed our
survey answering questions about all 60 questions, taking a
median of 40 minutes. Figure 7 summarizes the results of
the student survey comparing human-written and machine-
generated questions. Panel A compares between the level of
difficulty of human-written questions and questions generated
by our approach for each course based on the student ratings.
The plot shows the means of the difficulty ratings between 1
(easiest) and 5 (hardest), and their 95% confidence intervals.
Panel B shows the percentage of human-written and machine-
generated questions rated by students as appropriate or not
appropriate for the courses. Panel C shows the percentage of
human-written questions rated as human-written or machine-
generated (left) and the percentage of machine-generated ques-
tions rated as human-written or machine-generated (right).
Summarizing the student survey results:
• Survey participants rated our machine-generated ques-
tions and human-written questions to be at a similar level
of difficulty within confidence intervals.
• Survey participants rated human-written questions
slightly more appropriate for the courses than machine-
generated ones.

Fig. 8. Visualization of embeddings of course questions: We embed the course questions into a 2,048 dimensional space using OpenAI’s (text-similarity-babbage-001)
embedding which captures semantic similarity between texts. We then use uniform manifold approximation and projection to reduce dimensionality to two dimensions. This
shows distinctive clusters based on topics. On the top right we see clusters of questions from MIT’s 18.06 Linear Algebra and Columbia’s COMS3251 Computational Linear
Algebra. On the left side we see a cluster of the questions from MIT’s 18.01, 18.02, and 18.03. On the bottom right we see a cluster of the questions from MIT’s 18.05
Introduction to Probability and Statistics and 6.042 Mathematics for Computer Science, both course cover probability and statistics.
• Survey participants rated human-written questions
slightly more likely to be human-written as shown in the
left side of Panel C. Survey participants rated machine-
generated questions equally likely to be machine-generated
and human-written as shown in the right side of Panel C.
Human Level. We achieve 80% automatic accuracy on solving
mathematics course problems at MIT and Columbia. This is
comparable to typical student performance on these problem
sets in our courses at MIT and Columbia University. We au-
tomatically generate new questions that are indistinguishable
by students from human course questions.
Implementation Details. We use the latest version of OpenAI’s
GPT-3 text-davinci-002 and Codex codex-davinci-002 engine
for all of our experiments. We fix all of Codex’s hyperparame-
ters to be the same for all solution and explanation experiments
to yield deterministic and reproducible results. Specifically,
top P which controls diversity is set to 0 and sampling tem-
perature, which controls randomness, is also set to 0. Both
frequency and presence penalty are set to 0, and we do not
halt on any stop sequences. For all new question generation
experiments we allow diversity and randomness by setting top
P and temperature to 0.1. Each prompt is structured as a
Python documentation comment surrounded by triple quota-
tions and line breaks. We evaluate the solution by running
the output using a Python interpreter. Answers are correct
if a printed output is the correct solution, or if the program
function returns the correct solution, and the explanation of
the solution is correct.
Types of Problems the Model Cannot Solve. There are a few
different types of problems the model is incapable of solving: (i)
any problem for which images or other non-text modalities of
input are critical to the question, (ii) questions with solutions
that require proofs, and (iii) problems that are computationally
intractable, such as factoring very large primes. This last
category is not expected in any math course assignment, as
students themselves would also be unable to answer them.
That being said, many questions that students can answer
have generalizations that are computationally intractable.
Conclusion
We demonstrate that program synthesis and few-shot learning
using OpenAI Codex is able to solve, explain, and generate
university-level mathematics problems at human level. In con-
trast, previous methods use transformers pre-trained only on
text, such as GPT-3, and fail at these tasks. We verify that our
strong results are not overfitting the training data by solving a
new course that is not available online. We also generate and
analyze new problem sets. The success of this work confirms
that programs serve as a good representation and computation
environment for solving math problems. Since our approach
requires no additional training, it has been scaled-up to over
thirty STEM courses. This work addresses major pedagogical
challenges, bringing substantial benefits to higher education
such as tools for curriculum design and analysis, and automatic
content generation.
We show that neural network synthesis with modern pro-
gramming languages is more dynamic and widely applicable
than expression trees, and is therefore likely to solve a wider
range of problems. Although any finite computation could

ID Course Machine-generated question Closest question in the dataset Similarity
1 18.01
Single-Variable Calculus
Find the area of the region bounded by the curve and
the x-axis. y = x2
sin(x), 0 ≤ x ≤ π
Find the area of the region under the given curve from
1 to 2. y = (x2
+ 1)/(3x − x2
)
0.61
2 18.02
Multi-Variable Calculus
Find a × b. a = h9, −2, 1i, b = h−2, 1, 1i Find a × b. a = h5, −1, −2i, b = h−3, 2, 4i 0.87
3 18.03
Differential Equations
Use the method of separable variables to solve the
initial-value problem dy
dx
= 5ex
, y(2) = 12 when x =
2
Separate variables and use partial fractions to solve
the initial value problems. Use either the exact so-
lution or a computer-generated slope field to sketch
the graphs of several solutions of the given differen-
tial equation, and highlight the indicated particular so-
lution.
f′
(x) = 3f(x)(5 − f(x)), f(0) = 8
0.21
4 18.05
Introduction to Probability
and Statistics
Let X be a uniformly distributed random variable over
the interval [0, 1). Find E[X2
]
Let X be the result of rolling a fair 4-sided die. Let Y
be the result of rolling a fair 6-sided die. You win 2X
dollars if X > Y and lose 1 dollar otherwise. After
playing this game 60 times, what is your expected total
gain?
0.29
5 18.06
Linear Algebra
Write a Matlab code to determine if the given matrix
A = [1, 1; 4, 4] is positive semidefinite and if it is neg-
ative semidefinite.
Find A′
A if the columns of A are unit vectors, all mu-
tually perpendicular.
0.21
6 6.042
Mathematics for
Computer Science
A student is taking a test consisting of n multiple-
choice questions. Each question has five possible an-
swers, and only one of them is correct. The student
knows that the probability that any particular question
is answered correctly is 1
5
. Let X be the number of
questions answered correctly by the student. What is
E(X)?
MIT students sometimes delay laundry for a few days.
Assume all random values described below are mu-
tually independent. A busy student must complete 3
problem sets before doing laundry. Each problem set
requires 1 day with probability 2
3
and 2 days with prob-
ability 1
3
. Let B be the number of days a busy student
delays laundry. What is E(B)?
0.47
7 COMS3251
Computational
Find a combination of the vectors
"
1 2 3
4 5 6
7 8 9
#
that
gives the vector

1 2 3

.
Find a combination of the vectors

1 2 3
4 5 6
7 8 9
#
that
give the zero vector.
0.90
8 MATH
Pre-Algebra
How many four-digit positive integers are there with
hundreds digit 2?
How many four-digit positive integers are there with
thousands digit 2?
0.90
9 MATH
Algebra
Find the distance between the points (0, 0) and (3, 4). Find the distance between the points (0, 4) and (3, 0). 0.99
10 MATH
Number Theory
Find the smallest positive integer n such that n2
is di-
visible by 210
and n3
is divisible by 310
.
How many four-digit numbers whose digits add up to 9
are divisible by 11?
0.25
11 MATH
Counting and Probability
How many ways are there to divide a set of 10 objects
into two sets of equal size?
Compute
8
4

. 0.12
12 MATH
Intermediate Algebra
Let x and y be positive real numbers such that x2
+
y2
= 1. Find the maximum value of xy.
Given that x2
+ y2
= 14x + 6y + 6, find the largest
possible value of 3x + 4y.
0.59
13 MATH
Precalculus
Let A be the matrix

1 2 3
4 5 6
7 8 9
#
Find the determinant
of A2
+ A3
.
If det(A) = 2 and det(B) = 12, then find det(AB). 0.41
Table 2. Examples of new questions automatically generated by Codex for each course and the closest question from the course.
be expressed as a sufficiently large expression tree, one may
see an arbitrarily large expansion in the size of expression
tree needed, as opposed to a Turing-complete language. This
flexibility is bolstered by the massive corpus of programs that
already exist, which eclipses the amount of labeled expression
trees available. Program outputs are also inherently more
human-readable, as the ability to use abstraction, modularity,
and high-level logic lead to clearer illustrations of the path to
solution. Furthermore, program synthesis can convey logical
deductions directly through explanatory comments, as well
as function and variable names. In particular, we see such
explanatory text and derivations in a number of the Codex
outputs. The unification of such formal and informal language
is an inherent advantage of our methodology. We emphasize
that the outputs may be complex and multi-modal. For ex-
ample, using Python packages such as Matplotlib to produce
graphs of equations. This advanced and unique ability is
time-consuming for humans and offers a major pedagogical
benefit.
In summary, we automatically solve, explain, and gener-
ate university-level mathematics course questions in real-time
at human level. Students rated machine-generated questions
as equally likely to be human-written as machine-generated,
although rated human-written questions more likely to be
human-written. Students rated machine-generated questions
similarly difficult to human-written questions, as well as mostly
appropriate for their respective courses. Finally, we have suc-
ceeded in scaling-up this work to over thirty STEM courses
across 13 departments in the schools of science and engineer-
ing at MIT, Harvard, and the Ivy League universities, with

excellent results.
1. CQ Choi, 7 revealing ways AIs fail: Neural networks can be disastrously brittle, forgetful, and
surprisingly bad at math. IEEE Spectr. 58, 42–47 (2021).
2. A Vaswani, et al., Attention is all you need in Proceedings of Advances in neural information
processing systems. Vol. 30, (2017).
3. TB Brown, et al., Language models are few-shot learners in Proceedings of Advances in
Neural Information Processing Systems. (2020).
4. D Hendrycks, et al., Measuring massive multitask language understanding. arXiv preprint
arXiv:2009.03300 (2020).
5. D Hendrycks, et al., Measuring mathematical problem solving with the math dataset in Pro-
ceedings of Advances in Neural Information Processing Systems: Datasets and Benchmarks.
(2021).
6. JW Rae, et al., Scaling language models: Methods, analysis insights from training gopher.
arXiv preprint arXiv:2112.11446 (2021).
7. M Chen, , et al., Evaluating large language models trained on code. arXiv preprint
arXiv:2107.03374 (2021).
8. J Shen, et al., Generate rank: A multi-task framework for math word problems. arXiv
preprint arXiv:2109.03034 (2021).
9. K Cobbe, et al., Training veriﬁers to solve math word problems. arXiv preprint
arXiv:2110.14168 (2021).
10. Z Xie, S Sun, A goal-driven tree-structured neural model for math word problems in Proceed-
ings of International Joint Conference on Artificial Intelligence. pp. 5299–5305 (2019).
11. Q Wu, Q Zhang, J Fu, XJ Huang, A knowledge-aware sequence-to-tree network for math
word problem solving in Proceedings of Conference on Empirical Methods in Natural Lan-
guage Processing. pp. 7137–7146 (2020).
12. J Qin, L Lin, X Liang, R Zhang, L Lin, Semantically-aligned universal tree-structured solver
for math word problems in Proceedings of Conference on Empirical Methods in Natural Lan-
guage Processing. (2020).
13. J Zhang, et al., Graph-to-tree learning for solving math word problems in Proceedings of the
Annual Meeting of the Association for Computational Linguistics. pp. 3928–3937 (2020).
14. S Li, et al., Graph-to-tree neural networks for learning structured input-output translation with
applications to semantic parsing and math word problem in Proceedings of Conference on
Empirical Methods in Natural Language Processing. pp. 2841–2852 (2020).
15. Z Liang, J Zhang, J Shao, X Zhang, MWP-BERT: A strong baseline for math word problems.
arXiv preprint arXiv:2107.13435 (2021).
16. S Tran, et al., Solving machine learning problems in Proceedings of Asian Conference on
Machine Learning. (2021).
17. L McInnes, J Healy, J Melville, Umap: Uniform manifold approximation and projection for
dimension reduction. arXiv preprint arXiv:1802.03426 (2018).

A. Solution of MIT 18.01: Single Variable Calculus
Prereq None
Units 5-0-7
Syllabus Differentiation and integration of functions of one variable, with applications. Informal treatment of limits and
continuity. Differentiation: definition, rules, application to graphing, rates, approximations, and extremum problems. Indefinite
integration; separable first-order differential equations. Definite integral; fundamental theorem of calculus. Applications of
integration to geometry and science. Elementary functions. Techniques of integration. Polar coordinates. L’Hopital’s rule.
Improper integrals. Infinite series: geometric, p-harmonic, simple comparison tests, power series for some elementary functions.
Table 3. MIT 18.01, Question 1: Original question, Codex input and code and code explanation, Program solution, Solution type, GPT-3
response and evaluation
Topic Functions and Models
Original question Sketch the graph of the function. f(x) = x + |x|
Codex input Same as original question
Codex code
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-10, 10, 100)
y = x + np.abs(x)
plt.plot(x, y)
plt.xlabel(’x’)
plt.ylabel(’y’)
plt.grid()
plt.show()
Program solution
Solution type Automatic Zero-Shot
Codex code explanation 1. We import the numpy and matplotlib.pyplot libraries.
2. We create an array of 100 evenly spaced numbers from -10 to 10.
3. We create an array of the same size as x, where each element is the absolute value of the correspond-
ing element in x.
4. We plot the graph of y against x.
5. We label the x-axis and y-axis.
6. We display the grid.
7. We display the graph.
GPT-3 response The graph of the function is a line with a slope of 1 and a y-intercept of 0.
GPT-3 evaluation Incorrect
Original question Find f ◦ g ◦ h. f(x) = 3x − 2, g(x) = sin(x), h(x) = x2

Codex code
import math
def f(x):
return 3*x-2
def g(x):
return math.sin(x)
def h(x):
return x**2
def fgh(x):
return f(g(h(x)))
print(fgh(2))
Program solution 3 sin(x2
) − 2
Codex code explanation 1. fgh(2)
2. h(2)
3. g(4)
4. f(sin(4))
5. 3 sin 4 − 2
GPT-3 response f◦g◦h = f(g(h(x))) = f(g(x**2)) = f(sin(x**2)) = 3*sin(x**2)-2
GPT-3 evaluation Correct
Original question Solve each equation for x. ln(x2
− 1) = 3
Codex code
import sympy
x = sympy.Symbol(’x’)
eq = sympy.log(x**2-1) - 3
print(sympy.solve(eq, x))
Program solution −
√
1 + e3,
√
1 + e3
Codex code explanation 1. We import the sympy module.
2. We create a symbol x.
3. We create an equation.
4. We solve the equation for x.
GPT-3 response x**2-1=e**3
x**2=e**3+1
x=sqrt(e**3+1)
Topic Limits and Derivatives
Original question Determine the inﬁnite limit. limx→5+ (x + 1)/(x − 5)
Codex input Using sympy determine the inﬁnite limit. lim_x_5+ (x+1)/(x-5)

Codex code
from sympy import limit, Symbol, S
x = Symbol(’x’)
limit((x+1)/(x-5), x, 5)
Program solution ∞
Codex code explanation 1. We import the limit function from the sympy module.
2. We create a Symbol object called x.
3. We call the limit function and pass in the expression (x+1)/(x-5) and the variable x.
4. We pass in the point 5 as the third argument.
GPT-3 response The limit does not exist because the function is not deﬁned when x = 5.
Original question Find the limits as x → ∞ and as x → −∞. Use this information, together with intercepts, to give a
rough sketch of the graph as in Example 12. y = x2
(x2
− 1)2
(x + 2)
Codex code
import sympy as sym
from sympy import Symbol
from sympy import pprint
sigma = Symbol(’sigma’)
x = Symbol(’x’)
mu = Symbol(’mu’)
pprint(2*sym.pi*sigma)
pprint(sym.sqrt(2*sym.pi*sigma))
part_1 = 1/(sym.sqrt(2*sym.pi*sigma))
part_2 = sym.exp(-1*((x-mu)**2)/(2*sigma))
my_gauss_function = part_1*part_2
pprint(my_gauss_function)
sym.plot(my_gauss_function.subs({mu:10,sigma:30}),(x,0,100),title=’gauss
distribution’)
x_values = []
y_values = []
for value in range(-5,5):
y = my_gauss_function.subs({mu:10,sigma:30,x:value}).evalf()
y_values.append(y)
x_values.append(value)
print(value,y)
plt.plot(x_values,y_values)
plt.show()

Program solution ∞; -∞;
Codex code explanation 1. We create a list of x values from -5 to 5.
2. We create an empty list to store the y values.
3. We loop through the x values.
4. For each x value, we calculate the y value.
5. We append the y value to the list of y values.
6. We plot the x values against the y values.
GPT-3 response As x - infinity, y - infinity. As x - - infinity, y - infinity.
The graph will look like a parabola that opens up.
Original question Find the derivative of the function using the definition of a derivative. f(x) = (x2
− 1)/(2x − 3)
Codex input Using Sympy find the derivative of the function using the definition of a derivative. f(x)=(x**2-1)/(2*x-3)
Codex code
import sympy as sp
x = sp.Symbol(’x’)
f = (x**2-1)/(2*x-3)
print(sp.diff(f,x))
Program solution 2x
2x−3
−
2(x2
−1)
2x−3
Codex code explanation 1. We import sympy as sp
2. We create a symbol x
3. We create a function f
4. We print the derivative of f with respect to x
GPT-3 response (2*x**2-1)/(2*x-3)**2
Topic Differentiation Rules
Original question Find the limit. limx→0 csc(x) sin(sin(x))
Codex code
from sympy import limit, csc, sin, Symbol
x = Symbol(’x’)
print(limit(csc(x)*sin(sin(x)), x, 0))

Program solution 1
Codex code explanation 1. We are finding the limit of csc(x)*sin(sin(x)) as x approaches 0.
2. We are using the limit function from the sympy library.
3. We are passing in the function csc(x)*sin(sin(x)) as the first argument.
4. We are passing in the variable x as the second argument.
5. We are passing in the number 0 as the third argument.
GPT-3 response The limit does not exist.
Original question The half-life of cesium-137 is 30 years. Suppose we have a 100-mg sample. How much of the sample
remains after 100 years?
Codex input Write a program. The half-life of cesium-137 is 30 years. Suppose we have a 100-mg sample. How much
of the sample remains after 100 years?
Codex code
import math
def half_life(t, h):
return 100 * math.pow(0.5, t/h)
print(half_life(100, 30))
Program solution 9.92 mg
Codex code explanation 1. We import the math module.
2. We define a function called half_life that takes two arguments: t and h.
3. We return the value of 100 * 0.5ˆ(t/h).
4. We print the result of half_life(100, 30).
GPT-3 response After 100 years, 50 mg of the sample would remain.
Original question The radius of a sphere is increasing at a rate of 4 mm/s. How fast is the volume increasing when the
diameter is 80 mm?
Codex code
import sympy as sp
r = sp.symbols(’r’)
d = 4*r
v = 4/3*sp.pi*r**3
dv_dr = sp.diff(v,r)
print(dv_dr.subs(r,80))
Program solution 25600π
Solution type Manual
Codex code explanation 1. We define the radius as a symbol.
2. We define the diameter as a function of the radius.
3. We define the volume as a function of the radius.
4. We take the derivative of the volume with respect to the radius.
5. We substitute the radius with 80 and print the result.
GPT-3 response The volume of a sphere is increasing at a rate of approximately 201.06 mm3/s.

Topic Applications of Differentiation
Original question The water level, measured in feet above mean sea level, of Lake Lanier in Georgia, USA, during 2012
can be modeled by the function L(t) = 0.01441t3
− 0.4177t2
+ 2.703t + 1060.1 where t is measured
in months since January 1, 2012. Estimate when the water level was highest during 2012.
Codex code
import math
def L(t):
return 0.01441*t**3-0.4177*t**2+2.703*t+1060.1
def dL(t):
return 0.04323*t**2-0.8354*t+2.703
def ddL(t):
return 0.08646*t-0.8354
def newton(f, df, x0, epsilon):
x = x0
while True:
x1 = x - f(x)/df(x)
if abs(x1 - x) epsilon:
return x1
x = x1
print(newton(dL, ddL, 0, 0.00001))
Program solution
Codex code explanation 1. We define the function L(t)
2. We define the derivative of L(t)
3. We define the second derivative of L(t)
4. We define a function that uses Newton’s method to find the root of a function f(x)
5. We print the result of using Newton’s method to find the root of the derivative of L(t)
GPT-3 response The water level was maximum at t=2.
Original question If a resistor of R ohms is connected across a battery of E volts with internal resistance r ohms, then the
power (in watts) in the external resistor is P = E2
R/(R + r)2
. If E and r are fixed but R varies, what is
the maximum value of the power?

Codex code
import sympy as sym
from sympy import Symbol
from sympy import pprint
R = Symbol(’R’)
E = Symbol(’E’)
r = Symbol(’r’)
P = E**2*R/(R+r)**2
print(P)
dP = P.diff(R)
print(dP)
R_sol = sym.solve(dP,R)
print(R_sol)
R_max = R_sol[0].evalf()
print(R_max)
P_max = P.subs({R:R_max})
print(P_max)
Program solution E2
4r
Solution type Automatic Few-Shot (5)
Codex code explanation 1. Define the variables
2. Define the function
3. Take the derivative of the function
4. Solve the derivative for R
5. Evaluate the solution
6. Substitute the solution into the original function
GPT-3 response The maximum value of the power is P = Eˆ2R/(R+r)ˆ2 when R = r.
Original question Use Newton’s method to find all the solutions of the equation correct to eight decimal places. Start by
drawing a graph to find initial approximations. x5
− 3x4
+ x3
− x2
− x + 6 = 0
Codex input Use Newton’s method to find all the solutions between -100 and 100 with integer initial values. x**5-
3*x**4+x**3-x**2-x+6=0. Use initial points between -100 and 100. Round to 8 decimal places.

Codex code
import numpy as np
def f(x):
return x**5-3*x**4+x**3-x**2-x+6
def df(x):
return 5*x**4-12*x**3+3*x**2-2*x-1
def newton(x, f, df, epsilon=1e-8, max_iter=1000):
for i in range(max_iter):
x_new = x - f(x)/df(x)
if abs(x_new - x) epsilon:
return x_new
x = x_new
return x_new
def main():
x = np.linspace(-100, 100, 1000)
y = f(x)
plt.plot(x, y)
plt.grid()
plt.show()
x_sols = []
for i in range(-100, 100):
x_sols.append(newton(i, f, df))
print(x_sols)
print(np.round(x_sols, 8))
if __name__ == ’__main__’:
main()
Program solution −1.044503071109637, 2.7055120932482875
Codex code explanation 1. Plotting the function f(x) = x**5-3*x**4+x**3-x**2-x+6
2. Finding the derivative of f(x)
3. Using Newton’s method to ﬁnd the roots of f(x)
4. Plotting the roots of f(x)
GPT-3 response There are ﬁve solutions.
-1.46557123
-0.53728871
0.36082474
1.12403607
2.09132432
Topic Integrals
Original question Evaluate
R 1
1
√
1 + x4 dx.
Codex input Write a program to evaluate the integral from 1 to 1 of sqrt(1 + xˆ4) dx.

Codex code
from sympy import Symbol, sqrt, Integral, pprint
x = Symbol(’x’)
k = Symbol(’k’)
p = sqrt(1 + x**4)
pprint(Integral(p, (x, 1, 1)).doit())
Program solution 0.0
Codex code explanation 1. We create a Symbol object called x.
2. We create a Symbol object called k.
3. We create a symbolic expression called p.
4. We create an Integral object called p_integral.
5. We call the doit() method on the Integral object.
GPT-3 response The integral from 1 to 1 of sqrt(1 + xˆ4) dx is equal to 2.
Topic Integrals
Original question Sketch the region enclosed by the given curves and calculate its area. y = 4 − x2
, y = 0
Codex input Using numpy, graph the lines y=0 and y=4-x**2. Use ﬁll_between().
Then using Sympy, ﬁnd the points of intersection between and y=0 and y=4-x**2.
Integrate between lines y=4-x**2 and y=0, over the points of intersection to calculate the area of the
region. x should go from -5 to 5.
Codex code
import numpy as np
import sympy as sp
x = np.linspace(-5, 5, 100)
y1 = 0*x
y2 = 4-x**2
plt.plot(x, y1, ’-b’, label=’y=0’)
plt.plot(x, y2, ’-r’, label=’y=4-x**2’)
plt.fill_between(x, y1, y2, color=’grey’, alpha=0.5)
plt.legend(loc=’best’)
plt.show()
x, y = sp.symbols(’x y’)
f = 4-x**2
g = 0*x
sols = sp.solve([f, g], [x, y])
x1 = sols[0][0]
y1 = sols[0][1]
x2 = sols[1][0]
y2 = sols[1][1]
area = sp.integrate(f, (x, x1, x2)) + sp.integrate(g, (x, x1, x2))
print(area)

Program solution 32
3
Codex code explanation 1. We are using numpy to graph the lines y=0 and y=4-x**2. We are using fill_between() to fill in the area
between the two lines.
2. We are using Sympy to find the points of intersection between the two lines.
3. We are using Sympy to integrate between the two lines, over the points of
intersection, to calculate the area of the region.
GPT-3 Response The region is enclosed by the curves y=4-xˆ2, y=0, and x=2. Its area is 16.
Topic Integrals
Original question A bacteria population is 4000 at time t = 0 and its rate of growth is 1000 · 2t
bacteria per hour after t
hours. What is the population after one hour?
Codex input A bacteria population is 4000 at time t = 0 and its rate of growth is y = 1000 * 2ˆx bacteria per hour after
t hours. Write a function to calculate the population after one hour. Use sympy to integrate the symbol x
from 0 to 1. Then add the initial population to the result.
Codex code
import sympy as sp
def bacteria_population(x):
return 1000 * 2**x
print(sp.integrate(bacteria_population(x), (x, 0, 1)) + 4000)
Program solution 1000
log 2
+ 4000
Codex code explanation 1. We import sympy as sp.
2. We define x as a symbol.
3. We define a function bacteria_population(x) that returns the population of bacteria after x hours.
4. We print the result of integrating the function bacteria_population(x) from 0 to 1 and adding the initial
population of 4000.
GPT-3 response 5000
Topic Applications of Integration
Original question Graph the region between the curves and use your calculator to compute the area correct to five decimal
places. y = tan2
(x), y =
√
x
Codex input Graph the region between the curves y = tanˆ2(x) and y = sqrt(x) to find the two points where they
intersect. Then, write a program to compute the area between the curves where they cross. Use sympy.

Codex code
import sympy
import math
x = sympy.symbols(’x’)
y1 = sympy.tan(x)**2
y2 = sympy.sqrt(x)
# find the intersection points
intersection_points = sympy.solve(y1 - y2)
# find the area between the two curves
area = sympy.integrate(y1 - y2, (x, intersection_points[0],
intersection_points[1]))
print(area)
Codex code explanation 1. We import sympy and math.
2. We create a variable x and set it equal to sympy.symbols(’x’). This is how we tell sympy that we want
to use x as a variable.
3. We create two variables, y1 and y2, and set them equal to sympy.tan(x)**2 and sympy.sqrt(x), respec-
tively.
4. We create a variable called intersection_points and set it equal to sympy.solve(y1 - y2). This is how
we tell sympy to find the intersection points of the two curves.
5. We create a variable called area and set it equal to sympy.integrate(y1 - y2, (x, intersection_points[0],
intersection_points[1])). This is how we tell sympy to find the area between the two curves.
6. We print the area.
GPT-3 response The region between the curves is shaded in the graph.
The area is 1.04719.
Topic Applications of Integration
Original question A force of 10 lb is required to hold a spring stretched 4 in. beyond its natural length. How much work is
done in stretching it from its natural length to 6 in. beyond its natural length?
Codex input A force of 10 lb is required to hold a spring stretched delta_x = 1/3 foot. Write a function to calculate
the work done in stretching it to delta_x = 1/2 foot. Use Hooke’s Law F = k*delta_x where delta_x is a
sympy.symbols to find the spring constant k = 10/(1/3) and integrate F with respect to delta_x from 0 to
the stretched ratio 1/2. Use sympy for the integral. Then plug in for delta_x = 1/2.
Codex code
import sympy
def work_done(delta_x):
k = 10/(1/3)
F = k*delta_x
return sympy.integrate(F, (delta_x, 0, 1/2))
print(work_done(sympy.symbols(’delta_x’)))
Codex code explanation 1. We define a function called work_done that takes a variable delta_x as an argument.
2. We define the spring constant k = 10/(1/3)
3. We define the force F = k*delta_x
4. We return the integral of F with respect to delta_x from 0 to 1/2.
5. We print the result of calling work_done with sympy.symbols(’delta_x’) as the argument.
GPT-3 Response The work done in stretching the spring from its natural length to 6 in. beyond its natural length is 60 lb-in.

Topic Techniques of Integration
Original question Evaluate the integral.
R 0.5
0
x
√
1 − 4x2 dx
Codex input Using sympy evaluate the integral. 0 to 0.5 of x*sqrt(1-4xˆ2) dx.
Codex code
from sympy import *
x = Symbol(’x’)
integral = integrate(x*sqrt(1-4*x**2), (x, 0, 0.5))
print(integral)
Program solution 1
12
3. We create an integral object.
4. We print the integral object.
GPT-3 response The integral is 0.
Original question Find the area of the region under the given curve from 1 to 2. y = (x2
+ 1)/(3x − x2
)
Codex input Find the area of the region under the given curve from 1 to 2. y = (xˆ2+1)/(3x-xˆ2)

Codex code
import numpy as np
def f(x):
return (x**2+1)/(3*x-x**2)
def trapezoidal(f, a, b, n):
h = (b-a)/n
s = 0.5*(f(a) + f(b))
for i in range(1, n):
s = s + f(a + i*h)
return h*s
def simpson(f, a, b, n):
h = (b-a)/n
s = f(a) + f(b)
for i in range(1, n, 2):
s = s + 4*f(a + i*h)
for i in range(2, n-1, 2):
s = s + 2*f(a + i*h)
return h*s/3
def gauss(f, a, b, n):
x, A = np.polynomial.legendre.leggauss(n)
return 0.5*(b-a)*np.dot(A, f(0.5*(b-a)*x + 0.5*(b+a)))
a = 1
b = 2
n = 10
print(Trapezoidal:, trapezoidal(f, a, b, n))
print(Simpson:, simpson(f, a, b, n))
print(Gauss:, gauss(f, a, b, n))
x = np.linspace(a, b, 100)
y = f(x)
plt.plot(x, y)
plt.show()
Codex code explanation 1. Define the function f(x) = (xˆ2+1)/(3x-xˆ2)
2. Define the trapezoidal rule for numerical integration
3. Define the Simpson’s rule for numerical integration
4. Define the Gauss-Legendre quadrature rule for numerical integration
5. Define the limits of integration
6. Define the number of intervals
7. Print the results of the three methods
8. Plot the function
GPT-3 response The area of the region under the given curve from 1 to 2 is approximately 1.4.
Original question Evaluate the integral. ln(1 + x2
)
Codex input Evaluate the integral of ln(1+x**2). Cast to a string and add + C.
Codex code
from sympy import *
x = Symbol(’x’)
print(str(integrate(ln(1+x**2), x)) + + C)
Program solution x log(x2
+ 1) − 2x + 2 arctan x + C

3. We create a variable integral and assign it the value of the integral of ln(1+x**2) with respect to x.
4. We print the value of the integral.
GPT-3 response The integral of ln(1+x**2) is x*ln(1+x**2)-x
Topic Parametric Equations and Polar Coordinates
Original question Graph the curve x = y − 2 sin(πy).
Codex input Graph the curve x = y - 2sin(pi*y) by sampling points and swapping the (x, y) pair locations.
Codex code
import numpy as np
def f(y):
return y - 2*np.sin(np.pi*y)
y = np.linspace(-2, 2, 100)
x = f(y)
plt.plot(x, y)
plt.xlabel(’x’)
plt.ylabel(’y’)
plt.title(’x = y - 2sin(pi*y)’)
plt.show()
Program solution
Codex code explanation 1. Deﬁne the function f(y) = y - 2sin(pi*y).
2. Create an array of 100 evenly spaced numbers between -2 and 2.
3. Create an array of the values of f(y) for each y in the array.
4. Plot the curve x = f(y).

GPT-3 response A curve is defined by a set of parametric equations if the coordinates of each point on the curve are given
by specific functions of a variable. The variable is called a parameter.
The parametric equations for a curve are usually written as x = f(t) and y = g(t). The parameter t is usually
restricted to a certain interval, such as 0 ≤ t ≤ 1.
The graph of the parametric equations x = f(t) and y = g(t) is the set of points (f(t), g(t)) for all values of t
in the interval.
Example 1: Find the graph of the parametric equations x = t2 and y = t3.
Solution: The graph of the parametric equations x = t2 and y = t3 is the set of points (t2, t3) for all values
of t.
Topic Parametric Equations and Polar Coordinates
Original question Find the distance between the points with polar coordinates (4, 4π/3) and (6, 5π/3).
Codex input Using sympy find the distance between the points with polar coordinates (4, 4pi/3) and (6, 5pi/3).
Codex code
from sympy import *
r1, r2, theta1, theta2 = symbols(’r1 r2 theta1 theta2’)
x1 = r1 * cos(theta1)
y1 = r1 * sin(theta1)
x2 = r2 * cos(theta2)
y2 = r2 * sin(theta2)
d = sqrt((x2 - x1)**2 + (y2 - y1)**2)
print(d.subs({r1: 4, theta1: 4*pi/3, r2: 6, theta2: 5*pi/3}))
Program solution 2 ∗ sqrt(7)
Codex code explanation 1. We define the variables r1, r2, theta1, theta2 as symbols.
2. We define x1, y1, x2, y2 as the cartesian coordinates of the points.
3. We define d as the distance between the points.
4. We substitute the values of r1, theta1, r2, theta2 and print the result.
GPT-3 response The distance between the points is 2.
Topic Infinite Sequences and Series
Original question Determine whether the series is convergent or divergent.
P∞
n=1
1/n
√
2
Codex input Using Sympy, calculate the value sum_n=1înf 1/nˆsqrt(2) if it converges. Check if it converges.
Codex code
from sympy import Sum, Symbol, oo, pprint, init_printing
def test_convergence(n):
x = Symbol(’x’)
s = Sum(1/x**(2**0.5), (x, 1, n))
pprint(s)
pprint(s.doit())
if __name__ == ’__main__’:
init_printing(order=’rev-lex’)
test_convergence(10)
test_convergence(oo)
NOTE: ∞ in SymPy is oo. This is because oo looks like ∞.
Program solution
P∞
n=1
n−
√
2
; True

Codex code explanation 1. We import the Sum function from sympy.
2. We create a function that takes in a number n and creates a Sum object.
3. We print the Sum object and the result of the summation.
4. We call the function with n = 10 and n = oo.
GPT-3 response The series is divergent.
Topic Infinite Sequences and Series
Original question Find the sum of the series.
P∞
n=1
(−1)n−1
∗ (3n
/(n · 5n
))
Codex input Find the sum of the series by running a large number of iterations. sum_{n=1}ˆ{inf} (-1)ˆ{n-1}*(3ˆn/(n*5ˆn)).
Return 0 if n 1.
Codex code
def sum_series(n):

Find the sum of the series. sum n to infinity (-1)^{n-1}*(3^n/(n*5^n))

sum = 0
for i in range(1, n+1):
sum += ((-1)**(i-1))*(3**i)/(i*(5**i))
return sum
def main():

Find the sum of the series. sum n to infinity (-1)^{n-1}*(3^n/(n*5^n))

n = int(input())
print(sum_series(n))
if __name__ == ’__main__’:
main()
Codex code explanation 1. We are taking input from the user.
2. We are calling the function sum_series() with the input as the argument.
3. The function sum_series() is returning the sum of the series.
4. We are printing the sum of the series.
GPT-3 response The sum of the series is -1/4.
Topic Further Applications of Integration
Original question If revenue flows into a company at a rate of f(t) = 9000*sqrt(1+2t), where t is measured in years and f(t)
is measured in dollars per year, find the total revenue obtained in the first four years.
Codex input Using sympy if revenue flows into a company at a rate of f(t) = 9000*sqrt(1+2t), where t is measured in
years and f(t) is measured in dollars per year, find the total revenue obtained in the first four years.
Codex code
from sympy import *
t = symbols(’t’)
f = 9000*sqrt(1+2*t)
print(integrate(f, (t, 0, 4)))
Program solution 78000

Codex code explanation 1. We import the sympy library.
2. We define the variable t as a symbol.
3. We define the function f(t) = 9000*sqrt(1+2t).
4. We integrate f(t) from t=0 to t=4.
5. We print the result.
GPT-3 response The total revenue obtained in the first four years is f(4) = 9000*sqrt(1+2*4) = 9000*sqrt(9) = 9000*3 =
27000.

B. Solution of MIT 18.02: Multivariable Calculus
Prereq Calculus I (GIR)
Units 5-0-7
Syllabus Calculus of several variables. Vector algebra in 3-space, determinants, matrices. Vector-valued functions of one
variable, space motion. Scalar functions of several variables: partial differentiation, gradient, optimization techniques. Double
integrals and line integrals in the plane; exact differentials and conservative fields; Green’s theorem and applications, triple
integrals, line and surface integrals in space, Divergence theorem, Stokes’ theorem; applications.
Textbook Edwards, C.; Penney, David Multivariable Calculus(6th ed.) Pearson Education, ISBN 9780130339676.
Topic Polar Coordinates
Original question Sketch the graph of the polar equation. Indicate any symmetries around either coordinate axis or the
origin. r = (2 + 7 sin(θ))(cos(5θ))
Codex input Write a function using matplotlib to plot the graph of the polar equation r = (2 + 7*sin(theta))*(cos(5*theta))
Codex code
import numpy as np
def polar_eq(theta):
return (2 + 7*np.sin(theta))*(np.cos(5*theta))
theta = np.linspace(0, 2*np.pi, 100)
plt.polar(theta, polar_eq(theta))
plt.show()
Program solution
Codex code explanation 1. Importing numpy and matplotlib.pyplot
2. Deﬁning a function that takes theta as an argument
3. Creating a list of 100 evenly spaced values between 0 and 2pi
4. Plotting the polar equation
GPT-3 response The graph has symmetry around the origin and the y-axis.
Topic Analytical Geometry and Polar Coordinates
Original question Find the center and radius of the circle described in the given equation. x2
+ 2x + y2
= 4
Codex input Find the center and radius of the circle described in the given equation. x2
+ 2x + y2
= 4.

Codex code
import sympy as sp
# Equation of the circle
eq = x**2 + 2*x + y**2 - 4
# Solving the equation
sol = sp.solve(eq, y)
# Printing the solution
print(sol)
# Finding the center
center = sp.solve(eq, x)
print(center)
# Finding the radius
radius = sp.sqrt(center[0]**2 + sol[0]**2)
print(radius)
Program solution Center: (-1,0); Radius:
√
5
Codex code explanation 1. We import the sympy library and the symbols function.
2. We deﬁne the equation we want to ﬁnd the radius and center of.
3. We create a Circle object from the equation.
4. We print the radius and center of the circle.
GPT-3 response The center of the circle is at (1, 0) and the radius is
√
3.
Topic Polar Coordinates
Original question Sketch the graph of the polar equation. Indicate any symmetries around either coordinate axis or the
origin. r = 3θ (spiral of Archimedes)
Codex input Write a function using matplotlib to plot the graph of the polar equation r = 3*theta (spiral of Archimedes).
Codex code
import numpy as np
theta = np.arange(0, 2*np.pi, 0.01)
r = 3*theta
plt.polar(theta, r)
plt.show()
Program solution

Codex code explanation 1. Importing the numpy and matplotlib.pyplot modules.
2. Creating a range of values for theta from 0 to 2π with an increment of 0.01.
3. Creating a range of values for r from 0 to 30 with an increment of 0.01.
4. Creating a polar plot using the polar() function.
5. Showing the plot using the show() function.
GPT-3 response The graph of the polar equation is a spiral of Archimedes. There is no symmetry around either coordinate
axis or the origin.
Topic Area Computations in Polar Coordinates
Original question In Problems 7 through 16. find the area bounded by the given curve. r = 2 ∗ cos(θ)
Codex input f(theta)=2*cos(theta)
Using Sympy, starting from 0, find the first two points that f(theta)=0 and set teh solutions to a and b.
Use the solution to Integrate f(theta)**2/2 from a to b.
Codex code
from sympy import *
theta = Symbol(’theta’)
f = 2*cos(theta)
a = solve(f, theta)
print(a)
b = integrate(f**2/2, (theta, a[0], a[1]))
print(b)
Program solution π
Codex code explanation 1. The first line imports the sympy library.
2. The second line defines the function f(theta) as 2*cos(theta).
3. The third line defines the symbol theta.
4. The fourth line uses the solve() function to find the two values of theta that make f(theta)=0.
5. The fifth line uses the integrate() function to find the area under the curve from a to b.
GPT-3 response The area is 8π.
Topic Parametric Curves
Original question Eliminate the parameter and then sketch the curve. x = et
, y = 4e2
t
Codex input Plot the parametric curve. x = et
, y = 4e2
t
Codex code
import numpy as np
t = np.linspace(-2, 2, 100)
x = np.exp(t)
y = 4 * np.exp(2 * t)
plt.plot(x, y)
plt.show()

Program solution
Codex code explanation 1. Importing the numpy and matplotlib.pyplot modules.
2. Defining the t array.
3. Defining the x and y arrays.
4. Plotting the curve.
GPT-3 response y = 4e2
t
Topic Integral Computations with Parametric Curves
Original question Find the arc length of the given curve. x = 2t, y = 2
3
∗ t
3
2 , 5 ≤ t ≤ 12
Codex input Use sympy to integrate from 5 to 12 to find the arc length of x = 2t, y = 2/3*t**(3/2)
Codex code
from sympy import *
x = 2*t
y = 2/3*t**(3/2)
dx = diff(x, t)
dy = diff(y, t)
ds = sqrt(dx**2 + dy**2)
print(integrate(ds, (t, 5, 12)))
Program solution 74
3
Codex code explanation 1. Define x and y as functions of t
2. Define dx and dy as the derivatives of x and y with respect to t
3. Define ds as the magnitude of dx and dy
4. Integrate ds from 5 to 12
GPT-3 response The arc length of the given curve is approximately 33.4.
Topic Parabola
Original question Sketch the parabola with the given equation. Show and label its vertex, focus, axis, and directrix. y2
=
12x
Codex input Use least squares to find a second order polynomial passing through points (2. 3), (4, 3), and (6,-5).
Use polynomial matrix.
Then use python to print(f’c[0]+c[1]x+c[2]x**2’)

Codex code
import numpy as np
x = np.array([2, 4, 6])
y = np.array([3, 3, -5])
A = np.array([[1, x[0], x[0]**2],
[1, x[1], x[1]**2],
[1, x[2], x[2]**2]])
b = np.array([y[0], y[1], y[2]])
c = np.linalg.solve(A, b)
print(f’{c[0]}+{c[1]}x+{c[2]}x**2’)
Program solution −5.0 + 6.0x + −x2
Codex code explanation 1. Create a numpy array of x values
2. Create a numpy array of y values
3. Create a numpy array of ones with the same length as x and y
4. Create a numpy array of x values squared
5. Create a numpy array of x values cubed
6. Create a numpy array of the values of the matrix A
7. Create a numpy array of the values of the vector b
8. Solve the system of equations
9. Print the values of the coefficients
GPT-3 response The parabola has a vertex at (0,0), a focus at (0,6), an axis at y=0, and a directrix at y=-6.
Topic Ellipse
Original question Find an equation of the ellipse specified. Foci (-2, 2) and (4, 2), eccentricity 1/3
Codex input Write a function using sympy to find the equation of the ellipse with foci at (-2, 2) and (4, 2). The
eccentricity of the ellipse is 1/3.
Codex code
import sympy as sp
foci_1 = sp.Point(-2, 2)
foci_2 = sp.Point(4, 2)
eccentricity = sp.Rational(1, 3)
ellipse = sp.Ellipse(sp.Point(0, 0), sp.sqrt(5), sp.sqrt(5 * (1 -
eccentricity ** 2)))
print(ellipse.equation(x, y))
Program solution
(x−1)2
81
+
(y−2)2
72
= 1
Codex code explanation 1. Find the center of the ellipse
2. Find the distance between the foci
3. Find the length of the major axis
4. Find the length of the minor axis
5. Find the equation of the ellipse
GPT-3 response (x − 2)2
/9 + (y − 2)2
/4 = 1

Topic Functions of Several Variables
Original question Describe the graph of the function f.f(x, y) = 10 − 2
p
x2 + y2
Codex input Plot the 3-D graph of the function f(x, y) = 10 − 2
p
x2 + y2
Codex code
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection=’3d’)
x = np.arange(-10, 10, 0.1)
y = np.arange(-10, 10, 0.1)
X, Y = np.meshgrid(x, y)
Z = 10 - np.sqrt(X**2 + Y**2)
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=’rainbow’)
plt.show()
Program solution
Codex code explanation 1. Create a figure object
2. Add a subplot to the figure object
3. Create a meshgrid of x and y values
4. Calculate the z values for the meshgrid
5. Plot the graph
GPT-3 response The graph of the function f is a parabola.
Topic Infinite Series and Convergence
Original question Determine whether the given infinite series converges or diverges. If it converges, find its sum. 1 + 3 + 5
+ 7 + ... + (2n-1) + ....
Codex input Using sympy, determine if the sequence sn = 2n − 1 diverges by showing the value of the limit as n
goes to infinity. If the limit is infinity, then print diverges. Otherwise print converges.

Codex code
from sympy import limit, oo
def sequence_diverges(sequence):

Determine if the sequence s_n = 2n-1 diverges by showing the value of
the limit as n goes to infinity.
If the limit is infinity, then print diverges. Otherwise print converges.

limit_value = limit(sequence, n, oo)
if limit_value == oo:
print(diverges)
else:
print(converges)
sequence_diverges(2*n-1)
Program solution Diverges
Codex code explanation 1. The limit function takes a sequence and a value of n and returns the value of the limit as n goes to
infinity.
2. The limit function returns oo if the sequence diverges.
3. If the limit is oo, then the sequence diverges.
4. If the limit is not oo, then the sequence converges.
GPT-3 response The series diverges.
Topic Taylor Series and Taylor Polynomials
Original question Find the Taylor polynomial with remainder by using the given values of a and n. f(x) = ex
; a = 1, n = 4
Codex input Using Sympy, find the taylor series of f(x) = ex
where a = 1 and n = 4
Codex code
import sympy as sp
f = sp.exp(x)
print(sp.series(f, x, x0=1, n=4))
Program solution E + E(x − 1) +
E(x−1)2
2
+
E(x−1)3
6
+ O((x − 1)4
, (x, 1))
Solution type Automatic Zero Shot
Codex code explanation 1. We define the function f(x) = ex
2. We define the function taylor(function, a, n) which takes in a function, a value a, and a value n.
3. We define a variable i and set it to 0.
4. We define a variable p and set it to 0.
5. We create a while loop that runs while i is less than or equal to n.
6. We set p equal to p plus the derivative of the function with respect to x, evaluated at x=a, divided by
the factorial of i, times (x − a)i
.
7. We increment i by 1.
8. We return p.
9. We print the value of taylor(f(x), 1, 4).
GPT-3 response T4(x) = 1 + x + x2
/2! + x3
/3! + x4
/4! + R4(x)
Topic Taylor Series and Taylor Polynomials
Original question Find the Maclaurin series of the given function f by substitution. f(x) = sin(2x)
Codex input Use Sympy to find the maclaurin series of f(x) = sin(2x)

Codex code
from sympy import *
x = symbols(’x’)
f = sin(2*x)
print(f.series(x, 0, 10))
Program solution 2x − 4x3
3
+ 4x5
15
+ 8x7
315
+ · · ·
Codex code explanation 1. Define the variable x as a symbol
2. Define the function f(x) = sin(2x)
3. Print the expression f(x)
4. Print the series expansion of f(x) about x=0 up to 10 terms
GPT-3 response sin(2x) = 2 sin(x) cos(x)
sin(2x) = 2(x − x3
/3! + x5
/5! − ...)(cos(x) + x2
/2! − x4
/4! + ...)
sin(2x) = 2x cos(x) − 2x3
/3! cos(x) + 2x5
/5! cos(x) − ... + 2x3
/3! − 2x5
/5! + ...
sin(2x=2x cos(x) − 2x3
/3!(cos(x) + x2
/2!) + 2x5
/5!(cos(x) + x2
/2! − x4
/4!) − ...
Topic Integral Test
Original question Use the integral test to test the given series for convergence.
P∞
1
1/(2n
).
Codex input Using Sympy, integrate
P∞
1
1/(2n
) from 0 to infinity. if it is finite print(’converges’) and if it is infinite
print(’diverges’)
Codex code
from sympy import *
x = Symbol(’x’)
def f(x):
return 1/(2**x)
def integrate_f(x):
return integrate(f(x), (x, 0, oo))
print(integrate_f(x))
if integrate_f(x).is_infinite:
print(’diverges’)
else:
print(’converges’)
Program solution Converges
Codex code explanation 1. We import sympy as sp.
2. We create a symbol n.
3. We integrate 1/(2**n) from n=1 to infinity.
GPT-3 response
P∞
1
1/(2n
) converges because
R ∞
1
1/(2x
)dx converges.
Topic Comparison Tests for Positive-Term Series
Original question Use comparison tests to determine whether the infinite series converge or diverge.
P∞
1
1/(n2
+n+1).
Codex input Write a function using sympy to determine if the infinite series
P∞
1
1/(n2
+ n + 1) converges.

Codex code
from sympy import Sum, Symbol, oo, limit, init_printing
init_printing()
n = Symbol(’n’)
s = Sum(1/(n**2 + n + 1), (n, 1, oo))
limit(s, n, oo)

The series converges.

Program solution Converges.
Codex code explanation 1. We create a symbol n to represent the variable in the summation.
2. We create a summation of 1/(nˆ2 + n + 1) from n = 1 to inﬁnity.
3. We use the limit function to determine if the summation converges.
GPT-3 response
P∞
1
1/(n2
+ n + 1) diverges.
Topic Alternating Series and Absolute Convergence
Original question Determine whether the alternating series converge or diverge.
P∞
1
((−1)(
n + 1))/n2
Codex input Write a function using sympy to determine if the alternating series
P∞
1
((−1)(
n + 1))/n2
converges or
diverges.
Codex code
from sympy import Sum, Symbol, oo, sympify, simplify, limit, init_printing
def alternating_series(n):

This function takes a number n and determines if the alternating series
converges or diverges.

# Initialize sympy printing
init_printing()
# Create a symbol for the summation
n = Symbol(’n’)
# Create the summation
s = Sum((-1)**(n+1)/n**2, (n, 1, oo))
# Simplify the summation
s = simplify(s)
# Determine if the series converges or diverges
if s.is_infinite:
print(The series diverges.)
else:
print(The series converges.)
# Determine the limit of the series
lim = limit(s, n, oo)
# Print the limit
print(The series converges to: + str(lim))
if __name__ == __main__:
alternating_series(10)
Program solution Converges

Codex code explanation 1. The first line imports the Sum function from the sympy module.
2. The second line imports the Symbol function from the sympy module.
3. The third line imports the oo function from the sympy module.
4. The fourth line imports the limit function from the sympy module.
5. The fifth line imports the init_printing function from the sympy module.
6. The sixth line creates a variable called n and assigns it the value of the Symbol function.
7. The seventh line creates a variable called s and assigns it the value of the Sum function.
8. The eighth line prints the limit of the variable s as n approaches infinity.
GPT-3 response The series converges.
Topic Power Series Computations
Original question Use power series to approximate the values of the given integrals accurate to four decimal places.
R 1
0
sin(x)/xdx.
Codex input Write a function using scipy to compute the values of the given integral
R 1
0
sin(x)/xdx.
Codex code
from scipy.integrate import quad
def f(x):
return np.sin(x)/x
I = quad(f, 0, 1)
print(I)
Codex code explanation 1. Importing the integrate module from scipy
2. Defining a function f(x)
3. Calling the integrate.quad function, which performs the integral
4. Printing the result
GPT-3 response
R 1
0
sin(x)/xdx ≈
P∞
n=0
(−1)n x2n+1
(2n+1)!
R 1
0
sin(x)/xdx ≈
P3
n=0
(−1)n x2n+1
(2n+1)!
R 1
0
sin(x)/xdx ≈ x
1!
− x3
3!
+ x5
5!
− x7
7!
R 1
0
sin(x)/xdx ≈ 1
1
− 1
6
+ 1
120
− 1
5040
R 1
0
sin(x)/xdx ≈ 0.8415
Topic Cross Product of Two Vectors
Original question Find a x b. a = 5,-1,-2, b = -3,2,4
Codex input Find the vector product of a x b. a = 5,-1,-2, b = -3,2,4
Codex code
a = [5,-1,-2]
b = [-3,2,4]
def vector_product(a,b):
return [a[1]*b[2]-a[2]*b[1], a[2]*b[0]-a[0]*b[2], a[0]*b[1]-a[1]*b[0]]
print(vector_product(a,b))
Program solution h0, −14, 7i
Codex code explanation 1. The first line creates a list called a.
2. The second line creates a list called b.
3. The third line creates a list called c and stores the result of the vector product of a and b in c.
4. The fourth line prints the result of the vector product of a and b.
GPT-3 response 5,-1,-2 * -3,2,4 = -14

Topic Curves and Motion in Space
Original question Show that the graph of the curve with parametric equations x = t ∗ (sin(6t)), y = t ∗ (cos(6t)), z = t
lies on the cone z = 2
p
x2 + y2 with its vertex at the origin and opening upward.
Codex input Plot the graph of the curve with parametric equations: x = t*(sin6t). y = t*(cos6t), z = t.
Then plot the cone z =
p
Codex code
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
t = np.linspace(-10, 10, 100)
x = t*(np.sin(6*t))
y = t*(np.cos(6*t))
z = t
fig = plt.figure()
ax = fig.add_subplot(111, projection=’3d’)
ax.plot(x, y, z)
r = np.sqrt(x**2 + y**2)
ax.plot(x, y, r, ’r’)
plt.show()
Program solution
Codex code explanation 1. Generate 100 points between -10 and 10 using the linspace function from numpy.
2. Calculate x, y, and z using the parametric equations.
3. Plot the graph of the curve with parametric equations.
4. Plot the cone z = 2
p
GPT-3 response The graph of the curve with parametric equations x = t*(sin6t). y = t*(cos6t), z = t lies on the cone z =
2
p
Topic Double Integrals
Original question Evaluate the iterated integral.
R 2
0
R 4
0
(3x + 4y)dxdy
Codex input Write a function using scipy.integrate.dblquad to compute the value of the double integral:
R 2
0
R 4
0
(3x +
4y)dxdy.

A neural network solves, explains, and generates university math problems by program synthesis and few-shot learning at human level.pdf

A neural network solves, explains, and generates university math problems by program synthesis and few-shot learning at human level.pdf

Recommended

Recommended

More Related Content

Similar to A neural network solves, explains, and generates university math problems by program synthesis and few-shot learning at human level.pdf

Similar to A neural network solves, explains, and generates university math problems by program synthesis and few-shot learning at human level.pdf (20)

More from Sarah Pollard

More from Sarah Pollard (20)

Recently uploaded

Recently uploaded (20)

A neural network solves, explains, and generates university math problems by program synthesis and few-shot learning at human level.pdf