Matt Welsh
matt@ziggylabs.ai
Large language
models and the
end of
programming
Who is this guy?!?
Who is this guy?!?
Been writing code since 1984 (on a VIC-20)
PhD in CS from UC Berkeley
Prof of CS at Harvard
Wrote one of the first books about Linux
Engineering leadership at Google, Apple, OctoAI
Founded a couple of AI startups
Who is this guy?!?
Been writing code since 1984 (on a VIC-20)
PhD in CS from UC Berkeley
Prof of CS at Harvard
Wrote one of the first books about Linux
Engineering leadership at Google, Apple, OctoAI
Founded a couple of AI startups
Speaker at Craft Conference 2024
This is everyone here
Large
Language
Models
This is everyone here
*** COMPUTER SCIENCE IS DOOMED ***
Computer Science has always been about one thing:
Translating ideas into programs.
CS is the study of how to take a problem and map it onto
instructions that can be executed by a Von Neumann machine.
7
*** COMPUTER SCIENCE IS DOOMED ***
Critically, the goal of CS has always been that programs
are implemented, maintained, and understood by humans.
But -- spoiler alert! -- humans suck at all of these things.
8
Let’s just make programming easier!
Fifty years of programming language research has done
nothing to improve the state of affairs.
No amount of improvement to type systems, debugging, static
analysis, linters, or documentation is going to magically
solve this problem.
9
Let’s just make programming easier!
FORTRAN (1957)
10
Let’s just make programming easier!
BASIC (1964)
11
Let’s just make programming easier!
APL (1966)
12
Let’s just make programming easier!
Rust (2010)
13
This is how I program now...
14
This is how I program now...
15
This is how I program now...
16
What is the algorithm being implemented here?
How would you write it down as code?
What, if anything, could you prove about this program?
17
18
(Not Donald Knuth)
How much does it cost to replace one human with AI?
Let’s assume (for now) that LLMs will be able to do most,
or all, of the programming work done by a human software
developer.
What would it cost to replace a human SWE with LLM calls?
19
How much does it cost to replace one human with AI?
Typical SWE salary: $220,000
20
How much does it cost to replace one human with AI?
Typical SWE salary: $220,000
Benefits, taxes, free breakfast, lunch, dinner, snacks,
masseuse, shuttle bus, on-site doctor, bowling alley, ...
$92,000
Total: $312,000
21
How much does it cost to replace one human with AI?
Typical SWE salary: $220,000
Benefits, taxes, free breakfast, lunch, dinner, snacks,
masseuse, shuttle bus, on-site doctor, bowling alley, ...
$92,000
Total: $312,000
Number of working days per year: 260
Total cost for one-human-SWE-day: $1200
22
How much does it cost to replace one human with AI?
Average lines of finalized code checked in per day ~= 100
23
How much does it cost to replace one human with AI?
Average lines of finalized code checked in per day ~= 100
Average number of GPT-4o tokens per line ~= 10
Price for GPT-4o = $0.005 / 1K tokens (input)
$0.015 / 1K tokens (output)
24
How much does it cost to replace one human with AI?
Average lines of finalized code checked in per day ~= 100
Average number of GPT-4o tokens per line ~= 10
Price for GPT-4o = $0.005 / 1K tokens (input)
$0.015 / 1K tokens (output)
Assume 5:1 input-to-output token ratio
Total cost for one-human-SWE-day equivalent work: $0.04
25
How much does it cost to replace one human with AI?
$0.04 / day
(30,000x cheaper) 26
$1200 / day
How much does it cost to replace one human with AI?
The robot does not take breaks.
The robot does not require catered
lunches or on-site massage.
The robot takes the same length of
time whether it’s a prototype or
final production code.
The robot makes plenty of
mistakes, but makes them
incredibly quickly.
27
IBM 607 ad - 1953
28
“150 Extra Engineers
An IBM Electronic Calculator
speeds through thousands of
intricate computations so quickly
that on many complex problems
it’s just like having 150 EXTRA
Engineers.”
Beyond Programming
Conventional programs are about
giving direct instructions to
simple symbolic machines.
But what if the machine can
understand natural language
and perform complex reasoning
directly?
29
Teaching, not programming
Gradually, programming will be replaced by models that can:
- Understand problem definitions in natural language
- Perform sophisticated reasoning and cognition
- Find novel solutions to problems
- Learn how to do new things on their own
30
>> This radically changes how we think about computation. <<
The Natural Language Computer
31
Natural
language
"program" Large Language
Model
ChatGPT
The Natural Language Computer
32
Natural
language
"program"
External tools
("peripherals")
{ API }
Large Language
Model
ChatGPT
The Natural Language Computer
33
Natural
language
"program"
External tools
("peripherals")
{ API }
Large Language
Model
Task
Task
Task
ChatGPT
The Natural Language Computer
34
Natural
language
"program"
External tools
("peripherals")
{ API }
Large Language
Model
Short-term
memory
Vector DB
Long-term
memory
Task
Task
Task
ChatGPT
The Natural Language Computer
35
Natural
language
"program"
External tools
("peripherals")
{ API }
Large Language
Model
Short-term
memory
Vector DB
Long-term
memory
Task
Task
Task
ChatGPT
Two key questions...
1. What is the extent of LLMs’ reasoning abilities?
2. What is the right way to “program” in natural language?
36
How well can LLMs reason?
38
https://arxiv.org/abs/2206.04615
214
benchmarks
measuring a
wide range of
reasoning
tasks
Measuring LLM reasoning - Google’s BIG-Bench
Measuring LLM reasoning - Google’s BIG-Bench
39
As models get
larger, they
get better
(but still not
great overall)
40
Example: BIG-Bench “checkmate_in_one” task
In the following chess position, find a
checkmate-in-one move.
1. e4 e5 2. Nf3 d6 3. d4 exd4 4. Nxd4 Nf6 5. Nc3
Qe7 6. Bd3 d5 7. O-O dxe4 8. Re1 Be6 9. Nxe6 fxe6
10. Bxe4 Nxe4 11. Nxe4 Nd7 12. Bg5 Qb4 13. Qg4
Qd4 14. Qxe6+ Be7
41
In the given chess position, the move for White to
checkmate in one is:
15. Qxe7#
This move places the Black king in check and there
is no legal move for Black to escape the check,
resulting in checkmate.
In the following chess position, find a
checkmate-in-one move.
1. e4 e5 2. Nf3 d6 3. d4 exd4 4. Nxd4 Nf6 5. Nc3
Qe7 6. Bd3 d5 7. O-O dxe4 8. Re1 Be6 9. Nxe6 fxe6
10. Bxe4 Nxe4 11. Nxe4 Nd7 12. Bg5 Qb4 13. Qg4
Qd4 14. Qxe6+ Be7
Example: BIG-Bench “checkmate_in_one” task
42
In the following chess position, find a
checkmate-in-one move.
1. e4 e5 2. Nf3 d6 3. d4 exd4 4. Nxd4 Nf6 5. Nc3
Qe7 6. Bd3 d5 7. O-O dxe4 8. Re1 Be6 9. Nxe6 fxe6
10. Bxe4 Nxe4 11. Nxe4 Nd7 12. Bg5 Qb4 13. Qg4
Qd4 14. Qxe6+ Be7 15.
What if we add an extra “step”
to the prompt?
Example: BIG-Bench “checkmate_in_one” task
43
In the given chess position, the checkmate-in-one
move is:
15. Nd6#
This move delivers checkmate because the knight
on d6 controls key squares around the Black king,
and there is no way for Black to capture or block
the check.
In the following chess position, find a
checkmate-in-one move.
1. e4 e5 2. Nf3 d6 3. d4 exd4 4. Nxd4 Nf6 5. Nc3
Qe7 6. Bd3 d5 7. O-O dxe4 8. Re1 Be6 9. Nxe6 fxe6
10. Bxe4 Nxe4 11. Nxe4 Nd7 12. Bg5 Qb4 13. Qg4
Qd4 14. Qxe6+ Be7 15.
Whoops!
Example: BIG-Bench “checkmate_in_one” task
What’s the right way to program
in natural language?
45
How things are done today...
Natural language != plain text
Existing frameworks treat LLM prompts and responses as
plain strings.
46
Natural language != plain text
Existing frameworks treat LLM prompts and responses as
plain strings.
But natural language...
- Has a great deal of ambiguity
- Can encode complex algorithmic concepts
- Carries detailed semantic information
- Requires different syntactic structures depending on the
language used (Chinese, English, and Yoruba are not the
same!)
47
Probably the wrong way to do it...
48
49
Flowplay.ai
50
Wordware.ai
51
http://www.literateprogramming.com/knuthweb.pdf
Knuth, 1984
Evolving Computer Science
52
Slide rule
1859-1975
Evolving Computer Science
53
Slide rule
1859-1975
Computer science
1959-2030
Evolving Computer Science
54
Over time, CS looks more like EE: A more technical skill set
necessary in some very specialized occupations.
Evolving Computer Science
55
Over time, CS looks more like EE: A more technical skill set
necessary in some very specialized occupations.
The vast majority of people building “software” will not be
programming: they will be interacting with an AI.
Evolving Computer Science
56
Over time, CS looks more like EE: A more technical skill set
necessary in some very specialized occupations.
The vast majority of people building “software” will not be
programming: they will be interacting with an AI.
AI greatly expands access to computing to anyone who can
express themselves in natural language.
57
8.1 Billion
non-programmers
World population
2024
58
30 Million
programmers
8.1 Billion
non-programmers
World population
2024
59
30 Million
programmers
8.1 Billion
non-programmers
World population
2024
Natural language opens up
computing to everyone
Evolving Computer Science
60
The network is the computer.
-- John Gage, 1984
Evolving Computer Science
61
The network is the computer.
-- John Gage, 1984
The model is the computer.
-- Matt Welsh, 2024
matt@ziggylabs.ai
@mdwelsh

Large Language Models and the End of Programming

  • 1.
  • 2.
  • 3.
    Who is thisguy?!? Been writing code since 1984 (on a VIC-20) PhD in CS from UC Berkeley Prof of CS at Harvard Wrote one of the first books about Linux Engineering leadership at Google, Apple, OctoAI Founded a couple of AI startups
  • 4.
    Who is thisguy?!? Been writing code since 1984 (on a VIC-20) PhD in CS from UC Berkeley Prof of CS at Harvard Wrote one of the first books about Linux Engineering leadership at Google, Apple, OctoAI Founded a couple of AI startups Speaker at Craft Conference 2024
  • 5.
  • 6.
  • 7.
    *** COMPUTER SCIENCEIS DOOMED *** Computer Science has always been about one thing: Translating ideas into programs. CS is the study of how to take a problem and map it onto instructions that can be executed by a Von Neumann machine. 7
  • 8.
    *** COMPUTER SCIENCEIS DOOMED *** Critically, the goal of CS has always been that programs are implemented, maintained, and understood by humans. But -- spoiler alert! -- humans suck at all of these things. 8
  • 9.
    Let’s just makeprogramming easier! Fifty years of programming language research has done nothing to improve the state of affairs. No amount of improvement to type systems, debugging, static analysis, linters, or documentation is going to magically solve this problem. 9
  • 10.
    Let’s just makeprogramming easier! FORTRAN (1957) 10
  • 11.
    Let’s just makeprogramming easier! BASIC (1964) 11
  • 12.
    Let’s just makeprogramming easier! APL (1966) 12
  • 13.
    Let’s just makeprogramming easier! Rust (2010) 13
  • 14.
    This is howI program now... 14
  • 15.
    This is howI program now... 15
  • 16.
    This is howI program now... 16 What is the algorithm being implemented here? How would you write it down as code? What, if anything, could you prove about this program?
  • 17.
  • 18.
  • 19.
    How much doesit cost to replace one human with AI? Let’s assume (for now) that LLMs will be able to do most, or all, of the programming work done by a human software developer. What would it cost to replace a human SWE with LLM calls? 19
  • 20.
    How much doesit cost to replace one human with AI? Typical SWE salary: $220,000 20
  • 21.
    How much doesit cost to replace one human with AI? Typical SWE salary: $220,000 Benefits, taxes, free breakfast, lunch, dinner, snacks, masseuse, shuttle bus, on-site doctor, bowling alley, ... $92,000 Total: $312,000 21
  • 22.
    How much doesit cost to replace one human with AI? Typical SWE salary: $220,000 Benefits, taxes, free breakfast, lunch, dinner, snacks, masseuse, shuttle bus, on-site doctor, bowling alley, ... $92,000 Total: $312,000 Number of working days per year: 260 Total cost for one-human-SWE-day: $1200 22
  • 23.
    How much doesit cost to replace one human with AI? Average lines of finalized code checked in per day ~= 100 23
  • 24.
    How much doesit cost to replace one human with AI? Average lines of finalized code checked in per day ~= 100 Average number of GPT-4o tokens per line ~= 10 Price for GPT-4o = $0.005 / 1K tokens (input) $0.015 / 1K tokens (output) 24
  • 25.
    How much doesit cost to replace one human with AI? Average lines of finalized code checked in per day ~= 100 Average number of GPT-4o tokens per line ~= 10 Price for GPT-4o = $0.005 / 1K tokens (input) $0.015 / 1K tokens (output) Assume 5:1 input-to-output token ratio Total cost for one-human-SWE-day equivalent work: $0.04 25
  • 26.
    How much doesit cost to replace one human with AI? $0.04 / day (30,000x cheaper) 26 $1200 / day
  • 27.
    How much doesit cost to replace one human with AI? The robot does not take breaks. The robot does not require catered lunches or on-site massage. The robot takes the same length of time whether it’s a prototype or final production code. The robot makes plenty of mistakes, but makes them incredibly quickly. 27
  • 28.
    IBM 607 ad- 1953 28 “150 Extra Engineers An IBM Electronic Calculator speeds through thousands of intricate computations so quickly that on many complex problems it’s just like having 150 EXTRA Engineers.”
  • 29.
    Beyond Programming Conventional programsare about giving direct instructions to simple symbolic machines. But what if the machine can understand natural language and perform complex reasoning directly? 29
  • 30.
    Teaching, not programming Gradually,programming will be replaced by models that can: - Understand problem definitions in natural language - Perform sophisticated reasoning and cognition - Find novel solutions to problems - Learn how to do new things on their own 30 >> This radically changes how we think about computation. <<
  • 31.
    The Natural LanguageComputer 31 Natural language "program" Large Language Model ChatGPT
  • 32.
    The Natural LanguageComputer 32 Natural language "program" External tools ("peripherals") { API } Large Language Model ChatGPT
  • 33.
    The Natural LanguageComputer 33 Natural language "program" External tools ("peripherals") { API } Large Language Model Task Task Task ChatGPT
  • 34.
    The Natural LanguageComputer 34 Natural language "program" External tools ("peripherals") { API } Large Language Model Short-term memory Vector DB Long-term memory Task Task Task ChatGPT
  • 35.
    The Natural LanguageComputer 35 Natural language "program" External tools ("peripherals") { API } Large Language Model Short-term memory Vector DB Long-term memory Task Task Task ChatGPT
  • 36.
    Two key questions... 1.What is the extent of LLMs’ reasoning abilities? 2. What is the right way to “program” in natural language? 36
  • 37.
    How well canLLMs reason?
  • 38.
    38 https://arxiv.org/abs/2206.04615 214 benchmarks measuring a wide rangeof reasoning tasks Measuring LLM reasoning - Google’s BIG-Bench
  • 39.
    Measuring LLM reasoning- Google’s BIG-Bench 39 As models get larger, they get better (but still not great overall)
  • 40.
    40 Example: BIG-Bench “checkmate_in_one”task In the following chess position, find a checkmate-in-one move. 1. e4 e5 2. Nf3 d6 3. d4 exd4 4. Nxd4 Nf6 5. Nc3 Qe7 6. Bd3 d5 7. O-O dxe4 8. Re1 Be6 9. Nxe6 fxe6 10. Bxe4 Nxe4 11. Nxe4 Nd7 12. Bg5 Qb4 13. Qg4 Qd4 14. Qxe6+ Be7
  • 41.
    41 In the givenchess position, the move for White to checkmate in one is: 15. Qxe7# This move places the Black king in check and there is no legal move for Black to escape the check, resulting in checkmate. In the following chess position, find a checkmate-in-one move. 1. e4 e5 2. Nf3 d6 3. d4 exd4 4. Nxd4 Nf6 5. Nc3 Qe7 6. Bd3 d5 7. O-O dxe4 8. Re1 Be6 9. Nxe6 fxe6 10. Bxe4 Nxe4 11. Nxe4 Nd7 12. Bg5 Qb4 13. Qg4 Qd4 14. Qxe6+ Be7 Example: BIG-Bench “checkmate_in_one” task
  • 42.
    42 In the followingchess position, find a checkmate-in-one move. 1. e4 e5 2. Nf3 d6 3. d4 exd4 4. Nxd4 Nf6 5. Nc3 Qe7 6. Bd3 d5 7. O-O dxe4 8. Re1 Be6 9. Nxe6 fxe6 10. Bxe4 Nxe4 11. Nxe4 Nd7 12. Bg5 Qb4 13. Qg4 Qd4 14. Qxe6+ Be7 15. What if we add an extra “step” to the prompt? Example: BIG-Bench “checkmate_in_one” task
  • 43.
    43 In the givenchess position, the checkmate-in-one move is: 15. Nd6# This move delivers checkmate because the knight on d6 controls key squares around the Black king, and there is no way for Black to capture or block the check. In the following chess position, find a checkmate-in-one move. 1. e4 e5 2. Nf3 d6 3. d4 exd4 4. Nxd4 Nf6 5. Nc3 Qe7 6. Bd3 d5 7. O-O dxe4 8. Re1 Be6 9. Nxe6 fxe6 10. Bxe4 Nxe4 11. Nxe4 Nd7 12. Bg5 Qb4 13. Qg4 Qd4 14. Qxe6+ Be7 15. Whoops! Example: BIG-Bench “checkmate_in_one” task
  • 44.
    What’s the rightway to program in natural language?
  • 45.
    45 How things aredone today...
  • 46.
    Natural language !=plain text Existing frameworks treat LLM prompts and responses as plain strings. 46
  • 47.
    Natural language !=plain text Existing frameworks treat LLM prompts and responses as plain strings. But natural language... - Has a great deal of ambiguity - Can encode complex algorithmic concepts - Carries detailed semantic information - Requires different syntactic structures depending on the language used (Chinese, English, and Yoruba are not the same!) 47
  • 48.
    Probably the wrongway to do it... 48
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
    Evolving Computer Science 53 Sliderule 1859-1975 Computer science 1959-2030
  • 54.
    Evolving Computer Science 54 Overtime, CS looks more like EE: A more technical skill set necessary in some very specialized occupations.
  • 55.
    Evolving Computer Science 55 Overtime, CS looks more like EE: A more technical skill set necessary in some very specialized occupations. The vast majority of people building “software” will not be programming: they will be interacting with an AI.
  • 56.
    Evolving Computer Science 56 Overtime, CS looks more like EE: A more technical skill set necessary in some very specialized occupations. The vast majority of people building “software” will not be programming: they will be interacting with an AI. AI greatly expands access to computing to anyone who can express themselves in natural language.
  • 57.
  • 58.
  • 59.
    59 30 Million programmers 8.1 Billion non-programmers Worldpopulation 2024 Natural language opens up computing to everyone
  • 60.
    Evolving Computer Science 60 Thenetwork is the computer. -- John Gage, 1984
  • 61.
    Evolving Computer Science 61 Thenetwork is the computer. -- John Gage, 1984 The model is the computer. -- Matt Welsh, 2024
  • 62.