6/5/2019 äșșć·„çŸ„èƒœć­ŠäŒš
PFN Fellow äžžć±±ćź
Twitter: @maruyama
「äșșć·„çŸ„èƒœă€ă‚’ă©ăźă‚ˆă†ă«èȘ­ăżè§Łăă‹
How to understand “Artificial Intelligence?”
1
Agenda
1. 「äșșć·„çŸ„èƒœă€ăšăŻäœ•ă‹
2. æ–°ă—ă„èšˆçź—ăƒąăƒ‡ăƒ«
3. ç§‘ć­Šăƒ»ć·„ć­Šăžăźć±•é–‹
2
In this talk, first I point out that the word “AI” is used in many ways and it is
causing confusion.
Then, I will focus on “AI as frontier of computer science” and argue that a
new computation model is emerging.
Finally, I will discuss how this new computation model affects on science
and engineering.
2
3
(1) ă€ŒçŸ„æ€§ăźæŽąæ±‚ă€ăšă—ăŠăźäșșć·„çŸ„èƒœ
ç„žç”Œç§‘ć­Š
è„łç§‘ć­Š
ç„žç”Œç§‘ć­Š
è„łç§‘ć­Š
濃理歩濃理歩
ç”Œæžˆć­Šç”Œæžˆć­Š
ロボティクă‚č
æ©Ÿæą°ć·„ć­Š
ロボティクă‚č
æ©Ÿæą°ć·„ć­Š
ă‚łăƒłăƒ”ăƒ„ăƒŒă‚żç§‘ć­Š
(äșșć·„çŸ„èƒœ)
ă‚łăƒłăƒ”ăƒ„ăƒŒă‚żç§‘ć­Š
(äșșć·„çŸ„èƒœ)
ç„žç”ŒçŽ°èƒžăƒ»è„łăź
æŽ»ć‹•ă‚’èŠłæžŹ
äșșăźèĄŒć‹•ă‚’èŠłæžŹ
ćˆç†çš„ă‚šăƒŒă‚žă‚§ăƒłăƒˆ
ăšă—ăŠăƒąăƒ‡ăƒ«ćŒ–
- ă‚ČăƒŒăƒ ç†è«–
- æœ€é©ćŒ–
ćˆ¶ćŸĄă‚·ă‚čテムべしど
ăƒąăƒ‡ăƒ«ćŒ–
- ćŸźćˆ†æ–čçš‹ćŒ
- ćˆ¶ćŸĄç†è«–
ă‚œăƒ•ăƒˆă‚Šă‚§ă‚ąăš
ă—ăŠăƒąăƒ‡ăƒ«ćŒ–
The term “AI,” or Artificial Intelligence, is used in different meanings in
different contexts.
There are at least three different interpretations.
First, AI is a research field to study “intelligence.”
There are several different approaches to intelligence; neuroscience tries
to understand intelligence by observing neuron activities in a brain,
psychology by observing human behaviors, economics models human
intelligence as an independent and rational agent, robotic researchers use
differential equations and control theory to understand human motions,
and we, computer scientists try to understand intelligence by building a
software mimicking human intelligence.
3
「äșșć·„çŸ„èƒœ(AI)」べは歩敏領柟だが、、
歩敏領柟 æŽŸç”Ÿă—ăŸæŠ€èĄ“ ćżœç”šèŁœć“
物理歊
äșșć·„çŸ„èƒœ
(AI)
ć†…ç‡ƒæ©Ÿé–ą
ćŠć°Žäœ“
è‡Șć‹•è»Š
ă‚łăƒłăƒ”ăƒ„ăƒŒă‚ż
æŽąçŽąă‚ąăƒ«ă‚ŽăƒȘă‚șム
音棰èȘè­˜
ç”»ćƒèȘè­˜
äč—æ›æĄˆć†…
AIă‚čăƒ”ăƒŒă‚«ăƒŒ
è‡Șć‹•é‹è»ą
ă“ă‚Œă‚’ç‰©ç†ăš
ăŻć‘Œă°ăȘい
「AIă€ăšć‘Œă°
れるこべも

AI is a research field just like physics, but there is one important difference.
In physics, there are technologies derived from physical theories, and there
are products based on these derived technologies, but they are not called
“physics.”
On the other hand, in AI, the derived technologies as well as products are
sometimes called “AI,” and THAT is the source of confusion.
4
(2) ăƒă‚€ăƒ—ăšă—ăŠăźă€ŒAI」
‱ AIă«èžă„ăŠăżăŸ- 珏4ć›žè¶…æœȘć©šç€ŸäŒš 4/13
– ç€ŸäŒšć•éĄŒè§Łæ±ș枋AI 「AIăČろし」
‱ 保è‚Čæ‰€ăźć…„æ‰€ăŻAIがæ±șめる æ•°ç§’ă§éžè€ƒćŻèƒœă« | NHKăƒ‹ăƒ„ăƒŒă‚č 2/12
– ćŻŒćŁ«é€šăźăƒžăƒƒăƒăƒłă‚°æŠ€èĄ“
‱ “AIにèȠけăȘい”äșșæă‚’è‚Čæˆă›ă‚ˆ ïœžäŒæ„­ăƒ»æ•™è‚Č æœ€ć‰ç·šïœž 4/25
– 「è‡Șćˆ†ăŸăĄăźä»•äș‹ăŒAIにć„Șわれる」
‱ è»Šă„ă™ăźć€©æ‰ăƒ›ăƒŒă‚­ăƒłă‚°ćšćŁ«ăźéș蚀3/16
– ă€ŒïŒĄïŒ©ăŻè‡Șă‚‰ăźæ„ćż—ă‚’æŒă€ă‚ˆă†ă«ăȘり、äșșé–“ăšćŻŸç«‹ă™ă‚‹ă ă‚ă†ă€‚ïŒĄïŒ©ăźćˆ°æ„ăŻă€
äșș類ćČäžŠă€æœ€ć–„ăźć‡ș杄äș‹ă«ăȘă‚‹ă‹ă€ăŸăŸăŻæœ€æ‚Șたć‡ș杄äș‹ă«ăȘるだろう。」
‱ マăƒȘă‚ȘAIăźă‚†ăăˆïœž 4/5
– ă€Œæ­»ă€ă‚’ç†è§Łă§ăăȘいAIäșșé–“ăšă€ă€Œç”Ÿă€ă«ćžŒæœ›ăŒæŒăŠăȘい民ćčŽăźäș€æ”ăźæžœăŠă«ăŻă€
どんăȘç”æœ«ăŒćŸ…ăŁăŠă„ă‚‹ăźă‹ă€‚
5
ăƒ‡ăƒŒă‚żă‚ąăƒŠăƒȘティ
クă‚čăƒ»æœ€é©ćŒ–
æ©Ÿæą°ă«ă‚ˆă‚‹è‡Ș拕挖
汎甚AIïŒˆăŸă èŠ‹ăŹæŠ€
èĄ“ïŒ‰
That leads to the second interpretation – “AI as something sophisticated.”
Here are some examples of recent coverage by NHK.
In the program “Ask AI,” statistical analysis of social data is used for
drawing some recommendations for social issues.
The news on “Nursery Matching AI” refers to a combinatorial optimization
problem.
Discussion on how job security is threatened by AI is actually talking about
automation.
The program on Dr. Hawking’s Last Will and the drama “Mario the AI” are
dealing with hypothetical technologies that have never been realized.
The reason I call this interpretation “AI as a hype” is that, in this use,
people are less concerned with what “AI” is. The speaker means one thing,
and the listener may understand it as completely another, and they do not
bother.
If the speaker uses “AI” in order to intentionally confuse the listener,
tragedy happens.
5
This page is intentionally removed
6
6
This page is intentionally removed
7
7
(3) ă€Œæƒ…ć ±æŠ€èĄ“ăźăƒ•ăƒ­ăƒłăƒ†ă‚Łă‚ąă€ăšă—ăŠăźäșșć·„çŸ„èƒœ
ć†ćž°ć‘Œăłć‡șしはAIăźæŠ€èĄ“!?ć†ćž°ć‘Œăłć‡șしはAIăźæŠ€èĄ“!?
P. H. Winston, Artificial Intelligence
ISBN-13: 978-0201084542, 1977
1970ćčŽä»Łăźă€Œæœ€ć…ˆç«Żă€
Source: Wikipedia
The third interpretation of “AI” is “AI as frontier of computer science (CS).”
Many CS ideas were came from AI research.
This is one of the early textbooks on AI.
It covers things like
- Symbolic programming (LISP programming language and dynamic
memory management i.e., garbage collection)
- Recursive call (e.g., to solve the puzzle “Tower of Hanoi”)
- Search algorithm (e.g., depth-first search, A* algorithm, alpha-beta
pruning, means-end analysis)
- Syntax analysis
All of these technologies are now in the core computer science and people
do not call them “AI” anymore.
8
Computer Scienceă«ć–ă‚ŠèŸŒăŸă‚Œă‚‹ă€ŒAIæŠ€èĄ“ă€
1956-1974 珏1æŹĄäșșć·„çŸ„èƒœăƒ–ăƒŒăƒ 
‱ èš˜ć·ć‡Šç† (LISP)
‱ Means-End Analysis
‱ è‡Ș然蚀èȘžć‡Šç†
1956-1974 珏1æŹĄäșșć·„çŸ„èƒœăƒ–ăƒŒăƒ 
‱ èš˜ć·ć‡Šç† (LISP)
‱ Means-End Analysis
‱ è‡Ș然蚀èȘžć‡Šç†
1980-1987 珏2æŹĄäșșć·„çŸ„èƒœăƒ–ăƒŒăƒ 
‱ çŸ„è­˜èĄšçŸ (e.g. ăƒ•ăƒŹăƒŒăƒ )
‱ スキă‚čăƒ‘ăƒŒăƒˆă‚·ă‚čテム
‱ ă‚Șăƒłăƒˆăƒ­ă‚žăƒŒ
1980-1987 珏2æŹĄäșșć·„çŸ„èƒœăƒ–ăƒŒăƒ 
‱ çŸ„è­˜èĄšçŸ (e.g. ăƒ•ăƒŹăƒŒăƒ )
‱ スキă‚čăƒ‘ăƒŒăƒˆă‚·ă‚čテム
‱ ă‚Șăƒłăƒˆăƒ­ă‚žăƒŒ
2008 珏3æŹĄäșșć·„çŸ„èƒœăƒ–ăƒŒăƒ 
‱ ç”±èšˆçš„æ©Ÿæą°ć­Šçż’ăƒ»æ·±ć±€ć­Šçż’
‱ ăƒ–ăƒ©ăƒƒă‚Żăƒœăƒƒă‚Żă‚čæœ€é©ćŒ–
2008 珏3æŹĄäșșć·„çŸ„èƒœăƒ–ăƒŒăƒ 
‱ ç”±èšˆçš„æ©Ÿæą°ć­Šçż’ăƒ»æ·±ć±€ć­Šçż’
‱ ăƒ–ăƒ©ăƒƒă‚Żăƒœăƒƒă‚Żă‚čæœ€é©ćŒ–
- ć†ćž°ć‘Œăłć‡șし
- ă‚ŹăƒŒăƒ™ăƒŒă‚žă‚łăƒŹă‚Żă‚·ăƒ§ăƒł
- ć‹•çš„èšˆç”»æł•
- ă‚łăƒłăƒ‘ă‚€ăƒ©ăƒ»ă‚łăƒłăƒ‘ă‚€ăƒ©
- :
- ć†ćž°ć‘Œăłć‡șし
- ă‚ŹăƒŒăƒ™ăƒŒă‚žă‚łăƒŹă‚Żă‚·ăƒ§ăƒł
- ć‹•çš„èšˆç”»æł•
- ă‚łăƒłăƒ‘ă‚€ăƒ©ăƒ»ă‚łăƒłăƒ‘ă‚€ăƒ©
- :
- ă‚Șăƒ–ă‚žă‚§ă‚ŻăƒˆæŒ‡ć‘
- ヱデăƒȘăƒłă‚°èš€èȘž
- ă‚»ăƒžăƒłăƒ†ă‚Łăƒƒă‚ŻWeb
- ćˆ¶çŽ„ă‚œăƒ«ăƒăƒŒ
- :
- ă‚Șăƒ–ă‚žă‚§ă‚ŻăƒˆæŒ‡ć‘
- ヱデăƒȘăƒłă‚°èš€èȘž
- ă‚»ăƒžăƒłăƒ†ă‚Łăƒƒă‚ŻWeb
- ćˆ¶çŽ„ă‚œăƒ«ăƒăƒŒ
- :
æ–°ă—ă„èšˆçź—ăƒąăƒ‡ăƒ«
There have been three waves of AI research.
In the first wave of AI, research topics include
- Symbolic programming (lisp)
- Means-end analysis
- Natural language processing
and they are incorporated into the core CS as
- Dynamic memory management (garbage collection)
- Search algorithms
- Formal language theory
In the second wave of AI, research topics include
- Knowledge representation (e.g., Frame theory)
- Expert system
- Ontology
And they are incorporated into the core CS as
- Object-oriented programming
- Modeling language
- Semantic Web
9
So, if you look at what is happening in AI research, you can predict where
CS is to go.
In the current wave of AI research, focus areas include
- Statistical machine learning, especially deep learning
- Blackbox optimization
And I believe that these technologies will lead us to a new programming
paradigm, or new computation model.
9
Agenda
1. 「äșșć·„çŸ„èƒœă€ăšăŻäœ•ă‹
2. æ–°ă—ă„èšˆçź—ăƒąăƒ‡ăƒ«
– æ·±ć±€ć­Šçż’
– ăƒ–ăƒ©ăƒƒă‚Żăƒœăƒƒă‚Żă‚čæœ€é©ćŒ–
3. ç§‘ć­Šăƒ»ć·„ć­Šăžăźć±•é–‹
10
I said AI is to advance computer science.
Let me discuss these two technologies that are currently driving CS.
- Deep learning
- Blackbox optimization
and how they will make a new computation model.
10
æ·±ć±€ć­Šçż’ăšăŻäœ•ă‹ – (çŠ¶æ…‹ă‚’æŒăŸăȘい*)é–ąæ•°
Y = f(X)X Y
è¶…ć€šæŹĄć…ƒă€ïŒˆé€Ł
ç¶šć€‰æ•°ăƒ»ă‚«ăƒ†ă‚Ž
ăƒȘć€‰æ•°ăźä»»æ„ăź
ç”„ăżćˆă‚ă›ïŒ‰
ćˆ€ćˆ„ăƒ»ćˆ¶ćŸĄăȘら
ă°æŻ”èŒƒçš„äœŽæŹĄć…ƒă€
生成ăȘă‚‰ă°è¶…ć€š
æŹĄć…ƒ
*çŠ¶æ…‹ă‚’ć…„ćŠ›Xăźäž€éƒšăšă™ă‚Œă°ă€é–ąæ•°ăăźă‚‚ăźăŻçŠ¶æ…‹ă‚’ă‚‚ăŸăȘいもぼべみăȘせる
First, deep learning.
Very simply put, deep learning is a function.
A function takes X as input and produces Y as output.
Usually we are dealing with a very high dimensional vector as input. For
example, if the input is a image of 100x100 pixels, the input is a vector with
10,000 variables.
Output Y could be simple – if the function is to classify the input image into
either a cat or an airplane, the dimension of Y is 2. If the output is also an
image, the dimension is much higher.
11
æ™źé€šăź(挔çčč的ăȘ)é–ąæ•°ăźäœœă‚Šæ–č: 䟋 æ‘‚æ°ă‹ă‚‰èŻæ°ăžăźć€‰æ›
double c2f(double c) {
return 1.8*c + 32.0;
}
double c2f(double c) {
return 1.8*c + 32.0;
}
慄抛: C
ć‡ș抛: F
ただし、FはCă‚’èŻæ°ă§èĄšă—ăŸă‚‚ăź
慄抛: C
ć‡ș抛: F
ただし、FはCă‚’èŻæ°ă§èĄšă—ăŸă‚‚ăź
芁求
ă‚ąăƒ«ă‚ŽăƒȘă‚șム
F = 1.8 * C + 32F = 1.8 * C + 32ăƒąăƒ‡ăƒ«
äșșăŒæŒă€äșșăŒæŒă€
ć…ˆéš“çš„çŸ„
識
ăƒąăƒ‡ăƒ«ăŒæ—ąçŸ„ïŒèšˆçź—æ©Ÿæ§‹ïŒˆă‚ąăƒ«ă‚ŽăƒȘă‚șăƒ ïŒ‰ăŒæ§‹æˆćŻèƒœă§ă‚ă‚‹ćż…èŠ
Let’s consider programming a very simple function – computing Fahrenheit
from Celsius.
In conventional programming, which I call here “deductive programming,”
we start from the requirement – what is input and what is the output.
Then, we convert it into a “model” – by which I mean precise mathematical
relationship between the input and output.
In this case, we use our knowledge on the relationship between Celsius
and Fahrenheit and represent it as a mathematical formula.
Once you have the model, you can build an algorithm to actually compute it.
Note that in deductive programming, we assume that the model is known
and the algorithm is constructible.
12
æ·±ć±€ć­Šçż’ăź(澰箍的ăȘ)やりæ–č – èš“ç·Žăƒ‡ăƒŒă‚żă§ć…„ć‡șćŠ›ă‚’äŸ‹ç€ș
èš“ç·Žăƒ‡ăƒŒă‚żă‚»ăƒƒăƒˆèš“ç·Žăƒ‡ăƒŒă‚żă‚»ăƒƒăƒˆ
èŠłæžŹ
èš“ç·ŽïŒˆă»ăŒè‡Șć‹•ă§ăƒ‘ăƒ©ăƒĄă‚żÎžă‚’æ±șćźšïŒ‰
ăƒąăƒ‡ăƒ«ïŒă‚ąăƒ«ă‚ŽăƒȘă‚șムがæœȘçŸ„ă§ă‚‚ă‚ˆă„!
On the other hand, with deep learning, programming is done by giving
examples – which I call inductive programming..
In our case, first we obtain two thermometers, one in Celsius and the other
in Fahrenheit, and read their measurements from time to time.
Then you get input-output examples, which we call a training data set.
Once the training data set is ready, we train a deep neural net semi-
automatically.
Note that inductive programming is possible even when we do not know
the model or the algorithm. This opens opportunities to solve very
interesting real-world problems.
13
è‡Șć‹•é‹è»ąăźăŸă‚ăźă‚»ă‚°ăƒĄăƒłăƒ†ăƒŒă‚·ăƒ§ăƒł
https://www.youtube.com/watch?v=lGOjchGdVQs
ăƒąăƒ‡ăƒ«ăŒäžæ˜Ž:äșșæ‰‹ă«ă‚ˆă‚‹æ­Łè§Łă‚ąăƒŽăƒ†ăƒŒă‚·ăƒ§ăƒł
Here are some examples of real world applications of inductive
programming.
First is semantic segmentation for autonomous driving. The top left panel
is the video input. For each pixel of the input, our function has to calculate
what class, such as road surface (purple), sidewalk (pink), vehicle (blue),
tree (green), etc., the pixel belongs to. The output is shown in the right
bottom panel.
In this task, it is very hard to define a proper model, because you can not
give precise mathematical conditions with which a particular pixel is
classified as “road surface.”
Instead of a model, we supply examples. The bottom left panel shows a
human-annotated image, meaning that human annotators gave the correct
answer to each pixel. We should supply thousands of these human-
annotated images to get high accuracy. Even though we do not know the
model, we can prepare training data set and obtain reasonably accurate
results.
14
ăƒąăƒ‡ăƒ«ăŒäžæ˜Ž:äșșæ‰‹ă«ă‚ˆă‚‹æ­Łè§Łă‚ąăƒŽăƒ†ăƒŒă‚·ăƒ§ăƒł
The second example is a voice-controlled robot. In this demo, the operator
says “Can you move the brown fluffy thing to the bottom?” to move the
teddy bear. Note that the top and bottom are from the operators
perspective, as shown in the top view image of the boxes as shown on the
left.
Next, she says “Can you move the tissue box to the left?” and then use
another expression “white and blue box” to refer to the same object.
Again, this system is very hard to program in the conventional way,
because we cannot encode all the possible natural language expressions
in the system.
How did we train this system? We show the top view of the boxes and ask
human annotators “what would you say if you want to move this to there?”
and collected thousands of example sentences. With these example
sentences as a training data set, we trained the system.
15
線画ぞたè‡Ș拕着è‰ČケプăƒȘPaintsChainer
ă‚ąăƒ«ă‚ŽăƒȘă‚șăƒ ăŒäžæ˜Ž:é€†ć•éĄŒăšă—ăŠćźšćŒćŒ–
The third example is where the model is known but the algorithm is
unknown.
This is an application to colorize a line drawing. The left image is a line-
drawing input and the right image is the output. You can also give hints in
terms of what color is used in each region.
How did we train this neural net? We collected a lot of Anime images from
the Internet and applied a known algorithm called “edge detection” to
obtain the corresponding line drawing. Using many such pairs we trained
our neural net.
Because the edge detection algorithm is well known, the mathematical
relationship between a line drawing and the corresponding color image is
known in a sense -- that is, the model is known.
However, how to compute a colored image from a line-drawing input is not
known, meaning that the algorithm is unknown.
16
17
æ±Žç”šèšˆçź—æ©Ÿæ§‹ăšă—ăŠăźæ·±ć±€ć­Šçż’
‱ æĄé•ă„ă«ć€šă„ăƒ‘ăƒ©ăƒĄă‚ż
– ä»»æ„ăźć€šæŹĄć…ƒéžç·šćœąé–ąæ•°ă‚’èż‘äŒŒ*
 ç–‘äŒŒçš„ă«ăƒăƒ„ăƒŒăƒȘング漌慹!
ć‡șćŠ›ïŒˆè¶…ć€šæŹĄć…ƒïŒ‰
ć…„ćŠ›ïŒˆè¶…ć€šæŹĄć…ƒïŒ‰
* G. Cybenko. Approximations by superpositions
of sigmoidal functions. Mathematics of Control,
Signals, and Systems, 2(4):303–314, 1989.
I said that deep neural net is a function. How powerful is it? Is it powerful
enough to represent any computable functions? The answer to this
question is yes.
Because a deep neural net has a very large number of parameters, it is
theoretically proven that for any computable function, there exists a
complex enough deep neural network that approximates the function
within the given error margin.
In other words, deep neural network is “pseudo Turing complete,” meaning
that we have a universal computational mechanism, which can be
programmed with examples. This is a significant breakthrough in
computation, I believe.
17
ç”±èšˆçš„æ©Ÿæą°ć­Šçż’ăźæœŹèłȘ的限界 (1)
èš“ç·Žăƒ‡ăƒŒă‚ż DNN
ć°†æ„ăŒéŽćŽ»ăšćŒă˜ă§ăȘă„ăšæ­Łă—ăäșˆæžŹă§ăăȘい
時間軞
éŽćŽ»ă«èŠłæžŹă•ă‚ŒăŸăƒ‡ăƒŒ
タにćŸșă„ă„ăŠèš“ç·Ž
蚓緎
èš“ç·Žæžˆăżăƒąăƒ‡ăƒ«ă«ćŸșいど
æ–°ă—ă„ć€€ă‚’äșˆæžŹ
Even though deep learning is a very powerful computing mechanism, it is
not an almighty intelligence. It is statistical modeling and hence, there are
fundamental limitations. I will explain three such limitations.
First is that deep neural net is trained using the data observed in the past
and use the trained deep neural net (DNN) to predict future events. Thus,
we always assume that the future is essentially the same as the past.
18
ç”±èšˆçš„æ©Ÿæą°ć­Šçż’ăźæœŹèłȘ的限界 (2)
‱ èš“ç·Žăƒ‡ăƒŒă‚żă‚»ăƒƒăƒˆăŻæœ‰é™
èš“ç·Žăƒ‡ăƒŒă‚żă‚»ăƒƒăƒˆ
ć†…æŒż
ć€–æŒż
??
èš“ç·Žăƒ‡ăƒŒă‚żă«çŸă‚ŒăȘい、澌民ăȘäș‹è±Ąă«ćŻŸă—ăŠç„ĄćŠ›
The second limitation is this.
Even if the future and the past look the same, the training data set is
always finite. If we are dealing with an infinite set as the domain of the
function, which is almost always the case, an input to the function is in
general something that is never seen in the training data set. If the input is
close to some of the data points in the training data set, the prediction is
very accurate. This situation is called “interpolation.”
On the other hand, if the input is far from the any of the data points in the
training data set, the prediction is very hard. This situation is called
“extrapolation.” So deep learning is powerful for frequent events but
powerless for rare events that do not appear in the training data set.
Often customers ask us to build an “AI” system that can deal with
emergency situations. As long as deep learning is concerned, it is
impossible. We always need a lot of “emergency” samples in the training
data set.
19
ç”±èšˆçš„æ©Ÿæą°ć­Šçż’ăźæœŹèłȘ的限界 (3)
‱ æœŹèłȘ的にçąș率的
慃戆澃
ç‹Źç«‹ăƒ»ćŒćˆ†ćžƒ(i.i.d.)
èš“ç·Žăƒ‡ăƒŒă‚ż
èš“ç·ŽæžˆăżDNN
ă‚”ăƒłăƒ—ăƒȘăƒłă‚°ă‚”ăƒłăƒ—ăƒȘング
ă«ăƒă‚€ă‚ąă‚čが
慄るこずは避
けられăȘい!
「100%ăźæ­Łă—ă•ă€ăŻäżèšŒă§ăăȘい
The third and probably the most serious limitation is that the computation
of deep learning is essentially probabilistic. Because we consider deep
learning as statistical modeling, we always assume that there is an original
probability distribution which is unknown. We further assume that the
training data set is created by random sampling, or i.i.d. (Independent and
Identically Distributed). After we obtained the training data set, the DNN is
trained with a known training algorithm.
But the problem is in the first “random sampling” step. This is random
sampling, so there is always bias. For example, when you throw a dice for a
hundred times, what is the chance of getting ‘one’ for all the hundred
throws? It is very unlikely, and you may say that it will never happen. But
there is no guarantee that it never happens because of the bias. This
means that deep learning has no guarantee of its correctness. There is no
such thing as 100% correct deep neural net.
20
æ·±ć±€ć­Šçż’ăšăŻäœ•ă‹ïŒˆăŸăšă‚ïŒ‰
‱ é–ąæ•°ïŒˆăƒ—ăƒ­ă‚°ăƒ©ăƒ ïŒ‰ăźäœœă‚Šæ–č
– ăƒąăƒ‡ăƒ«ă‚„ă‚ąăƒ«ă‚ŽăƒȘă‚șムがわからăȘăăŠă‚‚ă€èš“ç·Žăƒ‡ăƒŒă‚żă‚»ăƒƒăƒˆăŒă‚
ă‚Œă°äœœă‚Œă‚‹
– æ•™ćž«äżĄć·ăźäžŽăˆæ–čæŹĄçŹŹă§ă€é©šăă‚ˆă†ăȘこべが

‱ æœŹèłȘçš„ă«ç”±èšˆăƒąăƒ‡ăƒȘング
– ć…ƒćˆ†ćžƒăŒç‹Źç«‹ăƒ»ćŒćˆ†ćžƒă§ă‚ă‚‹ă“ăšăŒć‰æ
– èż‘äŒŒă—ă‹ă§ăăȘă„ïŒˆăƒă‚€ă‚ąă‚čăŒć…„ă‚‹ă“ăšă‚’éżă‘ă‚‰ă‚ŒăȘă„ïŒ‰
Here is the recap of what deep learning is.
Deep learning is a new programming model. It is inductive, not deductive,
and the programming can be done without knowing the model or the
algorithm. Innovative thinking on how to give teacher signals may solve
many interesting problems.
On the other hand, it is fundamentally a statistical modeling. It assumes
that the original distribution is sampled i.i.d.
There is always bias, and what you get is always an approximation.
21
Consumer Electronics Show (CES) 2016
CESにおけるè‡Șć‹•é‹è»ąăƒ‡ăƒąă€æ·±ć±€ćŒ·ćŒ–ć­Šçż’ă«ă‚ˆă‚‹èš“ç·Žă‚’èĄŒăŁăŸă‚‚ăź
ăƒ–ăƒ©ăƒƒă‚Żăƒœăƒƒă‚Żă‚čæœ€é©ćŒ–:ćŒ·ćŒ–ć­Šçż’ă«ă‚ˆă‚‹è‡Șć‹•é‹è»ą
Let me turn to the second technology that current AI research focuses on:
blackbox optimization.
This is an autonomous car demo at Consumer Electronics Show in 2016.
The silver cars were trained using deep reinforcement learning.
The red car was controlled manually by our vice president, and he
sometimes hits or blocks other cars.
But as you can see the silver cars can avoid collisions.
22
ćŒ·ćŒ–ć­Šçż’ăŻă€ć ±é…Źé–ąæ•°ă‚’æœ€ć€§ćŒ–ă™ă‚‹æœ€é©ćŒ–ć•éĄŒ
23
X: ă‚»ăƒłă‚”ć…„ćŠ›
Y: ă‚ąă‚Żăƒăƒ„ă‚šăƒŒă‚ż
ć‡ș抛
Y = f(X, Ξ)
u(S, Y): ć ±é…Źé–ąæ•°ïŒˆćŠčç”šé–ąæ•°ïŒ‰
S: çŸćœšăźçŠ¶æ…‹
How this system was trained? We used an external simulator. The current
situation in the simulator is supplied as a simulated sensor input X to the
learning algorithm. The learning algorithm compute the actuator output Y
so that it maximize the reward function, or the utility function, u(S,Y).
Thus, this is an optimization problem.
23
24
䌝由的ăȘæ•°ç†æœ€é©ćŒ–ăšăźé•ă„
䌝由的ăȘæ•°ç†æœ€é©ćŒ–
- ć˜äœ“æł•ïŒˆç·šćœąç©ș間
- 憅ç‚čæł•ïŒˆć‡žç©ș間
ćŠčç”šé–ąæ•°ăźćœąăŒäș‹ć‰ă«çŸ„ら
れどいる敏題
ăƒ–ăƒ©ăƒƒă‚Żăƒœăƒƒă‚Żă‚čæœ€é©ćŒ–
- ć€šè…•ăƒăƒłăƒ‡ă‚Łăƒƒăƒˆć•éĄŒ
- ćźŸéš“èšˆç”»æł•
- é€ČćŒ–èšˆçź—
- ćŒ·ćŒ–ć­Šçż’
- ăƒ™ă‚€ă‚șæœ€é©ćŒ–
- ćŠčç”šé–ąæ•°ăźćœąăŒäș‹ć‰ă«äžæ˜Ž
- ć€–éƒšă‚Șăƒ©ă‚Żăƒ«ă‚’ç”šă„ăŠćŠčç”šé–ąæ•°ăźć€‹ćˆ„
ăźć€€ă‚’æ±‚ă‚ă‚‹
x
u(x)
This type of optimization problem is much different from the traditional
optimization problems in operations research, such as linear programming
(simplex method) and convex optimization (interior-point method) where
the utility function is known in advance.
Blackbox optimization, such as reinforcement learning, assumes no prior
knowledge on the internals of the utility function – thus it is called
“blackbox.” Instead, it receives signals on the utility function by asking an
external “oracle” the value of utility function u(X) for a given parameter X.
The idea of blackbox optimization is not new. Multi-arm bandit problem
and design of experiments can both be considered as blackbox
optimization. More recently, a new technique called Bayesian optimization
attracts a lot of attention, especially in the context of hyper-parameter
tuning in deep learning.
24
25
‱ ăƒă‚€ăƒ‘ăƒŒăƒ‘ăƒ©ăƒĄăƒŒă‚żæœ€é©ćŒ–ăƒ©ă‚€ăƒ–ăƒ©ăƒȘ
– ä»»æ„ăźç›źçš„é–ąæ•°ă§äžŠćˆ—ă‹ă€é«˜ćșŠăȘæžćˆˆă‚Šä»˜ăă§é«˜é€Ÿă«æŽąçŽą
‱ “Define-by-Runâ€œă«ćŸșă„ăć•éĄŒă‚’ă‚łăƒŒăƒ‰ă§èš˜èż°ă™ă‚‹ă ă‘
– æŽąçŽąçŻ„ć›Čă€æœ€é©ćŒ–ăƒ­ă‚žăƒƒă‚Żă‚’ćˆ„é€”èš˜èż°ă™ă‚‹ćż…èŠăŻăȘい
For example, Optuna is an open-source technology of Bayesian
optimization. It allows very computationally-expensive oracles, such as
deep learning, with pruning of unpromising trials.
It also features “define-by-run,” meaning that the exploration space can be
dynamically constructed as the exploration proceeds.
Blackbox optimization is a very powerful tool. It does not assume anything
about the utility function, yet it can optimize for it.
In a sense, with blackbox optimization you specify what you want but not
how to solve it. The “how” part is automatically discovered by the
underlying algorithm.
25
26
珏3æŹĄAIがComputer Scienceă«ă‚‚ăŸă‚‰ă™ă‚‚ăź
double c2f(double c) {
return 1.8*c + 32.0;
}
double c2f(double c) {
return 1.8*c + 32.0;
}
èšˆçź—ă‚čăƒ†ăƒƒăƒ—ă‚’æ˜Žç€ș的に
æŒ‡ćźšă™ă‚‹èšˆçź—
èš“ç·Žăƒ‡ăƒŒă‚ż
ć ±é…Źé–ąæ•°
èšˆçź—ă‚čăƒ†ăƒƒăƒ—ă‚’æ˜Ž
ç€șçš„ă«æŒ‡ćźšă—ăȘ
ă„ăƒ»ă§ăăȘă„èšˆçź—
ăƒ–ăƒ©ăƒƒă‚Ż
ボックă‚č
èšˆçź—!
With these two technologies arisen from AI research, namely deep learning
and blackbox optimization, we are now witnessing an evolution of
computing.
That is, from
what I call whitebox computation where you have to specify “how to
compute,”
to
what I call blackbox computation where you only specify “what to
compute” and let the machine to figure out how to compute it, which I call
“Blackbox Computing.”
26
ă€Œèšˆçź—ă€ăźé€Č挖
ăƒ›ăƒŻă‚€ăƒˆăƒœăƒƒă‚Żă‚čèšˆçź— (How) ăƒ–ăƒ©ăƒƒă‚Żăƒœăƒƒă‚Żă‚čèšˆçź— (What)
ç†è«–çš„èƒŒæ™Ż é›ąæ•Łæ•°ć­Šă€ăƒ–ăƒŒăƒ«ä»Łæ•° çąșçŽ‡è«–ă€æƒ…ć ±ćčŸäœ•
èšˆçź—ăƒĄă‚«ăƒ‹ă‚șム ăƒăƒ„ăƒŒăƒȘăƒłă‚°æ©Ÿæą° æ·±ć±€ć­Šçż’ă€ăƒ™ă‚€ă‚șæœ€é©ćŒ–ă€ 

ćŻŸè±Ąăšă™ă‚‹ć•éĄŒ ăƒąăƒ‡ăƒ«ăƒ»ă‚ąăƒ«ă‚ŽăƒȘă‚șăƒ ăŒć­˜ćœšă€äœŽ
æŹĄć…ƒ
ăƒąăƒ‡ăƒ«ăƒ»ă‚ąăƒ«ă‚ŽăƒȘă‚șăƒ ăŒäžæ˜Žă€è¶…é«˜æŹĄć…ƒ
ăƒ—ăƒ­ă‚°ăƒ©ăƒŸăƒłă‚° 挔çččçš„ăƒ»äșșæ‰‹ă«ă‚ˆă‚‹ (äșș間ぼèȘçŸ„
èƒœćŠ›ă«ćˆ¶çŽ„ă•ă‚Œă‚‹)
ćž°çŽçš„ăƒ»æŽąçŽąă«ćŸșă„ă
çČŸćșŠ ćŽłćŻ†è§ŁăŒćŸ—ă‚‰ă‚Œă‚‹ èż‘äŒŒè§Łăźăż
蚭蚈論 目äș€æ€§ă€ç‹Źç«‹æ€§ă€ăƒąă‚žăƒ„ăƒŒăƒ«ćŒ–ă€
閱濃ぼ戆雱
ć…šäœ“æœ€é©æ€§ă€ç”±ćˆ
27
Revolution of computing – from whitebox computing to blackbox computing.
Translations of the table contents follow:
Theoretical Background; Discrete mathematics, Boolean algebra;
Probability theory, Information Geometry
Computational Mechanism; Turing machine; Deep learning, Bayesian
optimization
Target problems; Known model/algorithm, low-dimensional; Unknown
model/algorithm, high-dimensional
Programming; Deductive, manual (constrained by human cognitive
capability); Inductive / search-based
Accuracy; Exact solution; Approximation only
Design principles; Orthogonal, independent, module, separation of
concerns; Global optimum, integration
27
Agenda
1. 「äșșć·„çŸ„èƒœă€ăšăŻäœ•ă‹
2. æ–°ă—ă„èšˆçź—ăƒąăƒ‡ăƒ«
3. ç§‘ć­Šăƒ»ć·„ć­Šăžăźć±•é–‹
28
I discussed that a new computing paradigm, blackbox computation, is
emerging.
If this is the case, then what are its implications to our society?
Here I discuss two areas -- science and engineering.
28
科歩ぼ掟理:「民ăȘă„ăƒ‘ăƒ©ăƒĄă‚żă§ć€šăăźçŸè±Ąă‚’èȘŹæ˜Žă™ă‚‹ă€
äž‡æœ‰ćŒ•ćŠ›ăźæł•ć‰‡
1/15, 201629
ç§‘ć­Šă‚’ć°ŽăäŸĄć€€èŠł:ă‚Șăƒƒă‚«ăƒ ăźć‰ƒćˆ€
First, how scientific methodologies are affected by blackbox computation?
Take for example the law of gravity, Sir Isaac Newton’s beautiful law. Why
is this beautiful? Because this simple equation can explain both the
movement of celestial bodies and a dropping apple. Behind this, there is a
fundamental value in science called Occam’s Razor which says “The
explanation requiring the fewest assumptions is most likely to be correct.”
29
30
ćźŸć•éĄŒăźć€šăăŻă€ć€šăƒ‘ăƒ©ăƒĄă‚żç©ș間:
䟋: ExRNA にćŸșă„ăăŒă‚“èšș断
がんèšș断
ç§‘ć­Šè€…ăŻă€æ”Żé…çš„ăȘć› ć­ă‚’æŽą
しがち
æ·±ć±€ć­Šçż’ăŻć€šăƒ‘ăƒ©ăƒĄă‚żăźăŸăŸ
よいèšșæ–­ç”æžœă‚’ć‡șす
https://www.preferred-networks.jp/en/news
On the other hand, many problems in the real world are not simple.
We are working with National Cancer Center to diagnose cancer by
analyzing exRNA in blood. Usually RNAs stay in cells but some fragments
may move out of the cell. These RNA fragments are called exRNA. It is
believed that by analyzing the expressions of these exRNA in blood we can
diagnose various kinds of cancer.
There are more than 4,000 different types of exRNA, and scientists have
been searching for a small number of predominant types that are indicative
to a particular cancer.
Our team applied deep learning on this problem. We took all the 4,000+
types of exRNA into account and by looking at the whole pattern of these
expressions, we could achieve order-of-magnitude higher accuracy of
cancer diagnosis.
30
31
X線1ćˆ†ć­èżœè·Ąæł•
éžç·šćœąăƒ»é«˜æŹĄć…ƒăƒ»éžćčłèĄĄăƒ»ćŠ±è”·çŠ¶
æ…‹ăƒ»ć‹•çš„ăȘç‰©æ€§ăŻă€ćŸ“æ„ç‰©ç†ć­Šè€…ăŒ
ă€Œæ±šă„é ˜ćŸŸă€ăšă—ăŠéżă‘ăŠăăŸé ˜ćŸŸ
æ·±ć±€ć­Šçż’ă«ă‚ˆă‚‹éžç·šćœąăƒ»
é«˜æŹĄć…ƒăźăƒąăƒ‡ăƒ«ćŒ–
æłš:非ćčłèĄĄăȘç‰©ç†ć­ŠăŻèż‘ćčŽæ€„速ăȘé€Čæ­©ăŒă‚
ă‚ŠæŽ»ç™șă«ç ”ç©¶ă•ă‚ŒăŠă„ă‚‹ăă†ă§ă™ă€‚
Some physicist told me that physical phenomena that is nonlinear, high-
dimensional, non-equilibrium, excited, and dynamic is considered “dirty”
and being avoided by many physicists.
Diffracted X-ray Tracking is one such problem – using very high-energy X-
ray diffracted by tiny gold crystal attached to an endpoint of a protein
molecule, you can observe the dynamic nature of the protein molecule.
But this modeling has been very hard with traditional tools because of its
nonlinear, high-dimensional, and dynamic nature.
With deep learning in our science toolbox, we have much better chance of
analyzing this phenomena.
31
32
ç§‘ć­Šă«ăƒă‚€ă‚ąă‚čはăȘかったか?
科歩ぼ敏題
(äșșが)è§Łă‘ă‚‹ć•éĄŒ è§Łăăčき敏題
By looking at these complex problems, a question arises if our science has
been focusing on problems that can be solved, not problems that should be
solved.
32
33
æ·±ć±€ć­Šçż’ă«ă‚ˆă‚‹ç§‘ć­Šăźć€‰ćŒ–:é«˜æŹĄć…ƒç§‘ć­Š
èŠłćŻŸ
äșˆæžŹ
ćˆ¶ćŸĄ
ć°‘ăƒ‘ăƒ©ăƒĄă‚żç§‘ć­Š(æ·±ć±€ć­Šçż’ć‰)
ć€šăƒ‘ăƒ©ăƒĄă‚żç§‘ć­Š
ïŒˆæ·±ć±€ć­Šçż’ă‚’ć‰æăšă™ă‚‹ç§‘ć­ŠïŒ‰
ăƒąăƒ‡ăƒ«(ăƒ›ăƒŻă‚€ăƒˆăƒœăƒƒă‚Żă‚č)
ç†è§Ł
ăƒąăƒ‡ăƒ«(ăƒ–ăƒ©ăƒƒă‚Żăƒœăƒƒă‚Żă‚č)
èŠłćŻŸ
äșș間ぼèȘçŸ„é™ç•Œă«ă‚ˆăŁăŠæ‹˜æŸă•ă‚Œă‚‹
äșˆæžŹ
ćˆ¶ćŸĄ
äșș間ぼèȘçŸ„é™ç•Œă«æ‹˜æŸă•ă‚ŒăȘい!
What does it mean in terms of scientific methodologies?
In the past, we observed the mother nature and came up with a simple
model on how the nature works. Once we understand the mechanism, we
can use this knowledge to predict the movement of celestial bodies and
control the trajectory of a cannon shell. The model should be simple
enough to be understood by a human being. This means that in this mode
of scientific activities, which I call low-dimensional science, the complexity
of the model is limited by the human cognitive capability.
On the other hand, now we have a computational tool that can deal with
high-dimensional models. I call this high-dimensional science. In high-
dimensional science, we derive a model by applying, for example, deep
learning, whose internal workings are too complex for a mere human to
understand, but still useful for predicting or controlling the physical world.
Probably this is a controversial view – later today, there is a special session
on AI and Physics, and I expect to see further discussions on this topic.
33
34
https://japan.cnet.com/blog/maruyama/2019/05/01/entry_30022958/
Last month I wrote a blog on high-dimensional science. If you are
interested in, please read the blog (sorry it is in Japanese at this moment).
34
äž–ăźäž­ăźæ”ă‚Œ – “Software 2.0”
35
https://petewarden.com/2017/11/13/deep-learning-is-eating-software/
“Software is eating the world,”
Marc Andreessen, 2011
The second implication of blackbox computation is on the engineering side.
In 2011, Marc Andreessen said that “software is eating the world.”
More recently Pete Warden said that “deep learning is eating software.”
I think they are right.
35
䞞汱たäșˆæƒł:
2020ćčŽă«ăŻă€æ–°ă—ăäœœă‚‰ă‚Œă‚‹ă‚œăƒ•ăƒˆă‚Šă‚§ă‚ąăź50%ä»„äžŠăŒ
ăƒ–ăƒ©ăƒƒă‚Żăƒœăƒƒă‚Żă‚čèšˆçź—ă‚’äœ•ç­‰ă‹ăźćœąă§ćˆ©ç”šă™ă‚‹
ăƒ‡ă‚žă‚żăƒ«èšˆçź—æ©Ÿç™șæ˜Žä»„æ„ă€æœ€ć€§ăźăƒ‘ăƒ©ăƒ€ă‚€ăƒ ă‚·ăƒ•ăƒˆ
ă“ăźæŠ€èĄ“ă‚’ă©ăźă‚ˆă†ă«ă†ăŸăäœżă†ă‹?
This is my personal conjecture – by the end of the next 2020, more than
half of newly developed software use blackbox computation in one way or
another.
And this is the largest paradigm shift since digital computers were first
invented 70 years ago.
The question is, how we can make good use of this new technology?
There are many challenges on how to apply blackbox computing in real
world applications.
36
ćž°çŽçš„ăƒ—ăƒ­ă‚°ăƒ©ăƒŸăƒłă‚°ă«ăŠă‘ă‚‹ć“èłȘäżèšŒăźæœŹèłȘ的ăȘ難しさ
37
ć‡ș慾:çŸłć·ć†Źæšč,トップスă‚čă‚€ăƒŒă‚·ăƒłăƒă‚žă‚Šăƒ , 2018/03
テă‚čトが曰難
ăƒ‡ăƒăƒƒă‚°ăƒ»æ”čć€‰ăŒ
曰難
One challenge is quality assurance.
This is a slide taken from Prof. Fuyuki Ishikawa’s presentation.
Because inductive programming often deals with real world problems and
also it is inductive,
1. There are infinite number of requirements
2. We cannot have test oracles that are absolute
3. It is hard to explain how the output is computed
4. The behavior of the system is hard to predict.
These points make deep learning-based systems harder to test and debug.
37
ăă‚‚ăă‚‚èŠæ±‚ä»•æ§˜ăŒæ›žă‘ă‚‹ăźă‹?
ćŒ·ćŒ–ć­Šçż’ă«ăŠă„ăŠă€èĄçȘăźăƒšăƒŠăƒ«ăƒ†ă‚Łă‚’焥限性にするず?
38
拕かăȘă„ă‚Żăƒ«ăƒž
ćŠčç”šăšćź‰ć…šæ€§ăźăƒăƒ©ăƒłă‚čă‚’ćźšé‡çš„ă«èŠä»¶ăšă—ăŠæ›žăć‡șă™ćż…èŠ!
Another challenge is how to define the requirements for a blackbox
optimization problem.
I showed you autonomous driving demo using reinforcement learning.
As I discussed, this system does not guarantee 100% avoidance of collision,
because ultimately we use deep learning, which works statistically.
Then what happens if we increase the penalty of collision to infinity in
order to make the cars “100% safe”?
The result would be cars that never move.
This is logical – if cars move, there are non-zero probabilities of collision. If
you want make the probabilities down to absolute zero, the cars must not
move.
This example reminds us that the safety and the utility are always tradeoff.
In traditional system development, we pretend as if we can achieve both
objectives simultaneously. If you formalize the task as an optimization
problem, suddenly you have to explicitly quantify how much of safety can
be sacrificed in order to achieve utility.
38
39
èŠæ±‚ăźé›Łă—ă•
‱ IJCAI 2017 Keynote by Stuart Russell, “Provably Beneficial AI”
– äșș:ă€Œă‚łăƒŒăƒ’ăƒŒă‚’ăšăŁăŠăăŠă€
– ロボット: ă‚čă‚żăƒăžèĄŒăă€ćˆ—ă«äžŠă‚“ă§ă„ă‚‹ä»–ăźćźąă‚’æźșă—ăŠă‚łăƒŒ
ăƒ’ăƒŒă‚’ăšăŁăŠăă‚‹
– äșșăźæŒ‡ç€șăŻćžžă«äžćźŒć…š
ă©ăĄă‚‰ă‚‚ă€æœ€é©ćŒ–ć•éĄŒă«ăŠă‘ă‚‹ă€Œæœ€é©ăšăŻäœ•ă‹ă€ăźć•éĄŒă‚’æè”·
äșșć·„çŸ„èƒœç ”ç©¶ă§ăźæœȘè§Łæ±ș敏題:ă€Œăƒ•ăƒŹăƒŒăƒ ć•éĄŒă€
Another example is a fictious robot case, that was presented by Stuart
Russell at IJCAI 2017.
Suppose there is a smart robot.
You say to the robot “get me coffee.”
The robot goes downstairs to Starbucks, sees many people in the line, kills
everybody, and gets coffee for you.
Of course you do not want the robot to kill people.
Instead of saying “get me coffee,” you may want to say “get me coffee
without killing people.”
But there are infinite number of things that you do not want the robot to do.
This is actually one form of the “frame problem,” an open problem in AI.
Both the autonomous driving case and the robot case tell us the difficulty
of specifying the utility function in optimization.
39
ă€Œæ©‹ă‚„ăƒ“ăƒ«ă‚’ćź‰ć…šă«äœœă‚‹ăŸ
ă‚ă«ćœŸæœšć·„ć­Šăšă„ă†ć·„ć­Šçš„
ăƒ‡ă‚Łă‚·ăƒ—ăƒȘăƒłăŒćż…èŠă§ă‚ăŁ
ăŸăźăšćŒæ§˜ă«ă€ăƒ‡ăƒŒă‚żă‚„æ©Ÿ
æą°ć­Šçż’ă‚’ćˆ©ç”šă—ăŠćˆ†æžă‚„æ„
思æ±șćźšă‚’èĄŒă†ăŸă‚ăźć·„ć­Šçš„
ăƒ‡ă‚Łă‚·ăƒ—ăƒȘăƒłăŒćż…èŠă ă€*
https://medium.com/@mijordan3/artificial-intelligence-the-
revolution-hasnt-happened-yet-5e1d5812e1e7
*äžžć±±ă«ă‚ˆă‚‹èŠçŽ„
æˆ‘ă€…ăŻäœ•ă‚’ă™ăčきか:æ–°ă—ă„ć·„ć­Šăźćż…èŠæ€§
How should we overcome these challenges?
Last year, Prof. Michael Jordan of UC Berkeley wrote a blog about the
necessity of engineering discipline for machine learning. It is a long text,
but my understanding is this:
As we have an engineering discipline called “civil engineering” for building
bridges and buildings safely, we should have a similar engineering
discipline for using machine learning or data analytics for decision making.
40
ć·„ć­ŠăšăŻă€ŒæŠ€èĄ“ăšç€ŸäŒšăźć„‘çŽ„ă€
理論e.g., æ§‹é€ èšˆçź—ïŒ‰  ćź‰ć…šäż‚æ•°
æ–°æŠ€èĄ“ăŒç€ŸäŒšă«ć—ćźčă•ă‚Œă‚‹ăźăŻă€Œć·„ć­Šă€ăšă—ăŠç†Ÿæˆă•ă‚ŒăŠă‹ă‚‰
ćœŸæœšć·„ć­Šăƒăƒłăƒ‰ăƒ–ăƒƒă‚Żă€969ăƒšăƒŒă‚ž
æˆ‘ă€…ăŻăȘăœćź‰ćżƒă—ăŠæ©‹ă‚’æžĄă‚Œă‚‹ăźă‹? ćœŸæœšć·„ć­Šăźè†šć€§ăȘçŸ„èŠ‹ăŒă‚ă‚‹ă‹ă‚‰
What is engineering? Let’s consider why we feel safe when crossing a
bridge. It is because there is a huge body of knowledge called “Civil
Engineering” behind the bridge, and we, society as a whole, trust it.
These volumes are Civil Engineering Handbook which have thousands of
pages. What are in this body of knowledge? Most of it are theories such as
material theory, structural theory, etc. In addition to these theories, there
is a small set of numbers called “safety factors” shown in this table. For
each subfield of civil engineering there is a safety factor, and if we put
them altogether, the overall safety factor is around three. This means that,
they first calculate a design of a bridge that is strong enough in every
aspect of the relevant theories, and then finally they build actual bridge
that is three times stronger than what the theories predict to be safe.
In other words, civil engineers are humble. They know that their theories
are not perfect. There are things that are beyond their knowledge. The
safety factors are determined according to their best judgement, and the
society trusts them.
41
To me, engineering is a contract between the technology and the society.
We need something similar for the new computation paradigm.
41
42
èˆˆć‘łă‚ă‚‹æ–čăŻă€æ—„æœŹă‚œăƒ•ăƒˆă‚Šă‚§ă‚ąç§‘ć­ŠäŒšæ©Ÿæą°ć­Šçż’ć·„ć­Šç ”ç©¶äŒšăž!
MLSE (Machine Learning Systems Engineering) is a Special Interest Group
of Japan Society for Software Science and Technology, established in April
2018.
We have a lot of discussions both domestically and globally on these topics.
We are planning to have MLSE 2019, a two-day, overnight workshop on
7/6-7/7 in Hakone. You are welcome to join. Paper submission deadline is
6/11.
42
43
æŠ€èĄ“è€…ăšă—ăŠæˆ‘ă€…ăŒă™ăčきこべ
1. 「äșșć·„çŸ„èƒœă€ăŻć€šçŸ©
– 䞍ćčžă‚’ă‚‚ăŸă‚‰ă™èȘ€è§ŁăŒă‚ă‚Œă°ă€æ­Łă™ăčき
2. æ·±ć±€ć­Šçż’ăƒ»ăƒ–ăƒ©ăƒƒă‚Żăƒœăƒƒă‚Żă‚čæœ€é©ćŒ–ăŻæ–°ă—ă„ăƒ—ăƒ­ă‚°ăƒ©ăƒŸăƒłă‚°ăƒ‘ăƒ©ăƒ€ă‚€ăƒ 
– ăăźćŻèƒœæ€§ăšé™ç•Œă‚’æ­Łă—ăäŒăˆă‚‹èȬ拙
3. ć·„ć­Šăšă—ăŠăźç†ŸæˆăŒćż…èŠ
– ăƒ–ăƒ©ăƒƒă‚Żăƒœăƒƒă‚Żă‚迀§ă‚’èȘă‚ăŸäžŠă§ăźç€ŸäŒšćˆæ„ă‚’çČ˜ă‚ŠćŒ·ăæ±‚ă‚ă‚‹
æŠ€èĄ“ă«ćŻŸă—ăŠèȘ ćźŸă«ć‘ăćˆăŁăŠăă ă•ă„
Here is the take-away message of my talk today: as researchers and
engineers in the field of AI, we should
1. Be aware that the word “AI” is used for different things in different
context
‱ Clarify the meaning whenever possible
2. Understand the emerging computation paradigm
‱ Explain what it is. Do not oversell or undersell
3. Make it an engineering discipline
‱ Admit its limitations and try to reach social acceptance
In short, I urge you to be sincere to technologies.
43
44
Thank You
Twitter: @maruyama
44

Jsai