Jsai

6/5/2019 人工知能学会
PFN Fellow 丸山宏
Twitter: @maruyama
「人工知能」をどのように読み解くか
How to understand “Artificial Intelligence?”
1

Agenda
1. 「人工知能」とは何か
2. 新しい計算モデル
3. 科学・工学への展開
2
In this talk, first I point out that the word “AI” is used in many ways and it is
causing confusion.
Then, I will focus on “AI as frontier of computer science” and argue that a
new computation model is emerging.
Finally, I will discuss how this new computation model affects on science
and engineering.
2

3
(1) 「知性の探求」としての人工知能
神経科学
脳科学
神経科学
脳科学
心理学心理学
経済学経済学
ロボティクス
機械工学
ロボティクス
機械工学
コンピュータ科学
(人工知能)
コンピュータ科学
(人工知能)
神経細胞・脳の
活動を観測
人の行動を観測
合理的エージェント
としてモデル化
- ゲーム理論
- 最適化
制御システムとして
モデル化
- 微分方程式
- 制御理論
ソフトウェアと
してモデル化
The term “AI,” or Artificial Intelligence, is used in different meanings in
different contexts.
There are at least three different interpretations.
First, AI is a research field to study “intelligence.”
There are several different approaches to intelligence; neuroscience tries
to understand intelligence by observing neuron activities in a brain,
psychology by observing human behaviors, economics models human
intelligence as an independent and rational agent, robotic researchers use
differential equations and control theory to understand human motions,
and we, computer scientists try to understand intelligence by building a
software mimicking human intelligence.
3

「人工知能(AI)」とは学問領域だが、、
学問領域派生した技術応用製品
物理学
人工知能
(AI)
内燃機関
半導体
自動車
コンピュータ
探索アルゴリズム
音声認識
画像認識
乗換案内
AIスピーカー
自動運転
これを物理と
は呼ばない
「AI」と呼ば
れることも…
AI is a research field just like physics, but there is one important difference.
In physics, there are technologies derived from physical theories, and there
are products based on these derived technologies, but they are not called
“physics.”
On the other hand, in AI, the derived technologies as well as products are
sometimes called “AI,” and THAT is the source of confusion.
4

(2) ハイプとしての「AI」
• AIに聞いてみた- 第4回超未婚社会 4/13
– 社会問題解決型AI 「AIひろし」
• 保育所の入所はAIが決める数秒で選考可能に | NHKニュース 2/12
– 富士通のマッチング技術
• “AIに負けない”人材を育成せよ～企業・教育最前線～ 4/25
– 「自分たちの仕事がAIに奪われる」
• 車いすの天才ホーキング博士の遺言3/16
– 「ＡＩは自らの意志を持つようになり、人間と対立するだろう。ＡＩの到来は、
人類史上、最善の出来事になるか、または最悪の出来事になるだろう。」
• マリオ～AIのゆくえ～ 4/5
– 「死」を理解できないAI人間と、「生」に希望が持てない少年の交流の果てには、
どんな結末が待っているのか。
5
データアナリティ
クス・最適化
機械による自動化
汎用AI（まだ見ぬ技
術）
That leads to the second interpretation – “AI as something sophisticated.”
Here are some examples of recent coverage by NHK.
In the program “Ask AI,” statistical analysis of social data is used for
drawing some recommendations for social issues.
The news on “Nursery Matching AI” refers to a combinatorial optimization
problem.
Discussion on how job security is threatened by AI is actually talking about
automation.
The program on Dr. Hawking’s Last Will and the drama “Mario the AI” are
dealing with hypothetical technologies that have never been realized.
The reason I call this interpretation “AI as a hype” is that, in this use,
people are less concerned with what “AI” is. The speaker means one thing,
and the listener may understand it as completely another, and they do not
bother.
If the speaker uses “AI” in order to intentionally confuse the listener,
tragedy happens.
5

This page is intentionally removed
6
6

This page is intentionally removed
7
7

(3) 「情報技術のフロンティア」としての人工知能
再帰呼び出しはAIの技術!?再帰呼び出しはAIの技術!?
P. H. Winston, Artificial Intelligence
ISBN-13: 978-0201084542, 1977
1970年代の「最先端」
Source: Wikipedia
The third interpretation of “AI” is “AI as frontier of computer science (CS).”
Many CS ideas were came from AI research.
This is one of the early textbooks on AI.
It covers things like
- Symbolic programming (LISP programming language and dynamic
memory management i.e., garbage collection)
- Recursive call (e.g., to solve the puzzle “Tower of Hanoi”)
- Search algorithm (e.g., depth-first search, A* algorithm, alpha-beta
pruning, means-end analysis)
- Syntax analysis
All of these technologies are now in the core computer science and people
do not call them “AI” anymore.
8

Computer Scienceに取り込まれる「AI技術」
1956-1974 第1次人工知能ブーム
• 記号処理 (LISP)
• Means-End Analysis
• 自然言語処理
• 記号処理 (LISP)
• Means-End Analysis
• 自然言語処理
• 知識表現 (e.g. フレーム)
• エキスパートシステム
• オントロジー
• 知識表現 (e.g. フレーム)
• エキスパートシステム
• オントロジー
2008 第3次人工知能ブーム
• 統計的機械学習・深層学習
• ブラックボックス最適化
2008 第3次人工知能ブーム
• 統計的機械学習・深層学習
• ブラックボックス最適化
- 再帰呼び出し
- ガーベージコレクション
- 動的計画法
- コンパイラ・コンパイラ
- :
- 再帰呼び出し
- ガーベージコレクション
- 動的計画法
- コンパイラ・コンパイラ
- :
- オブジェクト指向
- モデリング言語
- セマンティックWeb
- 制約ソルバー
- :
- オブジェクト指向
- モデリング言語
- セマンティックWeb
- 制約ソルバー
- :
新しい計算モデル
There have been three waves of AI research.
In the first wave of AI, research topics include
- Symbolic programming (lisp)
- Means-end analysis
- Natural language processing
and they are incorporated into the core CS as
- Dynamic memory management (garbage collection)
- Search algorithms
- Formal language theory
In the second wave of AI, research topics include
- Knowledge representation (e.g., Frame theory)
- Expert system
- Ontology
And they are incorporated into the core CS as
- Object-oriented programming
- Modeling language
- Semantic Web
9

So, if you look at what is happening in AI research, you can predict where
CS is to go.
In the current wave of AI research, focus areas include
- Statistical machine learning, especially deep learning
- Blackbox optimization
And I believe that these technologies will lead us to a new programming
paradigm, or new computation model.
9

Agenda
– 深層学習
– ブラックボックス最適化
10
I said AI is to advance computer science.
Let me discuss these two technologies that are currently driving CS.
- Deep learning
- Blackbox optimization
and how they will make a new computation model.
10

深層学習とは何か – (状態を持たない*)関数
Y = f(X)X Y
超多次元、（連
続変数・カテゴ
リ変数の任意の
組み合わせ）
判別・制御なら
ば比較的低次元、
生成ならば超多
次元
*状態を入力Xの一部とすれば、関数そのものは状態をもたないものとみなせる
First, deep learning.
Very simply put, deep learning is a function.
A function takes X as input and produces Y as output.
Usually we are dealing with a very high dimensional vector as input. For
example, if the input is a image of 100x100 pixels, the input is a vector with
10,000 variables.
Output Y could be simple – if the function is to classify the input image into
either a cat or an airplane, the dimension of Y is 2. If the output is also an
image, the dimension is much higher.
11

普通の(演繹的な)関数の作り方: 例摂氏から華氏への変換
double c2f(double c) {
return 1.8*c + 32.0;
}
return 1.8*c + 32.0;
}
入力: C
出力: F
ただし、FはCを華氏で表したもの
入力: C
出力: F
ただし、FはCを華氏で表したもの
要求
アルゴリズム
F = 1.8 * C + 32F = 1.8 * C + 32モデル
人が持つ人が持つ
先験的知
識
モデルが既知／計算機構（アルゴリズム）が構成可能である必要
Let’s consider programming a very simple function – computing Fahrenheit
from Celsius.
In conventional programming, which I call here “deductive programming,”
we start from the requirement – what is input and what is the output.
Then, we convert it into a “model” – by which I mean precise mathematical
relationship between the input and output.
In this case, we use our knowledge on the relationship between Celsius
and Fahrenheit and represent it as a mathematical formula.
Once you have the model, you can build an algorithm to actually compute it.
Note that in deductive programming, we assume that the model is known
and the algorithm is constructible.
12

深層学習の(帰納的な)やり方 – 訓練データで入出力を例示
訓練データセット訓練データセット
観測
訓練（ほぼ自動でパラメタθを決定）
モデル／アルゴリズムが未知でもよい!
On the other hand, with deep learning, programming is done by giving
examples – which I call inductive programming..
In our case, first we obtain two thermometers, one in Celsius and the other
in Fahrenheit, and read their measurements from time to time.
Then you get input-output examples, which we call a training data set.
Once the training data set is ready, we train a deep neural net semi-
automatically.
Note that inductive programming is possible even when we do not know
the model or the algorithm. This opens opportunities to solve very
interesting real-world problems.
13

自動運転のためのセグメンテーション
https://www.youtube.com/watch?v=lGOjchGdVQs
モデルが不明:人手による正解アノテーション
Here are some examples of real world applications of inductive
programming.
First is semantic segmentation for autonomous driving. The top left panel
is the video input. For each pixel of the input, our function has to calculate
what class, such as road surface (purple), sidewalk (pink), vehicle (blue),
tree (green), etc., the pixel belongs to. The output is shown in the right
bottom panel.
In this task, it is very hard to define a proper model, because you can not
give precise mathematical conditions with which a particular pixel is
classified as “road surface.”
Instead of a model, we supply examples. The bottom left panel shows a
human-annotated image, meaning that human annotators gave the correct
answer to each pixel. We should supply thousands of these human-
annotated images to get high accuracy. Even though we do not know the
model, we can prepare training data set and obtain reasonably accurate
results.
14

モデルが不明:人手による正解アノテーション
The second example is a voice-controlled robot. In this demo, the operator
says “Can you move the brown fluffy thing to the bottom?” to move the
teddy bear. Note that the top and bottom are from the operators
perspective, as shown in the top view image of the boxes as shown on the
left.
Next, she says “Can you move the tissue box to the left?” and then use
another expression “white and blue box” to refer to the same object.
Again, this system is very hard to program in the conventional way,
because we cannot encode all the possible natural language expressions
in the system.
How did we train this system? We show the top view of the boxes and ask
human annotators “what would you say if you want to move this to there?”
and collected thousands of example sentences. With these example
sentences as a training data set, we trained the system.
15

線画への自動着色アプリPaintsChainer
アルゴリズムが不明:逆問題として定式化
The third example is where the model is known but the algorithm is
unknown.
This is an application to colorize a line drawing. The left image is a line-
drawing input and the right image is the output. You can also give hints in
terms of what color is used in each region.
How did we train this neural net? We collected a lot of Anime images from
the Internet and applied a known algorithm called “edge detection” to
obtain the corresponding line drawing. Using many such pairs we trained
our neural net.
Because the edge detection algorithm is well known, the mathematical
relationship between a line drawing and the corresponding color image is
known in a sense -- that is, the model is known.
However, how to compute a colored image from a line-drawing input is not
known, meaning that the algorithm is unknown.
16

17
汎用計算機構としての深層学習
• 桁違いに多いパラメタ
– 任意の多次元非線形関数を近似*
 疑似的にチューリング完全!
出力（超多次元）
入力（超多次元）
* G. Cybenko. Approximations by superpositions
of sigmoidal functions. Mathematics of Control,
Signals, and Systems, 2(4):303–314, 1989.
I said that deep neural net is a function. How powerful is it? Is it powerful
enough to represent any computable functions? The answer to this
question is yes.
Because a deep neural net has a very large number of parameters, it is
theoretically proven that for any computable function, there exists a
complex enough deep neural network that approximates the function
within the given error margin.
In other words, deep neural network is “pseudo Turing complete,” meaning
that we have a universal computational mechanism, which can be
programmed with examples. This is a significant breakthrough in
computation, I believe.
17

統計的機械学習の本質的限界 (1)
訓練データ DNN
将来が過去と同じでないと正しく予測できない
時間軸
過去に観測されたデー
タに基づいて訓練
訓練
訓練済みモデルに基いて
新しい値を予測
Even though deep learning is a very powerful computing mechanism, it is
not an almighty intelligence. It is statistical modeling and hence, there are
fundamental limitations. I will explain three such limitations.
First is that deep neural net is trained using the data observed in the past
and use the trained deep neural net (DNN) to predict future events. Thus,
we always assume that the future is essentially the same as the past.
18

• 訓練データセットは有限
訓練データセット
内挿
外挿
??
訓練データに現れない、希少な事象に対して無力
The second limitation is this.
Even if the future and the past look the same, the training data set is
always finite. If we are dealing with an infinite set as the domain of the
function, which is almost always the case, an input to the function is in
general something that is never seen in the training data set. If the input is
close to some of the data points in the training data set, the prediction is
very accurate. This situation is called “interpolation.”
On the other hand, if the input is far from the any of the data points in the
training data set, the prediction is very hard. This situation is called
“extrapolation.” So deep learning is powerful for frequent events but
powerless for rare events that do not appear in the training data set.
Often customers ask us to build an “AI” system that can deal with
emergency situations. As long as deep learning is concerned, it is
impossible. We always need a lot of “emergency” samples in the training
data set.
19

• 本質的に確率的
元分布
独立・同分布(i.i.d.)
訓練データ
訓練済みDNN
サンプリングサンプリング
にバイアスが
入ることは避
けられない!
「100%の正しさ」は保証できない
The third and probably the most serious limitation is that the computation
of deep learning is essentially probabilistic. Because we consider deep
learning as statistical modeling, we always assume that there is an original
probability distribution which is unknown. We further assume that the
training data set is created by random sampling, or i.i.d. (Independent and
Identically Distributed). After we obtained the training data set, the DNN is
trained with a known training algorithm.
But the problem is in the first “random sampling” step. This is random
sampling, so there is always bias. For example, when you throw a dice for a
hundred times, what is the chance of getting ‘one’ for all the hundred
throws? It is very unlikely, and you may say that it will never happen. But
there is no guarantee that it never happens because of the bias. This
means that deep learning has no guarantee of its correctness. There is no
such thing as 100% correct deep neural net.
20

深層学習とは何か（まとめ）
• 関数（プログラム）の作り方
– モデルやアルゴリズムがわからなくても、訓練データセットがあ
れば作れる
– 教師信号の与え方次第で、驚くようなことが…
• 本質的に統計モデリング
– 元分布が独立・同分布であることが前提
– 近似しかできない（バイアスが入ることを避けられない）
Here is the recap of what deep learning is.
Deep learning is a new programming model. It is inductive, not deductive,
and the programming can be done without knowing the model or the
algorithm. Innovative thinking on how to give teacher signals may solve
many interesting problems.
On the other hand, it is fundamentally a statistical modeling. It assumes
that the original distribution is sampled i.i.d.
There is always bias, and what you get is always an approximation.
21

Consumer Electronics Show (CES) 2016
CESにおける自動運転デモ、深層強化学習による訓練を行ったもの
ブラックボックス最適化:強化学習による自動運転
Let me turn to the second technology that current AI research focuses on:
blackbox optimization.
This is an autonomous car demo at Consumer Electronics Show in 2016.
The silver cars were trained using deep reinforcement learning.
The red car was controlled manually by our vice president, and he
sometimes hits or blocks other cars.
But as you can see the silver cars can avoid collisions.
22

強化学習は、報酬関数を最大化する最適化問題
23
X: センサ入力
Y: アクチュエータ
出力
Y = f(X, θ)
u(S, Y): 報酬関数（効用関数）
S: 現在の状態
How this system was trained? We used an external simulator. The current
situation in the simulator is supplied as a simulated sensor input X to the
learning algorithm. The learning algorithm compute the actuator output Y
so that it maximize the reward function, or the utility function, u(S,Y).
Thus, this is an optimization problem.
23

24
伝統的な数理最適化との違い
伝統的な数理最適化
- 単体法（線形空間）
- 内点法（凸空間）
効用関数の形が事前に知ら
れている問題
ブラックボックス最適化
- 多腕バンディット問題
- 実験計画法
- 進化計算
- 強化学習
- ベイズ最適化
- 効用関数の形が事前に不明
- 外部オラクルを用いて効用関数の個別
の値を求める
x
u(x)
This type of optimization problem is much different from the traditional
optimization problems in operations research, such as linear programming
(simplex method) and convex optimization (interior-point method) where
the utility function is known in advance.
Blackbox optimization, such as reinforcement learning, assumes no prior
knowledge on the internals of the utility function – thus it is called
“blackbox.” Instead, it receives signals on the utility function by asking an
external “oracle” the value of utility function u(X) for a given parameter X.
The idea of blackbox optimization is not new. Multi-arm bandit problem
and design of experiments can both be considered as blackbox
optimization. More recently, a new technique called Bayesian optimization
attracts a lot of attention, especially in the context of hyper-parameter
tuning in deep learning.
24

25
• ハイパーパラメータ最適化ライブラリ
– 任意の目的関数で並列かつ高度な枝刈り付きで高速に探索
• “Define-by-Run“に基づき問題をコードで記述するだけ
– 探索範囲、最適化ロジックを別途記述する必要はない
For example, Optuna is an open-source technology of Bayesian
optimization. It allows very computationally-expensive oracles, such as
deep learning, with pruning of unpromising trials.
It also features “define-by-run,” meaning that the exploration space can be
dynamically constructed as the exploration proceeds.
Blackbox optimization is a very powerful tool. It does not assume anything
about the utility function, yet it can optimize for it.
In a sense, with blackbox optimization you specify what you want but not
how to solve it. The “how” part is automatically discovered by the
underlying algorithm.
25

26
第3次AIがComputer Scienceにもたらすもの
return 1.8*c + 32.0;
}
return 1.8*c + 32.0;
}
計算ステップを明示的に
指定する計算
訓練データ
報酬関数
計算ステップを明
示的に指定しな
い・できない計算
ブラック
ボックス
計算!
With these two technologies arisen from AI research, namely deep learning
and blackbox optimization, we are now witnessing an evolution of
computing.
That is, from
what I call whitebox computation where you have to specify “how to
compute,”
to
what I call blackbox computation where you only specify “what to
compute” and let the machine to figure out how to compute it, which I call
“Blackbox Computing.”
26

「計算」の進化
ホワイトボックス計算 (How) ブラックボックス計算 (What)
理論的背景離散数学、ブール代数確率論、情報幾何
計算メカニズムチューリング機械深層学習、ベイズ最適化、 …
対象とする問題モデル・アルゴリズムが存在、低
次元
モデル・アルゴリズムが不明、超高次元
プログラミング演繹的・人手による (人間の認知
能力に制約される)
帰納的・探索に基づく
精度厳密解が得られる近似解のみ
設計論直交性、独立性、モジュール化、
関心の分離
全体最適性、統合
27
Revolution of computing – from whitebox computing to blackbox computing.
Translations of the table contents follow:
Theoretical Background; Discrete mathematics, Boolean algebra;
Probability theory, Information Geometry
Computational Mechanism; Turing machine; Deep learning, Bayesian
optimization
Target problems; Known model/algorithm, low-dimensional; Unknown
model/algorithm, high-dimensional
Programming; Deductive, manual (constrained by human cognitive
capability); Inductive / search-based
Accuracy; Exact solution; Approximation only
Design principles; Orthogonal, independent, module, separation of
concerns; Global optimum, integration
27

Agenda
28
I discussed that a new computing paradigm, blackbox computation, is
emerging.
If this is the case, then what are its implications to our society?
Here I discuss two areas -- science and engineering.
28

科学の原理:「少ないパラメタで多くの現象を説明する」
万有引力の法則
1/15, 201629
科学を導く価値観:オッカムの剃刀
First, how scientific methodologies are affected by blackbox computation?
Take for example the law of gravity, Sir Isaac Newton’s beautiful law. Why
is this beautiful? Because this simple equation can explain both the
movement of celestial bodies and a dropping apple. Behind this, there is a
fundamental value in science called Occam’s Razor which says “The
explanation requiring the fewest assumptions is most likely to be correct.”
29

30
実問題の多くは、多パラメタ空間:
例: ExRNA に基づくがん診断
がん診断
科学者は、支配的な因子を探
しがち
深層学習は多パラメタのまま
よい診断結果を出す
https://www.preferred-networks.jp/en/news
On the other hand, many problems in the real world are not simple.
We are working with National Cancer Center to diagnose cancer by
analyzing exRNA in blood. Usually RNAs stay in cells but some fragments
may move out of the cell. These RNA fragments are called exRNA. It is
believed that by analyzing the expressions of these exRNA in blood we can
diagnose various kinds of cancer.
There are more than 4,000 different types of exRNA, and scientists have
been searching for a small number of predominant types that are indicative
to a particular cancer.
Our team applied deep learning on this problem. We took all the 4,000+
types of exRNA into account and by looking at the whole pattern of these
expressions, we could achieve order-of-magnitude higher accuracy of
cancer diagnosis.
30

31
X線1分子追跡法
非線形・高次元・非平衡・励起状
態・動的な物性は、従来物理学者が
「汚い領域」として避けてきた領域
深層学習による非線形・
高次元のモデル化
注:非平衡な物理学は近年急速な進歩があ
り活発に研究されているそうです。
Some physicist told me that physical phenomena that is nonlinear, high-
dimensional, non-equilibrium, excited, and dynamic is considered “dirty”
and being avoided by many physicists.
Diffracted X-ray Tracking is one such problem – using very high-energy X-
ray diffracted by tiny gold crystal attached to an endpoint of a protein
molecule, you can observe the dynamic nature of the protein molecule.
But this modeling has been very hard with traditional tools because of its
nonlinear, high-dimensional, and dynamic nature.
With deep learning in our science toolbox, we have much better chance of
analyzing this phenomena.
31

32
科学にバイアスはなかったか?
科学の問題
(人が)解ける問題解くべき問題
By looking at these complex problems, a question arises if our science has
been focusing on problems that can be solved, not problems that should be
solved.
32

33
深層学習による科学の変化:高次元科学
観察
予測
制御
少パラメタ科学(深層学習前)
多パラメタ科学
（深層学習を前提とする科学）
モデル(ホワイトボックス)
理解
モデル(ブラックボックス)
観察
人間の認知限界によって拘束される
予測
制御
人間の認知限界に拘束されない!
What does it mean in terms of scientific methodologies?
In the past, we observed the mother nature and came up with a simple
model on how the nature works. Once we understand the mechanism, we
can use this knowledge to predict the movement of celestial bodies and
control the trajectory of a cannon shell. The model should be simple
enough to be understood by a human being. This means that in this mode
of scientific activities, which I call low-dimensional science, the complexity
of the model is limited by the human cognitive capability.
On the other hand, now we have a computational tool that can deal with
high-dimensional models. I call this high-dimensional science. In high-
dimensional science, we derive a model by applying, for example, deep
learning, whose internal workings are too complex for a mere human to
understand, but still useful for predicting or controlling the physical world.
Probably this is a controversial view – later today, there is a special session
on AI and Physics, and I expect to see further discussions on this topic.
33

34
https://japan.cnet.com/blog/maruyama/2019/05/01/entry_30022958/
Last month I wrote a blog on high-dimensional science. If you are
interested in, please read the blog (sorry it is in Japanese at this moment).
34

世の中の流れ – “Software 2.0”
35
https://petewarden.com/2017/11/13/deep-learning-is-eating-software/
“Software is eating the world,”
Marc Andreessen, 2011
The second implication of blackbox computation is on the engineering side.
In 2011, Marc Andreessen said that “software is eating the world.”
More recently Pete Warden said that “deep learning is eating software.”
I think they are right.
35

丸山の予想:
2020年には、新しく作られるソフトウェアの50%以上が
ブラックボックス計算を何等かの形で利用する
デジタル計算機発明以来、最大のパラダイムシフト
この技術をどのようにうまく使うか?
This is my personal conjecture – by the end of the next 2020, more than
half of newly developed software use blackbox computation in one way or
another.
And this is the largest paradigm shift since digital computers were first
invented 70 years ago.
The question is, how we can make good use of this new technology?
There are many challenges on how to apply blackbox computing in real
world applications.
36

帰納的プログラミングにおける品質保証の本質的な難しさ
37
出典:石川冬樹,トップエスイーシンポジウム, 2018/03
テストが困難
デバッグ・改変が
困難
One challenge is quality assurance.
This is a slide taken from Prof. Fuyuki Ishikawa’s presentation.
Because inductive programming often deals with real world problems and
also it is inductive,
1. There are infinite number of requirements
2. We cannot have test oracles that are absolute
3. It is hard to explain how the output is computed
4. The behavior of the system is hard to predict.
These points make deep learning-based systems harder to test and debug.
37

そもそも要求仕様が書けるのか?
強化学習において、衝突のペナルティを無限大にすると?
38
動かないクルマ
効用と安全性のバランスを定量的に要件として書き出す必要!
Another challenge is how to define the requirements for a blackbox
optimization problem.
I showed you autonomous driving demo using reinforcement learning.
As I discussed, this system does not guarantee 100% avoidance of collision,
because ultimately we use deep learning, which works statistically.
Then what happens if we increase the penalty of collision to infinity in
order to make the cars “100% safe”?
The result would be cars that never move.
This is logical – if cars move, there are non-zero probabilities of collision. If
you want make the probabilities down to absolute zero, the cars must not
move.
This example reminds us that the safety and the utility are always tradeoff.
In traditional system development, we pretend as if we can achieve both
objectives simultaneously. If you formalize the task as an optimization
problem, suddenly you have to explicitly quantify how much of safety can
be sacrificed in order to achieve utility.
38

39
要求の難しさ
• IJCAI 2017 Keynote by Stuart Russell, “Provably Beneficial AI”
– 人:「コーヒーをとってきて」
– ロボット: スタバへ行き、列に並んでいる他の客を殺してコー
ヒーをとってくる
– 人の指示は常に不完全
どちらも、最適化問題における「最適とは何か」の問題を提起
人工知能研究での未解決問題:「フレーム問題」
Another example is a fictious robot case, that was presented by Stuart
Russell at IJCAI 2017.
Suppose there is a smart robot.
You say to the robot “get me coffee.”
The robot goes downstairs to Starbucks, sees many people in the line, kills
everybody, and gets coffee for you.
Of course you do not want the robot to kill people.
Instead of saying “get me coffee,” you may want to say “get me coffee
without killing people.”
But there are infinite number of things that you do not want the robot to do.
This is actually one form of the “frame problem,” an open problem in AI.
Both the autonomous driving case and the robot case tell us the difficulty
of specifying the utility function in optimization.
39

「橋やビルを安全に作るた
めに土木工学という工学的
ディシプリンが必要であっ
たのと同様に、データや機
械学習を利用して分析や意
思決定を行うための工学的
ディシプリンが必要だ」*
https://medium.com/@mijordan3/artificial-intelligence-the-
revolution-hasnt-happened-yet-5e1d5812e1e7
*丸山による要約
我々は何をすべきか:新しい工学の必要性
How should we overcome these challenges?
Last year, Prof. Michael Jordan of UC Berkeley wrote a blog about the
necessity of engineering discipline for machine learning. It is a long text,
but my understanding is this:
As we have an engineering discipline called “civil engineering” for building
bridges and buildings safely, we should have a similar engineering
discipline for using machine learning or data analytics for decision making.
40

工学とは「技術と社会の契約」
理論（e.g., 構造計算）＊安全係数
新技術が社会に受容されるのは「工学」として熟成されてから
土木工学ハンドブック、969ページ
我々はなぜ安心して橋を渡れるのか? 土木工学の膨大な知見があるから
What is engineering? Let’s consider why we feel safe when crossing a
bridge. It is because there is a huge body of knowledge called “Civil
Engineering” behind the bridge, and we, society as a whole, trust it.
These volumes are Civil Engineering Handbook which have thousands of
pages. What are in this body of knowledge? Most of it are theories such as
material theory, structural theory, etc. In addition to these theories, there
is a small set of numbers called “safety factors” shown in this table. For
each subfield of civil engineering there is a safety factor, and if we put
them altogether, the overall safety factor is around three. This means that,
they first calculate a design of a bridge that is strong enough in every
aspect of the relevant theories, and then finally they build actual bridge
that is three times stronger than what the theories predict to be safe.
In other words, civil engineers are humble. They know that their theories
are not perfect. There are things that are beyond their knowledge. The
safety factors are determined according to their best judgement, and the
society trusts them.
41

To me, engineering is a contract between the technology and the society.
We need something similar for the new computation paradigm.
41

42
興味ある方は、日本ソフトウェア科学会機械学習工学研究会へ!
MLSE (Machine Learning Systems Engineering) is a Special Interest Group
of Japan Society for Software Science and Technology, established in April
2018.
We have a lot of discussions both domestically and globally on these topics.
We are planning to have MLSE 2019, a two-day, overnight workshop on
7/6-7/7 in Hakone. You are welcome to join. Paper submission deadline is
6/11.
42

43
技術者として我々がすべきこと
1. 「人工知能」は多義
– 不幸をもたらす誤解があれば、正すべき
2. 深層学習・ブラックボックス最適化は新しいプログラミングパラダイム
– その可能性と限界を正しく伝える責務
3. 工学としての熟成が必要
– ブラックボックス性を認めた上での社会合意を粘り強く求める
技術に対して誠実に向き合ってください
Here is the take-away message of my talk today: as researchers and
engineers in the field of AI, we should
1. Be aware that the word “AI” is used for different things in different
context
• Clarify the meaning whenever possible
2. Understand the emerging computation paradigm
• Explain what it is. Do not oversell or undersell
3. Make it an engineering discipline
• Admit its limitations and try to reach social acceptance
In short, I urge you to be sincere to technologies.
43

44
Thank You
Twitter: @maruyama
44

Jsai

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Jsai

Similar to Jsai (20)

More from Hiroshi Maruyama

More from Hiroshi Maruyama (15)

Recently uploaded

Recently uploaded (20)

Jsai