SlideShare a Scribd company logo
AI alignment from the perspective of
Active Inference
Roman Leventov
г.Москва, 22-23 апреля 2023 г.
Научно-практическая конференция
"Современная системная инженерия и менеджмент"
Free Energy Principle: physical modelling basics
The FEP formalism assumes that the world is modelled as a set of
variables x that comprise a random dynamical system1
, in discrete or
continuous time:
x'(t) = f(x, t) + w(t),
Where x' is the rate of change of variables’ states, f is state-dependent
function (flow), and w is noise.
1. Friston, K., Da Costa, L., Sakthivadivel, D. A. R., Heins, C., Pavliotis, G. A., Ramstead, M., & Parr, T. (2022).
Path integrals, particular kinds, and strange things (arXiv:2210.12761). arXiv. http://arxiv.org/abs/2210.12761
Free Energy Principle basics: sparse coupling conjecture
A system is (approximately) causally separated from the environment
between t0
and now. μ are internal states, s are sensory states, a are
active states, b = (s, a) are boundary states, η are external states.
Illustration from Friston, K. (2019). A free energy principle for a particular physics (arXiv:1906.10184). arXiv.
http://arxiv.org/abs/1906.10184
FEP: path integral formulation (path-tracking dynamics)
Semantics are only associated with physical dynamics rather than static
physical states1
. Semantics = commuting mapping from physical objects to
mathematical objects.
μt
, bt
, ηt
are paths (trajectories) of states, i.e., physical dynamics.
∀ bt
: ∃ p(ηt
| bt
), a conditional density, μt
is the path of least action of internal
states ⇒ ∃q: μt
→ p(ηt
| bt
), semantic mapping from the path of internal
system states to beliefs about external state trajectories (a mathematical
object)2
.
VFE lemma2
: system state dynamics can be seen as a form of Bayesian
inference of q(ηt
), a variational density over external paths, wrt. some prior
and evidence bt
. ⇒ duality of physical and belief (mathematical)
dynamics (“Bayesian mechanics”)3
1. Fields, C., Friston, K., Glazebrook, J. F., & Levin, M. (2022). A free energy principle for generic quantum systems.
Progress in Biophysics and Molecular Biology, 173, 36–59. https://doi.org/10.1016/j.pbiomolbio.2022.05.006
2. Friston, K., Da Costa, L., Sakthivadivel, D. A. R., Heins, C., Pavliotis, G. A., Ramstead, M., & Parr, T. (2022). Path
integrals, particular kinds, and strange things (arXiv:2210.12761). arXiv. http://arxiv.org/abs/2210.12761
3. Ramstead, M. J. D., Sakthivadivel, D. A. R., Heins, C., Koudahl, M., Millidge, B., Da Costa, L., Klein, B., & Friston,
K. J. (2023). On Bayesian Mechanics: A Physics of and by Beliefs (arXiv:2205.11543). arXiv.
https://doi.org/10.48550/arXiv.2205.11543
Three important assumptions, or “moves”
Generalisation: q(ηt
) encodes beliefs about the present, not the future, but
we assume that smart systems decompose their beliefs into facts (current
state of the world) + generative model (e.g., scientific laws)
Assuming that systems “use” q(η) to “choose” their next action to minimise
expected free energy (~ integral of future surprise), i.e., perform Active
Inference, is induction (if the system is a black box), unless systems are
explicitly designed1
to do this or proven to explicitly do this.
Meta-theoretical move2
: assuming that scientists (observers) observe
themselves as Active Inference systems “reifies” FEP as the basis of
semantics and rationality (i.e., a form of Bayesian epistemology, Deutsch
disapproves)
1. Friston et al. (2022). Designing Ecosystems of Intelligence from First Principles (arXiv:2212.01354). arXiv.
http://arxiv.org/abs/2212.01354
2. Ramstead, M. J. D., Sakthivadivel, D. A. R., & Friston, K. J. (2022). On the Map-Territory Fallacy Fallacy
(arXiv:2208.06924). arXiv. http://arxiv.org/abs/2208.06924
Active Inference: against goals (objectives)
Active Inference system’s behaviour is caused (generated) by its
beliefs q(η) rather than its goals.
“Goals” appear only as future world states on highly predicted trajectories that
the system reflexively notices and records in memory to save computations in
the future. But even if thus recorded, goals remain in principle ephemeral and
discardable at any iteration in the active inference cycle (= OODA cycle).
See also: flaneuring (Taleb), open-endedness (Stanley & Lehman), lean
(Ries), etc., https://ailev.livejournal.com/1254147.html
⇒ Align beliefs instead of “specifying” goals. (Applies to alignment
between any intelligent systems on the same or different system levels, not
just to human–AI alignment. Cf. “managing with context, not control”.)
Definition of alignment
Informally: alignment is learning about each other, i.e., increasing mutual
capacity for predicting (signals from) each other.
FEP (with reference frame): Alignment is a physical interaction process (=
information exchange1
) between two systems during which their internal
dynamics entail belief structures (or update their prior beliefs, from from
their own perspectives) which decompose into causal generative
models with smaller transformation error2
(caveat: acyclic graphs only)
and the fact beliefs (current world states) that are closer after causal
model transformation wrt. some distance measure (KL/JS divergence?).
Quantum FEP (w/o RF): quantum RF alignment across holographic
screen1
= entanglement.
1. Fields, C., Friston, K., Glazebrook, J. F., & Levin, M. (2022). A free energy principle for generic quantum
systems. Progress in Biophysics and Molecular Biology, 173, 36–59.
https://doi.org/10.1016/j.pbiomolbio.2022.05.006
2. Rischel, E. F., & Weichwald, S. (2021). Compositional abstraction error and a category of causal models.
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 1013–1023.
https://proceedings.mlr.press/v161/rischel21a.html
Learning and aligning full world models is intractable
While AI architecture could be chosen to explicitly include a world
model1,2,3
, the architecture of human intelligence couldn’t be chosen!
Discovering large causal graphs is extremely expensive: the search space
size grows as 2d*d
, where d is the number of variables4
.
Humans (and other universal intelligences) learn many “local”, incoherent
models, which they select contextually5
. Monolithic q(η) doesn’t exist.
Solution: design belief sharing (communication) protocols1
and learning
environments that foster world model alignment without explicitly tracking
them.
1. Friston et al. (2022). Designing Ecosystems of Intelligence from First Principles (arXiv:2212.01354). arXiv.
http://arxiv.org/abs/2212.01354
2. LeCun, Y. (2022). A Path Towards Autonomous Machine Intelligence.
3. Zhou, G., Yao, L., Xu, X., Wang, C., Zhu, L., & Zhang, K. (2023). On the Opportunity of Causal Deep Generative
Models: A Survey and Future Directions (arXiv:2301.12351). arXiv. https://doi.org/10.48550/arXiv.2301.12351
4. Atanackovic, L., Tong, A., Hartford, J., Lee, L. J., Wang, B., & Bengio, Y. (2023). DynGFN: Bayesian Dynamic Causal
Discovery using Generative Flow Networks (arXiv:2302.04178). arXiv. https://doi.org/10.48550/arXiv.2302.04178
5. Fields, C., & Glazebrook, J. F. (2022). Information flow in context-dependent hierarchical Bayesian inference. Journal
of Experimental & Theoretical Artificial Intelligence, 34(1), 111–142. https://doi.org/10.1080/0952813X.2020.1836034
Hierarchy of alignment
The world model of a (self-modelling) Active Inference system could be
informally (because levels are still interdependent) separated in three
levels, roughly corresponding to self-modelling, world modelling, and
world state recognition:
1. Methodological (meta-)models: mathematics, philosophy of science,
meta-ethics, epistemology, rationality, semantics, communication, etc.
2. Science: laws of physics, chemistry, biology, intelligence, economics
3. Facts: the world state in terms of the models from 1. and 2.
Methodological alignment > scientific alignment > fact alignment1
Goals are theory-of-mind-based objects that we should fact-learn about
each other to coordinate them in the context of a cooperative system
“game”.
LLMs are a dead end?
In LLMs, world models q(η) are hopelessly entangled with recognition
(perception, encoder) and planning (actor, in LeCun’s terms) “computations”.
Using human feedback as a signal even during LLM pre-training1
doesn’t
explicitly transfer them ontologies that they should learn. (However, the
language feedback approach2
could be shaped into something that we want.)
Aligning with (and even productively communicating with) a system whose
world model is vastly larger and more complex is possible in principle, but
harder (cf. “humans don’t trade with ants”).
LeCun: LLMs are doomed3
(for related but separate reasons).
1. Korbak, T., Shi, K., Chen, A., Bhalerao, R., Buckley, C. L., Phang, J., Bowman, S. R., & Perez, E. (2023).
Pretraining Language Models with Human Preferences (arXiv:2302.08582). arXiv.
https://doi.org/10.48550/arXiv.2302.08582
2. Scheurer, J., Korbak, T., & Perez, E. (2023). Imitation Learning from Language Feedback.
https://www.lesswrong.com/posts/mCZSXdZoNoWn5SkvE/imitation-learning-from-language-feedback-1
3. LeCun, Y. (2023, April 6). Do large language models need sensory grounding for meaning and understanding?
Yes! https://www.youtube.com/watch?v=x10964w00zk&t=1m30s
Active Inference is an essential, but not an exhaustive
perspective for ensuring AI alignment
Active Inference doesn’t capture the full complexity of behaviour of
intelligent systems.
Other general1,2
and AI architecture-specific perspectives on alignment
should be taken simultaneously.
Constructor-theoretic perspective on alignment (non-Bayesian
probability)?
1. Boyd, A. B., Crutchfield, J. P., & Gu, M. (2022). Thermodynamic machine learning through maximum work production.
New Journal of Physics, 24(8), 083040. https://doi.org/10.1088/1367-2630/ac4309
2. Vanchurin, V. (2020). The World as a Neural Network. Entropy, 22(11), 1210. https://doi.org/10.3390/e22111210
AI alignment is essential, but not sufficient for the AGI
transition to go well
Control theory and system “zombie-fication”1
perspective (aligned
zombies)
Game-theoretic and collective intelligence perspective (actors cannot
align from a multi-polar trap). Collective activity should produce aligned
supra-systems.
● The Collective Intelligence Project, https://cip.org/
Infosec2
and general system fragility3
perspectives: AI, bio weapons of
mass destruction
● Need next-gen infra: https://trustoverip.org/, data ownership a-la
https://solidproject.org/, proof-of-humanness a-la
https://worldcoin.org/, etc.
1. Doyle, J. (2021). Universal Laws and Architectures and Their Fragilities.
https://www.youtube.com/watch?v=Bf4hPlwU4ys
2. Ladish, J., & Heim, L. (2022). Information security considerations for AI and the long term future.
https://forum.effectivealtruism.org/posts/WqQDCCLWbYfFRwubf/information-security-considerations-for-ai-and-the
-long-term
3. Bostrom, N. (2019). The Vulnerable World Hypothesis. Global Policy, 10(4), 455–476.
https://doi.org/10.1111/1758-5899.12718

More Related Content

What's hot

PyOpenCLによるGPGPU入門
PyOpenCLによるGPGPU入門PyOpenCLによるGPGPU入門
PyOpenCLによるGPGPU入門Yosuke Onoue
 
15. Transformerを用いた言語処理技術の発展.pdf
15. Transformerを用いた言語処理技術の発展.pdf15. Transformerを用いた言語処理技術の発展.pdf
15. Transformerを用いた言語処理技術の発展.pdf
幸太朗 岩澤
 
IIJmio meeting 29 総務省 モバイル市場の現状と政策動向
IIJmio meeting 29 総務省 モバイル市場の現状と政策動向IIJmio meeting 29 総務省 モバイル市場の現状と政策動向
IIJmio meeting 29 総務省 モバイル市場の現状と政策動向
techlog (Internet Initiative Japan Inc.)
 
[DL輪読会]When Does Label Smoothing Help?
[DL輪読会]When Does Label Smoothing Help?[DL輪読会]When Does Label Smoothing Help?
[DL輪読会]When Does Label Smoothing Help?
Deep Learning JP
 
Machine Learning Night - Preferred Networksの顧客向けプロダクト開発 - 谷脇大輔
Machine Learning Night - Preferred Networksの顧客向けプロダクト開発 - 谷脇大輔Machine Learning Night - Preferred Networksの顧客向けプロダクト開発 - 谷脇大輔
Machine Learning Night - Preferred Networksの顧客向けプロダクト開発 - 谷脇大輔
Preferred Networks
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
jins0618
 
Anomaly detection 系の論文を一言でまとめた
Anomaly detection 系の論文を一言でまとめたAnomaly detection 系の論文を一言でまとめた
Anomaly detection 系の論文を一言でまとめた
ぱんいち すみもと
 
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Preferred Networks
 
ドライブレコーダの画像認識による道路情報の自動差分抽出
ドライブレコーダの画像認識による道路情報の自動差分抽出ドライブレコーダの画像認識による道路情報の自動差分抽出
ドライブレコーダの画像認識による道路情報の自動差分抽出
Tetsutaro Watanabe
 
Laplacian Pyramid of Generative Adversarial Networks (LAPGAN) - NIPS2015読み会 #...
Laplacian Pyramid of Generative Adversarial Networks (LAPGAN) - NIPS2015読み会 #...Laplacian Pyramid of Generative Adversarial Networks (LAPGAN) - NIPS2015読み会 #...
Laplacian Pyramid of Generative Adversarial Networks (LAPGAN) - NIPS2015読み会 #...
Koichi Hamada
 
機械学習の力を引き出すための依存性管理
機械学習の力を引き出すための依存性管理機械学習の力を引き出すための依存性管理
機械学習の力を引き出すための依存性管理
Takahiro Kubo
 
機械学習応用システムの安全性の研究動向と今後の展望
機械学習応用システムの安全性の研究動向と今後の展望機械学習応用システムの安全性の研究動向と今後の展望
機械学習応用システムの安全性の研究動向と今後の展望
Nobukazu Yoshioka
 
自動運転におけるCNNの信頼性
自動運転におけるCNNの信頼性自動運転におけるCNNの信頼性
自動運転におけるCNNの信頼性
Fixstars Corporation
 
ディープラーニング開発組織のつくり方と運営ノウハウ_DLLAB Case Study Day
ディープラーニング開発組織のつくり方と運営ノウハウ_DLLAB Case Study Dayディープラーニング開発組織のつくり方と運営ノウハウ_DLLAB Case Study Day
ディープラーニング開発組織のつくり方と運営ノウハウ_DLLAB Case Study Day
Deep Learning Lab(ディープラーニング・ラボ)
 
論文紹介 : Unifying count based exploration and intrinsic motivation
論文紹介 : Unifying count based exploration and intrinsic motivation論文紹介 : Unifying count based exploration and intrinsic motivation
論文紹介 : Unifying count based exploration and intrinsic motivation
Katsuki Ohto
 
Julia: A modern language for software 2.0
Julia: A modern language for software 2.0Julia: A modern language for software 2.0
Julia: A modern language for software 2.0
Viral Shah
 
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks? 【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
Deep Learning JP
 
[DL輪読会]AutoAugment: LearningAugmentation Strategies from Data & Learning Data...
[DL輪読会]AutoAugment: LearningAugmentation Strategies from Data & Learning Data...[DL輪読会]AutoAugment: LearningAugmentation Strategies from Data & Learning Data...
[DL輪読会]AutoAugment: LearningAugmentation Strategies from Data & Learning Data...
Deep Learning JP
 
Particle Filter Tracking in Python
Particle Filter Tracking in PythonParticle Filter Tracking in Python
Particle Filter Tracking in Python
Kohta Ishikawa
 
これから始める人のためのディープラーニング基礎講座
これから始める人のためのディープラーニング基礎講座これから始める人のためのディープラーニング基礎講座
これから始める人のためのディープラーニング基礎講座
NVIDIA Japan
 

What's hot (20)

PyOpenCLによるGPGPU入門
PyOpenCLによるGPGPU入門PyOpenCLによるGPGPU入門
PyOpenCLによるGPGPU入門
 
15. Transformerを用いた言語処理技術の発展.pdf
15. Transformerを用いた言語処理技術の発展.pdf15. Transformerを用いた言語処理技術の発展.pdf
15. Transformerを用いた言語処理技術の発展.pdf
 
IIJmio meeting 29 総務省 モバイル市場の現状と政策動向
IIJmio meeting 29 総務省 モバイル市場の現状と政策動向IIJmio meeting 29 総務省 モバイル市場の現状と政策動向
IIJmio meeting 29 総務省 モバイル市場の現状と政策動向
 
[DL輪読会]When Does Label Smoothing Help?
[DL輪読会]When Does Label Smoothing Help?[DL輪読会]When Does Label Smoothing Help?
[DL輪読会]When Does Label Smoothing Help?
 
Machine Learning Night - Preferred Networksの顧客向けプロダクト開発 - 谷脇大輔
Machine Learning Night - Preferred Networksの顧客向けプロダクト開発 - 谷脇大輔Machine Learning Night - Preferred Networksの顧客向けプロダクト開発 - 谷脇大輔
Machine Learning Night - Preferred Networksの顧客向けプロダクト開発 - 谷脇大輔
 
Transfer Learning: An overview
Transfer Learning: An overviewTransfer Learning: An overview
Transfer Learning: An overview
 
Anomaly detection 系の論文を一言でまとめた
Anomaly detection 系の論文を一言でまとめたAnomaly detection 系の論文を一言でまとめた
Anomaly detection 系の論文を一言でまとめた
 
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
 
ドライブレコーダの画像認識による道路情報の自動差分抽出
ドライブレコーダの画像認識による道路情報の自動差分抽出ドライブレコーダの画像認識による道路情報の自動差分抽出
ドライブレコーダの画像認識による道路情報の自動差分抽出
 
Laplacian Pyramid of Generative Adversarial Networks (LAPGAN) - NIPS2015読み会 #...
Laplacian Pyramid of Generative Adversarial Networks (LAPGAN) - NIPS2015読み会 #...Laplacian Pyramid of Generative Adversarial Networks (LAPGAN) - NIPS2015読み会 #...
Laplacian Pyramid of Generative Adversarial Networks (LAPGAN) - NIPS2015読み会 #...
 
機械学習の力を引き出すための依存性管理
機械学習の力を引き出すための依存性管理機械学習の力を引き出すための依存性管理
機械学習の力を引き出すための依存性管理
 
機械学習応用システムの安全性の研究動向と今後の展望
機械学習応用システムの安全性の研究動向と今後の展望機械学習応用システムの安全性の研究動向と今後の展望
機械学習応用システムの安全性の研究動向と今後の展望
 
自動運転におけるCNNの信頼性
自動運転におけるCNNの信頼性自動運転におけるCNNの信頼性
自動運転におけるCNNの信頼性
 
ディープラーニング開発組織のつくり方と運営ノウハウ_DLLAB Case Study Day
ディープラーニング開発組織のつくり方と運営ノウハウ_DLLAB Case Study Dayディープラーニング開発組織のつくり方と運営ノウハウ_DLLAB Case Study Day
ディープラーニング開発組織のつくり方と運営ノウハウ_DLLAB Case Study Day
 
論文紹介 : Unifying count based exploration and intrinsic motivation
論文紹介 : Unifying count based exploration and intrinsic motivation論文紹介 : Unifying count based exploration and intrinsic motivation
論文紹介 : Unifying count based exploration and intrinsic motivation
 
Julia: A modern language for software 2.0
Julia: A modern language for software 2.0Julia: A modern language for software 2.0
Julia: A modern language for software 2.0
 
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks? 【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
 
[DL輪読会]AutoAugment: LearningAugmentation Strategies from Data & Learning Data...
[DL輪読会]AutoAugment: LearningAugmentation Strategies from Data & Learning Data...[DL輪読会]AutoAugment: LearningAugmentation Strategies from Data & Learning Data...
[DL輪読会]AutoAugment: LearningAugmentation Strategies from Data & Learning Data...
 
Particle Filter Tracking in Python
Particle Filter Tracking in PythonParticle Filter Tracking in Python
Particle Filter Tracking in Python
 
これから始める人のためのディープラーニング基礎講座
これから始める人のためのディープラーニング基礎講座これから始める人のためのディープラーニング基礎講座
これから始める人のためのディープラーニング基礎講座
 

Similar to AI alignment from the Active Inference perspective 2023.pdf

More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?
Paul Groth
 
Modeling sustainability in social networks
Modeling sustainability in social networksModeling sustainability in social networks
Modeling sustainability in social networks
Srinath Srinivasa
 
Blei lafferty2009
Blei lafferty2009Blei lafferty2009
Blei lafferty2009
Ajay Ohri
 
AI Math Agents
AI Math AgentsAI Math Agents
AI Math Agents
Melanie Swan
 
6. kr paper journal nov 11, 2017 (edit a)
6. kr paper journal nov 11, 2017 (edit a)6. kr paper journal nov 11, 2017 (edit a)
6. kr paper journal nov 11, 2017 (edit a)
IAESIJEECS
 
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELSREPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
cscpconf
 
Smart-GoodnessOfTheUniverse-SteppingIntoFuture2022NEW.pptx
Smart-GoodnessOfTheUniverse-SteppingIntoFuture2022NEW.pptxSmart-GoodnessOfTheUniverse-SteppingIntoFuture2022NEW.pptx
Smart-GoodnessOfTheUniverse-SteppingIntoFuture2022NEW.pptx
John Smart
 
DNA Information
DNA InformationDNA Information
DNA Information
Hans Rudolf Tremp
 
Mutual redundancies and triple contingencies
Mutual redundancies and triple contingenciesMutual redundancies and triple contingencies
Mutual redundancies and triple contingencies
leydesdorff
 
Theory of Mind: A Neural Prediction Problem
Theory of Mind: A Neural Prediction ProblemTheory of Mind: A Neural Prediction Problem
Theory of Mind: A Neural Prediction Problem
RealLifeMurderMyster
 
Functional and Structural Models of Commonsense Reasoning in Cognitive Archit...
Functional and Structural Models of Commonsense Reasoning in Cognitive Archit...Functional and Structural Models of Commonsense Reasoning in Cognitive Archit...
Functional and Structural Models of Commonsense Reasoning in Cognitive Archit...
Antonio Lieto
 
Xin Yao: "What can evolutionary computation do for you?"
Xin Yao: "What can evolutionary computation do for you?"Xin Yao: "What can evolutionary computation do for you?"
Xin Yao: "What can evolutionary computation do for you?"
ieee_cis_cyprus
 
How to quantify hierarchy?
How to quantify hierarchy?How to quantify hierarchy?
How to quantify hierarchy?
Dániel Czégel
 
Geometry of knowledge spaces
Geometry of knowledge spacesGeometry of knowledge spaces
Geometry of knowledge spaces
SyedVAhamed
 
Linguistics models for system analysis- Chuluundorj.B
Linguistics models for system analysis- Chuluundorj.BLinguistics models for system analysis- Chuluundorj.B
Linguistics models for system analysis- Chuluundorj.B
Khulan Jugder
 
Updated (version 2.3 THRILLER) Easy Perspective to (Complexity)-Thriller 12 S...
Updated (version 2.3 THRILLER) Easy Perspective to (Complexity)-Thriller 12 S...Updated (version 2.3 THRILLER) Easy Perspective to (Complexity)-Thriller 12 S...
Updated (version 2.3 THRILLER) Easy Perspective to (Complexity)-Thriller 12 S...
EmadfHABIB2
 
A General Principle of Learning and its Application for Reconciling Einstein’...
A General Principle of Learning and its Application for Reconciling Einstein’...A General Principle of Learning and its Application for Reconciling Einstein’...
A General Principle of Learning and its Application for Reconciling Einstein’...
Jeffrey Huang
 
Measuring Social Complexity and the Emergence of Cooperation from Entropic Pr...
Measuring Social Complexity and the Emergence of Cooperation from Entropic Pr...Measuring Social Complexity and the Emergence of Cooperation from Entropic Pr...
Measuring Social Complexity and the Emergence of Cooperation from Entropic Pr...
IJEAB
 
What Is Complexity Science? A View from Different Directions.pdf
What Is Complexity Science? A View from Different Directions.pdfWhat Is Complexity Science? A View from Different Directions.pdf
What Is Complexity Science? A View from Different Directions.pdf
Kizito Lubano
 
Gregory vigneaux design thinking for the end of the world
Gregory vigneaux design thinking for the end of the worldGregory vigneaux design thinking for the end of the world
Gregory vigneaux design thinking for the end of the world
Gregory Vigneaux
 

Similar to AI alignment from the Active Inference perspective 2023.pdf (20)

More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?
 
Modeling sustainability in social networks
Modeling sustainability in social networksModeling sustainability in social networks
Modeling sustainability in social networks
 
Blei lafferty2009
Blei lafferty2009Blei lafferty2009
Blei lafferty2009
 
AI Math Agents
AI Math AgentsAI Math Agents
AI Math Agents
 
6. kr paper journal nov 11, 2017 (edit a)
6. kr paper journal nov 11, 2017 (edit a)6. kr paper journal nov 11, 2017 (edit a)
6. kr paper journal nov 11, 2017 (edit a)
 
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELSREPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
REPRESENTATION OF UNCERTAIN DATA USING POSSIBILISTIC NETWORK MODELS
 
Smart-GoodnessOfTheUniverse-SteppingIntoFuture2022NEW.pptx
Smart-GoodnessOfTheUniverse-SteppingIntoFuture2022NEW.pptxSmart-GoodnessOfTheUniverse-SteppingIntoFuture2022NEW.pptx
Smart-GoodnessOfTheUniverse-SteppingIntoFuture2022NEW.pptx
 
DNA Information
DNA InformationDNA Information
DNA Information
 
Mutual redundancies and triple contingencies
Mutual redundancies and triple contingenciesMutual redundancies and triple contingencies
Mutual redundancies and triple contingencies
 
Theory of Mind: A Neural Prediction Problem
Theory of Mind: A Neural Prediction ProblemTheory of Mind: A Neural Prediction Problem
Theory of Mind: A Neural Prediction Problem
 
Functional and Structural Models of Commonsense Reasoning in Cognitive Archit...
Functional and Structural Models of Commonsense Reasoning in Cognitive Archit...Functional and Structural Models of Commonsense Reasoning in Cognitive Archit...
Functional and Structural Models of Commonsense Reasoning in Cognitive Archit...
 
Xin Yao: "What can evolutionary computation do for you?"
Xin Yao: "What can evolutionary computation do for you?"Xin Yao: "What can evolutionary computation do for you?"
Xin Yao: "What can evolutionary computation do for you?"
 
How to quantify hierarchy?
How to quantify hierarchy?How to quantify hierarchy?
How to quantify hierarchy?
 
Geometry of knowledge spaces
Geometry of knowledge spacesGeometry of knowledge spaces
Geometry of knowledge spaces
 
Linguistics models for system analysis- Chuluundorj.B
Linguistics models for system analysis- Chuluundorj.BLinguistics models for system analysis- Chuluundorj.B
Linguistics models for system analysis- Chuluundorj.B
 
Updated (version 2.3 THRILLER) Easy Perspective to (Complexity)-Thriller 12 S...
Updated (version 2.3 THRILLER) Easy Perspective to (Complexity)-Thriller 12 S...Updated (version 2.3 THRILLER) Easy Perspective to (Complexity)-Thriller 12 S...
Updated (version 2.3 THRILLER) Easy Perspective to (Complexity)-Thriller 12 S...
 
A General Principle of Learning and its Application for Reconciling Einstein’...
A General Principle of Learning and its Application for Reconciling Einstein’...A General Principle of Learning and its Application for Reconciling Einstein’...
A General Principle of Learning and its Application for Reconciling Einstein’...
 
Measuring Social Complexity and the Emergence of Cooperation from Entropic Pr...
Measuring Social Complexity and the Emergence of Cooperation from Entropic Pr...Measuring Social Complexity and the Emergence of Cooperation from Entropic Pr...
Measuring Social Complexity and the Emergence of Cooperation from Entropic Pr...
 
What Is Complexity Science? A View from Different Directions.pdf
What Is Complexity Science? A View from Different Directions.pdfWhat Is Complexity Science? A View from Different Directions.pdf
What Is Complexity Science? A View from Different Directions.pdf
 
Gregory vigneaux design thinking for the end of the world
Gregory vigneaux design thinking for the end of the worldGregory vigneaux design thinking for the end of the world
Gregory vigneaux design thinking for the end of the world
 

Recently uploaded

cathode ray oscilloscope and its applications
cathode ray oscilloscope and its applicationscathode ray oscilloscope and its applications
cathode ray oscilloscope and its applications
sandertein
 
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Sérgio Sacani
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 
AJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdfAJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR
 
Alternate Wetting and Drying - Climate Smart Agriculture
Alternate Wetting and Drying - Climate Smart AgricultureAlternate Wetting and Drying - Climate Smart Agriculture
Alternate Wetting and Drying - Climate Smart Agriculture
International Food Policy Research Institute- South Asia Office
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
yourprojectpartner05
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
Scintica Instrumentation
 
Microbiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdfMicrobiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdf
sammy700571
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
Carl Bergstrom
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
PirithiRaju
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Selcen Ozturkcan
 
11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
PirithiRaju
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
Advanced-Concepts-Team
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
Tissue fluids_etiology_volume regulation_pressure.pptx
Tissue fluids_etiology_volume regulation_pressure.pptxTissue fluids_etiology_volume regulation_pressure.pptx
Tissue fluids_etiology_volume regulation_pressure.pptx
muralinath2
 

Recently uploaded (20)

cathode ray oscilloscope and its applications
cathode ray oscilloscope and its applicationscathode ray oscilloscope and its applications
cathode ray oscilloscope and its applications
 
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 
AJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdfAJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdf
 
Alternate Wetting and Drying - Climate Smart Agriculture
Alternate Wetting and Drying - Climate Smart AgricultureAlternate Wetting and Drying - Climate Smart Agriculture
Alternate Wetting and Drying - Climate Smart Agriculture
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
 
Microbiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdfMicrobiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdf
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
 
11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
Tissue fluids_etiology_volume regulation_pressure.pptx
Tissue fluids_etiology_volume regulation_pressure.pptxTissue fluids_etiology_volume regulation_pressure.pptx
Tissue fluids_etiology_volume regulation_pressure.pptx
 

AI alignment from the Active Inference perspective 2023.pdf

  • 1. AI alignment from the perspective of Active Inference Roman Leventov г.Москва, 22-23 апреля 2023 г. Научно-практическая конференция "Современная системная инженерия и менеджмент"
  • 2. Free Energy Principle: physical modelling basics The FEP formalism assumes that the world is modelled as a set of variables x that comprise a random dynamical system1 , in discrete or continuous time: x'(t) = f(x, t) + w(t), Where x' is the rate of change of variables’ states, f is state-dependent function (flow), and w is noise. 1. Friston, K., Da Costa, L., Sakthivadivel, D. A. R., Heins, C., Pavliotis, G. A., Ramstead, M., & Parr, T. (2022). Path integrals, particular kinds, and strange things (arXiv:2210.12761). arXiv. http://arxiv.org/abs/2210.12761
  • 3. Free Energy Principle basics: sparse coupling conjecture A system is (approximately) causally separated from the environment between t0 and now. μ are internal states, s are sensory states, a are active states, b = (s, a) are boundary states, η are external states. Illustration from Friston, K. (2019). A free energy principle for a particular physics (arXiv:1906.10184). arXiv. http://arxiv.org/abs/1906.10184
  • 4. FEP: path integral formulation (path-tracking dynamics) Semantics are only associated with physical dynamics rather than static physical states1 . Semantics = commuting mapping from physical objects to mathematical objects. μt , bt , ηt are paths (trajectories) of states, i.e., physical dynamics. ∀ bt : ∃ p(ηt | bt ), a conditional density, μt is the path of least action of internal states ⇒ ∃q: μt → p(ηt | bt ), semantic mapping from the path of internal system states to beliefs about external state trajectories (a mathematical object)2 . VFE lemma2 : system state dynamics can be seen as a form of Bayesian inference of q(ηt ), a variational density over external paths, wrt. some prior and evidence bt . ⇒ duality of physical and belief (mathematical) dynamics (“Bayesian mechanics”)3 1. Fields, C., Friston, K., Glazebrook, J. F., & Levin, M. (2022). A free energy principle for generic quantum systems. Progress in Biophysics and Molecular Biology, 173, 36–59. https://doi.org/10.1016/j.pbiomolbio.2022.05.006 2. Friston, K., Da Costa, L., Sakthivadivel, D. A. R., Heins, C., Pavliotis, G. A., Ramstead, M., & Parr, T. (2022). Path integrals, particular kinds, and strange things (arXiv:2210.12761). arXiv. http://arxiv.org/abs/2210.12761 3. Ramstead, M. J. D., Sakthivadivel, D. A. R., Heins, C., Koudahl, M., Millidge, B., Da Costa, L., Klein, B., & Friston, K. J. (2023). On Bayesian Mechanics: A Physics of and by Beliefs (arXiv:2205.11543). arXiv. https://doi.org/10.48550/arXiv.2205.11543
  • 5. Three important assumptions, or “moves” Generalisation: q(ηt ) encodes beliefs about the present, not the future, but we assume that smart systems decompose their beliefs into facts (current state of the world) + generative model (e.g., scientific laws) Assuming that systems “use” q(η) to “choose” their next action to minimise expected free energy (~ integral of future surprise), i.e., perform Active Inference, is induction (if the system is a black box), unless systems are explicitly designed1 to do this or proven to explicitly do this. Meta-theoretical move2 : assuming that scientists (observers) observe themselves as Active Inference systems “reifies” FEP as the basis of semantics and rationality (i.e., a form of Bayesian epistemology, Deutsch disapproves) 1. Friston et al. (2022). Designing Ecosystems of Intelligence from First Principles (arXiv:2212.01354). arXiv. http://arxiv.org/abs/2212.01354 2. Ramstead, M. J. D., Sakthivadivel, D. A. R., & Friston, K. J. (2022). On the Map-Territory Fallacy Fallacy (arXiv:2208.06924). arXiv. http://arxiv.org/abs/2208.06924
  • 6. Active Inference: against goals (objectives) Active Inference system’s behaviour is caused (generated) by its beliefs q(η) rather than its goals. “Goals” appear only as future world states on highly predicted trajectories that the system reflexively notices and records in memory to save computations in the future. But even if thus recorded, goals remain in principle ephemeral and discardable at any iteration in the active inference cycle (= OODA cycle). See also: flaneuring (Taleb), open-endedness (Stanley & Lehman), lean (Ries), etc., https://ailev.livejournal.com/1254147.html ⇒ Align beliefs instead of “specifying” goals. (Applies to alignment between any intelligent systems on the same or different system levels, not just to human–AI alignment. Cf. “managing with context, not control”.)
  • 7. Definition of alignment Informally: alignment is learning about each other, i.e., increasing mutual capacity for predicting (signals from) each other. FEP (with reference frame): Alignment is a physical interaction process (= information exchange1 ) between two systems during which their internal dynamics entail belief structures (or update their prior beliefs, from from their own perspectives) which decompose into causal generative models with smaller transformation error2 (caveat: acyclic graphs only) and the fact beliefs (current world states) that are closer after causal model transformation wrt. some distance measure (KL/JS divergence?). Quantum FEP (w/o RF): quantum RF alignment across holographic screen1 = entanglement. 1. Fields, C., Friston, K., Glazebrook, J. F., & Levin, M. (2022). A free energy principle for generic quantum systems. Progress in Biophysics and Molecular Biology, 173, 36–59. https://doi.org/10.1016/j.pbiomolbio.2022.05.006 2. Rischel, E. F., & Weichwald, S. (2021). Compositional abstraction error and a category of causal models. Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 1013–1023. https://proceedings.mlr.press/v161/rischel21a.html
  • 8. Learning and aligning full world models is intractable While AI architecture could be chosen to explicitly include a world model1,2,3 , the architecture of human intelligence couldn’t be chosen! Discovering large causal graphs is extremely expensive: the search space size grows as 2d*d , where d is the number of variables4 . Humans (and other universal intelligences) learn many “local”, incoherent models, which they select contextually5 . Monolithic q(η) doesn’t exist. Solution: design belief sharing (communication) protocols1 and learning environments that foster world model alignment without explicitly tracking them. 1. Friston et al. (2022). Designing Ecosystems of Intelligence from First Principles (arXiv:2212.01354). arXiv. http://arxiv.org/abs/2212.01354 2. LeCun, Y. (2022). A Path Towards Autonomous Machine Intelligence. 3. Zhou, G., Yao, L., Xu, X., Wang, C., Zhu, L., & Zhang, K. (2023). On the Opportunity of Causal Deep Generative Models: A Survey and Future Directions (arXiv:2301.12351). arXiv. https://doi.org/10.48550/arXiv.2301.12351 4. Atanackovic, L., Tong, A., Hartford, J., Lee, L. J., Wang, B., & Bengio, Y. (2023). DynGFN: Bayesian Dynamic Causal Discovery using Generative Flow Networks (arXiv:2302.04178). arXiv. https://doi.org/10.48550/arXiv.2302.04178 5. Fields, C., & Glazebrook, J. F. (2022). Information flow in context-dependent hierarchical Bayesian inference. Journal of Experimental & Theoretical Artificial Intelligence, 34(1), 111–142. https://doi.org/10.1080/0952813X.2020.1836034
  • 9. Hierarchy of alignment The world model of a (self-modelling) Active Inference system could be informally (because levels are still interdependent) separated in three levels, roughly corresponding to self-modelling, world modelling, and world state recognition: 1. Methodological (meta-)models: mathematics, philosophy of science, meta-ethics, epistemology, rationality, semantics, communication, etc. 2. Science: laws of physics, chemistry, biology, intelligence, economics 3. Facts: the world state in terms of the models from 1. and 2. Methodological alignment > scientific alignment > fact alignment1 Goals are theory-of-mind-based objects that we should fact-learn about each other to coordinate them in the context of a cooperative system “game”.
  • 10. LLMs are a dead end? In LLMs, world models q(η) are hopelessly entangled with recognition (perception, encoder) and planning (actor, in LeCun’s terms) “computations”. Using human feedback as a signal even during LLM pre-training1 doesn’t explicitly transfer them ontologies that they should learn. (However, the language feedback approach2 could be shaped into something that we want.) Aligning with (and even productively communicating with) a system whose world model is vastly larger and more complex is possible in principle, but harder (cf. “humans don’t trade with ants”). LeCun: LLMs are doomed3 (for related but separate reasons). 1. Korbak, T., Shi, K., Chen, A., Bhalerao, R., Buckley, C. L., Phang, J., Bowman, S. R., & Perez, E. (2023). Pretraining Language Models with Human Preferences (arXiv:2302.08582). arXiv. https://doi.org/10.48550/arXiv.2302.08582 2. Scheurer, J., Korbak, T., & Perez, E. (2023). Imitation Learning from Language Feedback. https://www.lesswrong.com/posts/mCZSXdZoNoWn5SkvE/imitation-learning-from-language-feedback-1 3. LeCun, Y. (2023, April 6). Do large language models need sensory grounding for meaning and understanding? Yes! https://www.youtube.com/watch?v=x10964w00zk&t=1m30s
  • 11. Active Inference is an essential, but not an exhaustive perspective for ensuring AI alignment Active Inference doesn’t capture the full complexity of behaviour of intelligent systems. Other general1,2 and AI architecture-specific perspectives on alignment should be taken simultaneously. Constructor-theoretic perspective on alignment (non-Bayesian probability)? 1. Boyd, A. B., Crutchfield, J. P., & Gu, M. (2022). Thermodynamic machine learning through maximum work production. New Journal of Physics, 24(8), 083040. https://doi.org/10.1088/1367-2630/ac4309 2. Vanchurin, V. (2020). The World as a Neural Network. Entropy, 22(11), 1210. https://doi.org/10.3390/e22111210
  • 12. AI alignment is essential, but not sufficient for the AGI transition to go well Control theory and system “zombie-fication”1 perspective (aligned zombies) Game-theoretic and collective intelligence perspective (actors cannot align from a multi-polar trap). Collective activity should produce aligned supra-systems. ● The Collective Intelligence Project, https://cip.org/ Infosec2 and general system fragility3 perspectives: AI, bio weapons of mass destruction ● Need next-gen infra: https://trustoverip.org/, data ownership a-la https://solidproject.org/, proof-of-humanness a-la https://worldcoin.org/, etc. 1. Doyle, J. (2021). Universal Laws and Architectures and Their Fragilities. https://www.youtube.com/watch?v=Bf4hPlwU4ys 2. Ladish, J., & Heim, L. (2022). Information security considerations for AI and the long term future. https://forum.effectivealtruism.org/posts/WqQDCCLWbYfFRwubf/information-security-considerations-for-ai-and-the -long-term 3. Bostrom, N. (2019). The Vulnerable World Hypothesis. Global Policy, 10(4), 455–476. https://doi.org/10.1111/1758-5899.12718