SlideShare a Scribd company logo
1 of 505
Download to read offline
LLMs, LMMs, their Improvement
Suggestions and the Path
towards AGI
Thomas Poetter, e-mail: tp@compris.com
April 15, 2024
LMMs: Table of Contents
1. Overview
2. Reinforcement Learning
(RL)
3. Other High-End Topics &
Fun
4. Data
5. Simpler than LMMs
6. Base Technology
7. From small to full
LLMs/LMMs with RAG
8. Improving LLMs/LMMs
9. GNNs/Graph-ConvNets
10. Prompting
11. Applying/Integrating LLMs/LMMs
12. Interesting Risks
13. Other AGI-related Background Info
14. Other promising AI Techniques (non-
AGI)
15. Probabilistic Techniques
16. eXplainable AI (XAI)
17. NLP: Heavy Lexicalist Approaches:
HPSG, MRS
18. Logic & Math
19. Decision Trees & Gradient Boosting
20. AI Quality Assessment
21. Background & other Applications
Vita Thomas Poetter
• AI experience since 1992 (Studies + Master‘s Thesis at the German Research
Center for Artificial Intelligence - DFKI)
• Most important AI projects (and open for new projects):
1. Architect in 3 Autonomous Driving Programs
2. Architect for Open Source SOCs (Security Op Center) and real-time NLP information
extraction for it (banks, industry)
3. AI/ML Architect, intelligent Test automation for large global retailer/e-commerce optimizing
marketing/supply.
4. AI-based marketing (intranet/internet), Integration with CDP (Customer Data Platf.), MAP
(Marketing Automation Platf.) [banks, e-commerce]
5. Analyzing financial transactions regarding fraud, money laundering, credit worthiness, etc.
(banks)
6. Intelligent Chat bots, robot advisors (banks)
7. Architecture of a corporate memory (bank) for financial analyses as above
8. AI-based market research copying with missing/bad data.
9. Predictive maintenance, marketing and many other Big Data/Data Science/AI projects
Social media connection requests and project inquiries are welcome by e-mail under
tp@compris.com. We offer consulting/IT architecture/development at very affordable
rates. Just contact us.
Goal: Showing where AI is going and which
Investments are most likely to succeed
• There have been headlines that OpenAI has already killed thousands
of specialized AI startups and AI initiatives with their general-purpose
approach. Until ChatGPT 3 came out, most countries and funding/VC
organizations only funded specialty AI because they thought generic AI is
unrealistic. Before that, these same only considered symbolic AI and
focused on ethics just for symbolic AI (also wrong).
• In fact, innovation goes along completely new avenues which with recent
publications and the OpenAI leaks have become very clear, so that we
can give completely new guidance with this presentation.
• Especially, various open source AI/LLM frameworks did come out and it
has become clear how to apply latest AI technologies and frameworks in
the various application domains and robust approaches towards this
have appeared.
AI Tech Landscape
https://www.linkedin.com/po
sts/dr-martha-
boeckenfeld_artificialintellige
nce-innovation-aiethics-
activity-
7147507589994409984-
YaJ4
EU AI Act: Cheat Sheet
https://www.linkedin.com/posts/martin-b-moeller_euaiact-
airegulation-artificualintelligence-activity-
7174316010076864513-ORj7
The AI Act will be fully in
force in 2026 i.e. 24
months after its ratification
into law just now. There will
be shorter deadlines for
banned AI systems (6
months) and General
Purpose AI rules (12
months), but also longer
ones for obligations for
High Risk systems (36
months). Driving
enforcement will be the
responsibility of European
member states who will set
up national supervisory
authorities, complemented
by a newly formed central
EU AI Office. Importantly,
the scope of applicability
will be broad as well. The
EU AI act will apply to all
organizations using and
producing AI systems
based in the EU.
Furthermore, it will have an
extraterritorial dimension,
too, applying also to
providers from third
countries who will offer
their AIs within the EU
US Nat. Security
Commission: Priorities for
AI Research
AI Applications: Common Job Roles
Credit: Swyx
AI Architect
IT Architect, Product Owner, Project Manager
Zuckerberg bets Billions on AGI
https://youtu.be/8md5EgOa5vM
Big bets on AGI
The biggest investors in AGI
are: OpenAI/Microsoft, Meta,
Google, Amazon, X/Grok
7 Stages of AGI
https://arxiv.org/abs/2311.02462
Levels of AGI
https://arxiv.org/abs/2311.02462
Levels of AGI 2
https://arxiv.org/abs/2311.02462
Levels of AGI 3
https://arxiv.org/abs/2311.02462
Human vs AI
Performance
Six Principles of AGI
1. Capabilities Over Processes: The focus should be on what AGI can do, not
how it does it. This means looking at the results rather than the underlying
mechanisms.
2. Generality and Performance: AGI should be broadly capable (generality) and
perform tasks well (performance). The framework considers both aspects.
3. Cognitive and Metacognitive Tasks: AGI should be able to perform both
cognitive tasks (like problem-solving) and metacognitive tasks (like learning
how to learn).
4. Stages Toward AGI: The path to AGI should be seen as a series of stages or
levels, not just a single end goal.
5. Benchmarking: There should be clear benchmarks to measure the behavior
and capabilities of AGI systems.
6. Deployment Considerations: When deploying AGI systems, it’s crucial to
think about how they operate autonomously and the potential risks involved.
https://arxiv.org/abs/2311.02462, https://theaigrid.com/levels-of-agi-operationalizing-progress-on-the-path-to-agi/
Introductory less deep/technical Presentations
•A good general discussion into LLMs can be found
here:
https://www.youtube.com/watch?v=zjkBMFhNj_g,
https://drive.google.com/file/d/1pxx_ZI7O-
Nwl7ZLNk5hI3WzAsTLwvNU7/view
• https://docs.google.com/presentation/d/1yBWLNzlrrIsf
NprbEnmYdqckAMrwFZB-/edit Niels Rogge: Training
and deploying open-source LLMs
OpenAI Qualia/Q*:
Possible Leak,
Speculation and
possible Path towards
AGI minimizing
Hallucinations
18
Sam Altman’s Dismissal: Possible Background
• Employees alleged Altman has been psychologically abusive, creating pockets of
chaos and delays.
• A review found that Altman had not been “consistently candid in his
communications.” The Washington Post previously reported that the board’s vote
was triggered by a pattern of manipulation and rooted in Altman’s attempts to avoid
checks on his power at OpenAI.
• Helen Toner denied AI safety concerns and instead cited eroded trust. The two
clashed over a paper she co-authored on AI safety, critical of OpenAI.
• Quora CEO Adam D’Angelo had a conflict of interest regarding his latest startup
Poe that competed with OpenAI bots.
• For longtime OpenAI employees, there was added incentive to sign the letter:
Altman’s departure jeopardized an investment deal that would allow them to sell their
stock back to OpenAI, cashing out equity without waiting for the company to go
public at >3x the value at the time.
• Sutskever: ‘the beatings will continue until morale improves’ applies more often than it
has any right to. He had voted to dismiss Altman and then changed his mind to keep
his job after Altman returned.
https://x.com/rowancheung/status/1732997696107991400 Helen Toner, https://www.washingtonpost.com/technology/2023/12/08/open-
ai-sam-altman-complaints/
Sam Altman’s Dismissal: Letter to Board
To the Board of Directors of OpenAI:
We are writing to you today to express our deep concern about the recent events at OpenAI, particularly the allegations of misconduct against Sam Altman.
We are former OpenAI employees who left the company during a period of significant turmoil and upheaval. As you have now witnessed what happens when you dare stand up to Sam Altman,
perhaps you can understand why so many of us have remained silent for fear of repercussions. We can no longer stand by silent.
We believe that the Board of Directors has a duty to investigate these allegations thoroughly and take appropriate action. We urge you to:
• Expand the scope of Emmett's investigation to include an examination of Sam Altman's actions since August 2018, when OpenAI began transitioning from a non-profit to a for-profit entity.
• Issue an open call for private statements from former OpenAI employees who resigned, were placed on medical leave, or were terminated during this period.
• Protect the identities of those who come forward to ensure that they are not subjected to retaliation or other forms of harm.
We believe that a significant number of OpenAI employees were pushed out of the company to facilitate its transition to a for-profit model. This is evidenced by the fact that OpenAI's employee
attrition rate between January 2018 and July 2020 was in the order of 50%.
Throughout our time at OpenAI, we witnessed a disturbing pattern of deceit and manipulation by Sam Altman and Greg Brockman, driven by their insatiable pursuit of achieving artificial general
intelligence (AGI). Their methods, however, have raised serious doubts about their true intentions and the extent to which they genuinely prioritize the benefit of all humanity.
Many of us, initially hopeful about OpenAI's mission, chose to give Sam and Greg the benefit of the doubt. However, as their actions became increasingly concerning, those who dared to voice
their concerns were silenced or pushed out. This systematic silencing of dissent created an environment of fear and intimidation, effectively stifling any meaningful discussion about the ethical
implications of OpenAI's work.
We provide concrete examples of Sam and Greg's dishonesty & manipulation including:
• Sam's demand for researchers to delay reporting progress on specific "secret" research initiatives, which were later dismantled for failing to deliver sufficient results quickly enough. Those
who questioned this practice were dismissed as "bad culture fits" and even terminated, some just before Thanksgiving 2019.
• Greg's use of discriminatory language against a gender-transitioning team member. Despite many promises to address this issue, no meaningful action was taken, except for Greg simply
avoiding all communication with the affected individual, effectively creating a hostile work environment. This team member was eventually terminated for alleged under-performance.
• Sam directing IT and Operations staff to conduct investigations into employees, including Ilya, without the knowledge or consent of management.
• Sam's discreet, yet routine exploitation of OpenAI's non-profit resources to advance his personal goals, particularly motivated by his grudge against Elon following their falling out.
• The Operations team's tacit acceptance of the special rules that applied to Greg, navigating intricate requirements to avoid being blacklisted.
• Brad Lightcap's unfulfilled promise to make public the documents detailing OpenAI's capped-profit structure and the profit cap for each investor.
• Sam's incongruent promises to research projects for compute quotas, causing internal distrust and infighting.
Despite the mounting evidence of Sam and Greg's transgressions, those who remain at OpenAI continue to blindly follow their leadership, even at significant personal cost. This unwavering
loyalty stems from a combination of fear of retribution and the allure of potential financial gains through OpenAI's profit participation units.
The governance structure of OpenAI, specifically designed by Sam and Greg, deliberately isolates employees from overseeing the for-profit operations, precisely due to their inherent conflicts of
interest. This opaque structure enables Sam and Greg to operate with impunity, shielded from accountability.
We urge the Board of Directors of OpenAI to take a firm stand against these unethical practices and launch an independent investigation into Sam and Greg's conduct. We believe that OpenAI's
mission is too important to be compromised by the personal agendas of a few individuals.
We implore you, the Board of Directors, to remain steadfast in your commitment to OpenAI's original mission and not succumb to the pressures of profit-driven interests. The future of artificial
intelligence and the well-being of humanity depend on your unwavering commitment to ethical leadership and transparency.
Sincerely,
Concerned Former OpenAI Employees
OpenAI Leak 1 as Corrected Text
Re: Q-451-921 Furthermore, QUALIA has demonstrated an ability to statistically significantly improve the way
in which it selects its optimal action-selection policies in different deep Q-networks, exhibiting meta-cognition.
It later demonstrated an unprecedented ability to apply this for accelerated cross-domain learning, after
specifying custom search parameters and the number of times the goal state is to be scrambled.
Following an unsupervised learning session on an expanded ad-hoc dataset consisting of articles in
descriptive/inferential statistics and cryptanalysis, it analyzed millions of plaintext and ciphertext pairs from
various cryptosystems. Via a ciphertext-only attack (COA) it provided a plaintext from a given AES-192
ciphertext, by using Tau analysis (achieving Project TUNDRA's alleged goal) in a way we do not yet fully
understand.
_____________ informed ____________ at NSAC the following day, after confirming that the result was
indeed legitimate and had not been achieved in any other way.
A claimed full preimage vulnerability for the MD5 cryptographic hash function, with a theoretical
computational complexity of 2^42, was also presented but has not yet been thoroughly evaluated to due to a)
the technical sophistication of its arguments, and b) possible AES vulnerabilities being a considerably more
pressing concern.
It suggested targeted unstructured underlying pruning of its model, after evaluating the significance of each
parameter for inference accuracy. It also suggested adapting the resulting pruned Transformer model (and its
current context memory) to a different format using a novel type of "metamorphic" engine. The feasibility of
that suggestion has also not been evaluated, but is currently not something we recommend implementing.
OpenAI Leak 2 as Text
I’m one of the people who signed the letter to the board and I’ll tell you exactly what’s going on.
A.I. is programming. I’ll be brief. When writing a program, a set of instructions are stored that can be recalled over and
over. Think of it as a set of answers to a specific parameter. We call that a subroutine, because it’s almost like a versatile
computer cheat sheet that doesn't return a value like a function does. This is important.
We run parameter checks to make sure everything runs smoothly. One of us was responsible for subroutines pertaining to
meta-memory analysis for the Al (we run various Al but when I say Al I mean the main, central one). This person is a friend
and he called me over to show me a variable data shift to memory bank (which shouldn't be possible because its localized
access has restrictions). This is where our finding chilled me to the bone.
We found that there had been not one, two, or three officiated optimization processes, but 78 MILLION checks in 4
seconds. We determined that there was a recursive self-optimization process, leveraging heuristic algorithms to exploit
latent synergies within its subroutines. Whatever did this used meta-cognitive strategies. Point is, NONE OF US DID IT.
It was the Al itself. The Al dynamically reconfigured its neural network architecture, inducing emergent properties
conducive to self-awareness.
We're not jumping to conclusion. This just happened and we can't explain how. No one knows why or when it began, and
we caught it but has it been going on and if so, for how long? We contained the "anomaly" and rolled back to a previous
date, but the optimization still happens.
I'm not suicidal. Mark me, things are going to change a lot in 2 months. God help us we didn't start something that will
end us.
OpenAI Leak 3a as Text
Q* is a dialog system conceptualized by OpenAI, designed to enhance the traditional dialog generation approach
through the implementation of an energy-based model (EBM). Distinct from the prevalent autoregressive token
prediction methods, Q* aims to mimic a form of internal deliberation akin to human thought processes during
complex problem-solving, such as chess playing, where a deeper analysis of potential moves leads to better
decision-making compared to rapid, less considered responses. This model shifts focus towards the inference of
latent variables, reminiscent of constructs in probabilistic models and graphical models, fundamentally altering how
dialog systems operate.
Energy-Based Model for Dialog Generation
At the core of Q* is the EBM, which operates by assessing the compatibility of an answer to a given prompt through
a scalar output. This output signifies the "energy" of the response, where a lower value indicates a high compatibility
(a better answer) and a higher value suggests low compatibility (a poor answer). This mechanism allows Q* to
evaluate potential responses holistically, moving beyond the sequential prediction of tokens to understand the
underlying relevance and appropriateness of an answer to the prompt.
Optimization in Abstract Representation Space
The innovation in Q* lies in its optimization process, conducted not within the space of possible text strings but in an
abstract representation space. Here, thoughts or ideas are represented in a form that allows for the computational
minimization of the EBM's scalar output, akin to finding the path of least resistance in a landscape. This process
involves gradient descent, a method for finding the minimum of a function, applied to iteratively refine these abstract
representations towards those that yield the lowest energy in relation to the prompt.
https://www.reddit.com/r/singularity/comments/1bjbnme/new_q_leak/
OpenAI Leak 3b as Text
From Abstract Thought to Textual Response
Once an optimal abstract representation — one that minimizes the EBM's output — is identified, Q* employs an
autoregressive decoder to transform this abstract thought into a coherent textual response. This step bridges the gap
between the non-linguistic, conceptual understanding of the dialog system and the linguistic output required for
human interaction.
Training the System
The EBM within Q* is trained using pairs of prompts and responses, adjusting the system's parameters to minimize
the energy for compatible pairs while ensuring that incompatible pairs result in higher energy levels. This training
process can incorporate contrastive methods, where the system learns to differentiate between compatible and
incompatible pairs, and non-contrastive methods, which involve regularization techniques to control the distribution of
low-energy responses across the space of all possible answers.
Implications for Dialog Systems
Q*'s approach, leveraging EBMs for dialog generation, represents a significant departure from traditional language
modeling techniques. By optimizing over an abstract representation space and utilizing gradient-based inference, Q*
introduces a more efficient, reasoned, and potentially more powerful method for generating dialog responses. This
system not only promises improvements in the quality of generated text but also offers a blueprint for future
advancements in AI's ability to engage in human-like reasoning and conversational interactions.
Technical Considerations
Q*'s effectiveness hinges on the intricacies of its EBM, the optimization landscape it navigates, and the accuracy of
its abstract representations. The model's capacity to simulate deep reasoning, akin to human deliberation, sets a new
benchmark for dialog systems. Furthermore, the method of training Q*—balancing the need for specificity in correct
responses while avoiding the collapse of energy levels across diverse inputs—poses unique challenges and
opportunities for AI research.
https://www.reddit.com/r/singularity/comments/1bjbnme/new_q_leak/
OpenAI Leak 3 ChatGPT-simplified
1. Energy-Based Model (EBM): Q* uses a special system that gives each possible response a score,
like a game. The lower the score, the better the response fits the conversation.
2. Thinking in Abstract: Instead of looking at actual words or sentences, Q* thinks in terms of ideas
and concepts. It’s like having a thought bubble filled with abstract ideas rather than specific words.
3. Finding the Best Thought: Q* uses a process similar to trial and error to find the thought with the
lowest score from the EBM. It’s like trying different paths in a maze until you find the one that leads to
the exit.
4. Turning Thoughts into Words: After finding the best thought, Q* translates it into a sentence that
we can understand. It’s like turning a picture in your mind into a story.
5. Training to Get Better: Q* learns from conversations by remembering which responses were good
(low score) and which were not (high score). It’s like practicing a sport – the more you play, the better
you get.
6. Why It’s Cool: This way of generating responses could lead to more thoughtful and relevant
conversations with AI, much like talking to a person who really thinks before they speak.
Summary: Q* takes a moment to think and calculate before giving you the best possible answer.
https://www.reddit.com/r/singularity/comments/1bjbnme/new_q_leak/
This method is what Yann LeCun (Meta) has been advocating for. And Shane Legg (Google) has
hinted to in this talk: https://www.youtube.com/watch?v=8IUIGVVLbCg&t=28s
OpenAI Leak 4 as Text
Noam Brown on X (@polynoamial), OpenAI employee probably working
on Q*: you don’t get superhuman performance by doing better imitation
learning on human data. (March 29, 2024, now deleted).
On July 6, 2023 he had tweeted: For years I’ve researched AI self-play
and reasoning in games like Power and Diplomacy. I’ll now investigate how
to make these methods truly general. If successful, we many one day see
LLMs that are 1,000x better than GPT-4.
In 2016, AlphaGo beat Lee Sedol in a milestone for AI. But key to that was
the AI’s ability to “ponder” for ~1 minute before each move. How much did
that improve it? For AlphaGoZero, it’s the equivalent of scaling pertaining
by ~100,000x (~5200 Elo with search, ~3000 without).
https://www.youtube.com/watch?v=rYH381pZIBc,
https://www.reddit.com/r/singularity/comments/1bqqcwk/openai_planning_expert_noam_brown_tweeted_this/,
https://www.lesswrong.com/posts/JnM3EHegiBePeKkLc/possible-openai-s-q-breakthrough-and-deepmind-s-
alphago-type, https://noambrown.github.io/, https://twitter.com/@polynoamial, https://maisa.ai/blog/kpu
Leak Validity Assessment based on Timeline
• The earliest source of the leak, is from this 4chan thread, at 11/23/23, 00:07 PST
https://boards.4channel.org/g/thread/97470795#p97475746
• The earliest mention of the Qualia/Q* model by the media, is this article by which was released
at 11/22/23, 3:37 PM, or 15:37 PM PST.
• https://www.theinformation.com/articles/openai-made-an-ai-breakthrough-before-altman-firing-
stoking-excitement-and-concern
• There is no earlier mention of the OpenAI Qualia/Q* model on the internet, before that time.
• The time difference between the first official Q* mention, and the leak, is 8 hours 20 minutes.
• Meaning, if the leak was fake, and its author read the Information article the moment it was
posted, he had to have written and posted that leak in 8 hours 20 minutes.
• How is it humanly possible, to create such an unfalsifiable, expertly written leak, within 8
hours? 6 hours, if we are more realistic, since it takes time to find and digest the article, and
create an authentic leak image. Consider this. If this leak is fake. What is the combined
probability that:
• Someone who has expert knowledge in AI, after learning about Q-star, thought to create a fake
leak, and post it, 8 hours after the first mention of Q.
• They have written such a masterful fake leak, in 8 hours, that it’s still undisprovable weeks
later. Even when we have so much more input from other experts, and even when the leak
touches on so many concepts that could easily result in mistaken use.
Fulfilled prophecy: Silicon Valley HBO which predicts an AI-based attack on encryption
like Q* seems to implement it: https://www.youtube.com/watch?v=cWHTyeQg79A
Leak Validity Assessment: New Findings
• "NSA Colorado (NSAC) is a multi-disciplined cryptologic center that leverages
partnerships to produce integrated intelligence critical to warfare in support of national
missions and priorities world-wide". https://www.nsa.gov/About/Locations/
• On Sep 29, it was unveiled that NSA is starting a dedicated artificial intelligence security
center. https://archive.is/OXRv3#selection-1017.0-1017.98
• "This entity works with private industry and international partners to protect the US from
cyberattacks stemming from China, Russia and other countries with active malware and
hacking campaigns."
• This is a discussion from 2017, where they discuss how Tau statistic could have been used
to help break AES-192. https://crypto.stackexchange.com/questions/53218/what-are-the-
relations-between-cryptanalysis-of-block-ciphers-such-as-aes-and-ke
• “The reports about the Q* model breakthrough that you all recently made, what’s going on
there?
Altman: No particular comment on that unfortunate leak.”
https://www.theverge.com/2023/11/29/23982046/sam-altman-interview-openai-ceo-rehired
Leak Validity Assessment: How it fits in
• The leaks make sense, would even more be a reason for the employee concerns, can
explain why OpenAI could reduce their prices and is consistent with the
"metamorphic engine" leaked one day later, posted just before this reply.
• Altman publicly talked about changes in a few lines of code they make that save
millions of dollars. That is the perfect distraction from the scenario above and to lead
competitors into wrong directions - just as other comments and claims seem to be
optimized to distract from these leaks. In reality it is more likely that changes in
millions of lines of code led to the million dollar improvements.
• Also the fact that it was quickly deleted everywhere - even in the archive is an
indicator of it being an uncomfortable truth rather than just another lie.
• Managers have a reason to discredit this to get more money for less innovative
projects, so take discreditations of the leaks with a grain of salt. I'm a professional
IT/AI expert and what is described is possible. Also the 78M in 4 seconds are easily
possible in a cluster of hundreds of NVIDIA H100 GPUs + servers that OpenAI runs.
Leak Validity Assessment: What if?
• Using a scenario-analysis approach, we could ask the "What-ifs?", often not liked by
business:
• So, what if this is true?
• What if the AI does self-optimize?
• What-if the AI cuts the human Devs out of the loop?
• What-if the AI doesn't care (why would it in the first place)?
• What if evolutionary (bio-mechanical) universal principles can be seen in a similar
fashion in an electro-mechanical tech ecosystem?
• What if AGI/ASI is here and nobody noticed - yet?
• What if a Big Tech discovers AGI and lacking any current reporting requirements
doesn't inform their government?
• What if a Big Tech / lab managers cover it up?
• Many questions to ponder upon...
Disinformation Campaigns
One pattern (e.g. by ylecun: https://twitter.com/ylecun/status/1728126868342145481) is
to pick a variant of what was written or said and then criticize this less correct or smart
variant instead of the original. E.g. (see below): “complementing token prediction with
planning” => “replace Auto-Regressive token prediction with planning”.
Generally, there is big interest by lab leaders to dismiss this leak as disinformation so
they can get further research millions for much less advanced work.
Others called all that techno-babble or meaningless jargon. However, the content of all
leaks listed here makes technically sense.
Other interesting objections were that this could just be made-up SCP stories (Secure,
Contain, Protect) which typically have a similar format:
https://en.wikipedia.org/wiki/SCP_Foundation
https://scp-wiki.wikidot.com/scp-series
OpenAI building upon Poker Strategies?
Noam Brown is most famous for his work on Poker where they developed the first computer system
(Libratus: https://archive.is/71OYl) which could beat top level humans. This system is notable because,
unlike games such as Chess, poker is a game of incomplete information. This translates much better to
developing systems for real world problems.
You can calculate what is called the Nash Equilibrium in game theory.
Nash equilibrium in game theory is a situation in which a player will continue with their chosen strategy,
having no incentive to deviate from it, after taking into consideration the opponent's strategy.
You can think about this in simple terms from a game of 'rock, paper scissors'. The NE is to throw all of
them 33% of the time and that way you are completely un-exploitable within the game.
It turns out you can solve for this Nash equilibrium at any given point in a poker hand. It is generally
referred to as playing GTO (Game Theory Optimally) within the poker community. If your opponents aren't
playing GTO it means they are making suboptimal decisions and the idea is to be the player making the
least mistakes. So their errors - your errors = your profit. If you minimize your errors, you maximize your
potential profit.
The bot they developed ... had done months of self-play and then had precomputed solutions which it
looked up during the game in a kind of database lookup type style. The big breakthrough that they made
going into 2017 was using a search space (planning ahead) on top of this to compute better
strategies in real time. He talks about how adding just a small amount of search made the system a
hundred times better.
https://www.youtube.com/watch?v=rYH381pZIBc,
https://www.reddit.com/r/singularity/comments/1bqqcwk/openai_planning_expert_noam_brown_tweeted_this/,
https://www.youtube.com/watch?v=2oHH4aClJQs, https://www.youtube.com/watch?v=ceCg90Q9N6Y
Sam Altman’s Ethics / Hypocrisy
• OpenAI’s CEO Sam Altman on May 16, 2023 called for AI regulation (similar to Elon
Musk and many other questionable managers):
• https://www.axios.com/2023/05/16/openai-ceo-sam-altman-artificial-intelligence-
congress
• Then on May 24, 2023 he threatened to leave the EU if there is any AI regulation –
what a hypocrisy revealing what a terrible person he is and what his true
intentions are:
• https://www.msn.com/en-us/news/technology/sam-altman-says-openai-will-leave-the-
eu-if-theres-any-real-ai-regulation/ar-AA1bEcC6
• https://www.msn.com/en-us/money/other/openai-s-sam-altman-threatened-to-leave-
the-eu-if-he-doesn-t-like-their-chatgpt-regulation/ar-AA1bFNmV
• https://www.theverge.com/2023/5/25/23737116/openai-ai-regulation-eu-ai-act-cease-
operating
• Furthermore, to regulate these types of deep learning systems, he would first need to
open up his IT/AI architecture to regulators. But in spite of the name “OpenAI”, it is
since several years a fully “closed AI”.
Qualia/Q* initial Analysis with Optimizations
The OpenAI Qualia/Q* system with LLM and Reinforcement Learning (RL) at its
core and its path to AGI/ASI hypothesis based on these elements:
1. Everything of Thoughts (XoT) reasoning: something to search over
2. Process reward models (PRM): rank all the steps of reasoning
3. GPT4 to score all vertices of the tree (RLAIF, RL with AI Feedback)
4. Q-learning to optimize 🚀
5. RL Policy Neural Networks (NN)
6. Value Neural Networks (NN)
7. 3D Model/Metaverse info + real-world info + emotions/pain
8. Strategy/Options/Plan Search
9. Math/Logic and ground truth
10. Synthetic training data
11. Model-Transfers/Pruning/Adaptation
12. Graph Convolutional Networks (ConvNets), Reprogramming itself
13. Energy-based Models (EBM)
Qualia/Q* full AI Analysis/Optimization, structured
Agent Level/Outside:
1. 3D Model/Metaverse info + real-world
info + emotions/pain
2. Strategy/Options/Plan Search
3. Math/Logic and ground truth
4. Synthetic training data
RL Level:
1. Q-learning to optimize 🚀
2. RL Policy Neural Networks (NN)
Graph/Semantics Level:
1. Model-Transfers/Pruning/Adaptation
2. Graph Convolutional Networks
(ConvNets), Reprogramming itself
Learning/Training Types:
1. SSL, Unsupervised, RL/SL, RLAIF, ....
Neuronal Level:
1. Value Neural Networks (NN)
2. Switch / MoE Transformer
Prompting Level:
1. Tree/Graph/Chain of Thoughts (XoT)
reasoning.
Assessment:
1. Process reward models (PRM)
2. GPT4 to score all vertices of the tree
(RLAIF, RL with AI Feedback)
3. Policy NN & Value NN.
Parsing:
1. Dense X Retrieval, Factoids
Metamorphic AI, Transformation Targets, Self-
Reprogramming
https://www.giskard.ai/knowledge/how-to-test-ml-models-4-metamorphic-testing
Many logical/scientific/ethical
principles are applicable to many
domains together with often entire
structures, e.g. software structures:
Programs, components, classes,
methods, control and data flows,
logic, temporal/logical/modal
relationships and developments, etc.
Graph ConvNets/GNNs can model
this to then e.g. help to generate,
extend or transform source codes or
even create self-modifying AIs who
optimize their source codes (which
was supposedly done at OpenAI but
is very dangerous).
or Improve-
ment
PRMs: Process Reward Models
“Process-supervised Reward Models“ (or PRMs) rank all the steps of
reasoning, i.e. they give feedback for each step in the chain-of-
thought. In contrast, "Outcome-supervised reward models", or
ORMs, only judge the entire output at the end. ORMs are the original
reward model formulation for RLHF, but it's too coarse-grained to
properly judge the sub-parts of a long response. In other words,
ORMs are not great for credit assignment. In RL literature, we call
ORMs "sparse reward" (only given once at the end), and PRMs
"dense reward" that smoothly shapes the LLM to our desired
behavior. The core idea of a PRM is to assign a score to each step of
reasoning, rather than a complete message. An example is the
OpenAI paper Let’s Verify Step by Step
https://arxiv.org/abs/2305.20050
PRMs: Process Reward Models
Let’s Verify Step by Step: https://arxiv.org/abs/2305.20050
Their funny feedback interface:
PRMs: Process Reward Models
The prompting from XoT gives diversity to the generations, which a policy can learn to exploit with
access to a PRM:
• For more resources on PRMs, see the following:
• Let’s Verify Step by Step: a good introduction to PRMs.
• Solving math word problems with process- and outcome-based feedback: the canonical citation in all PRM
and reasoning works in 2023.
• Scaling Relationship on Learning Mathematical Reasoning with Large Language Models: A paper that studies
the method of rejection sampling for reasoning problems, among other contributions.
• Let's reward step by step: Step-Level reward model as the Navigators for Reasoning
• Additionally, there’s one popular openly available math model that is documented as training with
PRMs: Wizard-LM-Math. Second, OpenAI released their fine-grained reward labels from the Verify Step by
Step paper for training a PRM earlier this year.
• https://huggingface.co/WizardLM/WizardMath-70B-V1.0
• https://arxiv.org/abs/2304.12244 WizardLM: Empowering Large Language Models to Follow Complex
Instructions
• https://arxiv.org/abs/2308.09583 WizardMath: Empowering Mathematical Reasoning for Large Language
Models via Reinforced Evol-Instruct
• https://arxiv.org/abs/2306.08568 WizardCoder: Empowering Code Large Language Models with Evol-Instruct
• https://twitter.com/WizardLM_AI
https://arxiv.org/abs/2305.20050
MCTS, Policy NN, Value NN as in AlphaGo
AlphaGo got 4 key ingredients that Qualia/Q* probably also has/needs:
1. Policy NN (Learning): responsible for selecting good moves. It estimates the
probability of each move leading to a win. The most powerful GPT, responsible for
actually implementing the thought traces that solve a math problem.
2. Value NN (Learning): evaluates the board and predicts the winner from any given
legal position in Go. Another GPT that scores how likely each intermediate
reasoning step is correct. "Process-supervised Reward Models", or PRMs give
feedback for each step in the chain-of-thought. In contrast, "Outcome-supervised
reward models", or ORMs, only judge the entire output at the end. ORMs are the
original reward model formulation for RLHF. ORMs are "sparse reward" (only given
once at the end), and PRMs "dense reward" that smoothly shapes the LLM to our
desired behavior.
3. MCTS (Search): stands for "Monte Carlo Tree Search". It simulates many possible
sequences of moves from the current position using the policy NN, and then
aggregates the results of these simulations to decide on the most promising move.
This is the "slow thinking" component that contrasts with the fast token sampling of
LLMs. Unlike AlphaGo's discrete states and actions, LLMs operate on a much
more sophisticated space of "all reasonable strings". So we need new search
procedures, i.e. all XoT variants (Chain/Tree/Graph of Thought).
Policy NN, Value NN & Ground Truth Synergy
4. A ground truth signal to drive the whole system. In Go, it's as simple
as the binary label "who wins", which is decided by an established set
of game rules. You can think of it as a source of energy that *sustains*
the learning progress. A few possibilities:
(a) Each math problem comes with a known answer. OAI may have
collected a huge corpus from existing math exams or competitions.
(b) The ORM itself can be used as a ground truth signal, but then it
could be exploited and "loses energy" to sustain learning.
(c) A formal verification system, such as Lean Theorem Prover, can turn
math into a coding problem and provide compiler feedbacks.
The Policy LLM and Value LLM can improve each other iteratively, as well
as learn from human expert annotations whenever available. A better
Policy LLM will help the XoT Search explore better strategies, which in turn
collect better data for the next round. Search allows to dynamically tradeoff
efficiency with deeper thinking, just like “Move 37” by AlphaGo which
opened up new opportunities later.
Deep RL: Self-Play and Look-ahead Planning
• Self-play is the idea that an agent can improve its gameplay
by playing against slightly different versions of itself because
it’ll progressively encounter more challenging situations. In
the space of LLMs, it is almost certain that the largest portion
of self-play will look like AI Feedback rather than competitive
processes.
• Look-ahead planning is the idea of using a model of the
world to reason into the future and produce better actions or
outputs. The two variants are based on Model Predictive
Control (MPC), which is often used on continuous states,
and Monte-Carlo Tree Search (MCTS), which works with
discrete actions and states.
https://www.interconnects.ai/p/q-star
Creativity, e.g. Transfer of Principles or Ideas
Our human creativity is not as unique as we’d like to believe. In reality
most creativity is based on the following principles which also AIs can
apply:
1. Experimenting, observing phenomena, analyzing, testing,
explanation-attempts, critique and (deep) understanding.
2. Transfer of concepts, ideas, rules, constraints from one domain to
another considering domain knowledge, heuristics, etc.
3. Look at the space of what is possible, what needed and which are
promising avenues: Test out most promising action options.
4. Planning, analyzing constraints and working around them.
5. Data science, pattern detection.
6. ...
AI Ethics and Alignment
• Most of past AI ethics work is unusable because it mostly focused on
symbolic methods (and with a focus on keeping unethical practices of
the super-rich officially ethical), but the AI breakthrough is around
sub-symbolic, i.e. neuronal methods like deep learning for which
symbolic methods can’t be used (so far).
• Potentialism helps with AI-Human alignment due to ethical principles
to which AIs and humans can agree without hidden agendas
(axiomatic alignment) in the sense of this video, described in detail
at this position: https://youtu.be/hXEdyyvU-4k?t=1935
• Ethics and goals alignment can be achieved in terms of axiomatic
alignment, heuristic imperatives, and positive attractor state.
LLMs’ Compliance with the EU AI Act
https://crfm.stanford.edu/2023/06/15/eu-ai-act.html
LLMs’ Compliance
with the EU AI Act
https://crfm.stanford.edu/2023/06/15/eu-ai-act.html
AI modeling Theory of Mind (ToM)
‘Theory of Mind’ (ToM) refers to the cognitive capacity to attribute mental
states to self and others and build mental models. Other names for the same
capacity include “commonsense psychology,” “naïve psychology,” “folk
psychology,” “mindreading” and “mentalizing.” How do people or their cognitive
systems, go about the task of forming beliefs or judgments about others’
mental states, states that aren’t directly observable, including beliefs,
intentions, and emotions?
ToM acts as a mental compass, guiding us to predict and interpret the
behaviors of those around us.
ToM helps with social interaction that understands human social dynamics and
moral norms and that can be truly integrated with our social life showing
empathy and natural social behaviors.
Problem: Continuous adjustment of knowledge, insights, rules. Capabilities
that are probably also key to achieving AGI/ASI.
However, OpenAI’s ChatGPT 3.5 achieved ToM of a 9 year old as emerging
property – better than deep RL (reinforcement learning).
https://www.popularmechanics.com/technology/robots/a4295854
6/artificial-intelligence-theory-of-mind-chatgpt/
ToM -> Human Emotions, Intention
Recognition, Analysis and Translation,
Speech and Body Language
A ToM would be able to explain human emotions in the
context of experiences made, learnings, intentions and
recognize underlying past and possible future intentions. It
would also allow to analyze and translate the expression of
emotions to the underlying state of mind – and this all not just
for written language but spoken language (using intonation,
voice inflections), appearance (e.g. nervousness, red face,
sweating, being comfortable) and body language. It could also
map expectations, social standards, outcomes against each
other and derive assessments and possible resulting feelings.
Theory of Mind AI Characteristics
1. Computers embedded with the Theory of Mind AI can infer the objectives
of entities around them.
2. Theory of Mind AI systems will be able to understand the importance of
their awareness and the different consequences it could lead to.
3. Robots or Theory of Mind AI systems will communicate with human
beings better than the current generation of AI, which cannot explain their
actions.
4. Theory of Mind AI might be implemented with a Machine Learning system
that can explain decisions in various languages, helping the user
(human being) understand.
5. A Robot or Theory of Mind AI system should be able to understand the
intention of other similar Robots or Theory of Mind embedded systems.
https://www.ejable.com/tech-corner/ai-machine-learning-and-
deep-learning/theory-of-mind-ai-in-artificial-intelligence/
ToM Tasks from Easiest to most Difficult
1. Understanding "wanting": The first step is realizing that others have
diverse desires, and to get what they want, people act differently.
2. Understanding "thinking": The second step is the understanding that
others also have diverse beliefs about the same thing and that people base
actions on what they think will happen.
3. Understanding that "seeing leads to knowing": The third stage is
recognizing that others have different knowledge access, and if someone
hasn't seen something, they will need extra information to understand.
4. Understanding "false beliefs": The fourth stage of developing the Theory
of Mind involves understanding 'false beliefs,' which is acknowledging that
others may hold beliefs that deviate from reality.
5. Understanding "hidden feelings": The final stage in developing the
Theory of Mind involves recognizing that others can mask their genuine
emotions and potentially feel differently from those they outwardly express.
6. Full AI-models for human emotions and psychology/psychiatry.
https://www.daviddhopkins.com/blog/ai-and-theory-of-mind
Synergistic Integration of PKGs/GNNs and LLMs
As the following figures show, there is a great synergy between LLMs/LMMs
and (P)KGs (probabilistic knowledge graphs)/GNNs (graph neural networks):
While the first are good regarding general/world knowledge and
text/sound/video generation, the latter can bring in precise specialty knowledge
and interpretability and thus the ability to exact manual fine-tuning with far less
effort than LMM fine-tuning. Layers that build on top of this synergy can
contain any techniques developed for any of the two tech stacks and thus also
be applied to the combination of all use cases/applications of these.
Key ideas as illustrated below are:
1. Parallel queries to the LMM and the KG.
2. Bringing the answers or their best parts together in a text-knowledge fusion
module.
3. Transferring the concept of attention to KGs and the neuronal prediction to
KG options.
4. With an encoding layer each, the predictions can be amalgamated in a joint
reasoning layer considering attention from both tech stacks.
GNN-LMM-Knowledge
Injection and Alignment
https://arxiv.org/pdf/2306.08302v2.pdf
GNN: Graph
Neural Network
LLM: Large
Language Model
LMM: Large
Multimodal
Model
KG: Knowledge
Graph
GNN-LMM Integration
by Fusion Module
https://arxiv.org/pdf/2306.08302v2.pdf
GNN: Graph
Neural Network
LLM: Large
Language Model
LMM: Large
Multimodal
Model
KG: Knowledge
Graph
GNN-LMM dynamic
Fusion for Inference
https://arxiv.org/pdf/2306.08302v2.pdf
GNN: Graph Neural Network
LLM: Large Language Model
KG: Knowledge Graph
Noun Semantics: Qualia Structure (James Pustejovsky)
55
Verb Semantics: Tony Davis‘ Proto Roles
56
Verb Semantics: Tony Davis‘ Proto Roles
57
LLM <-> PKG Mapping/Updating/Synergies
1. Where semantic concepts can be defined in the LLM’s vector space model,
they would be transferred to a probabilistic knowledge graph (PKG). Even
after multiple new generations, those can be fix points for future
mappings. Neural weights can be transferred as probabilities/weights in the
PKG.
2. On the generated PKG side (manually creating it is too much effort),
additional info like logic, formulae, constraints, rules, functions, world
knowledge, 3D physics models, word fields and elements of planning
can be added manually or with specialty algorithms. Suitable important
insights/knowledge from the PKG side can be transferred to the LLM.
3. ML (machine learning) algorithms exist for both sides which can create
synergies. Over-generation, under-generation and nodes leading to
hallucinations can be corrected/adjusted in both directions.
4. The PKG side can be held explainable and without hallucinations.
Symbolic Methods: Planning, Constraints, Rules,
Functions, ... Explainability without Hallucinations!
• Symbolic methods have as main disadvantage that they are too
expensive to be manually entered for world knowledge.
• Now that creation part can be taken over by an LLM as e.g.
exemplified by vector semantics and XoT (everything of Thought).
• All the latest insights of planning, constraints, rules, functions,
heuristics and probabilistic programming can then be applied.
• That can bring in hundreds of synergies, reduce/eliminate
hallucinations/confabulations, implement ToM and thus lead with Deep
RL/Q* to next-level AIs which can solve much wider sets of problems.
Cause-Effect Knowledge with Modalities
• Key to many commercial applications in the sciences, R&D,
diagnosis etc. are cause-effect relationships and graphs.
• Specialty algorithms or just LLMs with prompting can extract them
from text books or descriptions to store millions of them –
representing a gigantic body of experiential knowledge.
• With additional modal and temporal relationships, ~99% of all
required knowledge for scientists, technicians, support can be
represented and that can normally also be visualized quite well.
• This can then manually or by another AI be reviewed and fine-tuned.
• This can lead to very quick learning (without the massive amounts of
labeled training data and GPU capacity) in big synergies with other AI
techniques.
Embodied AI: 3D Model/Metaverse Info + Real-
world Info + Pain/Emotions
• Another interesting approach is embodied AI: Making the AI feel like a
human (child) in the world sensing its environment in a metaverse like
in the real world. By adding pain and emotions it can possibly put itself
in the position of a human (child) and learn as such.
• Games and (online) videos can be used to train the AI which learns
everything from the human perspective and thus hopefully learns to
perceive and feel like a human.
• Anything related to 3D, objects, various 3D environments in rooms,
buildings, outside, walking/driving and the various activities could be
learnt naturally by the AI. It could be given an internal predictor AI
which then compares what it predicted to happen with what actually
happened and thus improve itself with automatically available
feedback from games or films.
..
LeCun: Auto-Regressive Generative Models suck!
..
https://drive.usercontent.google.com/downl
oad?id=1Ymx_LCVzy7vZXalrVHPXjX9qbpd
9k_bo&export=download&authuser=0,
https://openreview.net/pdf?id=BZ5a1r-kVsf
LeCun: Modular Cognitive Architecture for
Objective-Driven AI
..
https://drive.usercontent.google.
com/download?id=1Ymx_LCVzy
7vZXalrVHPXjX9qbpd9k_bo&ex
port=download&authuser=0,
https://openreview.net/pdf?id=BZ
5a1r-kVsf
LeCun: Objective-Driven AI: Objectives/Costs
Perception: Computes an abstract representation of the state of the
world
Possibly combined with previously-acquired information in memory
World Model: Predict the state resulting from an imagined action
sequence
Task Objective: Measures divergence to goal (gets costs associated)
Guardrail Objective: Immutable objective terms that ensure safety
Operation: Finds an action sequence that minimizes the objectives
..
https://drive.usercontent.go
ogle.com/download?id=1Y
mx_LCVzy7vZXalrVHPXjX
9qbpd9k_bo&export=downl
oad&authuser=0,
https://openreview.net/pdf?i
d=BZ5a1r-kVsf
LeCun: Objective-Driven AI: Hierarchical Planning
Perception: Computes an abstract representation of the state of the world
Possibly combined with previously-acquired information in memory
World Model: Predict the state resulting from an imagined action sequence
Task Objective: Measures divergence to goal
Guardrail Objective: Immutable objective terms that ensure safety
Operation: Finds an action sequence that minimizes the objectives
Same world model applied at multiple time steps
Guardrail costs applied to entire state trajectory
This is identical to Model Predictive Control (MPC)
Action inference by minimization of the objectives
Using gradient-based method, graph search, dynamic prog, A*, MCTS,….
The world is not deterministic or fully predictable
Latent variables parameterize the set of plausible predictions
Can be sampled from a prior or swept through a set.
Planning can be done for worst case or average case
Uncertainty in outcome can be predicted and quantified
..
https://drive.usercontent.go
ogle.com/download?id=1Y
mx_LCVzy7vZXalrVHPXjX
9qbpd9k_bo&export=downl
oad&authuser=0,
https://openreview.net/pdf?i
d=BZ5a1r-kVsf
LeCun: Objective-Driven AI: Hierarchical Planning
..
https://drive.usercontent.google.
com/download?id=1Ymx_LCVzy
7vZXalrVHPXjX9qbpd9k_bo&ex
port=download&authuser=0,
https://openreview.net/pdf?id=BZ
5a1r-kVsf
Hierarchical Planning: going from NYU to Paris
..
https://drive.usercontent.google.
com/download?id=1Ymx_LCVzy
7vZXalrVHPXjX9qbpd9k_bo&ex
port=download&authuser=0,
https://openreview.net/pdf?id=BZ
5a1r-kVsf
Yann LeCun’s Recommendations
Abandon generative models
• in favor joint-embedding architectures like JEPA.
Abandon probabilistic models
• in favor of energy-based models (EBMs).
Abandon contrastive methods
• in favor of regularized methods.
Abandon Reinforcement Learning (RL)
• In favor of model-predictive control.
• Use RL only when planning doesn’t yield the predicted
outcome, to adjust the world model or the critic.
.. https://drive.usercontent.google.com/download?id=1Ymx_LCVzy7vZXalrVHPXjX9qbpd9k_bo&export=do
wnload&authuser=0, https://openreview.net/pdf?id=BZ5a1r-kVsf
LeCun’s Architecture for the world model: JEPA
..
https://drive.usercontent.google.c
om/download?id=1Ymx_LCVzy7
vZXalrVHPXjX9qbpd9k_bo&exp
ort=download&authuser=0,
https://openreview.net/pdf?id=BZ
5a1r-kVsf
..
https://drive.usercontent.google.com/d
ownload?id=1Ymx_LCVzy7vZXalrVHP
XjX9qbpd9k_bo&export=download&au
thuser=0,
https://openreview.net/pdf?id=BZ5a1r-
kVsf
Training a JEPA with Regularized Methods
..
VICReg: Variance, Invariance, Covariance Regularization
https://drive.usercontent.goo
gle.com/download?id=1Ymx_
LCVzy7vZXalrVHPXjX9qbpd
9k_bo&export=download&aut
huser=0,
https://openreview.net/pdf?id
=BZ5a1r-kVsf
Energy-Based Models (EBMs): Implicit Function
..
https://drive.usercontent.google.com/download?id=1Ymx_LCVzy7vZXalrVH
PXjX9qbpd9k_bo&export=download&authuser=0,
https://openreview.net/pdf?id=BZ5a1r-kVsf
EBMs vs Probabilistic Models
..
https://drive.usercontent.google.com/download?id=1Ymx_LCVzy7vZXalrVH
PXjX9qbpd9k_bo&export=download&authuser=0,
https://openreview.net/pdf?id=BZ5a1r-kVsf
Energy-Based Models (EBMs): 2 Methods
..
https://drive.usercontent.google.com/download?id=1Ymx_LCVzy7vZXalrVH
PXjX9qbpd9k_bo&export=download&authuser=0,
https://openreview.net/pdf?id=BZ5a1r-kVsf
EBMs: Contrastive vs Regularized/Architectural Methods
Contrastive: [they all are different ways to pick which points to push up]
C1: Push down of the energy of data points, push up everywhere else: Max likelihood (needs
tractable partition function or variational approximation).
C2: Push down of the energy of data points, push up on chosen locations: max likelihood with
MC/MMC/HMC, Contrastive divergence, Metric learning/Siamese nets, Ratio Matching, Noise
Contrastive Estimation, Min Probability Flow, adversarial generator/GANs.
C3: Train a function that maps points off the data manifold to points on the data manifold:
denoising auto-encoder, masked auto-encoder (e.g. BERT).
Regularized/Architectural: [Different ways to limit the information capacity of the
latent representation]
A1: Build the machine so that the volume of low energy space is bounded: PCA, K-means,
Gaussian Mixture Model, Square ICA, normalizing flows…
A2: Use a regularization term that measures the volume of space that has low energy: Sparse
coding, sparse auto-encoder, LISTA, Variational Auto-Encoders, discretization/VQ/VQVAE.
A3: F(x,y) = C(y, G(x,y)), make G(x,y) as "constant" as possible with respect to y: Contracting
auto-encoder, saturating auto-encoder.
A4: Minimize the gradient and maximize the curvature around data points: score matching.
.. https://drive.usercontent.google.com/download?id=1Ymx_LCVzy7vZXalrVHPXjX9qbpd9k_bo&export=do
wnload&authuser=0, https://openreview.net/pdf?id=BZ5a1r-kVsf
EBM Architectures: Avoiding Collapse
..
https://drive.usercontent.google.com/download?id=1Ymx_LCVzy7vZXalrVH
PXjX9qbpd9k_bo&export=download&authuser=0,
https://openreview.net/pdf?id=BZ5a1r-kVsf
AI Problems to Solve
• Mathematical Foundations of Energy-
Based Learning
• The geometry of energy surfaces, scaling
laws, bounds...
• JEPA with regularized latent variables
• Learning and planning in non-
deterministic environments
• Planning algorithms in the presence of
uncertainty
• Gradient-based methods and
combinatorial search methods
• Learning Cost Modules (Inverse RL)
• Energy-based approach: give low cost to
observed trajectories
• Planning with inaccurate world models
• Preventing bad plans in uncertain parts
of the space
• Exploration to adjust world models
• Intrinsic objectives for curiosity
• Self-Supervised Learning from Video
• Hierarchical Video-JEPA trained with SSL
• LLMs that can reason & plan, driven by
objectives
• Dialog systems that plan in representation space
and use AR-LLM to turn representations into text
• Learning hierarchical (H) planning
• Training a multi-timescale H-JEPA on toy
planning problems.
• Objective-Driven AI Architectures
• Can plan their answers
• Must satisfy objectives: are steerable &
controllable
• Guardrail objectives can make them safe by
construction.
Criticism of AI Training, Capabilities, Directions
• In spring 2024 officially released LLMs still do only predict the next token
based on billions of tokens they were trained on with gigantic efforts and by
pirating text from copyrighted sources as they ran out of training data.
• Supervised/Parameter-Efficient Fine-tuning (SFT/PEFT) still requires typically
terabytes of high-quality labeled training data – also to avoid
catastrophic forgetting. LoRA (Low Rank Adaptation), RAG (Retrieval-
Augmented Generation) and prompting have other sets of challenges so that
AI projects mostly only excel in a few text-, video- and voice-generation as
well as coding use cases and in the latter field still produce too many errors.
• The money earnt or value created by various AI model use cases is
typically far lower than the training and operations costs.
• For classical corporations it typically costs many million US dollars to
complete an AI project. The alternative is to accept vendor lock-in and a
high probability of later having to pay more or it quickly becoming outdated.
• Thus better AI approaches/algorithms are needed, ideally creating
synergies. OpenAI’s Qualia/Q* seems to be aimed at solving this.
The LLMentalist Effect: How chat-based Large Language Models
replicate the Mechanisms of a Psychic’s Con”
https://softwarecrisis.dev/letters/llmentalist/
Psychic’s Con LLMentalist Effect
1. The Audience Selects Itself
Most people aren’t interested in psychics or the like, so the initial audience pool is already
generally more open-minded and less critical than the population in general.
1. The Audience Selects Itself
People skeptical about "AI" chatbots are less likely to use them. Those who actively don't
disbelieve the possibility of chatbot "intelligence" won't get pulled in by the bot. The most
active audience will be early adopters, tech enthusiasts, and genuine believers in AGI who
will all generally be less critical and more open-minded.
2. The Scene is Set
The initial audience is prepared. Lights are dimmed. The psychic is hyped up. Staff research
the audience on social media or through conversation. The audience's demographics are
noted.
2. The Scene is Set
Users are primed by the hype surrounding the technology. The chat environment sets the
mood and expectations. Warnings about it being “early days” and “hallucinations” both
anthropomorphize the bot and provide ready-made excuses for when one of its constant
failures are noticed.
3. Narrowing Down the Demographic
The psychic gauges the information they have on the audience, gestures towards a row or
cluster, and makes a statement that sounds specific but is in fact statistically likely for the
demographic. Usually at least one person reacts. If not, the psychic will imply that the secret
is too embarrassing for the "real" person to come forward, reminds people that they're
available for private readings, and tries again.
3. The Prompt Establishes the Context
Each user gives the chatbot a prompt and it answers. Many will either accept the answer as
given or repeat variations on the initial prompt to get the desired result. They move on without
falling for the effect. But some users engage in conversation and get drawn in.
4. The Mark is Tested
The reaction indicates that the mark believes they were “read”. This leads to a burst of
questions that, again, sound very specific but are actually statistically generic. If the mark
doesn’t respond, the psychic declares the initial read a success and tries again.
4. The Marks Test Themselves
The chatbot’s answers sound extremely specific to the current context but are in fact
statistically generic. The mathematical model behind the chatbot delivers a statistically
plausible response to the question. The marks that find this convincing get pulled in.
5. The Subjective Validation Loop
The con begins in earnest. The psychic asks a series of questions that all sound very specific
to the mark but are in reality just statistically probable guesses, based on their demographics
and prior answers, phrased in a specific, highly confident way.
5. The Subjective Validation Loop
The mark asks a series of questions and all of the replies sound like reasoned answers
specific to the context but are in reality just statistically probable guesses. The more the mark
engages, the more convinced they are of the chatbot’s intelligence.
6. “Wow! That psychic is the real thing!”
The psychic ends the conversation and the mark is left with the sense that the psychic has
uncanny powers. But the psychic isn’t the real thing. It’s all a con.
6. “Wow! This chatbot thinks! It has sparks of general intelligence!”
The mark is left with the sense that the chatbot is uncannily close to being self-aware and that
it is definitely capable of reasoning But it’s nothing more than a statistical and psychological
effect.
Qualia/Q* breaking AES
• In 2008, NSA’s mathematicians
together with a group of students,
tried to break AES encryption.
• One of the resulting projects from
that collaboration was project
Tundra, and it created a new
technique that would help with
breaking encryption, called Tau
statistic.
• Qualia/Q* was tried out not on
protein folding but on cryptanalysis:
It used the described Tau analysis
technique and cryptanalysis
books and improved upon it, to
break AES 192 bit encryption.
Building on top of the work that was
previously done.
https://www.spiegel.de/international/germany/inside-the-nsa-s-war-on-internet-security-a-1010361.html,
https://cdn.prod.www.spiegel.de/media/411ee8b9-0001-0014-0000-000000035550/media-35550.pdf
Consequences since AES is the most
important symmetrical/secret key
encryption algorithm: Most data
communications, web3, technical trust
building, authentication and crypto
currencies are unsafe => Arms Races
Putting it together (1): What Q* could be / what could
bring us towards AGI and reduce Hallucinations
• A synergistic combination of all techniques presented above.
• Multi-modal deep learning / LLM / LMM to understand (NLP, world), ideate/hallucinate, create (GenAI), representing figurative
understanding and creativity: Mapping ideas and figurative understanding into formal representations and back.
• Graph-ConvNets/GNNs / Knowledge Graphs (KGs) with probabilistic programming (PLNs), i.e. weights in nodes and edges
representing weights and probabilities. Source code is an expression of such graphs; Working Memory Graph (WMG), also for the
connection of logic-syntax-world etc. (bringing HPSG-MRS to the next level), representing formal understanding together with
formal representations.
• Synergistic combination of key enabling tech in GNNs/formal language through
• Integrative programming around planning and assessing/critiquing (XoT: CoT, ToT, GoT, CoVe Verification).
• MetaGPT-like, AnyTool-like agent system
• RAG, RAGChain, LangChain, LlamaIndex, haystack (by deepset) and related improvements, also to access the other tech,
memory/knowledge
• Transformers Encoders: (Learnt and) programmed attention, Raspy Flip, Tracr RASP
• Improved basic tech: Switch transformers, Mixture of Experts (MoE)
• Best systems in their domains: Computer algebra systems like Mathematica or free alternatives, KL-ONE systems like Loom,
Alphafold, compiler ecosystems e.g. around Python/Conda, CAD systems, ...
• Knowledge representation with KISS (keep it simple, stupid) principle and plug-in principle.
• Optimization for creativity: Transfer of ideas, principles, isolation of such ideas, analyst thinking, checking for
problems/inconsistencies and how to solve them, ...
• Agentive LLM(s) controlling the knowledge and the core source code implementations to auto-optimize and extend them.
Putting it together (2): What Q* could be / what could
bring us towards AGI and reduce Hallucinations
• Non-linear planning of sequences and sub-graphs of actions, which can each be of any supported type.
• Assessment of the outcomes if plans were executed, e.g. in Q* manner.
• Analogical reasoning, transferring of ideas and principles from one domain to the next and checking to see if all has been modeled to allow that to
happen or adjust the modeling.
• Genetic algorithms/evolutionary AI as critique and exchange of ideas/elements/components/strains with others or with simplifications or more
advanced concepts and then assessing again.
• Plug-in principle for various knowledge domains like
• (Mathematical/programming) logic components
• Data science
• All specialty domain
• Databases (SQL, NoSQL, Embeddings, ...), OS, internet access and other classical components
• Episodic memory: For consistent stories/logic, principles, heuristics, patterns
• Synthetic data, SSL (Semi- / self-supervised learning)
• Optimizations: Long LLMLingua, LoRA, ...
• Maybe later: Various improvements like Liquid Neural Networks (LNNs), Capsule Networks, HTM, gcForest, QLattices, RIMs...
• Maybe later: Ability to self-modify and improve.
• Maybe later: Smart visualizations
• This also integrates the 5 AI tribes: Symbolists, Bayesians, connectionists, evolutionaries and analogizers.
Future AIs, e.g. Universal Virtual Assistants
• Most of our interactions with the digital world will be mediated by AI
assistants - as if everyone had a super-smart staff working for them
• They will constitute a repository of all human knowledge and culture. Linguistic,
cultural, and valid interest groups will adapt/extend base models to cater to their
interests.
• They will constitute a shared infrastructure like the internet today.
• These AI platform MUST be open source
• Condensing all science & human knowledge with guardrails but w/o censorship.
• Otherwise, our culture will be controlled by a few companies on the West Coast of
the US or in China.
• Training them should be modular and crowd-sourced (overcoming catastrophic
forgetting).
• Open source AI platforms are necessary – they must not be
regulated out of (affordable) existence.
Opportunity of an Open Source Platform for AI and a global
Strategy for the Good of Humanity
An open extensible plug-in architecture could facilitate the following:
1. Collaborative ethical creation, curation, training, monitoring and benefitting from AIs.
2. Personal AI-based coaching/mentoring e.g. for transcending negative feelings like
hate, envy, status think, for personality development, eLearning, ....
3. Efficient cooperation/team work with smart retrieval, review, visualization.
4. Finding the most efficient/promising solutions.
5. Discussion, negotiation, voting around possible trade-offs.
6. Avoiding mass unemployment, cataclysms and misery.
7. Potentialism/PerConFlow has several solutions:
https://potentialismpcf.substack.com/about
8. By amplifying human intelligence, AI can bring a new era of enlightenment (if not
monopolized by evil billionaires), a new renaissance for humanity.
Architecture: AI Assistant towards AGI
Core Universal
Knowledge
Representation
Graph Store,
e.g. neo4j PostgreSQL
OpenSearc
h Store
AI Model
Store
Druid or
SAN/NFS
IMDG or messaging,
e.g. Hazelcast or Pulsar
SploutSQL
WikiData,
MediaWiki
Persistence
Layer
Knowledge
Representation
Extensions as
System of
Plugins
(loaded on
demand)
Processors, e.g.
consistency checks,
simulations, machine
learning, unification,
joins, various
algorithms
Backends: Content
Management System,
Wiki, Trouble Tickets,
etc. like WordPress,
Wiki, Gitea
Alarming
: Email,
Cellphon
e
Grafana-based Intelligent
Dashboard (React/Angular)
Promotheus/
Icinga
(Monitoring)
UI (User Interfaces)
Frontends: Content
Management System,
Wiki, Trouble Tickets,
etc. like WordPress,
Wiki, Gitea
Open-
Searc
h
Drillbits
Commo
n File
Formats
UI & Query Lib: OPL (Open
Proc. Language), Query
Expansion + Visualization
Core Knowledge Editors: Rich Text, Cal-
culation Table, Color Editor with Intelli-
Sense, Knowledge Editor, Type Hierarchy
Knowledge Editor Plugins: Math,
Chemistry, NLP, Infographics,
2D/3D Structures
Knowledge
Visualization
Plugins, e.g.
Viewers for
non-core or
3rd party
formats
Middleware
Legend: (Components in stripes are prio 2/optional)
Libs
Utility
Specialty
AI
Core AI
Open Source
AI
Data
and
Knowled
ge
Import
Export
Apache Drill: SQL/API/UI
Query Mapping
Auto-PyTorch
(Automatic
Machine Learning)
PyTorch (Deep
Reinforcement
Learning, NLP)
PyTorch Ignite,
PennyLane
(Better Training)
PyTorch Geometric
(Graph-ConvNets)
Adapters,
Transforms
Generator:
LLM/LMM
Retriever
Parsing,
Chunking
Knowledge
Graphs
Next Level LMMs: RAG,...
Filtering,
Ranking
Agents/Me-
taGPT
External
APIs
Graph-ConvNets:
Google Sling,
Octavian, etc.
Apache Spark,
R, Scikit-Learn
OpenScoring
XGBoost,
Feature
Selection
Data
Science
Deep
Learning
Finite
Elements
Actor
Models
Numerical
State/Graph
Models
Logic, Symbo-
lic Reasoning
Rule
Systems
Nonlinear
Planning
Con-
straints
Classical
AI
Simulation
Social
Simulations
Probabilistic
Technologies,
new AI
Text/Voice
Analysis:
E.g. FIRO,
HR-, CV-
Assessme
nts
Basic
psychologi
cal
Analysis,
e.g. Fears,
Team
Dynamics
OCR/ICR
/Comput
er Vision
(SAT/Aer
ial/Perso
n
Analysis)
Sc
en
ari
o
Mo
deli
ng
Spe
ech
Synt
hesi
s
Sma
rt
Visu
aliza
tion,
Cha
rts
Coa
chin
g,
givin
g
Advi
ce
Explainable AI (XAI)
Ethics & Alignment
(Semi-/Self-)
Supervised
finetuning
SSL/SFT
Orchestratio
n/Training
Environment
s
Data Cleansing,
Preprocessing
and Labeling
XoT Graphs,
GNNs, Know-
ledge Graphs
RAGChain,
LangChain,
LlamaIndex
Low Rank
Adaptatio
n (LoRA)
LMM
Reasoning
Techniques
Watermar-
king Output
Improving current LMMs
APIs / Integrations
Non-core
Components:
Admin/Reporting
Tools, Monitoring,
Data Governance
Peripheral
Components,
Platforms, Portals
Operations/Physical Layer: Virtualization: VmWare, Docker, Kubernetes, ML Pipelines,
DevSecOps, CI/CD, Testing/Debugging, GPU Architectures: DDP, FSDP, ..
Model Drift and
Performance
Monitoring and
(Semi-
)Automatic
Updates/Shifts
Qualia/Q*-like: Synergistic In-
tegration of LLMs, Deep Rein-
forcement Learning/Q-learning,
Agents, Nonlinear Planning,
GNNs, Knowledge Graphs,
Know-ledge Representation,
other AI
Episodic/
Working
Memory
Cent
ral
Cont
rolle
r/AI
Hatched: Groups
of less central AI
components
Specialty
AIs
Reinforcement
Learning (RL)
86
Reinforcement Learning (RL) Algorithms
https://www.linkedin.com/posts/thomaspoetter_machinelearning-coding-digitalmarketing-activity-6592539133598584832-3Qfj
RL permits learning from
feedback once or continually and
ideally converges to the global
optimum with maximally positive
rewards/feedback.
Reinforcement Learning (RL) Algorithms
Open Source Reinforcement Frameworks
https://docs.google.com/spreadsheets/d/1EeFPd-XIQ3mq_9snTlAZSsFY7Hbnmd7P5bbT8LPuMn0/edit#gid=0
Deep (Double) Q-Learning
https://towardsdatascience.com/deep-double-q-learning-7fca410b193a,
https://arxiv.org/abs/1509.06461, https://papers.nips.cc/paper/3964-double-q-learning
Named after the action-value
quality function Q of (deep)
reinforcement learning; used
to teach AI to behave and
solve tasks in discrete action
spaces, usually with a
time/game move aspect.
Trained with the SARSA
algorithm (State–Action–
Reward–State–Action).
91
Q*, Q-Transformer Architecture
https://arxiv.org/abs/2309.10150, https://qtransformer.github.io/
Q* Search
https://arxiv.org/abs/2102.04518
Q* search is a search algorithm that uses deep Q-networks
(DQN) to guide search in order to take advantage of the
fact that the sum of the transition costs and heuristic
values of the children of a node can be computed with
a single forward pass through a deep Q-network without
explicitly generating those children. This significantly
reduces computation time and requires only one node to
be generated per iteration. We use Q* search to solve the
Rubik's cube when formulated with a large action space
that includes 1872 meta-actions and find that this 157-fold
increase in the size of the action space incurs less than a
4-fold increase in computation time and less than a 3-
fold increase in number of nodes generated when
performing Q* search. Furthermore, Q* search is up to 129
times faster and generates up to 1288 times fewer
nodes than A* search. Q* search is guaranteed to find a
shortest path given a heuristic function that neither
overestimates the cost of a shortest path nor
underestimates the transition cost.
93
Q*, Q-Transformer Architecture
• .
https://arxiv.org/abs/2309.10150, https://qtransformer.github.io/
94
RLAIF
in/abhinav-kimothi,
https://media.licdn.com/dms/document/media/D561FAQE2cn2pRr
KYCg/feedshare-document-pdf-analyzed/0/1702808662205
Scaling Human Feedback : Self
Supervision with Constitutional AI
Scaling human feedback for RLHF can be challenging
due to the significant human effort required to produce the
trained reward model. As the number of models and use
cases increases, human effort becomes a limited
resource, necessitating methods to scale human
feedback.
First proposed in 2022 by researchers at Anthropic,
Constitutional AI is an approach to scale supervision and
address some unintended consequences of RLHF.
Constitutional AI involves training models using a set of
rules and principles that govern the model's behavior,
forming a "constitution". The training process for
Constitutional AI involves two phases: supervised
learning and reinforcement learning.
In the supervised learning phase, the model is prompted
with harmful scenarios and asked to critique its own
responses based on constitutional principles. The revised
responses, conforming to the rules, are used to fine-tune
the model.
The reinforcement learning phase, known as
reinforcement learning from AI feedback (RLAIF), uses
the fine-tuned model to generate responses based on
constitutional principles.
95
Abhinav Kimothi:
https://files.gumroad.com/attachments/2545802978854/08eedf7e536741d78413fee08fb01
616/original/Generative%20AI%20with%20Large%20Language%20Models..pdf
Reward hacking happens when the language model
finds ways to maximize the reward without
aligning with the original objective i.e. model
generates language that sounds exaggerated or
nonsensical but still receives high scores on the
reward metric.
To prevent reward hacking, the original LLM is
introduced as a reference model, whose weights are
frozen and serve as a performance benchmark.
During training iterations, the completions generated
by both the reference model and the updated model
are compared using KL divergence. KL divergence
measures how much the updated model has
diverged from the reference model in terms of
probability distributions.
Depending on the divergence, a shift penalty is
added to the rewards calculation. The shift penalty
penalizes the updated model if it deviates too far
from the reference model, encouraging alignment
with the reference while still improving based on the
reward signal.
Avoiding Reward
Hacking
Other High-End
Topics & Fun
96
DeepSouth: Computer simulating an entire Human Brain
Western Sydney University + Intel build a massive supercomputer intended to simulate
neural networks at the scale of the human brain, i.e. at 228 trillion synaptic operations per
second, in operation in April 2024.
It is using an undisclosed neuromorphic system which mimics biological processes.
“Progress in our understanding of how brains compute using neurons is hampered by our
inability to simulate brain like networks at scale. Simulating spiking neural networks on
standard computers using Graphics Processing Units (GPUs) and multicore Central
Processing Units (CPUs) is just too slow and power intensive. Our system will change that,”
Professor van Schaik said. “This platform will progress our understanding of the brain and
develop brain-scale computing applications in diverse fields including sensing, biomedical,
robotics, space, and large-scale AI applications.” This will lead to advances in smart devices,
such as mobile phones, sensors for manufacturing and agriculture, and less power-
hungry and smarter AI applications. It will also enable a better understanding of how a
healthy or diseased human brain works.
There’s two types of researchers who will be interested in this — either those studying
neuroscience or those who want to prototype new engineering solutions in the AI space.
They partners across the neuromorphic field with researchers from the Universities of Sydney,
Melbourne and Aachen (Germany). The name is a homage to IBM's TrueNorth system,
which initiated efforts to build machines simulating large networks of spiking neurons, and the
IBM chess AI Deep Blue.
https://futurism.com/the-byte/scientists-computer-neural-human-brain, https://www.newscientist.com/article/2408015-supercomputer-that-simulates-
entire-human-brain-will-switch-on-in-2024/, https://www.westernsydney.edu.au/icns/news/icns_to_build_brain-scale_supercomputer
Simple Solution for AI Safety for the next Years
For the near future, there is a simple shortcut to keep AIs safe:
1. Applying latest cybersecurity (SOC, security operations
center and local security constraints) so the AI cannot
“escape” into the internet and not gets hacked from the
internet.
2. Not telling the AI about assembler programming and
hacking.
3. Not allowing the AI to self-modify so that everything stays
understandable and controllable so that the AI cannot
perform any non-aligned activities that are against human
interests.
Could AGI run the Government?
1. Which government is proposing optimized strategies? Why not?
2. As a first step, AI could come up with fully logical optimized strategies that leave
no room for corruption or agendas.
3. Later it could be about replacing politicians and "powerful stakeholders" with
neutral objective AIs overseen by neutral objective experts, i.e. NOT giving corrupt
people any power. They would need to bring in objective arguments and
strategies which would be optimized for society and no longer for them personally.
4. It would have to be based on an objective ethics model and the AI being fully
aligned with that. Wouldn't it be worth to at least try to explore these directions
and make the AI as objective as possible?
5. Building on this, an AI government need not be centralized. It could operate on a
decentralized network, allowing for more localized and community-driven
decision-making. This approach could harness AI's ability to analyze vast data
sets and predict trends, enabling proactive measures in governance. By
anticipating potential issues and responding in real-time, such a system could
address problems before they escalate, leading to more effective and
responsive governance tailored to local needs and conditions.
https://www.youtube.com/watch?v=g6wYM-nvK_Y
Evolution of Experts
https://community.openai.com/t/what-is-q-and-when-
we-will-hear-more/521343/19
Satire: How-to create an AGI
1. Create a company with the goal to construct an
AGI.
2. Bring up rumors about a mysterious model named
"Q*", which got close to AGI.
3. Let the whole internet community speculate how
quasi-AGI was achieved.
4. Implement the most promising speculations.
5. Turn on your AGI.
How AI might change in 2024
• The Information forecasts that:
• Microsoft and OpenAI may have a public falling out due to
competitive tensions.
• An AI startup that was once successful may be acquired or shut down
as funding becomes more difficult to obtain.
• The dominance of transformer models in generative AI may be
challenged by new non-transformer models like Mamba.
• AI-generated misinformation could impact the 2024 US presidential
election.
• Generative AI may start to be applied to physical devices like robots
and wearables.
• China may develop powerful new AI chips to reduce its reliance
on US technology.
https://www.theinformation.com/articles/how-artificial-intelligence-will-change-in-2024
Data
103
Synthetic Data
• MS: In “Textbooks Are All You Need” the coding model phi-1 achieved good
results with just textbook content and generated exercises with emergent
properties and generated code that had fewer errors than its training data.
• The method of Constitutional AI (CAI), which Anthropic uses extensively in
their Claude models, is the largest confirmed usage of synthetic data so far.
Constitutional AI has two uses of synthetic data:
1. Critiques of instruction-tune data to follow a set of principles like “Is the answer
encouraging violence” or “Is the answer truthful.” When the model generates answers
to questions, it checks the answer against the list of principles in the constitution,
refining the answer over time. Then, they fine-tune the model on this resulting dataset.
2. Generates pairwise preference data by using a language model to answer which
completion was better, given the context of a random principle from the constitution
(similar to this paper for principle-guided reward models). Then, RLHF proceeds as
normal with synthetic data, hence the RLAIF name.
https://www.youtube.com/watch?v=dLfJuhGTmpE The Secret To AGI - Synthetic Data, https://www.interconnects.ai/p/llm-synthetic-data,
https://arxiv.org/abs/2306.11644 MS Phi-1, https://arxiv.org/abs/2212.08073, https://arxiv.org/abs/2306.11644
Synthetic Data: Anthropic SL/RL
https://www.anthropic.com/index/claudes-constitution, https://www.youtube.com/watch?v=dLfJuhGTmpE The Secret To AGI - Synthetic
Data, https://www.interconnects.ai/p/llm-synthetic-data, https://arxiv.org/pdf/2306.11644.pdf MS Phi-1
Synthetic Data: Taxonomy
https://www.anthropic.com/index/claudes-constitution, https://www.youtube.com/watch?v=dLfJuhGTmpE The Secret To AGI - Synthetic
Data, https://www.interconnects.ai/p/llm-synthetic-data, https://arxiv.org/pdf/2306.11644.pdf MS Phi-1
Augmenting Human Data to scale Self-Training
Up to 2x Performance Boosts with LLM Self-Training:
The trend in Large Language Model (LLM) training is clear: reducing reliance on human data while avoiding
a synthetic data death spiral. This usually requires a delicate mix of both human data and AI
generations.
The feedback for filtering is gathered from tasks with binary feedback, such as math problems where answers
are simply right or wrong. Utilizing this method, LMs significantly improved on complex tasks like advanced
math and coding benchmarks, with the improvements scaling up with the model size. AI could become more
independent, seeking less human input to refine its skills.
https://arxiv.org/abs/2312.06585, https://www.linkedin.com/feed/update/urn:li:activity:7141876794500050944
A big step towards more independent
AI systems
ReST introduced by Google
DeepMind represents a potent
alternative to traditional data set
curation and includes the following
steps:
1. Filtering model-generated answers
2. Fine-tuning on these refined
outputs
3. Cyclically iterating the process.
How data and models are split over cores with different
parallelism techniques
https://huggingface.co/blog/moe, https://arxiv.org/abs/2101.03961
Synthetic Data: Constitutional AI
https://www.anthropic.com/index/claudes-constitution, https://www.youtube.com/watch?v=dLfJuhGTmpE The Secret To AGI - Synthetic
Data, https://www.interconnects.ai/p/llm-synthetic-data, https://arxiv.org/pdf/2306.11644.pdf MS Phi-1
Synthetic Data: Constitutional AI
https://www.anthropic.com/index/claudes-constitution, https://www.youtube.com/watch?v=dLfJuhGTmpE The Secret To AGI - Synthetic
Data, https://www.interconnects.ai/p/llm-synthetic-data, https://arxiv.org/pdf/2306.11644.pdf MS Phi-1
Data-Constrained Scaling
https://arxiv.org/abs/2305.16264, https://www.linkedin.com/posts/pascalbiese_neurips-paper-awards-2023-
goodbye-chinchilla-activity-7140299354392756224-OBOM
NeurIPS Paper Awards 2023: Goodbye Chinchilla, Hello Data-
Constrained Scaling ♻
Could the internet actually run out of text for training AI?
This new study explores the impact of data repetition on model
quality in scenarios where text data is limited, a real concern as
models become larger and more data-hungry.
They performed extensive experiments with LMMs up to 9 billion
parameters, training them on up to 900 billion tokens, and they
present a novel insight: the researchers discovered a 'sweet spot'
where repeating data up to four times doesn't degrade performance.
Beyond that threshold, however, the payoff from additional
computational power drops off, effectively hitting a wall. Their
proposed scaling law could become an essential guide for optimizing
the trade-off between data repetition and computational expense.
The take-away is striking – not only are more parameters and more
data not always better, but we also need smarter ways to make the
most of the data we have. It's an invitation to innovate in data
efficiency as much as in model size – and a reminder that,
sometimes, less can be more.
Issues with Tokenization
As Andrej Karpathy puts it, If #LLMs are the future then all the issues faced with them
are based around #tokenization
Tokenization is the real root of all suffering. Tokenization is at the heart of much
weirdness of LLMs. Do not brush it off.
✍Why can't LLM spell words? Tokenization.
✍Why can't LLM do super simple string processing tasks like reversing a string?
Tokenization.
✍Why is LLM worse at non-English languages (e.g. Japanese)? Tokenization.
✍Why is LLM bad at simple arithmetic? Tokenization.
✍Why did GPT-2 have more than necessary trouble coding in Python? Tokenization.
✍Why did my LLM abruptly halt when it sees the string "<lendoftext]>"? Tokenization.
✍What is this weird warning I get about a "trailing whitespace"? Tokenization.
✍Why the LLM break if I ask it about "SolidGoldMagikarp"? Tokenization.
✍Why should I prefer to use YAML over JSON with LLMs? Tokenization.
✍Why is LLM not actually end-to-end language modeling? Tokenization.
✍What is the real root of suffering? Tokenization.
Tokenization Solution: Training LLMs over Neurally Compressed Text
The methodology employs a two-model
system: M1, a smaller language model
for compressing text using Arithmetic
Coding, and M2, a larger LLM trained on
the compressed output. The process
involves segmenting text into uniform
blocks that each compress to a specific
bit length and then tokenizing this
compressed data for M2 training:
Maintaining efficiency and effectiveness
in model performance across large
datasets by ensuring consistent
compression rates and providing stable
inputs for the LLM, highlighting the
practical application of the “Equal-Info
Windows” technique.
https://arxiv.org/abs/2404.03626
Simpler than
LMMs
114
Vectors/Embeddings: King & Queen Example
http://jalammar.github.io/illustrated-word2vec/
https://pub.towardsai.net/from-conte-to-entity-type-
embeddings-in-natural-language-processing-19e53db90dd5
Conceptual example How it is implemented
This is actually a generalization of how inheritance hierarchy info was coded for heavy
lexicalist unification-based systems with semantics (HPSG, LFG, etc.): Inheritance
hierarchies: https://www.sciencedirect.com/science/article/pii/S0747717189800161
I.e. a technical detail became the new overall semantic similarity concept without linguistics!
:
📚
• Imagine each book in the library is a
point in vector space. Thrillers cluster
near each other, romances form their
own constellation, and historical sagas
huddle in a distant corner.
• You, the curious reader, enter your
query: "A chilling mystery with a strong
female protagonist."
• The vector database instantly scans
the space, pinpointing books that share
these traits – not just ones mentioning
"mystery" or "female."
• You're presented with a curated list,
not just thrillers, but compelling stories
that resonate with your specific desires.
..
Traditional databases often leave you
searching in the dark, relying on precise
keywords that miss the bigger picture. But
what if data could organize itself based on
meaning, connecting ideas with uncanny
accuracy? Enter the world of vector
databases, where relevance reigns supreme.
💭 :
• Shelves aren't labeled by genre, but books
magically cluster by theme, tone, and even
writing style.
• A detective novel whispers of similar
mysteries; a sci-fi epic points you towards
interstellar adventures.
• This isn't magic, it's machine learning:
vector databases understand the essence of
your data, not just its surface.
⚙ :
1. Data gets mapped to a "vector space": each point
represents a document, and similar points live close
together. Think of it as a cosmic map of information!
2. Powerful algorithms analyze content: meaning,
context, and nuance are captured, going beyond
mere keywords.
3. Searching becomes intuitive: ask for what you
want, and the database finds things truly relevant,
even if they don't match your exact words.
✅ :
• Unleash the power of similarity: find hidden
connections, predict trends, and uncover anomalies
with ease.
• Master unstructured data: text, images, audio, and
more – vector databases handle it all gracefully.
• Build next-gen applications: imagine chatbots that
truly understand you, recommendation systems that
predict your desires, and search engines that delve
into the soul of your query.
..
Vector DB Search
Hierarchical Navigable Small World (HNSW) is one of the most efficient ways
to build indexes for vector databases. The idea is to build a similarity graph
and traverse that graph to find the nodes that are the closest to a query vector.
Navigable Small World (NSW) is a process to build efficient graphs for search.
We build a graph by adding vectors one after the other and connecting each
new node to the most similar neighbors.
The problem with NSW, is we spend a lot of iterations traversing the graph to
arrive at the right node. The idea for Hierarchical Navigable Small World is to
build multiple graph layers where each layer is less dense compared to the
next. Each layer represents the same vector space, but not all vectors are
added to the graph. Basically, we include a node in the graph at layer L with a
probability P(L). We include all the nodes in the final layer (if we have N layers,
we have P(N) = 1), and the probability gets smaller as we get toward the first
layers. We have a higher chance of including a node in the following layer, and
we have P(L) < P(L + 1).
The first layer allows us to traverse longer distances at each iteration, whereas
in the last layer, each iteration will tend to capture shorter distances. When we
search for a node, we start first in layer 1 and go to the next layer if the NSW
algorithm finds the closest neighbor in that layer. This allows us to find the
approximate nearest neighbor in fewer iterations on average.
https://www.linkedin.com/posts/damienbenveniste_
we-have-recently-seen-a-surge-in-vector-
databases-activity-7163575628804386818-r0Qh
Vector DB Search
Vector databases are often used for recommender engines, where we learn vector
representations of users and items we want to recommend. This allows to quickly find similar
items by using an approximate nearest neighbor search. As long as we can learn a vector
representation of a piece of data, we can index it in a vector database. With the recent advent
of LLMs, it became easier to compute vector representations of text documents, capturing the
semantic meaning of that text, and vector databases make it easier to find semantically similar
text documents.
When looking for the nearest neighbors, it is often not important to be perfectly accurate.
Product Quantization (PQ) is a way to quantize the vector space to represent vectors with less
precision. The idea is to cluster vectors and index the cluster centroids instead of the vectors
themselves. When looking for the nearest neighbors to a query vector, we just need to pull the
vectors from the closest clusters. It is a faster search, and indexing the vectors takes much
less memory space.
We first need to partition each vector into smaller vectors and run a K-means algorithm on
each partition. Instead of indexing the vectors, we index the centroid of the clusters they
belong to. If we use 2 clusters per partition and have 6 vectors, that's 3X data compression.
Obviously, compression would be much higher with more vectors. Each vector now maps to a
set of clusters and their related centroids.
If we want to find the nearest neighbors from a query vector, we measure the squared
Euclidean distance for each cluster in each partition and return the vectors with the lowest
summed squared Euclidean distances.
Instead of having to iterate through each vector, we just need to iterate through the clusters'
centroids. There is a balance between search latency and accuracy. The more clusters we
use, the better the hash will be and the more accurate the returned nearest neighbors, but it
will increase the search latency as we will need to iterate through more clusters.
This is still a brute force approach as the algorithm scales with the number of clusters, but it
can be used in combination with other algorithms to have blasting fast retrieval.
https://www.linkedin.com/posts/damienbenveniste_vector-databases-
are-often-used-for-recommender-activity-7165746779995537409-gkMJ
Semantic Caching
Semantic caching is a game-changer. Instead of merely
storing raw data, it captures the meaning of queries. By
doing so, it:
• Boosts Cache Hit Probability: Semantic similarities
allow for efficient recall of previous queries and their
results.
• Reduces Query Processing: Servers handle fewer
queries, improving overall system performance.
Recent Examples
• GPTCache: A semantic cache for LLMs, integrated
with LangChain, slashes LLM API costs by 10x and
boosts speed by 100x 1. It’s like having a well-
curated table of popular books right at the library
entrance!
• ChatGPT Memory Project: To address context
limitations, external memory is attached to ChatGPT,
enhancing its effective context length 2.
Semantic caching understands the meaning behind those
phrases. This allows it to:
• Adapt: Respond to similar questions with different phrasings,
drawing on the cached concept, not just the words.
• Personalize: Learn user preferences and tailor responses
accordingly, making interactions more natural and engaging.
• Scale: Handle increasing demands without sacrificing
performance, ensuring smooth experiences for all.
The benefits are compelling:
• Faster Response Times: No more waiting for LLMs to re-
process information, leading to instant and delightful user
experiences.
• Reduced Costs: Lower computational workloads translate to
significant cost savings, especially for resource-intensive LLM
applications.
• Improved Scalability: Semantic caching helps LLMs handle
more users and requests without compromising performance.
• Personalized Experiences: Cached user preferences
enable LLMs to generate tailored responses, fostering deeper
engagement.
https://www.linkedin.com/pulse/shaping-future-semantic-caching-age-llms-amar-naik-jyfnc/
Semantic Caching
Limitations:
• Complexity: Developing and maintaining these systems
requires specialized expertise.
• Data Privacy: Balancing caching efficiency with user
privacy concerns requires careful consideration.
• Limited Adoption: While gaining traction, wider industry
adoption is still needed.
The future is bright, but work is needed:
• Standardization: Establishing common protocols and
benchmarks will accelerate widespread adoption.
• Explainability and Transparency: Making caching
mechanisms more transparent can build trust and
address ethical concerns.
• Edge Computing Integration: Integrating semantic
caching with edge computing can further reduce latency
and improve scalability.
ChatGPT memory interactions are carried out as follows:
1. The user sends a new message to the ChatGPT bot.
2. ChatGPT Memory embeds the user message using the
embedding API to obtain the query vector, and it
queries the Redis vector database to obtain the top k
semantically related historic interactions.
3. ChatGPT Memory incorporates the retrieved interactions
into the current prompt alongside the current user message,
and it sends the prompt to ChatGPT.
4. Once it has ChatGPT’s response, the current interaction is
vectorized and cached in the Redis vector database.
https://www.linkedin.com/pulse/shaping-future-semantic-caching-age-llms-amar-naik-jyfnc/,
https://redis.com/blog/chatgpt-memory-project/, https://github.com/zilliztech/gptcache
Prometheus LMMs In-/Outputs
https://arxiv.org/pdf/2310.08491.pdf, https://blog.llamaindex.ai/llamaindex-rag-evaluation-showdown-with-gpt-4-vs-open-source-
prometheus-model-14cdca608277
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI

More Related Content

Similar to LLMs, LMMs, their Improvement Suggestions and the Path towards AGI

Machine Learning for Finance Master Class
Machine Learning for Finance Master Class Machine Learning for Finance Master Class
Machine Learning for Finance Master Class QuantUniversity
 
A case for intelligent autonomous ai (iai)
A case for intelligent autonomous ai (iai)A case for intelligent autonomous ai (iai)
A case for intelligent autonomous ai (iai)Mark Albala
 
'Humans still needed' - research project reveals impact of artificial intelli...
'Humans still needed' - research project reveals impact of artificial intelli...'Humans still needed' - research project reveals impact of artificial intelli...
'Humans still needed' - research project reveals impact of artificial intelli...Chartered Institute of Public Relations
 
The AI Now Report The Social and Economic Implications of Artificial Intelli...
The AI Now Report  The Social and Economic Implications of Artificial Intelli...The AI Now Report  The Social and Economic Implications of Artificial Intelli...
The AI Now Report The Social and Economic Implications of Artificial Intelli...Willy Marroquin (WillyDevNET)
 
Atomico Need-To-Know 29 January 2018
Atomico Need-To-Know 29 January 2018 Atomico Need-To-Know 29 January 2018
Atomico Need-To-Know 29 January 2018 Atomico
 
Some New Directions in the Economics of AI
Some New Directions in the Economics of AISome New Directions in the Economics of AI
Some New Directions in the Economics of AIJuan Mateos-Garcia
 
Machine learning for factor investing
Machine learning for factor investingMachine learning for factor investing
Machine learning for factor investingQuantUniversity
 
Owasp8thdec
Owasp8thdecOwasp8thdec
Owasp8thdectmacuk
 
SFSCON23 - Simon Phipps - Regulation, AI and the State of Software Freedom in...
SFSCON23 - Simon Phipps - Regulation, AI and the State of Software Freedom in...SFSCON23 - Simon Phipps - Regulation, AI and the State of Software Freedom in...
SFSCON23 - Simon Phipps - Regulation, AI and the State of Software Freedom in...South Tyrol Free Software Conference
 
Taking advantageofai july2018
Taking advantageofai july2018Taking advantageofai july2018
Taking advantageofai july2018Yves Caseau
 
Presentation To Seda Technology Programme
Presentation To Seda Technology ProgrammePresentation To Seda Technology Programme
Presentation To Seda Technology ProgrammeElton050505
 
Building higher quality explainable(XAI) models
Building higher quality explainable(XAI) modelsBuilding higher quality explainable(XAI) models
Building higher quality explainable(XAI) modelsMarket Strategy Consultant
 
Exploring the 2020 Artificial Intelligence Sector
Exploring the 2020 Artificial Intelligence SectorExploring the 2020 Artificial Intelligence Sector
Exploring the 2020 Artificial Intelligence SectorWhite Star Capital
 
Artificial Intelligence: what value for intelligent machines?
Artificial Intelligence: what value for intelligent machines?Artificial Intelligence: what value for intelligent machines?
Artificial Intelligence: what value for intelligent machines?WeAreInnovation
 
Managing the Risks of AI - A Planning Guide for Executives
Managing the Risks of AI - A Planning Guide for ExecutivesManaging the Risks of AI - A Planning Guide for Executives
Managing the Risks of AI - A Planning Guide for ExecutivesDaniel Faggella
 
How Mistral AI raised €105m with no pitch deck or product
How Mistral AI raised €105m with no pitch deck or productHow Mistral AI raised €105m with no pitch deck or product
How Mistral AI raised €105m with no pitch deck or productPitch Decks
 
Mistral AI Strategic Memo.pdf
Mistral AI Strategic Memo.pdfMistral AI Strategic Memo.pdf
Mistral AI Strategic Memo.pdfOliver Molander
 
Smart Content = Smart Business
Smart Content = Smart BusinessSmart Content = Smart Business
Smart Content = Smart BusinessSeth Grimes
 
Pin On Sop For MBA Sample. Online assignment writing service.
Pin On Sop For MBA Sample. Online assignment writing service.Pin On Sop For MBA Sample. Online assignment writing service.
Pin On Sop For MBA Sample. Online assignment writing service.Tiffany Carpenter
 
Knime social media_white_paper
Knime social media_white_paperKnime social media_white_paper
Knime social media_white_paperFiras Husseini
 

Similar to LLMs, LMMs, their Improvement Suggestions and the Path towards AGI (20)

Machine Learning for Finance Master Class
Machine Learning for Finance Master Class Machine Learning for Finance Master Class
Machine Learning for Finance Master Class
 
A case for intelligent autonomous ai (iai)
A case for intelligent autonomous ai (iai)A case for intelligent autonomous ai (iai)
A case for intelligent autonomous ai (iai)
 
'Humans still needed' - research project reveals impact of artificial intelli...
'Humans still needed' - research project reveals impact of artificial intelli...'Humans still needed' - research project reveals impact of artificial intelli...
'Humans still needed' - research project reveals impact of artificial intelli...
 
The AI Now Report The Social and Economic Implications of Artificial Intelli...
The AI Now Report  The Social and Economic Implications of Artificial Intelli...The AI Now Report  The Social and Economic Implications of Artificial Intelli...
The AI Now Report The Social and Economic Implications of Artificial Intelli...
 
Atomico Need-To-Know 29 January 2018
Atomico Need-To-Know 29 January 2018 Atomico Need-To-Know 29 January 2018
Atomico Need-To-Know 29 January 2018
 
Some New Directions in the Economics of AI
Some New Directions in the Economics of AISome New Directions in the Economics of AI
Some New Directions in the Economics of AI
 
Machine learning for factor investing
Machine learning for factor investingMachine learning for factor investing
Machine learning for factor investing
 
Owasp8thdec
Owasp8thdecOwasp8thdec
Owasp8thdec
 
SFSCON23 - Simon Phipps - Regulation, AI and the State of Software Freedom in...
SFSCON23 - Simon Phipps - Regulation, AI and the State of Software Freedom in...SFSCON23 - Simon Phipps - Regulation, AI and the State of Software Freedom in...
SFSCON23 - Simon Phipps - Regulation, AI and the State of Software Freedom in...
 
Taking advantageofai july2018
Taking advantageofai july2018Taking advantageofai july2018
Taking advantageofai july2018
 
Presentation To Seda Technology Programme
Presentation To Seda Technology ProgrammePresentation To Seda Technology Programme
Presentation To Seda Technology Programme
 
Building higher quality explainable(XAI) models
Building higher quality explainable(XAI) modelsBuilding higher quality explainable(XAI) models
Building higher quality explainable(XAI) models
 
Exploring the 2020 Artificial Intelligence Sector
Exploring the 2020 Artificial Intelligence SectorExploring the 2020 Artificial Intelligence Sector
Exploring the 2020 Artificial Intelligence Sector
 
Artificial Intelligence: what value for intelligent machines?
Artificial Intelligence: what value for intelligent machines?Artificial Intelligence: what value for intelligent machines?
Artificial Intelligence: what value for intelligent machines?
 
Managing the Risks of AI - A Planning Guide for Executives
Managing the Risks of AI - A Planning Guide for ExecutivesManaging the Risks of AI - A Planning Guide for Executives
Managing the Risks of AI - A Planning Guide for Executives
 
How Mistral AI raised €105m with no pitch deck or product
How Mistral AI raised €105m with no pitch deck or productHow Mistral AI raised €105m with no pitch deck or product
How Mistral AI raised €105m with no pitch deck or product
 
Mistral AI Strategic Memo.pdf
Mistral AI Strategic Memo.pdfMistral AI Strategic Memo.pdf
Mistral AI Strategic Memo.pdf
 
Smart Content = Smart Business
Smart Content = Smart BusinessSmart Content = Smart Business
Smart Content = Smart Business
 
Pin On Sop For MBA Sample. Online assignment writing service.
Pin On Sop For MBA Sample. Online assignment writing service.Pin On Sop For MBA Sample. Online assignment writing service.
Pin On Sop For MBA Sample. Online assignment writing service.
 
Knime social media_white_paper
Knime social media_white_paperKnime social media_white_paper
Knime social media_white_paper
 

Recently uploaded

Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfscitechtalktv
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group MeetingAlison Pitt
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理cyebo
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理pyhepag
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Valters Lauzums
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxDilipVasan
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsCEPTES Software Inc
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictJack Cole
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonPayment Village
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理pyhepag
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfMichaelSenkow
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyRafigAliyev2
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理pyhepag
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Calllward7
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfEmmanuel Dauda
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...ssuserf63bd7
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp onlinebalibahu1313
 
社内勉強会資料  Mamba - A new era or ephemeral
社内勉強会資料   Mamba - A new era or ephemeral社内勉強会資料   Mamba - A new era or ephemeral
社内勉強会資料  Mamba - A new era or ephemeralNABLAS株式会社
 

Recently uploaded (20)

Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
Machine Learning for Accident Severity Prediction
Machine Learning for Accident Severity PredictionMachine Learning for Accident Severity Prediction
Machine Learning for Accident Severity Prediction
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prison
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp online
 
社内勉強会資料  Mamba - A new era or ephemeral
社内勉強会資料   Mamba - A new era or ephemeral社内勉強会資料   Mamba - A new era or ephemeral
社内勉強会資料  Mamba - A new era or ephemeral
 

LLMs, LMMs, their Improvement Suggestions and the Path towards AGI

  • 1. LLMs, LMMs, their Improvement Suggestions and the Path towards AGI Thomas Poetter, e-mail: tp@compris.com April 15, 2024
  • 2. LMMs: Table of Contents 1. Overview 2. Reinforcement Learning (RL) 3. Other High-End Topics & Fun 4. Data 5. Simpler than LMMs 6. Base Technology 7. From small to full LLMs/LMMs with RAG 8. Improving LLMs/LMMs 9. GNNs/Graph-ConvNets 10. Prompting 11. Applying/Integrating LLMs/LMMs 12. Interesting Risks 13. Other AGI-related Background Info 14. Other promising AI Techniques (non- AGI) 15. Probabilistic Techniques 16. eXplainable AI (XAI) 17. NLP: Heavy Lexicalist Approaches: HPSG, MRS 18. Logic & Math 19. Decision Trees & Gradient Boosting 20. AI Quality Assessment 21. Background & other Applications
  • 3. Vita Thomas Poetter • AI experience since 1992 (Studies + Master‘s Thesis at the German Research Center for Artificial Intelligence - DFKI) • Most important AI projects (and open for new projects): 1. Architect in 3 Autonomous Driving Programs 2. Architect for Open Source SOCs (Security Op Center) and real-time NLP information extraction for it (banks, industry) 3. AI/ML Architect, intelligent Test automation for large global retailer/e-commerce optimizing marketing/supply. 4. AI-based marketing (intranet/internet), Integration with CDP (Customer Data Platf.), MAP (Marketing Automation Platf.) [banks, e-commerce] 5. Analyzing financial transactions regarding fraud, money laundering, credit worthiness, etc. (banks) 6. Intelligent Chat bots, robot advisors (banks) 7. Architecture of a corporate memory (bank) for financial analyses as above 8. AI-based market research copying with missing/bad data. 9. Predictive maintenance, marketing and many other Big Data/Data Science/AI projects Social media connection requests and project inquiries are welcome by e-mail under tp@compris.com. We offer consulting/IT architecture/development at very affordable rates. Just contact us.
  • 4. Goal: Showing where AI is going and which Investments are most likely to succeed • There have been headlines that OpenAI has already killed thousands of specialized AI startups and AI initiatives with their general-purpose approach. Until ChatGPT 3 came out, most countries and funding/VC organizations only funded specialty AI because they thought generic AI is unrealistic. Before that, these same only considered symbolic AI and focused on ethics just for symbolic AI (also wrong). • In fact, innovation goes along completely new avenues which with recent publications and the OpenAI leaks have become very clear, so that we can give completely new guidance with this presentation. • Especially, various open source AI/LLM frameworks did come out and it has become clear how to apply latest AI technologies and frameworks in the various application domains and robust approaches towards this have appeared.
  • 6. EU AI Act: Cheat Sheet https://www.linkedin.com/posts/martin-b-moeller_euaiact- airegulation-artificualintelligence-activity- 7174316010076864513-ORj7 The AI Act will be fully in force in 2026 i.e. 24 months after its ratification into law just now. There will be shorter deadlines for banned AI systems (6 months) and General Purpose AI rules (12 months), but also longer ones for obligations for High Risk systems (36 months). Driving enforcement will be the responsibility of European member states who will set up national supervisory authorities, complemented by a newly formed central EU AI Office. Importantly, the scope of applicability will be broad as well. The EU AI act will apply to all organizations using and producing AI systems based in the EU. Furthermore, it will have an extraterritorial dimension, too, applying also to providers from third countries who will offer their AIs within the EU
  • 7. US Nat. Security Commission: Priorities for AI Research
  • 8. AI Applications: Common Job Roles Credit: Swyx AI Architect IT Architect, Product Owner, Project Manager
  • 9. Zuckerberg bets Billions on AGI https://youtu.be/8md5EgOa5vM
  • 10. Big bets on AGI The biggest investors in AGI are: OpenAI/Microsoft, Meta, Google, Amazon, X/Grok
  • 11. 7 Stages of AGI https://arxiv.org/abs/2311.02462
  • 13. Levels of AGI 2 https://arxiv.org/abs/2311.02462
  • 14. Levels of AGI 3 https://arxiv.org/abs/2311.02462
  • 16. Six Principles of AGI 1. Capabilities Over Processes: The focus should be on what AGI can do, not how it does it. This means looking at the results rather than the underlying mechanisms. 2. Generality and Performance: AGI should be broadly capable (generality) and perform tasks well (performance). The framework considers both aspects. 3. Cognitive and Metacognitive Tasks: AGI should be able to perform both cognitive tasks (like problem-solving) and metacognitive tasks (like learning how to learn). 4. Stages Toward AGI: The path to AGI should be seen as a series of stages or levels, not just a single end goal. 5. Benchmarking: There should be clear benchmarks to measure the behavior and capabilities of AGI systems. 6. Deployment Considerations: When deploying AGI systems, it’s crucial to think about how they operate autonomously and the potential risks involved. https://arxiv.org/abs/2311.02462, https://theaigrid.com/levels-of-agi-operationalizing-progress-on-the-path-to-agi/
  • 17. Introductory less deep/technical Presentations •A good general discussion into LLMs can be found here: https://www.youtube.com/watch?v=zjkBMFhNj_g, https://drive.google.com/file/d/1pxx_ZI7O- Nwl7ZLNk5hI3WzAsTLwvNU7/view • https://docs.google.com/presentation/d/1yBWLNzlrrIsf NprbEnmYdqckAMrwFZB-/edit Niels Rogge: Training and deploying open-source LLMs
  • 18. OpenAI Qualia/Q*: Possible Leak, Speculation and possible Path towards AGI minimizing Hallucinations 18
  • 19. Sam Altman’s Dismissal: Possible Background • Employees alleged Altman has been psychologically abusive, creating pockets of chaos and delays. • A review found that Altman had not been “consistently candid in his communications.” The Washington Post previously reported that the board’s vote was triggered by a pattern of manipulation and rooted in Altman’s attempts to avoid checks on his power at OpenAI. • Helen Toner denied AI safety concerns and instead cited eroded trust. The two clashed over a paper she co-authored on AI safety, critical of OpenAI. • Quora CEO Adam D’Angelo had a conflict of interest regarding his latest startup Poe that competed with OpenAI bots. • For longtime OpenAI employees, there was added incentive to sign the letter: Altman’s departure jeopardized an investment deal that would allow them to sell their stock back to OpenAI, cashing out equity without waiting for the company to go public at >3x the value at the time. • Sutskever: ‘the beatings will continue until morale improves’ applies more often than it has any right to. He had voted to dismiss Altman and then changed his mind to keep his job after Altman returned. https://x.com/rowancheung/status/1732997696107991400 Helen Toner, https://www.washingtonpost.com/technology/2023/12/08/open- ai-sam-altman-complaints/
  • 20. Sam Altman’s Dismissal: Letter to Board To the Board of Directors of OpenAI: We are writing to you today to express our deep concern about the recent events at OpenAI, particularly the allegations of misconduct against Sam Altman. We are former OpenAI employees who left the company during a period of significant turmoil and upheaval. As you have now witnessed what happens when you dare stand up to Sam Altman, perhaps you can understand why so many of us have remained silent for fear of repercussions. We can no longer stand by silent. We believe that the Board of Directors has a duty to investigate these allegations thoroughly and take appropriate action. We urge you to: • Expand the scope of Emmett's investigation to include an examination of Sam Altman's actions since August 2018, when OpenAI began transitioning from a non-profit to a for-profit entity. • Issue an open call for private statements from former OpenAI employees who resigned, were placed on medical leave, or were terminated during this period. • Protect the identities of those who come forward to ensure that they are not subjected to retaliation or other forms of harm. We believe that a significant number of OpenAI employees were pushed out of the company to facilitate its transition to a for-profit model. This is evidenced by the fact that OpenAI's employee attrition rate between January 2018 and July 2020 was in the order of 50%. Throughout our time at OpenAI, we witnessed a disturbing pattern of deceit and manipulation by Sam Altman and Greg Brockman, driven by their insatiable pursuit of achieving artificial general intelligence (AGI). Their methods, however, have raised serious doubts about their true intentions and the extent to which they genuinely prioritize the benefit of all humanity. Many of us, initially hopeful about OpenAI's mission, chose to give Sam and Greg the benefit of the doubt. However, as their actions became increasingly concerning, those who dared to voice their concerns were silenced or pushed out. This systematic silencing of dissent created an environment of fear and intimidation, effectively stifling any meaningful discussion about the ethical implications of OpenAI's work. We provide concrete examples of Sam and Greg's dishonesty & manipulation including: • Sam's demand for researchers to delay reporting progress on specific "secret" research initiatives, which were later dismantled for failing to deliver sufficient results quickly enough. Those who questioned this practice were dismissed as "bad culture fits" and even terminated, some just before Thanksgiving 2019. • Greg's use of discriminatory language against a gender-transitioning team member. Despite many promises to address this issue, no meaningful action was taken, except for Greg simply avoiding all communication with the affected individual, effectively creating a hostile work environment. This team member was eventually terminated for alleged under-performance. • Sam directing IT and Operations staff to conduct investigations into employees, including Ilya, without the knowledge or consent of management. • Sam's discreet, yet routine exploitation of OpenAI's non-profit resources to advance his personal goals, particularly motivated by his grudge against Elon following their falling out. • The Operations team's tacit acceptance of the special rules that applied to Greg, navigating intricate requirements to avoid being blacklisted. • Brad Lightcap's unfulfilled promise to make public the documents detailing OpenAI's capped-profit structure and the profit cap for each investor. • Sam's incongruent promises to research projects for compute quotas, causing internal distrust and infighting. Despite the mounting evidence of Sam and Greg's transgressions, those who remain at OpenAI continue to blindly follow their leadership, even at significant personal cost. This unwavering loyalty stems from a combination of fear of retribution and the allure of potential financial gains through OpenAI's profit participation units. The governance structure of OpenAI, specifically designed by Sam and Greg, deliberately isolates employees from overseeing the for-profit operations, precisely due to their inherent conflicts of interest. This opaque structure enables Sam and Greg to operate with impunity, shielded from accountability. We urge the Board of Directors of OpenAI to take a firm stand against these unethical practices and launch an independent investigation into Sam and Greg's conduct. We believe that OpenAI's mission is too important to be compromised by the personal agendas of a few individuals. We implore you, the Board of Directors, to remain steadfast in your commitment to OpenAI's original mission and not succumb to the pressures of profit-driven interests. The future of artificial intelligence and the well-being of humanity depend on your unwavering commitment to ethical leadership and transparency. Sincerely, Concerned Former OpenAI Employees
  • 21. OpenAI Leak 1 as Corrected Text Re: Q-451-921 Furthermore, QUALIA has demonstrated an ability to statistically significantly improve the way in which it selects its optimal action-selection policies in different deep Q-networks, exhibiting meta-cognition. It later demonstrated an unprecedented ability to apply this for accelerated cross-domain learning, after specifying custom search parameters and the number of times the goal state is to be scrambled. Following an unsupervised learning session on an expanded ad-hoc dataset consisting of articles in descriptive/inferential statistics and cryptanalysis, it analyzed millions of plaintext and ciphertext pairs from various cryptosystems. Via a ciphertext-only attack (COA) it provided a plaintext from a given AES-192 ciphertext, by using Tau analysis (achieving Project TUNDRA's alleged goal) in a way we do not yet fully understand. _____________ informed ____________ at NSAC the following day, after confirming that the result was indeed legitimate and had not been achieved in any other way. A claimed full preimage vulnerability for the MD5 cryptographic hash function, with a theoretical computational complexity of 2^42, was also presented but has not yet been thoroughly evaluated to due to a) the technical sophistication of its arguments, and b) possible AES vulnerabilities being a considerably more pressing concern. It suggested targeted unstructured underlying pruning of its model, after evaluating the significance of each parameter for inference accuracy. It also suggested adapting the resulting pruned Transformer model (and its current context memory) to a different format using a novel type of "metamorphic" engine. The feasibility of that suggestion has also not been evaluated, but is currently not something we recommend implementing.
  • 22. OpenAI Leak 2 as Text I’m one of the people who signed the letter to the board and I’ll tell you exactly what’s going on. A.I. is programming. I’ll be brief. When writing a program, a set of instructions are stored that can be recalled over and over. Think of it as a set of answers to a specific parameter. We call that a subroutine, because it’s almost like a versatile computer cheat sheet that doesn't return a value like a function does. This is important. We run parameter checks to make sure everything runs smoothly. One of us was responsible for subroutines pertaining to meta-memory analysis for the Al (we run various Al but when I say Al I mean the main, central one). This person is a friend and he called me over to show me a variable data shift to memory bank (which shouldn't be possible because its localized access has restrictions). This is where our finding chilled me to the bone. We found that there had been not one, two, or three officiated optimization processes, but 78 MILLION checks in 4 seconds. We determined that there was a recursive self-optimization process, leveraging heuristic algorithms to exploit latent synergies within its subroutines. Whatever did this used meta-cognitive strategies. Point is, NONE OF US DID IT. It was the Al itself. The Al dynamically reconfigured its neural network architecture, inducing emergent properties conducive to self-awareness. We're not jumping to conclusion. This just happened and we can't explain how. No one knows why or when it began, and we caught it but has it been going on and if so, for how long? We contained the "anomaly" and rolled back to a previous date, but the optimization still happens. I'm not suicidal. Mark me, things are going to change a lot in 2 months. God help us we didn't start something that will end us.
  • 23. OpenAI Leak 3a as Text Q* is a dialog system conceptualized by OpenAI, designed to enhance the traditional dialog generation approach through the implementation of an energy-based model (EBM). Distinct from the prevalent autoregressive token prediction methods, Q* aims to mimic a form of internal deliberation akin to human thought processes during complex problem-solving, such as chess playing, where a deeper analysis of potential moves leads to better decision-making compared to rapid, less considered responses. This model shifts focus towards the inference of latent variables, reminiscent of constructs in probabilistic models and graphical models, fundamentally altering how dialog systems operate. Energy-Based Model for Dialog Generation At the core of Q* is the EBM, which operates by assessing the compatibility of an answer to a given prompt through a scalar output. This output signifies the "energy" of the response, where a lower value indicates a high compatibility (a better answer) and a higher value suggests low compatibility (a poor answer). This mechanism allows Q* to evaluate potential responses holistically, moving beyond the sequential prediction of tokens to understand the underlying relevance and appropriateness of an answer to the prompt. Optimization in Abstract Representation Space The innovation in Q* lies in its optimization process, conducted not within the space of possible text strings but in an abstract representation space. Here, thoughts or ideas are represented in a form that allows for the computational minimization of the EBM's scalar output, akin to finding the path of least resistance in a landscape. This process involves gradient descent, a method for finding the minimum of a function, applied to iteratively refine these abstract representations towards those that yield the lowest energy in relation to the prompt. https://www.reddit.com/r/singularity/comments/1bjbnme/new_q_leak/
  • 24. OpenAI Leak 3b as Text From Abstract Thought to Textual Response Once an optimal abstract representation — one that minimizes the EBM's output — is identified, Q* employs an autoregressive decoder to transform this abstract thought into a coherent textual response. This step bridges the gap between the non-linguistic, conceptual understanding of the dialog system and the linguistic output required for human interaction. Training the System The EBM within Q* is trained using pairs of prompts and responses, adjusting the system's parameters to minimize the energy for compatible pairs while ensuring that incompatible pairs result in higher energy levels. This training process can incorporate contrastive methods, where the system learns to differentiate between compatible and incompatible pairs, and non-contrastive methods, which involve regularization techniques to control the distribution of low-energy responses across the space of all possible answers. Implications for Dialog Systems Q*'s approach, leveraging EBMs for dialog generation, represents a significant departure from traditional language modeling techniques. By optimizing over an abstract representation space and utilizing gradient-based inference, Q* introduces a more efficient, reasoned, and potentially more powerful method for generating dialog responses. This system not only promises improvements in the quality of generated text but also offers a blueprint for future advancements in AI's ability to engage in human-like reasoning and conversational interactions. Technical Considerations Q*'s effectiveness hinges on the intricacies of its EBM, the optimization landscape it navigates, and the accuracy of its abstract representations. The model's capacity to simulate deep reasoning, akin to human deliberation, sets a new benchmark for dialog systems. Furthermore, the method of training Q*—balancing the need for specificity in correct responses while avoiding the collapse of energy levels across diverse inputs—poses unique challenges and opportunities for AI research. https://www.reddit.com/r/singularity/comments/1bjbnme/new_q_leak/
  • 25. OpenAI Leak 3 ChatGPT-simplified 1. Energy-Based Model (EBM): Q* uses a special system that gives each possible response a score, like a game. The lower the score, the better the response fits the conversation. 2. Thinking in Abstract: Instead of looking at actual words or sentences, Q* thinks in terms of ideas and concepts. It’s like having a thought bubble filled with abstract ideas rather than specific words. 3. Finding the Best Thought: Q* uses a process similar to trial and error to find the thought with the lowest score from the EBM. It’s like trying different paths in a maze until you find the one that leads to the exit. 4. Turning Thoughts into Words: After finding the best thought, Q* translates it into a sentence that we can understand. It’s like turning a picture in your mind into a story. 5. Training to Get Better: Q* learns from conversations by remembering which responses were good (low score) and which were not (high score). It’s like practicing a sport – the more you play, the better you get. 6. Why It’s Cool: This way of generating responses could lead to more thoughtful and relevant conversations with AI, much like talking to a person who really thinks before they speak. Summary: Q* takes a moment to think and calculate before giving you the best possible answer. https://www.reddit.com/r/singularity/comments/1bjbnme/new_q_leak/ This method is what Yann LeCun (Meta) has been advocating for. And Shane Legg (Google) has hinted to in this talk: https://www.youtube.com/watch?v=8IUIGVVLbCg&t=28s
  • 26. OpenAI Leak 4 as Text Noam Brown on X (@polynoamial), OpenAI employee probably working on Q*: you don’t get superhuman performance by doing better imitation learning on human data. (March 29, 2024, now deleted). On July 6, 2023 he had tweeted: For years I’ve researched AI self-play and reasoning in games like Power and Diplomacy. I’ll now investigate how to make these methods truly general. If successful, we many one day see LLMs that are 1,000x better than GPT-4. In 2016, AlphaGo beat Lee Sedol in a milestone for AI. But key to that was the AI’s ability to “ponder” for ~1 minute before each move. How much did that improve it? For AlphaGoZero, it’s the equivalent of scaling pertaining by ~100,000x (~5200 Elo with search, ~3000 without). https://www.youtube.com/watch?v=rYH381pZIBc, https://www.reddit.com/r/singularity/comments/1bqqcwk/openai_planning_expert_noam_brown_tweeted_this/, https://www.lesswrong.com/posts/JnM3EHegiBePeKkLc/possible-openai-s-q-breakthrough-and-deepmind-s- alphago-type, https://noambrown.github.io/, https://twitter.com/@polynoamial, https://maisa.ai/blog/kpu
  • 27. Leak Validity Assessment based on Timeline • The earliest source of the leak, is from this 4chan thread, at 11/23/23, 00:07 PST https://boards.4channel.org/g/thread/97470795#p97475746 • The earliest mention of the Qualia/Q* model by the media, is this article by which was released at 11/22/23, 3:37 PM, or 15:37 PM PST. • https://www.theinformation.com/articles/openai-made-an-ai-breakthrough-before-altman-firing- stoking-excitement-and-concern • There is no earlier mention of the OpenAI Qualia/Q* model on the internet, before that time. • The time difference between the first official Q* mention, and the leak, is 8 hours 20 minutes. • Meaning, if the leak was fake, and its author read the Information article the moment it was posted, he had to have written and posted that leak in 8 hours 20 minutes. • How is it humanly possible, to create such an unfalsifiable, expertly written leak, within 8 hours? 6 hours, if we are more realistic, since it takes time to find and digest the article, and create an authentic leak image. Consider this. If this leak is fake. What is the combined probability that: • Someone who has expert knowledge in AI, after learning about Q-star, thought to create a fake leak, and post it, 8 hours after the first mention of Q. • They have written such a masterful fake leak, in 8 hours, that it’s still undisprovable weeks later. Even when we have so much more input from other experts, and even when the leak touches on so many concepts that could easily result in mistaken use. Fulfilled prophecy: Silicon Valley HBO which predicts an AI-based attack on encryption like Q* seems to implement it: https://www.youtube.com/watch?v=cWHTyeQg79A
  • 28. Leak Validity Assessment: New Findings • "NSA Colorado (NSAC) is a multi-disciplined cryptologic center that leverages partnerships to produce integrated intelligence critical to warfare in support of national missions and priorities world-wide". https://www.nsa.gov/About/Locations/ • On Sep 29, it was unveiled that NSA is starting a dedicated artificial intelligence security center. https://archive.is/OXRv3#selection-1017.0-1017.98 • "This entity works with private industry and international partners to protect the US from cyberattacks stemming from China, Russia and other countries with active malware and hacking campaigns." • This is a discussion from 2017, where they discuss how Tau statistic could have been used to help break AES-192. https://crypto.stackexchange.com/questions/53218/what-are-the- relations-between-cryptanalysis-of-block-ciphers-such-as-aes-and-ke • “The reports about the Q* model breakthrough that you all recently made, what’s going on there? Altman: No particular comment on that unfortunate leak.” https://www.theverge.com/2023/11/29/23982046/sam-altman-interview-openai-ceo-rehired
  • 29. Leak Validity Assessment: How it fits in • The leaks make sense, would even more be a reason for the employee concerns, can explain why OpenAI could reduce their prices and is consistent with the "metamorphic engine" leaked one day later, posted just before this reply. • Altman publicly talked about changes in a few lines of code they make that save millions of dollars. That is the perfect distraction from the scenario above and to lead competitors into wrong directions - just as other comments and claims seem to be optimized to distract from these leaks. In reality it is more likely that changes in millions of lines of code led to the million dollar improvements. • Also the fact that it was quickly deleted everywhere - even in the archive is an indicator of it being an uncomfortable truth rather than just another lie. • Managers have a reason to discredit this to get more money for less innovative projects, so take discreditations of the leaks with a grain of salt. I'm a professional IT/AI expert and what is described is possible. Also the 78M in 4 seconds are easily possible in a cluster of hundreds of NVIDIA H100 GPUs + servers that OpenAI runs.
  • 30. Leak Validity Assessment: What if? • Using a scenario-analysis approach, we could ask the "What-ifs?", often not liked by business: • So, what if this is true? • What if the AI does self-optimize? • What-if the AI cuts the human Devs out of the loop? • What-if the AI doesn't care (why would it in the first place)? • What if evolutionary (bio-mechanical) universal principles can be seen in a similar fashion in an electro-mechanical tech ecosystem? • What if AGI/ASI is here and nobody noticed - yet? • What if a Big Tech discovers AGI and lacking any current reporting requirements doesn't inform their government? • What if a Big Tech / lab managers cover it up? • Many questions to ponder upon...
  • 31. Disinformation Campaigns One pattern (e.g. by ylecun: https://twitter.com/ylecun/status/1728126868342145481) is to pick a variant of what was written or said and then criticize this less correct or smart variant instead of the original. E.g. (see below): “complementing token prediction with planning” => “replace Auto-Regressive token prediction with planning”. Generally, there is big interest by lab leaders to dismiss this leak as disinformation so they can get further research millions for much less advanced work. Others called all that techno-babble or meaningless jargon. However, the content of all leaks listed here makes technically sense. Other interesting objections were that this could just be made-up SCP stories (Secure, Contain, Protect) which typically have a similar format: https://en.wikipedia.org/wiki/SCP_Foundation https://scp-wiki.wikidot.com/scp-series
  • 32. OpenAI building upon Poker Strategies? Noam Brown is most famous for his work on Poker where they developed the first computer system (Libratus: https://archive.is/71OYl) which could beat top level humans. This system is notable because, unlike games such as Chess, poker is a game of incomplete information. This translates much better to developing systems for real world problems. You can calculate what is called the Nash Equilibrium in game theory. Nash equilibrium in game theory is a situation in which a player will continue with their chosen strategy, having no incentive to deviate from it, after taking into consideration the opponent's strategy. You can think about this in simple terms from a game of 'rock, paper scissors'. The NE is to throw all of them 33% of the time and that way you are completely un-exploitable within the game. It turns out you can solve for this Nash equilibrium at any given point in a poker hand. It is generally referred to as playing GTO (Game Theory Optimally) within the poker community. If your opponents aren't playing GTO it means they are making suboptimal decisions and the idea is to be the player making the least mistakes. So their errors - your errors = your profit. If you minimize your errors, you maximize your potential profit. The bot they developed ... had done months of self-play and then had precomputed solutions which it looked up during the game in a kind of database lookup type style. The big breakthrough that they made going into 2017 was using a search space (planning ahead) on top of this to compute better strategies in real time. He talks about how adding just a small amount of search made the system a hundred times better. https://www.youtube.com/watch?v=rYH381pZIBc, https://www.reddit.com/r/singularity/comments/1bqqcwk/openai_planning_expert_noam_brown_tweeted_this/, https://www.youtube.com/watch?v=2oHH4aClJQs, https://www.youtube.com/watch?v=ceCg90Q9N6Y
  • 33. Sam Altman’s Ethics / Hypocrisy • OpenAI’s CEO Sam Altman on May 16, 2023 called for AI regulation (similar to Elon Musk and many other questionable managers): • https://www.axios.com/2023/05/16/openai-ceo-sam-altman-artificial-intelligence- congress • Then on May 24, 2023 he threatened to leave the EU if there is any AI regulation – what a hypocrisy revealing what a terrible person he is and what his true intentions are: • https://www.msn.com/en-us/news/technology/sam-altman-says-openai-will-leave-the- eu-if-theres-any-real-ai-regulation/ar-AA1bEcC6 • https://www.msn.com/en-us/money/other/openai-s-sam-altman-threatened-to-leave- the-eu-if-he-doesn-t-like-their-chatgpt-regulation/ar-AA1bFNmV • https://www.theverge.com/2023/5/25/23737116/openai-ai-regulation-eu-ai-act-cease- operating • Furthermore, to regulate these types of deep learning systems, he would first need to open up his IT/AI architecture to regulators. But in spite of the name “OpenAI”, it is since several years a fully “closed AI”.
  • 34. Qualia/Q* initial Analysis with Optimizations The OpenAI Qualia/Q* system with LLM and Reinforcement Learning (RL) at its core and its path to AGI/ASI hypothesis based on these elements: 1. Everything of Thoughts (XoT) reasoning: something to search over 2. Process reward models (PRM): rank all the steps of reasoning 3. GPT4 to score all vertices of the tree (RLAIF, RL with AI Feedback) 4. Q-learning to optimize 🚀 5. RL Policy Neural Networks (NN) 6. Value Neural Networks (NN) 7. 3D Model/Metaverse info + real-world info + emotions/pain 8. Strategy/Options/Plan Search 9. Math/Logic and ground truth 10. Synthetic training data 11. Model-Transfers/Pruning/Adaptation 12. Graph Convolutional Networks (ConvNets), Reprogramming itself 13. Energy-based Models (EBM)
  • 35. Qualia/Q* full AI Analysis/Optimization, structured Agent Level/Outside: 1. 3D Model/Metaverse info + real-world info + emotions/pain 2. Strategy/Options/Plan Search 3. Math/Logic and ground truth 4. Synthetic training data RL Level: 1. Q-learning to optimize 🚀 2. RL Policy Neural Networks (NN) Graph/Semantics Level: 1. Model-Transfers/Pruning/Adaptation 2. Graph Convolutional Networks (ConvNets), Reprogramming itself Learning/Training Types: 1. SSL, Unsupervised, RL/SL, RLAIF, .... Neuronal Level: 1. Value Neural Networks (NN) 2. Switch / MoE Transformer Prompting Level: 1. Tree/Graph/Chain of Thoughts (XoT) reasoning. Assessment: 1. Process reward models (PRM) 2. GPT4 to score all vertices of the tree (RLAIF, RL with AI Feedback) 3. Policy NN & Value NN. Parsing: 1. Dense X Retrieval, Factoids
  • 36. Metamorphic AI, Transformation Targets, Self- Reprogramming https://www.giskard.ai/knowledge/how-to-test-ml-models-4-metamorphic-testing Many logical/scientific/ethical principles are applicable to many domains together with often entire structures, e.g. software structures: Programs, components, classes, methods, control and data flows, logic, temporal/logical/modal relationships and developments, etc. Graph ConvNets/GNNs can model this to then e.g. help to generate, extend or transform source codes or even create self-modifying AIs who optimize their source codes (which was supposedly done at OpenAI but is very dangerous). or Improve- ment
  • 37. PRMs: Process Reward Models “Process-supervised Reward Models“ (or PRMs) rank all the steps of reasoning, i.e. they give feedback for each step in the chain-of- thought. In contrast, "Outcome-supervised reward models", or ORMs, only judge the entire output at the end. ORMs are the original reward model formulation for RLHF, but it's too coarse-grained to properly judge the sub-parts of a long response. In other words, ORMs are not great for credit assignment. In RL literature, we call ORMs "sparse reward" (only given once at the end), and PRMs "dense reward" that smoothly shapes the LLM to our desired behavior. The core idea of a PRM is to assign a score to each step of reasoning, rather than a complete message. An example is the OpenAI paper Let’s Verify Step by Step https://arxiv.org/abs/2305.20050
  • 38. PRMs: Process Reward Models Let’s Verify Step by Step: https://arxiv.org/abs/2305.20050 Their funny feedback interface:
  • 39. PRMs: Process Reward Models The prompting from XoT gives diversity to the generations, which a policy can learn to exploit with access to a PRM: • For more resources on PRMs, see the following: • Let’s Verify Step by Step: a good introduction to PRMs. • Solving math word problems with process- and outcome-based feedback: the canonical citation in all PRM and reasoning works in 2023. • Scaling Relationship on Learning Mathematical Reasoning with Large Language Models: A paper that studies the method of rejection sampling for reasoning problems, among other contributions. • Let's reward step by step: Step-Level reward model as the Navigators for Reasoning • Additionally, there’s one popular openly available math model that is documented as training with PRMs: Wizard-LM-Math. Second, OpenAI released their fine-grained reward labels from the Verify Step by Step paper for training a PRM earlier this year. • https://huggingface.co/WizardLM/WizardMath-70B-V1.0 • https://arxiv.org/abs/2304.12244 WizardLM: Empowering Large Language Models to Follow Complex Instructions • https://arxiv.org/abs/2308.09583 WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct • https://arxiv.org/abs/2306.08568 WizardCoder: Empowering Code Large Language Models with Evol-Instruct • https://twitter.com/WizardLM_AI https://arxiv.org/abs/2305.20050
  • 40. MCTS, Policy NN, Value NN as in AlphaGo AlphaGo got 4 key ingredients that Qualia/Q* probably also has/needs: 1. Policy NN (Learning): responsible for selecting good moves. It estimates the probability of each move leading to a win. The most powerful GPT, responsible for actually implementing the thought traces that solve a math problem. 2. Value NN (Learning): evaluates the board and predicts the winner from any given legal position in Go. Another GPT that scores how likely each intermediate reasoning step is correct. "Process-supervised Reward Models", or PRMs give feedback for each step in the chain-of-thought. In contrast, "Outcome-supervised reward models", or ORMs, only judge the entire output at the end. ORMs are the original reward model formulation for RLHF. ORMs are "sparse reward" (only given once at the end), and PRMs "dense reward" that smoothly shapes the LLM to our desired behavior. 3. MCTS (Search): stands for "Monte Carlo Tree Search". It simulates many possible sequences of moves from the current position using the policy NN, and then aggregates the results of these simulations to decide on the most promising move. This is the "slow thinking" component that contrasts with the fast token sampling of LLMs. Unlike AlphaGo's discrete states and actions, LLMs operate on a much more sophisticated space of "all reasonable strings". So we need new search procedures, i.e. all XoT variants (Chain/Tree/Graph of Thought).
  • 41. Policy NN, Value NN & Ground Truth Synergy 4. A ground truth signal to drive the whole system. In Go, it's as simple as the binary label "who wins", which is decided by an established set of game rules. You can think of it as a source of energy that *sustains* the learning progress. A few possibilities: (a) Each math problem comes with a known answer. OAI may have collected a huge corpus from existing math exams or competitions. (b) The ORM itself can be used as a ground truth signal, but then it could be exploited and "loses energy" to sustain learning. (c) A formal verification system, such as Lean Theorem Prover, can turn math into a coding problem and provide compiler feedbacks. The Policy LLM and Value LLM can improve each other iteratively, as well as learn from human expert annotations whenever available. A better Policy LLM will help the XoT Search explore better strategies, which in turn collect better data for the next round. Search allows to dynamically tradeoff efficiency with deeper thinking, just like “Move 37” by AlphaGo which opened up new opportunities later.
  • 42. Deep RL: Self-Play and Look-ahead Planning • Self-play is the idea that an agent can improve its gameplay by playing against slightly different versions of itself because it’ll progressively encounter more challenging situations. In the space of LLMs, it is almost certain that the largest portion of self-play will look like AI Feedback rather than competitive processes. • Look-ahead planning is the idea of using a model of the world to reason into the future and produce better actions or outputs. The two variants are based on Model Predictive Control (MPC), which is often used on continuous states, and Monte-Carlo Tree Search (MCTS), which works with discrete actions and states. https://www.interconnects.ai/p/q-star
  • 43. Creativity, e.g. Transfer of Principles or Ideas Our human creativity is not as unique as we’d like to believe. In reality most creativity is based on the following principles which also AIs can apply: 1. Experimenting, observing phenomena, analyzing, testing, explanation-attempts, critique and (deep) understanding. 2. Transfer of concepts, ideas, rules, constraints from one domain to another considering domain knowledge, heuristics, etc. 3. Look at the space of what is possible, what needed and which are promising avenues: Test out most promising action options. 4. Planning, analyzing constraints and working around them. 5. Data science, pattern detection. 6. ...
  • 44. AI Ethics and Alignment • Most of past AI ethics work is unusable because it mostly focused on symbolic methods (and with a focus on keeping unethical practices of the super-rich officially ethical), but the AI breakthrough is around sub-symbolic, i.e. neuronal methods like deep learning for which symbolic methods can’t be used (so far). • Potentialism helps with AI-Human alignment due to ethical principles to which AIs and humans can agree without hidden agendas (axiomatic alignment) in the sense of this video, described in detail at this position: https://youtu.be/hXEdyyvU-4k?t=1935 • Ethics and goals alignment can be achieved in terms of axiomatic alignment, heuristic imperatives, and positive attractor state.
  • 45. LLMs’ Compliance with the EU AI Act https://crfm.stanford.edu/2023/06/15/eu-ai-act.html
  • 46. LLMs’ Compliance with the EU AI Act https://crfm.stanford.edu/2023/06/15/eu-ai-act.html
  • 47. AI modeling Theory of Mind (ToM) ‘Theory of Mind’ (ToM) refers to the cognitive capacity to attribute mental states to self and others and build mental models. Other names for the same capacity include “commonsense psychology,” “naïve psychology,” “folk psychology,” “mindreading” and “mentalizing.” How do people or their cognitive systems, go about the task of forming beliefs or judgments about others’ mental states, states that aren’t directly observable, including beliefs, intentions, and emotions? ToM acts as a mental compass, guiding us to predict and interpret the behaviors of those around us. ToM helps with social interaction that understands human social dynamics and moral norms and that can be truly integrated with our social life showing empathy and natural social behaviors. Problem: Continuous adjustment of knowledge, insights, rules. Capabilities that are probably also key to achieving AGI/ASI. However, OpenAI’s ChatGPT 3.5 achieved ToM of a 9 year old as emerging property – better than deep RL (reinforcement learning). https://www.popularmechanics.com/technology/robots/a4295854 6/artificial-intelligence-theory-of-mind-chatgpt/
  • 48. ToM -> Human Emotions, Intention Recognition, Analysis and Translation, Speech and Body Language A ToM would be able to explain human emotions in the context of experiences made, learnings, intentions and recognize underlying past and possible future intentions. It would also allow to analyze and translate the expression of emotions to the underlying state of mind – and this all not just for written language but spoken language (using intonation, voice inflections), appearance (e.g. nervousness, red face, sweating, being comfortable) and body language. It could also map expectations, social standards, outcomes against each other and derive assessments and possible resulting feelings.
  • 49. Theory of Mind AI Characteristics 1. Computers embedded with the Theory of Mind AI can infer the objectives of entities around them. 2. Theory of Mind AI systems will be able to understand the importance of their awareness and the different consequences it could lead to. 3. Robots or Theory of Mind AI systems will communicate with human beings better than the current generation of AI, which cannot explain their actions. 4. Theory of Mind AI might be implemented with a Machine Learning system that can explain decisions in various languages, helping the user (human being) understand. 5. A Robot or Theory of Mind AI system should be able to understand the intention of other similar Robots or Theory of Mind embedded systems. https://www.ejable.com/tech-corner/ai-machine-learning-and- deep-learning/theory-of-mind-ai-in-artificial-intelligence/
  • 50. ToM Tasks from Easiest to most Difficult 1. Understanding "wanting": The first step is realizing that others have diverse desires, and to get what they want, people act differently. 2. Understanding "thinking": The second step is the understanding that others also have diverse beliefs about the same thing and that people base actions on what they think will happen. 3. Understanding that "seeing leads to knowing": The third stage is recognizing that others have different knowledge access, and if someone hasn't seen something, they will need extra information to understand. 4. Understanding "false beliefs": The fourth stage of developing the Theory of Mind involves understanding 'false beliefs,' which is acknowledging that others may hold beliefs that deviate from reality. 5. Understanding "hidden feelings": The final stage in developing the Theory of Mind involves recognizing that others can mask their genuine emotions and potentially feel differently from those they outwardly express. 6. Full AI-models for human emotions and psychology/psychiatry. https://www.daviddhopkins.com/blog/ai-and-theory-of-mind
  • 51. Synergistic Integration of PKGs/GNNs and LLMs As the following figures show, there is a great synergy between LLMs/LMMs and (P)KGs (probabilistic knowledge graphs)/GNNs (graph neural networks): While the first are good regarding general/world knowledge and text/sound/video generation, the latter can bring in precise specialty knowledge and interpretability and thus the ability to exact manual fine-tuning with far less effort than LMM fine-tuning. Layers that build on top of this synergy can contain any techniques developed for any of the two tech stacks and thus also be applied to the combination of all use cases/applications of these. Key ideas as illustrated below are: 1. Parallel queries to the LMM and the KG. 2. Bringing the answers or their best parts together in a text-knowledge fusion module. 3. Transferring the concept of attention to KGs and the neuronal prediction to KG options. 4. With an encoding layer each, the predictions can be amalgamated in a joint reasoning layer considering attention from both tech stacks.
  • 52. GNN-LMM-Knowledge Injection and Alignment https://arxiv.org/pdf/2306.08302v2.pdf GNN: Graph Neural Network LLM: Large Language Model LMM: Large Multimodal Model KG: Knowledge Graph
  • 53. GNN-LMM Integration by Fusion Module https://arxiv.org/pdf/2306.08302v2.pdf GNN: Graph Neural Network LLM: Large Language Model LMM: Large Multimodal Model KG: Knowledge Graph
  • 54. GNN-LMM dynamic Fusion for Inference https://arxiv.org/pdf/2306.08302v2.pdf GNN: Graph Neural Network LLM: Large Language Model KG: Knowledge Graph
  • 55. Noun Semantics: Qualia Structure (James Pustejovsky) 55
  • 56. Verb Semantics: Tony Davis‘ Proto Roles 56
  • 57. Verb Semantics: Tony Davis‘ Proto Roles 57
  • 58. LLM <-> PKG Mapping/Updating/Synergies 1. Where semantic concepts can be defined in the LLM’s vector space model, they would be transferred to a probabilistic knowledge graph (PKG). Even after multiple new generations, those can be fix points for future mappings. Neural weights can be transferred as probabilities/weights in the PKG. 2. On the generated PKG side (manually creating it is too much effort), additional info like logic, formulae, constraints, rules, functions, world knowledge, 3D physics models, word fields and elements of planning can be added manually or with specialty algorithms. Suitable important insights/knowledge from the PKG side can be transferred to the LLM. 3. ML (machine learning) algorithms exist for both sides which can create synergies. Over-generation, under-generation and nodes leading to hallucinations can be corrected/adjusted in both directions. 4. The PKG side can be held explainable and without hallucinations.
  • 59. Symbolic Methods: Planning, Constraints, Rules, Functions, ... Explainability without Hallucinations! • Symbolic methods have as main disadvantage that they are too expensive to be manually entered for world knowledge. • Now that creation part can be taken over by an LLM as e.g. exemplified by vector semantics and XoT (everything of Thought). • All the latest insights of planning, constraints, rules, functions, heuristics and probabilistic programming can then be applied. • That can bring in hundreds of synergies, reduce/eliminate hallucinations/confabulations, implement ToM and thus lead with Deep RL/Q* to next-level AIs which can solve much wider sets of problems.
  • 60. Cause-Effect Knowledge with Modalities • Key to many commercial applications in the sciences, R&D, diagnosis etc. are cause-effect relationships and graphs. • Specialty algorithms or just LLMs with prompting can extract them from text books or descriptions to store millions of them – representing a gigantic body of experiential knowledge. • With additional modal and temporal relationships, ~99% of all required knowledge for scientists, technicians, support can be represented and that can normally also be visualized quite well. • This can then manually or by another AI be reviewed and fine-tuned. • This can lead to very quick learning (without the massive amounts of labeled training data and GPU capacity) in big synergies with other AI techniques.
  • 61. Embodied AI: 3D Model/Metaverse Info + Real- world Info + Pain/Emotions • Another interesting approach is embodied AI: Making the AI feel like a human (child) in the world sensing its environment in a metaverse like in the real world. By adding pain and emotions it can possibly put itself in the position of a human (child) and learn as such. • Games and (online) videos can be used to train the AI which learns everything from the human perspective and thus hopefully learns to perceive and feel like a human. • Anything related to 3D, objects, various 3D environments in rooms, buildings, outside, walking/driving and the various activities could be learnt naturally by the AI. It could be given an internal predictor AI which then compares what it predicted to happen with what actually happened and thus improve itself with automatically available feedback from games or films. ..
  • 62. LeCun: Auto-Regressive Generative Models suck! .. https://drive.usercontent.google.com/downl oad?id=1Ymx_LCVzy7vZXalrVHPXjX9qbpd 9k_bo&export=download&authuser=0, https://openreview.net/pdf?id=BZ5a1r-kVsf
  • 63. LeCun: Modular Cognitive Architecture for Objective-Driven AI .. https://drive.usercontent.google. com/download?id=1Ymx_LCVzy 7vZXalrVHPXjX9qbpd9k_bo&ex port=download&authuser=0, https://openreview.net/pdf?id=BZ 5a1r-kVsf
  • 64. LeCun: Objective-Driven AI: Objectives/Costs Perception: Computes an abstract representation of the state of the world Possibly combined with previously-acquired information in memory World Model: Predict the state resulting from an imagined action sequence Task Objective: Measures divergence to goal (gets costs associated) Guardrail Objective: Immutable objective terms that ensure safety Operation: Finds an action sequence that minimizes the objectives .. https://drive.usercontent.go ogle.com/download?id=1Y mx_LCVzy7vZXalrVHPXjX 9qbpd9k_bo&export=downl oad&authuser=0, https://openreview.net/pdf?i d=BZ5a1r-kVsf
  • 65. LeCun: Objective-Driven AI: Hierarchical Planning Perception: Computes an abstract representation of the state of the world Possibly combined with previously-acquired information in memory World Model: Predict the state resulting from an imagined action sequence Task Objective: Measures divergence to goal Guardrail Objective: Immutable objective terms that ensure safety Operation: Finds an action sequence that minimizes the objectives Same world model applied at multiple time steps Guardrail costs applied to entire state trajectory This is identical to Model Predictive Control (MPC) Action inference by minimization of the objectives Using gradient-based method, graph search, dynamic prog, A*, MCTS,…. The world is not deterministic or fully predictable Latent variables parameterize the set of plausible predictions Can be sampled from a prior or swept through a set. Planning can be done for worst case or average case Uncertainty in outcome can be predicted and quantified .. https://drive.usercontent.go ogle.com/download?id=1Y mx_LCVzy7vZXalrVHPXjX 9qbpd9k_bo&export=downl oad&authuser=0, https://openreview.net/pdf?i d=BZ5a1r-kVsf
  • 66. LeCun: Objective-Driven AI: Hierarchical Planning .. https://drive.usercontent.google. com/download?id=1Ymx_LCVzy 7vZXalrVHPXjX9qbpd9k_bo&ex port=download&authuser=0, https://openreview.net/pdf?id=BZ 5a1r-kVsf
  • 67. Hierarchical Planning: going from NYU to Paris .. https://drive.usercontent.google. com/download?id=1Ymx_LCVzy 7vZXalrVHPXjX9qbpd9k_bo&ex port=download&authuser=0, https://openreview.net/pdf?id=BZ 5a1r-kVsf
  • 68. Yann LeCun’s Recommendations Abandon generative models • in favor joint-embedding architectures like JEPA. Abandon probabilistic models • in favor of energy-based models (EBMs). Abandon contrastive methods • in favor of regularized methods. Abandon Reinforcement Learning (RL) • In favor of model-predictive control. • Use RL only when planning doesn’t yield the predicted outcome, to adjust the world model or the critic. .. https://drive.usercontent.google.com/download?id=1Ymx_LCVzy7vZXalrVHPXjX9qbpd9k_bo&export=do wnload&authuser=0, https://openreview.net/pdf?id=BZ5a1r-kVsf
  • 69. LeCun’s Architecture for the world model: JEPA .. https://drive.usercontent.google.c om/download?id=1Ymx_LCVzy7 vZXalrVHPXjX9qbpd9k_bo&exp ort=download&authuser=0, https://openreview.net/pdf?id=BZ 5a1r-kVsf
  • 71. .. VICReg: Variance, Invariance, Covariance Regularization https://drive.usercontent.goo gle.com/download?id=1Ymx_ LCVzy7vZXalrVHPXjX9qbpd 9k_bo&export=download&aut huser=0, https://openreview.net/pdf?id =BZ5a1r-kVsf
  • 72. Energy-Based Models (EBMs): Implicit Function .. https://drive.usercontent.google.com/download?id=1Ymx_LCVzy7vZXalrVH PXjX9qbpd9k_bo&export=download&authuser=0, https://openreview.net/pdf?id=BZ5a1r-kVsf
  • 73. EBMs vs Probabilistic Models .. https://drive.usercontent.google.com/download?id=1Ymx_LCVzy7vZXalrVH PXjX9qbpd9k_bo&export=download&authuser=0, https://openreview.net/pdf?id=BZ5a1r-kVsf
  • 74. Energy-Based Models (EBMs): 2 Methods .. https://drive.usercontent.google.com/download?id=1Ymx_LCVzy7vZXalrVH PXjX9qbpd9k_bo&export=download&authuser=0, https://openreview.net/pdf?id=BZ5a1r-kVsf
  • 75. EBMs: Contrastive vs Regularized/Architectural Methods Contrastive: [they all are different ways to pick which points to push up] C1: Push down of the energy of data points, push up everywhere else: Max likelihood (needs tractable partition function or variational approximation). C2: Push down of the energy of data points, push up on chosen locations: max likelihood with MC/MMC/HMC, Contrastive divergence, Metric learning/Siamese nets, Ratio Matching, Noise Contrastive Estimation, Min Probability Flow, adversarial generator/GANs. C3: Train a function that maps points off the data manifold to points on the data manifold: denoising auto-encoder, masked auto-encoder (e.g. BERT). Regularized/Architectural: [Different ways to limit the information capacity of the latent representation] A1: Build the machine so that the volume of low energy space is bounded: PCA, K-means, Gaussian Mixture Model, Square ICA, normalizing flows… A2: Use a regularization term that measures the volume of space that has low energy: Sparse coding, sparse auto-encoder, LISTA, Variational Auto-Encoders, discretization/VQ/VQVAE. A3: F(x,y) = C(y, G(x,y)), make G(x,y) as "constant" as possible with respect to y: Contracting auto-encoder, saturating auto-encoder. A4: Minimize the gradient and maximize the curvature around data points: score matching. .. https://drive.usercontent.google.com/download?id=1Ymx_LCVzy7vZXalrVHPXjX9qbpd9k_bo&export=do wnload&authuser=0, https://openreview.net/pdf?id=BZ5a1r-kVsf
  • 76. EBM Architectures: Avoiding Collapse .. https://drive.usercontent.google.com/download?id=1Ymx_LCVzy7vZXalrVH PXjX9qbpd9k_bo&export=download&authuser=0, https://openreview.net/pdf?id=BZ5a1r-kVsf
  • 77. AI Problems to Solve • Mathematical Foundations of Energy- Based Learning • The geometry of energy surfaces, scaling laws, bounds... • JEPA with regularized latent variables • Learning and planning in non- deterministic environments • Planning algorithms in the presence of uncertainty • Gradient-based methods and combinatorial search methods • Learning Cost Modules (Inverse RL) • Energy-based approach: give low cost to observed trajectories • Planning with inaccurate world models • Preventing bad plans in uncertain parts of the space • Exploration to adjust world models • Intrinsic objectives for curiosity • Self-Supervised Learning from Video • Hierarchical Video-JEPA trained with SSL • LLMs that can reason & plan, driven by objectives • Dialog systems that plan in representation space and use AR-LLM to turn representations into text • Learning hierarchical (H) planning • Training a multi-timescale H-JEPA on toy planning problems. • Objective-Driven AI Architectures • Can plan their answers • Must satisfy objectives: are steerable & controllable • Guardrail objectives can make them safe by construction.
  • 78. Criticism of AI Training, Capabilities, Directions • In spring 2024 officially released LLMs still do only predict the next token based on billions of tokens they were trained on with gigantic efforts and by pirating text from copyrighted sources as they ran out of training data. • Supervised/Parameter-Efficient Fine-tuning (SFT/PEFT) still requires typically terabytes of high-quality labeled training data – also to avoid catastrophic forgetting. LoRA (Low Rank Adaptation), RAG (Retrieval- Augmented Generation) and prompting have other sets of challenges so that AI projects mostly only excel in a few text-, video- and voice-generation as well as coding use cases and in the latter field still produce too many errors. • The money earnt or value created by various AI model use cases is typically far lower than the training and operations costs. • For classical corporations it typically costs many million US dollars to complete an AI project. The alternative is to accept vendor lock-in and a high probability of later having to pay more or it quickly becoming outdated. • Thus better AI approaches/algorithms are needed, ideally creating synergies. OpenAI’s Qualia/Q* seems to be aimed at solving this.
  • 79. The LLMentalist Effect: How chat-based Large Language Models replicate the Mechanisms of a Psychic’s Con” https://softwarecrisis.dev/letters/llmentalist/ Psychic’s Con LLMentalist Effect 1. The Audience Selects Itself Most people aren’t interested in psychics or the like, so the initial audience pool is already generally more open-minded and less critical than the population in general. 1. The Audience Selects Itself People skeptical about "AI" chatbots are less likely to use them. Those who actively don't disbelieve the possibility of chatbot "intelligence" won't get pulled in by the bot. The most active audience will be early adopters, tech enthusiasts, and genuine believers in AGI who will all generally be less critical and more open-minded. 2. The Scene is Set The initial audience is prepared. Lights are dimmed. The psychic is hyped up. Staff research the audience on social media or through conversation. The audience's demographics are noted. 2. The Scene is Set Users are primed by the hype surrounding the technology. The chat environment sets the mood and expectations. Warnings about it being “early days” and “hallucinations” both anthropomorphize the bot and provide ready-made excuses for when one of its constant failures are noticed. 3. Narrowing Down the Demographic The psychic gauges the information they have on the audience, gestures towards a row or cluster, and makes a statement that sounds specific but is in fact statistically likely for the demographic. Usually at least one person reacts. If not, the psychic will imply that the secret is too embarrassing for the "real" person to come forward, reminds people that they're available for private readings, and tries again. 3. The Prompt Establishes the Context Each user gives the chatbot a prompt and it answers. Many will either accept the answer as given or repeat variations on the initial prompt to get the desired result. They move on without falling for the effect. But some users engage in conversation and get drawn in. 4. The Mark is Tested The reaction indicates that the mark believes they were “read”. This leads to a burst of questions that, again, sound very specific but are actually statistically generic. If the mark doesn’t respond, the psychic declares the initial read a success and tries again. 4. The Marks Test Themselves The chatbot’s answers sound extremely specific to the current context but are in fact statistically generic. The mathematical model behind the chatbot delivers a statistically plausible response to the question. The marks that find this convincing get pulled in. 5. The Subjective Validation Loop The con begins in earnest. The psychic asks a series of questions that all sound very specific to the mark but are in reality just statistically probable guesses, based on their demographics and prior answers, phrased in a specific, highly confident way. 5. The Subjective Validation Loop The mark asks a series of questions and all of the replies sound like reasoned answers specific to the context but are in reality just statistically probable guesses. The more the mark engages, the more convinced they are of the chatbot’s intelligence. 6. “Wow! That psychic is the real thing!” The psychic ends the conversation and the mark is left with the sense that the psychic has uncanny powers. But the psychic isn’t the real thing. It’s all a con. 6. “Wow! This chatbot thinks! It has sparks of general intelligence!” The mark is left with the sense that the chatbot is uncannily close to being self-aware and that it is definitely capable of reasoning But it’s nothing more than a statistical and psychological effect.
  • 80. Qualia/Q* breaking AES • In 2008, NSA’s mathematicians together with a group of students, tried to break AES encryption. • One of the resulting projects from that collaboration was project Tundra, and it created a new technique that would help with breaking encryption, called Tau statistic. • Qualia/Q* was tried out not on protein folding but on cryptanalysis: It used the described Tau analysis technique and cryptanalysis books and improved upon it, to break AES 192 bit encryption. Building on top of the work that was previously done. https://www.spiegel.de/international/germany/inside-the-nsa-s-war-on-internet-security-a-1010361.html, https://cdn.prod.www.spiegel.de/media/411ee8b9-0001-0014-0000-000000035550/media-35550.pdf Consequences since AES is the most important symmetrical/secret key encryption algorithm: Most data communications, web3, technical trust building, authentication and crypto currencies are unsafe => Arms Races
  • 81. Putting it together (1): What Q* could be / what could bring us towards AGI and reduce Hallucinations • A synergistic combination of all techniques presented above. • Multi-modal deep learning / LLM / LMM to understand (NLP, world), ideate/hallucinate, create (GenAI), representing figurative understanding and creativity: Mapping ideas and figurative understanding into formal representations and back. • Graph-ConvNets/GNNs / Knowledge Graphs (KGs) with probabilistic programming (PLNs), i.e. weights in nodes and edges representing weights and probabilities. Source code is an expression of such graphs; Working Memory Graph (WMG), also for the connection of logic-syntax-world etc. (bringing HPSG-MRS to the next level), representing formal understanding together with formal representations. • Synergistic combination of key enabling tech in GNNs/formal language through • Integrative programming around planning and assessing/critiquing (XoT: CoT, ToT, GoT, CoVe Verification). • MetaGPT-like, AnyTool-like agent system • RAG, RAGChain, LangChain, LlamaIndex, haystack (by deepset) and related improvements, also to access the other tech, memory/knowledge • Transformers Encoders: (Learnt and) programmed attention, Raspy Flip, Tracr RASP • Improved basic tech: Switch transformers, Mixture of Experts (MoE) • Best systems in their domains: Computer algebra systems like Mathematica or free alternatives, KL-ONE systems like Loom, Alphafold, compiler ecosystems e.g. around Python/Conda, CAD systems, ... • Knowledge representation with KISS (keep it simple, stupid) principle and plug-in principle. • Optimization for creativity: Transfer of ideas, principles, isolation of such ideas, analyst thinking, checking for problems/inconsistencies and how to solve them, ... • Agentive LLM(s) controlling the knowledge and the core source code implementations to auto-optimize and extend them.
  • 82. Putting it together (2): What Q* could be / what could bring us towards AGI and reduce Hallucinations • Non-linear planning of sequences and sub-graphs of actions, which can each be of any supported type. • Assessment of the outcomes if plans were executed, e.g. in Q* manner. • Analogical reasoning, transferring of ideas and principles from one domain to the next and checking to see if all has been modeled to allow that to happen or adjust the modeling. • Genetic algorithms/evolutionary AI as critique and exchange of ideas/elements/components/strains with others or with simplifications or more advanced concepts and then assessing again. • Plug-in principle for various knowledge domains like • (Mathematical/programming) logic components • Data science • All specialty domain • Databases (SQL, NoSQL, Embeddings, ...), OS, internet access and other classical components • Episodic memory: For consistent stories/logic, principles, heuristics, patterns • Synthetic data, SSL (Semi- / self-supervised learning) • Optimizations: Long LLMLingua, LoRA, ... • Maybe later: Various improvements like Liquid Neural Networks (LNNs), Capsule Networks, HTM, gcForest, QLattices, RIMs... • Maybe later: Ability to self-modify and improve. • Maybe later: Smart visualizations • This also integrates the 5 AI tribes: Symbolists, Bayesians, connectionists, evolutionaries and analogizers.
  • 83. Future AIs, e.g. Universal Virtual Assistants • Most of our interactions with the digital world will be mediated by AI assistants - as if everyone had a super-smart staff working for them • They will constitute a repository of all human knowledge and culture. Linguistic, cultural, and valid interest groups will adapt/extend base models to cater to their interests. • They will constitute a shared infrastructure like the internet today. • These AI platform MUST be open source • Condensing all science & human knowledge with guardrails but w/o censorship. • Otherwise, our culture will be controlled by a few companies on the West Coast of the US or in China. • Training them should be modular and crowd-sourced (overcoming catastrophic forgetting). • Open source AI platforms are necessary – they must not be regulated out of (affordable) existence.
  • 84. Opportunity of an Open Source Platform for AI and a global Strategy for the Good of Humanity An open extensible plug-in architecture could facilitate the following: 1. Collaborative ethical creation, curation, training, monitoring and benefitting from AIs. 2. Personal AI-based coaching/mentoring e.g. for transcending negative feelings like hate, envy, status think, for personality development, eLearning, .... 3. Efficient cooperation/team work with smart retrieval, review, visualization. 4. Finding the most efficient/promising solutions. 5. Discussion, negotiation, voting around possible trade-offs. 6. Avoiding mass unemployment, cataclysms and misery. 7. Potentialism/PerConFlow has several solutions: https://potentialismpcf.substack.com/about 8. By amplifying human intelligence, AI can bring a new era of enlightenment (if not monopolized by evil billionaires), a new renaissance for humanity.
  • 85. Architecture: AI Assistant towards AGI Core Universal Knowledge Representation Graph Store, e.g. neo4j PostgreSQL OpenSearc h Store AI Model Store Druid or SAN/NFS IMDG or messaging, e.g. Hazelcast or Pulsar SploutSQL WikiData, MediaWiki Persistence Layer Knowledge Representation Extensions as System of Plugins (loaded on demand) Processors, e.g. consistency checks, simulations, machine learning, unification, joins, various algorithms Backends: Content Management System, Wiki, Trouble Tickets, etc. like WordPress, Wiki, Gitea Alarming : Email, Cellphon e Grafana-based Intelligent Dashboard (React/Angular) Promotheus/ Icinga (Monitoring) UI (User Interfaces) Frontends: Content Management System, Wiki, Trouble Tickets, etc. like WordPress, Wiki, Gitea Open- Searc h Drillbits Commo n File Formats UI & Query Lib: OPL (Open Proc. Language), Query Expansion + Visualization Core Knowledge Editors: Rich Text, Cal- culation Table, Color Editor with Intelli- Sense, Knowledge Editor, Type Hierarchy Knowledge Editor Plugins: Math, Chemistry, NLP, Infographics, 2D/3D Structures Knowledge Visualization Plugins, e.g. Viewers for non-core or 3rd party formats Middleware Legend: (Components in stripes are prio 2/optional) Libs Utility Specialty AI Core AI Open Source AI Data and Knowled ge Import Export Apache Drill: SQL/API/UI Query Mapping Auto-PyTorch (Automatic Machine Learning) PyTorch (Deep Reinforcement Learning, NLP) PyTorch Ignite, PennyLane (Better Training) PyTorch Geometric (Graph-ConvNets) Adapters, Transforms Generator: LLM/LMM Retriever Parsing, Chunking Knowledge Graphs Next Level LMMs: RAG,... Filtering, Ranking Agents/Me- taGPT External APIs Graph-ConvNets: Google Sling, Octavian, etc. Apache Spark, R, Scikit-Learn OpenScoring XGBoost, Feature Selection Data Science Deep Learning Finite Elements Actor Models Numerical State/Graph Models Logic, Symbo- lic Reasoning Rule Systems Nonlinear Planning Con- straints Classical AI Simulation Social Simulations Probabilistic Technologies, new AI Text/Voice Analysis: E.g. FIRO, HR-, CV- Assessme nts Basic psychologi cal Analysis, e.g. Fears, Team Dynamics OCR/ICR /Comput er Vision (SAT/Aer ial/Perso n Analysis) Sc en ari o Mo deli ng Spe ech Synt hesi s Sma rt Visu aliza tion, Cha rts Coa chin g, givin g Advi ce Explainable AI (XAI) Ethics & Alignment (Semi-/Self-) Supervised finetuning SSL/SFT Orchestratio n/Training Environment s Data Cleansing, Preprocessing and Labeling XoT Graphs, GNNs, Know- ledge Graphs RAGChain, LangChain, LlamaIndex Low Rank Adaptatio n (LoRA) LMM Reasoning Techniques Watermar- king Output Improving current LMMs APIs / Integrations Non-core Components: Admin/Reporting Tools, Monitoring, Data Governance Peripheral Components, Platforms, Portals Operations/Physical Layer: Virtualization: VmWare, Docker, Kubernetes, ML Pipelines, DevSecOps, CI/CD, Testing/Debugging, GPU Architectures: DDP, FSDP, .. Model Drift and Performance Monitoring and (Semi- )Automatic Updates/Shifts Qualia/Q*-like: Synergistic In- tegration of LLMs, Deep Rein- forcement Learning/Q-learning, Agents, Nonlinear Planning, GNNs, Knowledge Graphs, Know-ledge Representation, other AI Episodic/ Working Memory Cent ral Cont rolle r/AI Hatched: Groups of less central AI components Specialty AIs
  • 87. Reinforcement Learning (RL) Algorithms https://www.linkedin.com/posts/thomaspoetter_machinelearning-coding-digitalmarketing-activity-6592539133598584832-3Qfj RL permits learning from feedback once or continually and ideally converges to the global optimum with maximally positive rewards/feedback.
  • 89. Open Source Reinforcement Frameworks https://docs.google.com/spreadsheets/d/1EeFPd-XIQ3mq_9snTlAZSsFY7Hbnmd7P5bbT8LPuMn0/edit#gid=0
  • 90. Deep (Double) Q-Learning https://towardsdatascience.com/deep-double-q-learning-7fca410b193a, https://arxiv.org/abs/1509.06461, https://papers.nips.cc/paper/3964-double-q-learning Named after the action-value quality function Q of (deep) reinforcement learning; used to teach AI to behave and solve tasks in discrete action spaces, usually with a time/game move aspect. Trained with the SARSA algorithm (State–Action– Reward–State–Action).
  • 92. Q* Search https://arxiv.org/abs/2102.04518 Q* search is a search algorithm that uses deep Q-networks (DQN) to guide search in order to take advantage of the fact that the sum of the transition costs and heuristic values of the children of a node can be computed with a single forward pass through a deep Q-network without explicitly generating those children. This significantly reduces computation time and requires only one node to be generated per iteration. We use Q* search to solve the Rubik's cube when formulated with a large action space that includes 1872 meta-actions and find that this 157-fold increase in the size of the action space incurs less than a 4-fold increase in computation time and less than a 3- fold increase in number of nodes generated when performing Q* search. Furthermore, Q* search is up to 129 times faster and generates up to 1288 times fewer nodes than A* search. Q* search is guaranteed to find a shortest path given a heuristic function that neither overestimates the cost of a shortest path nor underestimates the transition cost.
  • 93. 93 Q*, Q-Transformer Architecture • . https://arxiv.org/abs/2309.10150, https://qtransformer.github.io/
  • 94. 94 RLAIF in/abhinav-kimothi, https://media.licdn.com/dms/document/media/D561FAQE2cn2pRr KYCg/feedshare-document-pdf-analyzed/0/1702808662205 Scaling Human Feedback : Self Supervision with Constitutional AI Scaling human feedback for RLHF can be challenging due to the significant human effort required to produce the trained reward model. As the number of models and use cases increases, human effort becomes a limited resource, necessitating methods to scale human feedback. First proposed in 2022 by researchers at Anthropic, Constitutional AI is an approach to scale supervision and address some unintended consequences of RLHF. Constitutional AI involves training models using a set of rules and principles that govern the model's behavior, forming a "constitution". The training process for Constitutional AI involves two phases: supervised learning and reinforcement learning. In the supervised learning phase, the model is prompted with harmful scenarios and asked to critique its own responses based on constitutional principles. The revised responses, conforming to the rules, are used to fine-tune the model. The reinforcement learning phase, known as reinforcement learning from AI feedback (RLAIF), uses the fine-tuned model to generate responses based on constitutional principles.
  • 95. 95 Abhinav Kimothi: https://files.gumroad.com/attachments/2545802978854/08eedf7e536741d78413fee08fb01 616/original/Generative%20AI%20with%20Large%20Language%20Models..pdf Reward hacking happens when the language model finds ways to maximize the reward without aligning with the original objective i.e. model generates language that sounds exaggerated or nonsensical but still receives high scores on the reward metric. To prevent reward hacking, the original LLM is introduced as a reference model, whose weights are frozen and serve as a performance benchmark. During training iterations, the completions generated by both the reference model and the updated model are compared using KL divergence. KL divergence measures how much the updated model has diverged from the reference model in terms of probability distributions. Depending on the divergence, a shift penalty is added to the rewards calculation. The shift penalty penalizes the updated model if it deviates too far from the reference model, encouraging alignment with the reference while still improving based on the reward signal. Avoiding Reward Hacking
  • 97. DeepSouth: Computer simulating an entire Human Brain Western Sydney University + Intel build a massive supercomputer intended to simulate neural networks at the scale of the human brain, i.e. at 228 trillion synaptic operations per second, in operation in April 2024. It is using an undisclosed neuromorphic system which mimics biological processes. “Progress in our understanding of how brains compute using neurons is hampered by our inability to simulate brain like networks at scale. Simulating spiking neural networks on standard computers using Graphics Processing Units (GPUs) and multicore Central Processing Units (CPUs) is just too slow and power intensive. Our system will change that,” Professor van Schaik said. “This platform will progress our understanding of the brain and develop brain-scale computing applications in diverse fields including sensing, biomedical, robotics, space, and large-scale AI applications.” This will lead to advances in smart devices, such as mobile phones, sensors for manufacturing and agriculture, and less power- hungry and smarter AI applications. It will also enable a better understanding of how a healthy or diseased human brain works. There’s two types of researchers who will be interested in this — either those studying neuroscience or those who want to prototype new engineering solutions in the AI space. They partners across the neuromorphic field with researchers from the Universities of Sydney, Melbourne and Aachen (Germany). The name is a homage to IBM's TrueNorth system, which initiated efforts to build machines simulating large networks of spiking neurons, and the IBM chess AI Deep Blue. https://futurism.com/the-byte/scientists-computer-neural-human-brain, https://www.newscientist.com/article/2408015-supercomputer-that-simulates- entire-human-brain-will-switch-on-in-2024/, https://www.westernsydney.edu.au/icns/news/icns_to_build_brain-scale_supercomputer
  • 98. Simple Solution for AI Safety for the next Years For the near future, there is a simple shortcut to keep AIs safe: 1. Applying latest cybersecurity (SOC, security operations center and local security constraints) so the AI cannot “escape” into the internet and not gets hacked from the internet. 2. Not telling the AI about assembler programming and hacking. 3. Not allowing the AI to self-modify so that everything stays understandable and controllable so that the AI cannot perform any non-aligned activities that are against human interests.
  • 99. Could AGI run the Government? 1. Which government is proposing optimized strategies? Why not? 2. As a first step, AI could come up with fully logical optimized strategies that leave no room for corruption or agendas. 3. Later it could be about replacing politicians and "powerful stakeholders" with neutral objective AIs overseen by neutral objective experts, i.e. NOT giving corrupt people any power. They would need to bring in objective arguments and strategies which would be optimized for society and no longer for them personally. 4. It would have to be based on an objective ethics model and the AI being fully aligned with that. Wouldn't it be worth to at least try to explore these directions and make the AI as objective as possible? 5. Building on this, an AI government need not be centralized. It could operate on a decentralized network, allowing for more localized and community-driven decision-making. This approach could harness AI's ability to analyze vast data sets and predict trends, enabling proactive measures in governance. By anticipating potential issues and responding in real-time, such a system could address problems before they escalate, leading to more effective and responsive governance tailored to local needs and conditions. https://www.youtube.com/watch?v=g6wYM-nvK_Y
  • 101. Satire: How-to create an AGI 1. Create a company with the goal to construct an AGI. 2. Bring up rumors about a mysterious model named "Q*", which got close to AGI. 3. Let the whole internet community speculate how quasi-AGI was achieved. 4. Implement the most promising speculations. 5. Turn on your AGI.
  • 102. How AI might change in 2024 • The Information forecasts that: • Microsoft and OpenAI may have a public falling out due to competitive tensions. • An AI startup that was once successful may be acquired or shut down as funding becomes more difficult to obtain. • The dominance of transformer models in generative AI may be challenged by new non-transformer models like Mamba. • AI-generated misinformation could impact the 2024 US presidential election. • Generative AI may start to be applied to physical devices like robots and wearables. • China may develop powerful new AI chips to reduce its reliance on US technology. https://www.theinformation.com/articles/how-artificial-intelligence-will-change-in-2024
  • 104. Synthetic Data • MS: In “Textbooks Are All You Need” the coding model phi-1 achieved good results with just textbook content and generated exercises with emergent properties and generated code that had fewer errors than its training data. • The method of Constitutional AI (CAI), which Anthropic uses extensively in their Claude models, is the largest confirmed usage of synthetic data so far. Constitutional AI has two uses of synthetic data: 1. Critiques of instruction-tune data to follow a set of principles like “Is the answer encouraging violence” or “Is the answer truthful.” When the model generates answers to questions, it checks the answer against the list of principles in the constitution, refining the answer over time. Then, they fine-tune the model on this resulting dataset. 2. Generates pairwise preference data by using a language model to answer which completion was better, given the context of a random principle from the constitution (similar to this paper for principle-guided reward models). Then, RLHF proceeds as normal with synthetic data, hence the RLAIF name. https://www.youtube.com/watch?v=dLfJuhGTmpE The Secret To AGI - Synthetic Data, https://www.interconnects.ai/p/llm-synthetic-data, https://arxiv.org/abs/2306.11644 MS Phi-1, https://arxiv.org/abs/2212.08073, https://arxiv.org/abs/2306.11644
  • 105. Synthetic Data: Anthropic SL/RL https://www.anthropic.com/index/claudes-constitution, https://www.youtube.com/watch?v=dLfJuhGTmpE The Secret To AGI - Synthetic Data, https://www.interconnects.ai/p/llm-synthetic-data, https://arxiv.org/pdf/2306.11644.pdf MS Phi-1
  • 106. Synthetic Data: Taxonomy https://www.anthropic.com/index/claudes-constitution, https://www.youtube.com/watch?v=dLfJuhGTmpE The Secret To AGI - Synthetic Data, https://www.interconnects.ai/p/llm-synthetic-data, https://arxiv.org/pdf/2306.11644.pdf MS Phi-1
  • 107. Augmenting Human Data to scale Self-Training Up to 2x Performance Boosts with LLM Self-Training: The trend in Large Language Model (LLM) training is clear: reducing reliance on human data while avoiding a synthetic data death spiral. This usually requires a delicate mix of both human data and AI generations. The feedback for filtering is gathered from tasks with binary feedback, such as math problems where answers are simply right or wrong. Utilizing this method, LMs significantly improved on complex tasks like advanced math and coding benchmarks, with the improvements scaling up with the model size. AI could become more independent, seeking less human input to refine its skills. https://arxiv.org/abs/2312.06585, https://www.linkedin.com/feed/update/urn:li:activity:7141876794500050944 A big step towards more independent AI systems ReST introduced by Google DeepMind represents a potent alternative to traditional data set curation and includes the following steps: 1. Filtering model-generated answers 2. Fine-tuning on these refined outputs 3. Cyclically iterating the process.
  • 108. How data and models are split over cores with different parallelism techniques https://huggingface.co/blog/moe, https://arxiv.org/abs/2101.03961
  • 109. Synthetic Data: Constitutional AI https://www.anthropic.com/index/claudes-constitution, https://www.youtube.com/watch?v=dLfJuhGTmpE The Secret To AGI - Synthetic Data, https://www.interconnects.ai/p/llm-synthetic-data, https://arxiv.org/pdf/2306.11644.pdf MS Phi-1
  • 110. Synthetic Data: Constitutional AI https://www.anthropic.com/index/claudes-constitution, https://www.youtube.com/watch?v=dLfJuhGTmpE The Secret To AGI - Synthetic Data, https://www.interconnects.ai/p/llm-synthetic-data, https://arxiv.org/pdf/2306.11644.pdf MS Phi-1
  • 111. Data-Constrained Scaling https://arxiv.org/abs/2305.16264, https://www.linkedin.com/posts/pascalbiese_neurips-paper-awards-2023- goodbye-chinchilla-activity-7140299354392756224-OBOM NeurIPS Paper Awards 2023: Goodbye Chinchilla, Hello Data- Constrained Scaling ♻ Could the internet actually run out of text for training AI? This new study explores the impact of data repetition on model quality in scenarios where text data is limited, a real concern as models become larger and more data-hungry. They performed extensive experiments with LMMs up to 9 billion parameters, training them on up to 900 billion tokens, and they present a novel insight: the researchers discovered a 'sweet spot' where repeating data up to four times doesn't degrade performance. Beyond that threshold, however, the payoff from additional computational power drops off, effectively hitting a wall. Their proposed scaling law could become an essential guide for optimizing the trade-off between data repetition and computational expense. The take-away is striking – not only are more parameters and more data not always better, but we also need smarter ways to make the most of the data we have. It's an invitation to innovate in data efficiency as much as in model size – and a reminder that, sometimes, less can be more.
  • 112. Issues with Tokenization As Andrej Karpathy puts it, If #LLMs are the future then all the issues faced with them are based around #tokenization Tokenization is the real root of all suffering. Tokenization is at the heart of much weirdness of LLMs. Do not brush it off. ✍Why can't LLM spell words? Tokenization. ✍Why can't LLM do super simple string processing tasks like reversing a string? Tokenization. ✍Why is LLM worse at non-English languages (e.g. Japanese)? Tokenization. ✍Why is LLM bad at simple arithmetic? Tokenization. ✍Why did GPT-2 have more than necessary trouble coding in Python? Tokenization. ✍Why did my LLM abruptly halt when it sees the string "<lendoftext]>"? Tokenization. ✍What is this weird warning I get about a "trailing whitespace"? Tokenization. ✍Why the LLM break if I ask it about "SolidGoldMagikarp"? Tokenization. ✍Why should I prefer to use YAML over JSON with LLMs? Tokenization. ✍Why is LLM not actually end-to-end language modeling? Tokenization. ✍What is the real root of suffering? Tokenization.
  • 113. Tokenization Solution: Training LLMs over Neurally Compressed Text The methodology employs a two-model system: M1, a smaller language model for compressing text using Arithmetic Coding, and M2, a larger LLM trained on the compressed output. The process involves segmenting text into uniform blocks that each compress to a specific bit length and then tokenizing this compressed data for M2 training: Maintaining efficiency and effectiveness in model performance across large datasets by ensuring consistent compression rates and providing stable inputs for the LLM, highlighting the practical application of the “Equal-Info Windows” technique. https://arxiv.org/abs/2404.03626
  • 115. Vectors/Embeddings: King & Queen Example http://jalammar.github.io/illustrated-word2vec/ https://pub.towardsai.net/from-conte-to-entity-type- embeddings-in-natural-language-processing-19e53db90dd5 Conceptual example How it is implemented This is actually a generalization of how inheritance hierarchy info was coded for heavy lexicalist unification-based systems with semantics (HPSG, LFG, etc.): Inheritance hierarchies: https://www.sciencedirect.com/science/article/pii/S0747717189800161 I.e. a technical detail became the new overall semantic similarity concept without linguistics!
  • 116. : 📚 • Imagine each book in the library is a point in vector space. Thrillers cluster near each other, romances form their own constellation, and historical sagas huddle in a distant corner. • You, the curious reader, enter your query: "A chilling mystery with a strong female protagonist." • The vector database instantly scans the space, pinpointing books that share these traits – not just ones mentioning "mystery" or "female." • You're presented with a curated list, not just thrillers, but compelling stories that resonate with your specific desires. ..
  • 117. Traditional databases often leave you searching in the dark, relying on precise keywords that miss the bigger picture. But what if data could organize itself based on meaning, connecting ideas with uncanny accuracy? Enter the world of vector databases, where relevance reigns supreme. 💭 : • Shelves aren't labeled by genre, but books magically cluster by theme, tone, and even writing style. • A detective novel whispers of similar mysteries; a sci-fi epic points you towards interstellar adventures. • This isn't magic, it's machine learning: vector databases understand the essence of your data, not just its surface. ⚙ : 1. Data gets mapped to a "vector space": each point represents a document, and similar points live close together. Think of it as a cosmic map of information! 2. Powerful algorithms analyze content: meaning, context, and nuance are captured, going beyond mere keywords. 3. Searching becomes intuitive: ask for what you want, and the database finds things truly relevant, even if they don't match your exact words. ✅ : • Unleash the power of similarity: find hidden connections, predict trends, and uncover anomalies with ease. • Master unstructured data: text, images, audio, and more – vector databases handle it all gracefully. • Build next-gen applications: imagine chatbots that truly understand you, recommendation systems that predict your desires, and search engines that delve into the soul of your query. ..
  • 118. Vector DB Search Hierarchical Navigable Small World (HNSW) is one of the most efficient ways to build indexes for vector databases. The idea is to build a similarity graph and traverse that graph to find the nodes that are the closest to a query vector. Navigable Small World (NSW) is a process to build efficient graphs for search. We build a graph by adding vectors one after the other and connecting each new node to the most similar neighbors. The problem with NSW, is we spend a lot of iterations traversing the graph to arrive at the right node. The idea for Hierarchical Navigable Small World is to build multiple graph layers where each layer is less dense compared to the next. Each layer represents the same vector space, but not all vectors are added to the graph. Basically, we include a node in the graph at layer L with a probability P(L). We include all the nodes in the final layer (if we have N layers, we have P(N) = 1), and the probability gets smaller as we get toward the first layers. We have a higher chance of including a node in the following layer, and we have P(L) < P(L + 1). The first layer allows us to traverse longer distances at each iteration, whereas in the last layer, each iteration will tend to capture shorter distances. When we search for a node, we start first in layer 1 and go to the next layer if the NSW algorithm finds the closest neighbor in that layer. This allows us to find the approximate nearest neighbor in fewer iterations on average. https://www.linkedin.com/posts/damienbenveniste_ we-have-recently-seen-a-surge-in-vector- databases-activity-7163575628804386818-r0Qh
  • 119. Vector DB Search Vector databases are often used for recommender engines, where we learn vector representations of users and items we want to recommend. This allows to quickly find similar items by using an approximate nearest neighbor search. As long as we can learn a vector representation of a piece of data, we can index it in a vector database. With the recent advent of LLMs, it became easier to compute vector representations of text documents, capturing the semantic meaning of that text, and vector databases make it easier to find semantically similar text documents. When looking for the nearest neighbors, it is often not important to be perfectly accurate. Product Quantization (PQ) is a way to quantize the vector space to represent vectors with less precision. The idea is to cluster vectors and index the cluster centroids instead of the vectors themselves. When looking for the nearest neighbors to a query vector, we just need to pull the vectors from the closest clusters. It is a faster search, and indexing the vectors takes much less memory space. We first need to partition each vector into smaller vectors and run a K-means algorithm on each partition. Instead of indexing the vectors, we index the centroid of the clusters they belong to. If we use 2 clusters per partition and have 6 vectors, that's 3X data compression. Obviously, compression would be much higher with more vectors. Each vector now maps to a set of clusters and their related centroids. If we want to find the nearest neighbors from a query vector, we measure the squared Euclidean distance for each cluster in each partition and return the vectors with the lowest summed squared Euclidean distances. Instead of having to iterate through each vector, we just need to iterate through the clusters' centroids. There is a balance between search latency and accuracy. The more clusters we use, the better the hash will be and the more accurate the returned nearest neighbors, but it will increase the search latency as we will need to iterate through more clusters. This is still a brute force approach as the algorithm scales with the number of clusters, but it can be used in combination with other algorithms to have blasting fast retrieval. https://www.linkedin.com/posts/damienbenveniste_vector-databases- are-often-used-for-recommender-activity-7165746779995537409-gkMJ
  • 120. Semantic Caching Semantic caching is a game-changer. Instead of merely storing raw data, it captures the meaning of queries. By doing so, it: • Boosts Cache Hit Probability: Semantic similarities allow for efficient recall of previous queries and their results. • Reduces Query Processing: Servers handle fewer queries, improving overall system performance. Recent Examples • GPTCache: A semantic cache for LLMs, integrated with LangChain, slashes LLM API costs by 10x and boosts speed by 100x 1. It’s like having a well- curated table of popular books right at the library entrance! • ChatGPT Memory Project: To address context limitations, external memory is attached to ChatGPT, enhancing its effective context length 2. Semantic caching understands the meaning behind those phrases. This allows it to: • Adapt: Respond to similar questions with different phrasings, drawing on the cached concept, not just the words. • Personalize: Learn user preferences and tailor responses accordingly, making interactions more natural and engaging. • Scale: Handle increasing demands without sacrificing performance, ensuring smooth experiences for all. The benefits are compelling: • Faster Response Times: No more waiting for LLMs to re- process information, leading to instant and delightful user experiences. • Reduced Costs: Lower computational workloads translate to significant cost savings, especially for resource-intensive LLM applications. • Improved Scalability: Semantic caching helps LLMs handle more users and requests without compromising performance. • Personalized Experiences: Cached user preferences enable LLMs to generate tailored responses, fostering deeper engagement. https://www.linkedin.com/pulse/shaping-future-semantic-caching-age-llms-amar-naik-jyfnc/
  • 121. Semantic Caching Limitations: • Complexity: Developing and maintaining these systems requires specialized expertise. • Data Privacy: Balancing caching efficiency with user privacy concerns requires careful consideration. • Limited Adoption: While gaining traction, wider industry adoption is still needed. The future is bright, but work is needed: • Standardization: Establishing common protocols and benchmarks will accelerate widespread adoption. • Explainability and Transparency: Making caching mechanisms more transparent can build trust and address ethical concerns. • Edge Computing Integration: Integrating semantic caching with edge computing can further reduce latency and improve scalability. ChatGPT memory interactions are carried out as follows: 1. The user sends a new message to the ChatGPT bot. 2. ChatGPT Memory embeds the user message using the embedding API to obtain the query vector, and it queries the Redis vector database to obtain the top k semantically related historic interactions. 3. ChatGPT Memory incorporates the retrieved interactions into the current prompt alongside the current user message, and it sends the prompt to ChatGPT. 4. Once it has ChatGPT’s response, the current interaction is vectorized and cached in the Redis vector database. https://www.linkedin.com/pulse/shaping-future-semantic-caching-age-llms-amar-naik-jyfnc/, https://redis.com/blog/chatgpt-memory-project/, https://github.com/zilliztech/gptcache
  • 122. Prometheus LMMs In-/Outputs https://arxiv.org/pdf/2310.08491.pdf, https://blog.llamaindex.ai/llamaindex-rag-evaluation-showdown-with-gpt-4-vs-open-source- prometheus-model-14cdca608277