This session will provide a balanced insight into the technical development and business-centric application of augmented retrieval products, utilizing Generative AI models. We will traverse from requirements engineering to prototyping and user acceptance testing, spotlighting the critical role of optimizing vectorizers for superior smart search functionality within a business ecosystem. A substantial focus will be on demonstrating the deployment of these advanced models on Azure infrastructure, ensuring scalable and efficient solutions. Additionally, the integration of strategic feedback mechanisms will be addressed, essential for perpetually enhancing the quality of answers and aligning products with evolving business goals and user requisites, ultimately fostering refined decision-making and improved business operations.
3. Generative AI (and GPT models)
Generative AI is basically a very smart time-
series forecasting machine where instead of the
time-line we have the order of words (tokens)
4. GenAI in Business Use Cases
Two broad areas or use of GenAI:
1. Content creation, summarization, drafting general docs
2. Doing research, asking QA
⢠In general, GAI models are good in 1 and not good in 2
⢠1 is already good value for companies but the real value (or rather a treasure)
lies in 2
5. Known problems of GenAI models for research and
asking questions
⢠hallucinations
⢠the GPT just makes the answer up based on the likelihood
⢠impossibility of checking the truth and from which source the answer is coming
⢠this has broader consequences as you canât also directly access the relevant resources and learn
more on your own
⢠example â I ask about whether I can mention that we (PwC) cooperate with OpenAI in a Risk
Management GAI model â I get an answer but canât check whether itâs true and also from which
directive/guideline is the answer coming
7. PwC
How can LLM* get access to domain-specific data?
7
First option: We could fine-tune the model, BUT
LLM
Big LLM Training Set Domain-specific Data
LLM
Fine-Tuned LLM
â Itâs difficult to prevent
hallucinations, no clear distinction
between âgeneralâ and âspecificâ
knowledge
â Might be costly (certainly - GPUs)
â Model retraining should be done
each time there are changes in
the knowledge base
pre-training fine-tuning
results in
ask generate
Question Answer
* LLM - Large Language Model, e.g GPT, BERT etc.
8. PwC
How can LLM get access to domain-specific data?
8
Second option: Use Retrieval-Augmented Generation (RAG)
â Clear indication of the source
upon which the answer was based
â Very unlikely to hallucinate â
precise and fact based solutions
â When knowledge base changes,
smart search will automatically
adapt to reflect those changes
LLM
ask generate
Question Answer
Smart search
Question + relevant
documents
Domain-specific Data
look-up
Relevant
documents
10. PwC
PoC Customization: CEO Surveys Chatbot (GPT-4, GURU)
Proof-of-concept study with last 10 editions of PwC
CEO Global Survey
Solution for retrieving enterprise data from a knowledge base:
14. PwC
The results were still quite optimistic
But the employee HR problematics is
more complex, answers not detailed
Idea â with more data answers will get
better
16. PwC
The business idea
⢠Armed with our new knowledge and positive experience on RAG GAI solutions,
we decided to work on a business opportunity with PwC Germany and German publisher on Data
Privacy Legal articles (Daten Schutz Berater)
⢠Context: according to GDPR, each company with more than 20 employees needs to appoint a âData
Protection Officerâ who doesnât have to be a lawyer but needs to ensure compliance with GDPR (this
duty can be theoretically outsourced to external provider)
17.
18. PwC
PrivAID = Privacy Aid
⢠The idea is that we can simply transform Data Privacy Advisor (so essentially magazines and books
on data privacy law) into a chatbot
⢠The target audience would be non-lawyers, laymen data protection officers who are already
subscribing to one of these magazines anyway
⢠Idea â following up on HR GURU â if we put all these articles and opinions and books into a RAG
GAI product, it would be able to answer all your questions
22. PwC
So did this work out?
-> Quite but not completely
Extensive testing with data privacy lawyers is necessary which we have been performing in multiple
iterations last months
We found that for some questions,
⢠first doing the vector search and then the keyword search yields the best result
⢠and for other questions vice versa
Possible reasons
⢠not enough data
⢠low quality of data ( ? )
⢠limitations of RAG itself
⢠likely the reasons - when there is too much data that is retrievable it just doesnât work that well
⢠context length limitations
⢠u can see it now with BingChat and new chatGPT-4 with search function â premium account
23. PwC
Current idea
In the beginning, I said there are two approaches
⢠fine-tuning which is costly and difficult to update
⢠RAG which is good but does not seem to work that well for large amounts of data
Now we are trying to combine both
⢠The idea is to fine-tune model into specific knowledge of data protection
⢠And at the same time use RAG to reference the exact sources
⢠so the answer can still be checked, trusted and users can learn further from the sources (gamification of
learning process)
24. PwC
How to fine-tune a model?
Going with LLAMA 2 â open-source model
⢠First, you need the question-answer pairs, ideally tens of thousands of them
⢠These we donât have so we use GPT to read through content/articles and generate QA pairs
⢠The problem is that you canât simply retrain the model with these QAs because all the 7/13/70 billion model weights
would get shifted just in accordance to these âfewâ QA pairs and the LLM generator power would disappear
(catastrophic forgetting)
⢠Solution we are testing now â LoRA - Low-Rank Adaptation of Large Language Models
⢠this way the model parameters are kept the same (not spoiled)
⢠LoRA modifies the neural network by adding customized layers (with our QAs)
⢠it outperform classic fine-tuning with just customizing the last layer
⢠It seems that by grounding the GPT in the context of data privacy laws and still adding RAG, the tool improves the
quality and most importantly the consistency of the answers
26. PwC
Tailored GAI products vs out-of-box solutions
Tailored products
⢠Benefits: Answers grounded in the domain knowledge
⢠Cons: costly, needs initial setup and fine-tuning
⢠Use case: scalable apps for many clients, risk-, reputation-sensitive use cases
Off-the-rack products with RAG
⢠Benefits : (very) cheap, fast to create a customized tool (minutes)
⢠Cons : general LLM (GPT) knowledge, can terribly backfire in many scenarios
⢠Examples:
⢠MS Copilot
⢠Custom GPTs (OpenAI)
⢠products by startups that didnât yet close business because of Custom GPTs announced
26
27. Š 2023 PwC Ăsterreich GmbH WirtschaftsprĂźfungsgesellschaft. Alle Rechte vorbehalten. In diesem Dokument bezieht sich die Bezeichnung âPwC Ăsterreichâ auf
die PwC Ăsterreich GmbH WirtschaftsprĂźfungsgesellschaft oder eines ihrer verbundenen Unternehmen, von denen jedes ein selbstständiges Rechtssubjekt ist.
Mehr Informationen hierzu finden Sie unter pwc.at/impressum.
âPwCâ bezeichnet das PwC-Netzwerk und/oder eine oder mehrere seiner Mitgliedsfirmen. Jedes Mitglied dieses Netzwerks ist ein selbstständiges Rechtssubjekt.
Weitere Informationen finden Sie unter pwc.
Thank you for your attention!
Marcel Tkacik
Data Science Manager, GAI Lead
Digital Factory - Innovation & AI Team
marcel.tkacik@pwc.com