LLMs in Production: Tooling, Process, and Team Structure
The document outlines a webinar on production-level management of large language models (LLMs) scheduled for December 6, 2023, featuring panelists who discuss LLM operations, tooling, and prototyping techniques. Key topics include aligning processes for building LLM applications, utilizing various industry-standard tools such as Langsmith, and strategies for improving model performance and user experience. The webinar also highlights the importance of retrieval-augmented generation and practical approaches for engineers and data scientists in the evolving AI landscape.
Have a questionor comment for our
panelists?
Use this QR code to engage with our
speakers, or visit the link in the chat!
Having an audio issue?
Try dialing in by phone!
Dial: +1 312 626 6799
Webinar ID: 819 1469 6007
Passcode: 385318
Closed Captioning is available
for this webinar!
3.
Our Panelists
Tony Karrer
Founder& CEO TechEmpower,
Founder & CTO Aggregage
Greg Loughnane
Founder & CEO of
AI Makerspace
Chris Alexiuk
Co-Founder & CTO at
AI Makerspace
BY THE ENDOF TODAY...
Understand processes for building and
improving production LLM applications
Overview of industry-standard tooling
How to leverage LangSmith
6.
OVERVIEW
LLM Ops, LLMOS, and “The New Stack”
Leading Tooling
Meet LangSmith
Conclusions, Q&A
OUR LLM OPSCURRICULUM
🧑💻Building LLM Applications in Pure Python
1.
🔗LangChain Powered RAG and Advanced Retrieval
2.
🦙Open-Source Production RAG with LlamaIndex
3.
🕴️Agents, 🧑💻Hackathon, and 🧑🏫Demo Day!
4.
🧩3 EASY PIECESTO RETRIEVAL
Ask a question
1.
Search database for stuff similar to question
2.
Return the stuff
3.
45.
📇INDEX (THE DATABASE)
Splitdocs into chunks
1.
Create embeddings for each chunk
2.
Store embeddings in vector store index
3.
Embeddings Vector Store Index
Documents
Raw Source
Documents Chunked Documents
[0.1,0.4,-0.6,...]
[0.2,0.3,-0.4,...]
[0.8,0.3,-0.1,...]
46.
🐕RETRIEVERS
Embeddings Vector StoreIndex
Documents
Raw Source
Documents Chunked Documents
[0.1,0.4,-0.6,...]
Query
INPUT
[0.1,0.4,-0.6,...]
Find Nearest Neighbors
Context: From source 1
Context: From source 2
Context: From source
🐕
[0.2,0.3,-0.4,...]
[0.8,0.3,-0.1,...]
47.
[0.1, 0.4, -0.6,...]
Ryan was ...
Query
Find Nearest
Neighbours
(cosine similarity)
Vector Database
App Logic
INPUT
“Query...”
Embedding Model
48.
[0.1, 0.4, -0.6,...]
Use the provided context to answer the user's query.
You may not answer the user's query unless there is specific
context in the following text.
If you do not know the answer, or cannot answer, please respond
with "I don't know".
Context:
{context}
User Query:
{user_query}
Query
Embedding Model Chat Model
Prompt Templates
INPUT
“Query...”
Find Nearest
Neighbours
(cosine similarity)
Vector Database
App Logic
49.
Embedding Model ChatModel
Vector Store
Find Nearest
Neighbours
(cosine similarity)
Return document(s)
from
Nearest Neighbours
[0.1, 0.4, -0.6, ...]
Prompt Templates
Vector Database
App Logic App Logic
Use the provided context to answer the user's query.
You may not answer the user's query unless there is specific
context in the following text.
If you do not know the answer, or cannot answer, please respond
with "I don't know".
Context:
{context}
User Query:
{user_query}
Context: ref 1
Context: ref 2
Context: ref 3
Context: ref 4
Ryan was ...
Query
INPUT
“Query”
50.
Embedding Model ChatModel
Vector Store
Find Nearest
Neighbours
(cosine similarity)
Return document(s)
from
Nearest Neighbours
[0.1, 0.4, -0.6, ...]
Prompt Templates
Vector Database
App Logic App Logic
Use the provided context to answer the user's query.
You may not answer the user's query unless there is specific
context in the following text.
If you do not know the answer, or cannot answer, please respond
with "I don't know".
Context:
{context}
User Query:
{user_query}
Context: ref 1
Context: ref 2
Context: ref 3
Context: ref 4
Answer
Query
INPUT
OUTPUT
“Query”
51.
Embedding Model ChatModel
Vector Store
Find Nearest
Neighbours
(cosine similarity)
Return document(s)
from
Nearest Neighbours
[0.1, 0.4, -0.6, ...]
Prompt Templates
Vector Database
App Logic App Logic
Use the provided context to answer the user's query.
You may not answer the user's query unless there is specific
context in the following text.
If you do not know the answer, or cannot answer, please respond
with "I don't know".
Context:
{context}
User Query:
{user_query}
Context: ref 1
Context: ref 2
Context: ref 3
Context: ref 4
Answer
Query
INPUT
OUTPUT
“Query”
Dense Vector Retrieval
In-Context Learning
52.
Embedding Model ChatModel
Vector Store
Find Nearest
Neighbours
(cosine similarity)
Return document(s)
from
Nearest Neighbours
[0.1, 0.4, -0.6, ...]
Prompt Templates
Vector Database
App Logic App Logic
Use the provided context to answer the user's query.
You may not answer the user's query unless there is specific
context in the following text.
If you do not know the answer, or cannot answer, please respond
with "I don't know".
Context:
{context}
User Query:
{user_query}
Context: ref 1
Context: ref 2
Context: ref 3
Context: ref 4
Answer
Query
INPUT
OUTPUT
“Query”
Dense Vector Retrieval
In-Context Learning
Search OpenAI blogfor top k resources, rerank
1.
Ask specific questions related to content
2.
Return answers to questions with sources
3.
OpenAI RAG Flow
THE AGE OFTHE AI ENGINEER
“A wide range of AI tasks that used to
take 5 years and a research team to
accomplish in 2013, now just require API
docs and a spare afternoon in 2023.”
It is now possible to build what used
to take months in a single day!
CONCLUSIONS
Best-practice tools areout there!
LangSmith-like tooling is the most comprehensive
Building
Prompt Engineering, RAG, Fine-Tuning
Improvement
Depends on Building!
Eval varies
Lots of work for data scientist and AI Engineers!
77.
Q&A
Tony Karrer
Founder &CEO TechEmpower,
Founder & CTO Aggregage
Dr. Greg Loughnane
Founder & CEO of
AI Makerspace
Chris Alexiuk
Co-Founder & CTO at
AI Makerspace
Tara Dwyer
Webinar Manager
/in/tonykarrer/
aggregage.com
/in/gregloughnane/
aimakerspace.io
/in/csalexiuk/
aimakerspace.io
/in/taradwyer/
artificialintelligencezone.com
JOIN THE GENERATIVE AI FOR TECHNOLOGY LEADERS LINKEDIN GROUP
FOR THOUGHTFUL DISCUSSION AND Q&A! VISIT THE LINK OR SCAN THE QR CODE!
bit.ly/genaitechleaders