SlideShare a Scribd company logo
1 of 30
Download to read offline
JF James, 2024
When Java
meets GenAI
at JChateau
Context
I'm neither a data scientist
nor an AI specialist
Just a Java Dev and
Software Architect
Wondering how to leverage
LLMs impressive
capabilities in our Apps
Experimentation
LET’S
EXPERIMENT
QUARKUS-
LANCHAIN4J
EXTENSION
WITH A SIMPLE
CAR BOOKING
APP
FOCUS ON
RAG AND
FUNCTION CALLS
USING AZURE GPT 3.5 & 4
How to
• Basic REST/HTTP
• Specific SDK: OpenAI
• Framework: langchain
• Low/No Code: FlowizeAI
• Orchestration tool: RAGNA
LangChain
• A popular framework for developing applications powered
by language models
• Assemblages of components for accomplishing higher-
level tasks
• Connect various building blocks: large language models,
document loaders, text splitters, output parsers, vector
stores to store text embeddings, tools, and prompts
• Supports Python and JavaScript
• Launched elf 2022 (just after ChatGPT release)
langchain4j
• The “Java version” of langchain
• Simplify the integration of AI/LLM capabilities into your
Java application
• Launched in 2023
• Last release : 0.27.1 (6 March 2024)
Quarkus-langchain4j
• Seamless integration between Quarkus and LangChain4j
• Easy incorporation of LLMs into your Quarkus applications
• Launched eof 2023
• Last release : 0.9.0 (6 March 2024) based on langchain4j
0.27.1
A fast pace of change
2017
Transformer
GPT1
2018
langchain
2022
2022
ChatGPT
2023
langchain4j
quarkus-langchain4j
2023
Defining an AI service
Defining an AI interface
@RegisterAiService
public interface CustomerSupportAgent {
// Free chat method, unstructured user message
@SystemMessage("You are a customer support agent of a car rental company …")
String chat(String userMessage);
// Structured fraud detection method with parameters
@SystemMessage("You are a car booking fraud detection AI… »)
@UserMessage("Your task is to detect if a fraud was committed for the customer {{name}} {{surname}} …")
String detectFraudForCustomer(String name, String surname);
}
LLM configuration
# Connection configuration to Azure OpenAI instance
quarkus.langchain4j.azure-openai.api-key=…
quarkus.langchain4j.azure-openai.resource-name=…
quarkus.langchain4j.azure-openai.deployment-name=…
quarkus.langchain4j.azure-openai.endpoint=…
# Warning: function calls support depends on the api-version
quarkus.langchain4j.azure-openai.api-version=2023-12-01-preview
quarkus.langchain4j.azure-openai.max-retries=2
quarkus.langchain4j.azure-openai.timeout=60S
# Set the model temperature for deterministic (non-creative) behavior (between 0 and 2)
quarkus.langchain4j.azure-openai.chat-model.temperature=0.1
# An alternative (or a complement?) to temperature: 0.1 means only top 10% probable tokens are considered
quarkus.langchain4j.azure-openai.chat-model.top-p=0.1
# Logging requests and responses in dev mode
%dev.quarkus.langchain4j.azure-openai.log-requests=true
%dev.quarkus.langchain4j.azure-openai.log-responses=true
Rietreval
Augmented
Generation
Principles
• Augment the LLM with specific knowledge
• From different data sources and formats: text, PDF, CSV …
• First off, the input text is turned into a vectorial representation
• Each request is then completed with relevant selected data
• Vector databases: InMemory, PgVector, Redis, Chroma …
• In-process embedding models: all-minlm-l6-v2-q, bge-small-en, bge-
small-zh …
Ingesting documents
public void ingest(@Observes StartupEvent evt) throws Exception {
DocumentSplitter splitter = DocumentSplitters.recursive(500, 0);
EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor
.builder()
.embeddingStore(embeddingStore)
.embeddingModel(embeddingModel)
.documentSplitter(splitter)
.build();
List<Document> docs = loadDocs();
ingestor.ingest(docs);
}
Retrieving relevant contents
public class DocRetriever implements ContentRetriever {
…
// From 0 (low selectivity) to 1 (high selectivity)
private static final double MIN_SCORE = 0.7;
@Inject
public DocRetriever(EmbeddingStore<TextSegment> store, EmbeddingModel model) {
this.retriever = EmbeddingStoreContentRetriever
.builder()
.embeddingModel(model)
.embeddingStore(store)
.maxResults(MAX_RESULTS)
.minScore(MIN_SCORE)
.build();
}
@Override
public List<Content> retrieve(Query query) {
return retriever.retrieve(query);
}
}
Binding an AI service to a document retriever
// Binding is defined with the RegisterAiService annotation
@RegisterAiService(retrievalAugmentor = DocRagAugmentor.class))
public interface CustomerSupportAgent { … }
// DocRagAugmentor is an intermediate class supplying the retriever
public class DocRagAugmentor implements Supplier<RetrievalAugmentor> {
@Override
public RetrievalAugmentor get() { … }
}
RAG configuration
# Local Embedding Model for RAG
quarkus.langchain4j.embedding-model.provider=dev.langchain4j…AllMiniLmL6V2EmbeddingModel
# Local directory for RAG documents
app.local-data-for-rag.dir=data-for-rag
Function calls
Stephan Pirson, 2023
Basic principles
1. Instruct the LLM to call App functions
2. A function is a Java method annotated with @Tool
3. Function descriptors are sent requests
4. The LLM decides whether it’s relevant to call a function
5. A description of the function call is provided in the response
6. quarkus-langchain4j automatically calls the @Tool method
Perspective
Use the LLM as a “workflow
engine”
The LLM is entrusted with the
decision to call business logic
Both powerful and dangerous
Trustable? Reliable?
Defining a function
@Tool("Get booking details for booking number {bookingNumber} and customer {name} {surname}")
public Booking getBookingDetails(String bookingNumber, String name, String surname) {
Log.info("DEMO: Calling Tool-getBookingDetails: " + bookingNumber + " and customer: "
+ name + " " + surname);
return checkBookingExists(bookingNumber, name, surname);
}
Binding the functions to an AI interface
@RegisterAiService(tools = BookingService.class)
public interface CustomerSupportAgent { … }
LLM initial request
"functions":[
{
"name":"getBookingDetails",
"description":"Get booking details for {bookingNumber} and customer {firstName} {lastName}",
"parameters":{
"type":"object",
"properties":{
"firstName":{
"type":"string"
},
"lastName":{
"type":"string"
},
"bookingNumber":{
"type":"string"
}
},
"required":[
"bookingNumber",
"firstName",
"lastName"
]
}
}, …]
LLM intermediate response
"choices":[
{
"finish_reason":"function_call",
"index":0,
"message":{
"role":"assistant",
"function_call":{
"name":"getBookingDetails",
"arguments":"{"firstName":"James","lastName":"Bond","bookingNumber":"456-789"}"
}
},
…
}
]
LLM intermediate request
{
"role":"function",
"name":"getBookingDetails",
"content":"{"bookingNumber" : "456-789",
"customer" : { "firstName" : "James", "lastName" : "Bond" },
"startDate" : "2024-03-01",
"endDate" : "2024-03-09",
"carModel" : "Volvo",
"cancelled" : false}"
}
Example of a booking cancelation
Initial request
Second request
Response: “Your booking 456-789 has
been successfully cancelled, Mr. Bond.
Prompt: “I'm James Bond, can you
cancel my booking 456-789”
Local execution
Third request
call getBookingDetails
POST
final response (finish_reason=stop)
POST cancelBooking result
Stateless request
processing
Local execution
call cancelBooking
POST getBookingDetails result
Stateless request
processing
Stateless request
processing
User Application LLM
Lessons learnt
Lesson learns
• Overall interesting results:
• quarkus-langchain4j makes GenAI really easy!
• Even a generic LLM such as GPT proves to be helpful regarding a specific domain context
• GPT4 is more precise but significantly slower in this example:
• GPT 4 >=5 sec
• GPT 3.5 >=2 sec
• RAG:
• Be selective: set min_score appropriately in your context when retrieving text segments
• Request message can be verbose: selected text segments are added to the user message
• Function calls:
• Not supported by all LLMs
• Powerful and dangerous
• Hard to debug
• Potentially verbose: 1 round-trip per function call
• Many requests under the cover, similar to JPA N+1 queries problem
• Non-deterministic behavior but acceptable with temperature and seed set to minimum
• To be used with care on critical functions: payment, cancelation
Next steps
• Testability
• Auditability
• Observability
• Security
• Production readiness
• Real use cases beyond the fun
Code available on GitHub
https://github.com/jefrajames/car-booking

More Related Content

Similar to When GenAI meets with Java with Quarkus and langchain4j

Build an AI/ML-driven image archive processing workflow: Image archive, analy...
Build an AI/ML-driven image archive processing workflow: Image archive, analy...Build an AI/ML-driven image archive processing workflow: Image archive, analy...
Build an AI/ML-driven image archive processing workflow: Image archive, analy...wesley chun
 
The Magic Revealed: Four Real-World Examples of Using the Client Object Model...
The Magic Revealed: Four Real-World Examples of Using the Client Object Model...The Magic Revealed: Four Real-World Examples of Using the Client Object Model...
The Magic Revealed: Four Real-World Examples of Using the Client Object Model...SPTechCon
 
web2py:Web development like a boss
web2py:Web development like a bossweb2py:Web development like a boss
web2py:Web development like a bossFrancisco Ribeiro
 
Make your gui shine with ajax solr
Make your gui shine with ajax solrMake your gui shine with ajax solr
Make your gui shine with ajax solrlucenerevolution
 
OpenERP Technical Memento
OpenERP Technical MementoOpenERP Technical Memento
OpenERP Technical MementoOdoo
 
SharePoint REST vs CSOM
SharePoint REST vs CSOMSharePoint REST vs CSOM
SharePoint REST vs CSOMMark Rackley
 
Practices and Tools for Building Better APIs
Practices and Tools for Building Better APIsPractices and Tools for Building Better APIs
Practices and Tools for Building Better APIsPeter Hendriks
 
GDG Addis - An Introduction to Django and App Engine
GDG Addis - An Introduction to Django and App EngineGDG Addis - An Introduction to Django and App Engine
GDG Addis - An Introduction to Django and App EngineYared Ayalew
 
Developing your first application using FIWARE
Developing your first application using FIWAREDeveloping your first application using FIWARE
Developing your first application using FIWAREFIWARE
 
Integrating Splunk into your Spring Applications
Integrating Splunk into your Spring ApplicationsIntegrating Splunk into your Spring Applications
Integrating Splunk into your Spring ApplicationsDamien Dallimore
 
Spring in the Cloud - using Spring with Cloud Foundry
Spring in the Cloud - using Spring with Cloud FoundrySpring in the Cloud - using Spring with Cloud Foundry
Spring in the Cloud - using Spring with Cloud FoundryJoshua Long
 
Implementing Messaging Patterns in JavaScript using the OpenAjax Hub
Implementing Messaging Patterns in JavaScript using the OpenAjax HubImplementing Messaging Patterns in JavaScript using the OpenAjax Hub
Implementing Messaging Patterns in JavaScript using the OpenAjax HubKevin Hakanson
 
Designing a production grade realtime ml inference endpoint
Designing a production grade realtime ml inference endpointDesigning a production grade realtime ml inference endpoint
Designing a production grade realtime ml inference endpointChandim Sett
 
Session on Selenium Powertools by Unmesh Gundecha
Session on Selenium Powertools by Unmesh GundechaSession on Selenium Powertools by Unmesh Gundecha
Session on Selenium Powertools by Unmesh GundechaAgile Testing Alliance
 
Working with data using Azure Functions.pdf
Working with data using Azure Functions.pdfWorking with data using Azure Functions.pdf
Working with data using Azure Functions.pdfStephanie Locke
 
Asp.Net_ Developer Resume Remotely
Asp.Net_ Developer Resume RemotelyAsp.Net_ Developer Resume Remotely
Asp.Net_ Developer Resume RemotelySumitKumar2504
 
Introduction to Swagger
Introduction to SwaggerIntroduction to Swagger
Introduction to SwaggerKnoldus Inc.
 

Similar to When GenAI meets with Java with Quarkus and langchain4j (20)

Build an AI/ML-driven image archive processing workflow: Image archive, analy...
Build an AI/ML-driven image archive processing workflow: Image archive, analy...Build an AI/ML-driven image archive processing workflow: Image archive, analy...
Build an AI/ML-driven image archive processing workflow: Image archive, analy...
 
Naga Srinivas
Naga SrinivasNaga Srinivas
Naga Srinivas
 
The Magic Revealed: Four Real-World Examples of Using the Client Object Model...
The Magic Revealed: Four Real-World Examples of Using the Client Object Model...The Magic Revealed: Four Real-World Examples of Using the Client Object Model...
The Magic Revealed: Four Real-World Examples of Using the Client Object Model...
 
#CNX14 - Intro to Force
#CNX14 - Intro to Force#CNX14 - Intro to Force
#CNX14 - Intro to Force
 
web2py:Web development like a boss
web2py:Web development like a bossweb2py:Web development like a boss
web2py:Web development like a boss
 
Make your gui shine with ajax solr
Make your gui shine with ajax solrMake your gui shine with ajax solr
Make your gui shine with ajax solr
 
OpenERP Technical Memento
OpenERP Technical MementoOpenERP Technical Memento
OpenERP Technical Memento
 
ServerLess by usama Azure fuctions.pptx
ServerLess by usama Azure fuctions.pptxServerLess by usama Azure fuctions.pptx
ServerLess by usama Azure fuctions.pptx
 
SharePoint REST vs CSOM
SharePoint REST vs CSOMSharePoint REST vs CSOM
SharePoint REST vs CSOM
 
Practices and Tools for Building Better APIs
Practices and Tools for Building Better APIsPractices and Tools for Building Better APIs
Practices and Tools for Building Better APIs
 
GDG Addis - An Introduction to Django and App Engine
GDG Addis - An Introduction to Django and App EngineGDG Addis - An Introduction to Django and App Engine
GDG Addis - An Introduction to Django and App Engine
 
Developing your first application using FIWARE
Developing your first application using FIWAREDeveloping your first application using FIWARE
Developing your first application using FIWARE
 
Integrating Splunk into your Spring Applications
Integrating Splunk into your Spring ApplicationsIntegrating Splunk into your Spring Applications
Integrating Splunk into your Spring Applications
 
Spring in the Cloud - using Spring with Cloud Foundry
Spring in the Cloud - using Spring with Cloud FoundrySpring in the Cloud - using Spring with Cloud Foundry
Spring in the Cloud - using Spring with Cloud Foundry
 
Implementing Messaging Patterns in JavaScript using the OpenAjax Hub
Implementing Messaging Patterns in JavaScript using the OpenAjax HubImplementing Messaging Patterns in JavaScript using the OpenAjax Hub
Implementing Messaging Patterns in JavaScript using the OpenAjax Hub
 
Designing a production grade realtime ml inference endpoint
Designing a production grade realtime ml inference endpointDesigning a production grade realtime ml inference endpoint
Designing a production grade realtime ml inference endpoint
 
Session on Selenium Powertools by Unmesh Gundecha
Session on Selenium Powertools by Unmesh GundechaSession on Selenium Powertools by Unmesh Gundecha
Session on Selenium Powertools by Unmesh Gundecha
 
Working with data using Azure Functions.pdf
Working with data using Azure Functions.pdfWorking with data using Azure Functions.pdf
Working with data using Azure Functions.pdf
 
Asp.Net_ Developer Resume Remotely
Asp.Net_ Developer Resume RemotelyAsp.Net_ Developer Resume Remotely
Asp.Net_ Developer Resume Remotely
 
Introduction to Swagger
Introduction to SwaggerIntroduction to Swagger
Introduction to Swagger
 

More from Jean-Francois James

More from Jean-Francois James (7)

Loom promises: be there!
Loom promises: be there!Loom promises: be there!
Loom promises: be there!
 
LyonJUG-2023-v1.0.pdf
LyonJUG-2023-v1.0.pdfLyonJUG-2023-v1.0.pdf
LyonJUG-2023-v1.0.pdf
 
ParisJUG-2022-v0.4.pdf
ParisJUG-2022-v0.4.pdfParisJUG-2022-v0.4.pdf
ParisJUG-2022-v0.4.pdf
 
Boost your APIs with GraphQL
Boost your APIs with GraphQLBoost your APIs with GraphQL
Boost your APIs with GraphQL
 
Tnt 2020-jf-james
Tnt 2020-jf-jamesTnt 2020-jf-james
Tnt 2020-jf-james
 
Talk Oracle Code One 2019
Talk Oracle Code One 2019Talk Oracle Code One 2019
Talk Oracle Code One 2019
 
Boost your API with GraphQL
Boost your API with GraphQLBoost your API with GraphQL
Boost your API with GraphQL
 

Recently uploaded

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Recently uploaded (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

When GenAI meets with Java with Quarkus and langchain4j

  • 1. JF James, 2024 When Java meets GenAI at JChateau
  • 2. Context I'm neither a data scientist nor an AI specialist Just a Java Dev and Software Architect Wondering how to leverage LLMs impressive capabilities in our Apps
  • 3. Experimentation LET’S EXPERIMENT QUARKUS- LANCHAIN4J EXTENSION WITH A SIMPLE CAR BOOKING APP FOCUS ON RAG AND FUNCTION CALLS USING AZURE GPT 3.5 & 4
  • 4. How to • Basic REST/HTTP • Specific SDK: OpenAI • Framework: langchain • Low/No Code: FlowizeAI • Orchestration tool: RAGNA
  • 5. LangChain • A popular framework for developing applications powered by language models • Assemblages of components for accomplishing higher- level tasks • Connect various building blocks: large language models, document loaders, text splitters, output parsers, vector stores to store text embeddings, tools, and prompts • Supports Python and JavaScript • Launched elf 2022 (just after ChatGPT release)
  • 6. langchain4j • The “Java version” of langchain • Simplify the integration of AI/LLM capabilities into your Java application • Launched in 2023 • Last release : 0.27.1 (6 March 2024)
  • 7. Quarkus-langchain4j • Seamless integration between Quarkus and LangChain4j • Easy incorporation of LLMs into your Quarkus applications • Launched eof 2023 • Last release : 0.9.0 (6 March 2024) based on langchain4j 0.27.1
  • 8. A fast pace of change 2017 Transformer GPT1 2018 langchain 2022 2022 ChatGPT 2023 langchain4j quarkus-langchain4j 2023
  • 9. Defining an AI service
  • 10. Defining an AI interface @RegisterAiService public interface CustomerSupportAgent { // Free chat method, unstructured user message @SystemMessage("You are a customer support agent of a car rental company …") String chat(String userMessage); // Structured fraud detection method with parameters @SystemMessage("You are a car booking fraud detection AI… ») @UserMessage("Your task is to detect if a fraud was committed for the customer {{name}} {{surname}} …") String detectFraudForCustomer(String name, String surname); }
  • 11. LLM configuration # Connection configuration to Azure OpenAI instance quarkus.langchain4j.azure-openai.api-key=… quarkus.langchain4j.azure-openai.resource-name=… quarkus.langchain4j.azure-openai.deployment-name=… quarkus.langchain4j.azure-openai.endpoint=… # Warning: function calls support depends on the api-version quarkus.langchain4j.azure-openai.api-version=2023-12-01-preview quarkus.langchain4j.azure-openai.max-retries=2 quarkus.langchain4j.azure-openai.timeout=60S # Set the model temperature for deterministic (non-creative) behavior (between 0 and 2) quarkus.langchain4j.azure-openai.chat-model.temperature=0.1 # An alternative (or a complement?) to temperature: 0.1 means only top 10% probable tokens are considered quarkus.langchain4j.azure-openai.chat-model.top-p=0.1 # Logging requests and responses in dev mode %dev.quarkus.langchain4j.azure-openai.log-requests=true %dev.quarkus.langchain4j.azure-openai.log-responses=true
  • 13. Principles • Augment the LLM with specific knowledge • From different data sources and formats: text, PDF, CSV … • First off, the input text is turned into a vectorial representation • Each request is then completed with relevant selected data • Vector databases: InMemory, PgVector, Redis, Chroma … • In-process embedding models: all-minlm-l6-v2-q, bge-small-en, bge- small-zh …
  • 14. Ingesting documents public void ingest(@Observes StartupEvent evt) throws Exception { DocumentSplitter splitter = DocumentSplitters.recursive(500, 0); EmbeddingStoreIngestor ingestor = EmbeddingStoreIngestor .builder() .embeddingStore(embeddingStore) .embeddingModel(embeddingModel) .documentSplitter(splitter) .build(); List<Document> docs = loadDocs(); ingestor.ingest(docs); }
  • 15. Retrieving relevant contents public class DocRetriever implements ContentRetriever { … // From 0 (low selectivity) to 1 (high selectivity) private static final double MIN_SCORE = 0.7; @Inject public DocRetriever(EmbeddingStore<TextSegment> store, EmbeddingModel model) { this.retriever = EmbeddingStoreContentRetriever .builder() .embeddingModel(model) .embeddingStore(store) .maxResults(MAX_RESULTS) .minScore(MIN_SCORE) .build(); } @Override public List<Content> retrieve(Query query) { return retriever.retrieve(query); } }
  • 16. Binding an AI service to a document retriever // Binding is defined with the RegisterAiService annotation @RegisterAiService(retrievalAugmentor = DocRagAugmentor.class)) public interface CustomerSupportAgent { … } // DocRagAugmentor is an intermediate class supplying the retriever public class DocRagAugmentor implements Supplier<RetrievalAugmentor> { @Override public RetrievalAugmentor get() { … } }
  • 17. RAG configuration # Local Embedding Model for RAG quarkus.langchain4j.embedding-model.provider=dev.langchain4j…AllMiniLmL6V2EmbeddingModel # Local directory for RAG documents app.local-data-for-rag.dir=data-for-rag
  • 19. Stephan Pirson, 2023 Basic principles 1. Instruct the LLM to call App functions 2. A function is a Java method annotated with @Tool 3. Function descriptors are sent requests 4. The LLM decides whether it’s relevant to call a function 5. A description of the function call is provided in the response 6. quarkus-langchain4j automatically calls the @Tool method
  • 20. Perspective Use the LLM as a “workflow engine” The LLM is entrusted with the decision to call business logic Both powerful and dangerous Trustable? Reliable?
  • 21. Defining a function @Tool("Get booking details for booking number {bookingNumber} and customer {name} {surname}") public Booking getBookingDetails(String bookingNumber, String name, String surname) { Log.info("DEMO: Calling Tool-getBookingDetails: " + bookingNumber + " and customer: " + name + " " + surname); return checkBookingExists(bookingNumber, name, surname); }
  • 22. Binding the functions to an AI interface @RegisterAiService(tools = BookingService.class) public interface CustomerSupportAgent { … }
  • 23. LLM initial request "functions":[ { "name":"getBookingDetails", "description":"Get booking details for {bookingNumber} and customer {firstName} {lastName}", "parameters":{ "type":"object", "properties":{ "firstName":{ "type":"string" }, "lastName":{ "type":"string" }, "bookingNumber":{ "type":"string" } }, "required":[ "bookingNumber", "firstName", "lastName" ] } }, …]
  • 25. LLM intermediate request { "role":"function", "name":"getBookingDetails", "content":"{"bookingNumber" : "456-789", "customer" : { "firstName" : "James", "lastName" : "Bond" }, "startDate" : "2024-03-01", "endDate" : "2024-03-09", "carModel" : "Volvo", "cancelled" : false}" }
  • 26. Example of a booking cancelation Initial request Second request Response: “Your booking 456-789 has been successfully cancelled, Mr. Bond. Prompt: “I'm James Bond, can you cancel my booking 456-789” Local execution Third request call getBookingDetails POST final response (finish_reason=stop) POST cancelBooking result Stateless request processing Local execution call cancelBooking POST getBookingDetails result Stateless request processing Stateless request processing User Application LLM
  • 28. Lesson learns • Overall interesting results: • quarkus-langchain4j makes GenAI really easy! • Even a generic LLM such as GPT proves to be helpful regarding a specific domain context • GPT4 is more precise but significantly slower in this example: • GPT 4 >=5 sec • GPT 3.5 >=2 sec • RAG: • Be selective: set min_score appropriately in your context when retrieving text segments • Request message can be verbose: selected text segments are added to the user message • Function calls: • Not supported by all LLMs • Powerful and dangerous • Hard to debug • Potentially verbose: 1 round-trip per function call • Many requests under the cover, similar to JPA N+1 queries problem • Non-deterministic behavior but acceptable with temperature and seed set to minimum • To be used with care on critical functions: payment, cancelation
  • 29. Next steps • Testability • Auditability • Observability • Security • Production readiness • Real use cases beyond the fun
  • 30. Code available on GitHub https://github.com/jefrajames/car-booking