SlideShare a Scribd company logo
1 of 38
Download to read offline
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Unraveling Multimodality with LLMs
Alex Coqueiro
O P E N A I + D A T A F O R U M
Director of Solutions Architecture for Canada Public Sector
AWS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Multimodality
refers to a concept that utilizes multiple methods of
communication or representation
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Multimodal
Intelligence as a
service
Question
Answer
Medical
Staff
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Foundation Models (FM) as the heart of LLM
Text generation
Summarization
Information
extraction
Q&A
Chatbot
Pretrain Adapt
Tasks
Unlabeled
data
FM
Text generation
Summarization
Information
extraction
Q&A
Chatbot
Train Deploy
Tasks
ML
models
…
…
…
…
Labeled
data
…
…
…
…
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Precision
Speed
Cost
Model Evaluation Score HIL/LLM Feedback
FM1 5/5 <Feedback summary>
FM2 4/5 <Feedback summary>
FM3 3/5 <Feedback summary>
Model Cost
FM1 $$$$
FM2 $
FM3 $$$
Model Speed
FM1 ⚡⚡
FM2 ⚡
FM3 ⚡
Finding the best FM based on your use case
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Our Application (v1)
Foundational
Model
Question + Context
Answer
Medical
Staff
Could you suggest me ways to prevent allergy?
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Meta Llama
Source: https://ai.meta.com/resources/models-and-libraries/llama/
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
… New Requirements …
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Adding Multimodal Capabilities to FMs
text2text
text2image
image2text
text2audio
text2video
Pretrain Adapt
Tasks Capability
Unlabeled
data
FM
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Multimodality Business Use Case
Same Product
Image
Title
Title
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advertising
tailors ads with deep
understanding of user
preferences with multimodal
query representation
Search Engine
Based on multimodal query
understanding
Recommendation
System
It recommends based on
diverse data sources effectively
Robotics
Enhances robotics with natural
language understanding and
decision-making
Assistant (E.g. Chat)
Enhance assistant capability
adding visual analysis
Query Suggestion
Guidance on the best way to
explore the image content
based on Multimodal query
suggestion
Multimodal Application Examples
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Visual Instruction Tuning (Llava)
• Employ LLM's reasoning capability for vision based tasks
• Addressing VQA (Visual Question Answering)
• Instruction-tuning for images with pre-trained image captions
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Hugging Face – llava-v1.6-mistral-7b
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Our Application (v2) – Multimodal Retrieval
C O L L E C T D A T A F R O M M U L T I M O D A L D O C U M E N T S
Llava
Mistral 7B
Question + Context
Answer
Medical
Staff
Could you summarize the main points
of these data?
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Demo
T E X T E X T R A C T
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Demo
M U L T I M O D A L U N D E R S T A N D I N G
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Demo
R E A S O N I N G
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
… Increasing Complexity …
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Latent Diffusion Models
24
https://www.amazon.science/blog/virtual-try-all-
visualizing-any-product-in-any-personal-setting
https://arxiv.org/abs/2401.13795
• Enriching Image Conditioned Inpainting in Latent
Diffusion Models
• Multimodal retrieval tasks x Multimodal generation
(different problem)
• E.g. Virtual try-all: Visualizing any product in any
personal setting
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Brief Overview of Diffusion Models
- “destroy” the data by gradually adding
small amounts of gaussian noise
- “create” data by gradually denoising a
noisy code from a stationary
distribution
Animations from https://yang-song.github.io/blog/2021/score/
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Stable Diffusion 2.0 with Fine-tuning
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Depth to Image Model (Stable Diffusion 2.0)
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
UIs / Plug-Ins for Photoshop, GIMP etc
28
https://twitter.com/wbuchw/status/1563162131024920576
https://github.com/lkwq007/stablediffusion-infinity
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Our Application (v3) – Multimodal Generation
G E N E R A T E I M A G E W I T H S T A B L E D I F F U S I O N
Question + Context
Answer
Medical
Staff
Create an image about the patient's journey from admission to discharge for my
clinical report
Stable Difusion – SDXL (or SD 2.1)
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
… Balancing Multiple Tasks …
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
LLM Multimodel
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
LangChain
A framework to simplify applications using
an LLM
Provides a common way of accessing APIs
of different LLMs
Helps with learning how to use LLMs but
may be too restrictive for some use cases
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agents can take Actions
Instructions: “you are an health assistant, helping
nurses understand about patients”
Health Assistant
What’s the Joan’s
insurance?
Insurance ACME
Vector DB
Patient
Medical records
Actions
Booking Procedure
In: name, procedure
Out: protocol number
Please book
Patients’s anesthesia
Approved and
protocol is 12345
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agent Orchestration
Task Langchain
Final
response
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Unimodel
Multimodal Multimodel
Closings
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
Alex Coqueiro
Director of Solutions Architecture for Canada Public Sector
AWS

More Related Content

Similar to Unraveling Multimodality with Large Language Models.pdf

Amazon Connect & AI - Shaping the Future of Customer Interactions - GenAI and...
Amazon Connect & AI - Shaping the Future of Customer Interactions - GenAI and...Amazon Connect & AI - Shaping the Future of Customer Interactions - GenAI and...
Amazon Connect & AI - Shaping the Future of Customer Interactions - GenAI and...CloudHesive
 
Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...
Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...
Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...Amazon Web Services
 
Ripping off the Bandage: Re-Architecting Traditional Three-Tier Monoliths to ...
Ripping off the Bandage: Re-Architecting Traditional Three-Tier Monoliths to ...Ripping off the Bandage: Re-Architecting Traditional Three-Tier Monoliths to ...
Ripping off the Bandage: Re-Architecting Traditional Three-Tier Monoliths to ...Amazon Web Services
 
2018 re:Invent - Safeguard the Integrity of Your Code for Fast and Secure Dep...
2018 re:Invent - Safeguard the Integrity of Your Code for Fast and Secure Dep...2018 re:Invent - Safeguard the Integrity of Your Code for Fast and Secure Dep...
2018 re:Invent - Safeguard the Integrity of Your Code for Fast and Secure Dep...Martin Klie
 
Aws cloud computing conference
Aws cloud computing conferenceAws cloud computing conference
Aws cloud computing conferenceAnjani Phuyal
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
현대백화점 리테일테크랩과 AWS Prototyping 팀 개발자가 들려주는 인공 지능 무인 스토어 개발 여정 - 최권열 AWS 프로토타이핑...
현대백화점 리테일테크랩과 AWS Prototyping 팀 개발자가 들려주는 인공 지능 무인 스토어 개발 여정 - 최권열 AWS 프로토타이핑...현대백화점 리테일테크랩과 AWS Prototyping 팀 개발자가 들려주는 인공 지능 무인 스토어 개발 여정 - 최권열 AWS 프로토타이핑...
현대백화점 리테일테크랩과 AWS Prototyping 팀 개발자가 들려주는 인공 지능 무인 스토어 개발 여정 - 최권열 AWS 프로토타이핑...Amazon Web Services Korea
 
Nuvem Híbrida - EBC on the road Brazil Edition [Portuguese]
Nuvem Híbrida - EBC on the road Brazil Edition [Portuguese]Nuvem Híbrida - EBC on the road Brazil Edition [Portuguese]
Nuvem Híbrida - EBC on the road Brazil Edition [Portuguese]Amazon Web Services
 
20201013 - Serverless Architecture Conference - How to migrate your existing ...
20201013 - Serverless Architecture Conference - How to migrate your existing ...20201013 - Serverless Architecture Conference - How to migrate your existing ...
20201013 - Serverless Architecture Conference - How to migrate your existing ...Marcia Villalba
 
Monolithic to Heterogenous - AWS FS Cloud Symposium Apr 2019.pdf
Monolithic to Heterogenous - AWS FS Cloud Symposium Apr 2019.pdfMonolithic to Heterogenous - AWS FS Cloud Symposium Apr 2019.pdf
Monolithic to Heterogenous - AWS FS Cloud Symposium Apr 2019.pdfAmazon Web Services
 
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...Amazon Web Services
 
Transforming Enterprise IT - AWS Transformation Day Boston 2018
Transforming Enterprise IT - AWS Transformation Day Boston 2018Transforming Enterprise IT - AWS Transformation Day Boston 2018
Transforming Enterprise IT - AWS Transformation Day Boston 2018Amazon Web Services
 
AWS Systems Manager: Bridging Operational Models - SRV212 - Chicago AWS Summit
AWS Systems Manager: Bridging Operational Models - SRV212 - Chicago AWS SummitAWS Systems Manager: Bridging Operational Models - SRV212 - Chicago AWS Summit
AWS Systems Manager: Bridging Operational Models - SRV212 - Chicago AWS SummitAmazon Web Services
 
How to Enhance Your Application Security Strategy with F5 on AWS
 How to Enhance Your Application Security Strategy with F5 on AWS How to Enhance Your Application Security Strategy with F5 on AWS
How to Enhance Your Application Security Strategy with F5 on AWSAmazon Web Services
 
How Different Large Organizations are Approaching Cloud Adoption
How Different Large Organizations are Approaching Cloud AdoptionHow Different Large Organizations are Approaching Cloud Adoption
How Different Large Organizations are Approaching Cloud AdoptionAmazon Web Services
 
AWS Partner Data Analytics on AWS_Handout.pdf
AWS Partner Data Analytics on AWS_Handout.pdfAWS Partner Data Analytics on AWS_Handout.pdf
AWS Partner Data Analytics on AWS_Handout.pdfSrinjoySaha12
 
DevConZM - Modern Applications Development in the Cloud
DevConZM - Modern Applications Development in the CloudDevConZM - Modern Applications Development in the Cloud
DevConZM - Modern Applications Development in the CloudCobus Bernard
 
Get More from your Data: Accelerate Time-to-Value and Reduce TCO with Conflue...
Get More from your Data: Accelerate Time-to-Value and Reduce TCO with Conflue...Get More from your Data: Accelerate Time-to-Value and Reduce TCO with Conflue...
Get More from your Data: Accelerate Time-to-Value and Reduce TCO with Conflue...HostedbyConfluent
 
AI Services for Developers | AWS Floor28
AI Services for Developers | AWS Floor28AI Services for Developers | AWS Floor28
AI Services for Developers | AWS Floor28Amazon Web Services
 
AI Services for Developers - Floor28
AI Services for Developers - Floor28AI Services for Developers - Floor28
AI Services for Developers - Floor28Boaz Ziniman
 

Similar to Unraveling Multimodality with Large Language Models.pdf (20)

Amazon Connect & AI - Shaping the Future of Customer Interactions - GenAI and...
Amazon Connect & AI - Shaping the Future of Customer Interactions - GenAI and...Amazon Connect & AI - Shaping the Future of Customer Interactions - GenAI and...
Amazon Connect & AI - Shaping the Future of Customer Interactions - GenAI and...
 
Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...
Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...
Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...
 
Ripping off the Bandage: Re-Architecting Traditional Three-Tier Monoliths to ...
Ripping off the Bandage: Re-Architecting Traditional Three-Tier Monoliths to ...Ripping off the Bandage: Re-Architecting Traditional Three-Tier Monoliths to ...
Ripping off the Bandage: Re-Architecting Traditional Three-Tier Monoliths to ...
 
2018 re:Invent - Safeguard the Integrity of Your Code for Fast and Secure Dep...
2018 re:Invent - Safeguard the Integrity of Your Code for Fast and Secure Dep...2018 re:Invent - Safeguard the Integrity of Your Code for Fast and Secure Dep...
2018 re:Invent - Safeguard the Integrity of Your Code for Fast and Secure Dep...
 
Aws cloud computing conference
Aws cloud computing conferenceAws cloud computing conference
Aws cloud computing conference
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
현대백화점 리테일테크랩과 AWS Prototyping 팀 개발자가 들려주는 인공 지능 무인 스토어 개발 여정 - 최권열 AWS 프로토타이핑...
현대백화점 리테일테크랩과 AWS Prototyping 팀 개발자가 들려주는 인공 지능 무인 스토어 개발 여정 - 최권열 AWS 프로토타이핑...현대백화점 리테일테크랩과 AWS Prototyping 팀 개발자가 들려주는 인공 지능 무인 스토어 개발 여정 - 최권열 AWS 프로토타이핑...
현대백화점 리테일테크랩과 AWS Prototyping 팀 개발자가 들려주는 인공 지능 무인 스토어 개발 여정 - 최권열 AWS 프로토타이핑...
 
Nuvem Híbrida - EBC on the road Brazil Edition [Portuguese]
Nuvem Híbrida - EBC on the road Brazil Edition [Portuguese]Nuvem Híbrida - EBC on the road Brazil Edition [Portuguese]
Nuvem Híbrida - EBC on the road Brazil Edition [Portuguese]
 
20201013 - Serverless Architecture Conference - How to migrate your existing ...
20201013 - Serverless Architecture Conference - How to migrate your existing ...20201013 - Serverless Architecture Conference - How to migrate your existing ...
20201013 - Serverless Architecture Conference - How to migrate your existing ...
 
Monolithic to Heterogenous - AWS FS Cloud Symposium Apr 2019.pdf
Monolithic to Heterogenous - AWS FS Cloud Symposium Apr 2019.pdfMonolithic to Heterogenous - AWS FS Cloud Symposium Apr 2019.pdf
Monolithic to Heterogenous - AWS FS Cloud Symposium Apr 2019.pdf
 
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...
 
Transforming Enterprise IT - AWS Transformation Day Boston 2018
Transforming Enterprise IT - AWS Transformation Day Boston 2018Transforming Enterprise IT - AWS Transformation Day Boston 2018
Transforming Enterprise IT - AWS Transformation Day Boston 2018
 
AWS Systems Manager: Bridging Operational Models - SRV212 - Chicago AWS Summit
AWS Systems Manager: Bridging Operational Models - SRV212 - Chicago AWS SummitAWS Systems Manager: Bridging Operational Models - SRV212 - Chicago AWS Summit
AWS Systems Manager: Bridging Operational Models - SRV212 - Chicago AWS Summit
 
How to Enhance Your Application Security Strategy with F5 on AWS
 How to Enhance Your Application Security Strategy with F5 on AWS How to Enhance Your Application Security Strategy with F5 on AWS
How to Enhance Your Application Security Strategy with F5 on AWS
 
How Different Large Organizations are Approaching Cloud Adoption
How Different Large Organizations are Approaching Cloud AdoptionHow Different Large Organizations are Approaching Cloud Adoption
How Different Large Organizations are Approaching Cloud Adoption
 
AWS Partner Data Analytics on AWS_Handout.pdf
AWS Partner Data Analytics on AWS_Handout.pdfAWS Partner Data Analytics on AWS_Handout.pdf
AWS Partner Data Analytics on AWS_Handout.pdf
 
DevConZM - Modern Applications Development in the Cloud
DevConZM - Modern Applications Development in the CloudDevConZM - Modern Applications Development in the Cloud
DevConZM - Modern Applications Development in the Cloud
 
Get More from your Data: Accelerate Time-to-Value and Reduce TCO with Conflue...
Get More from your Data: Accelerate Time-to-Value and Reduce TCO with Conflue...Get More from your Data: Accelerate Time-to-Value and Reduce TCO with Conflue...
Get More from your Data: Accelerate Time-to-Value and Reduce TCO with Conflue...
 
AI Services for Developers | AWS Floor28
AI Services for Developers | AWS Floor28AI Services for Developers | AWS Floor28
AI Services for Developers | AWS Floor28
 
AI Services for Developers - Floor28
AI Services for Developers - Floor28AI Services for Developers - Floor28
AI Services for Developers - Floor28
 

More from Alex Barbosa Coqueiro

Generative Artificial Intelligence for Macro-Fiscal Risks.pdf
Generative Artificial Intelligencefor Macro-Fiscal Risks.pdfGenerative Artificial Intelligencefor Macro-Fiscal Risks.pdf
Generative Artificial Intelligence for Macro-Fiscal Risks.pdfAlex Barbosa Coqueiro
 
Unlocking the Power of Quantum Computing dist.pdf
Unlocking the Power of Quantum Computing dist.pdfUnlocking the Power of Quantum Computing dist.pdf
Unlocking the Power of Quantum Computing dist.pdfAlex Barbosa Coqueiro
 
Building Robotics Application at Scale using OpenSource from Zero to Hero
Building Robotics Application at Scale using OpenSource from Zero to HeroBuilding Robotics Application at Scale using OpenSource from Zero to Hero
Building Robotics Application at Scale using OpenSource from Zero to HeroAlex Barbosa Coqueiro
 
Building Your Robot using AWS Robomaker
Building Your Robot using AWS RobomakerBuilding Your Robot using AWS Robomaker
Building Your Robot using AWS RobomakerAlex Barbosa Coqueiro
 
Desafios da transição de estado em um mundo serverless
Desafios da transição de estado em um mundo serverlessDesafios da transição de estado em um mundo serverless
Desafios da transição de estado em um mundo serverlessAlex Barbosa Coqueiro
 
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and RobomakerReinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and RobomakerAlex Barbosa Coqueiro
 
A maturidade dos sistemas tecnológicos e a migração para a nuvem. Como lidar?
A maturidade dos sistemas tecnológicos e a migração para a nuvem. Como lidar?A maturidade dos sistemas tecnológicos e a migração para a nuvem. Como lidar?
A maturidade dos sistemas tecnológicos e a migração para a nuvem. Como lidar?Alex Barbosa Coqueiro
 
Deploying Bigdata from Zero to Million of records in Amazon Web Services
Deploying Bigdata from Zero to Million of records in Amazon Web ServicesDeploying Bigdata from Zero to Million of records in Amazon Web Services
Deploying Bigdata from Zero to Million of records in Amazon Web ServicesAlex Barbosa Coqueiro
 
Migração do seu website para a AWS
Migração do seu website para a AWSMigração do seu website para a AWS
Migração do seu website para a AWSAlex Barbosa Coqueiro
 
Seminario de Cloud Computing na UFRRJ
Seminario de Cloud Computing na UFRRJSeminario de Cloud Computing na UFRRJ
Seminario de Cloud Computing na UFRRJAlex Barbosa Coqueiro
 
IBM Mobile Platform: Desenvolvimento de Aplicações Mobile
IBM Mobile Platform: Desenvolvimento de Aplicações MobileIBM Mobile Platform: Desenvolvimento de Aplicações Mobile
IBM Mobile Platform: Desenvolvimento de Aplicações MobileAlex Barbosa Coqueiro
 
Webcast WebSphere Portal Performance
Webcast WebSphere Portal PerformanceWebcast WebSphere Portal Performance
Webcast WebSphere Portal PerformanceAlex Barbosa Coqueiro
 

More from Alex Barbosa Coqueiro (15)

Generative Artificial Intelligence for Macro-Fiscal Risks.pdf
Generative Artificial Intelligencefor Macro-Fiscal Risks.pdfGenerative Artificial Intelligencefor Macro-Fiscal Risks.pdf
Generative Artificial Intelligence for Macro-Fiscal Risks.pdf
 
Unlocking the Power of Quantum Computing dist.pdf
Unlocking the Power of Quantum Computing dist.pdfUnlocking the Power of Quantum Computing dist.pdf
Unlocking the Power of Quantum Computing dist.pdf
 
Building Robotics Application at Scale using OpenSource from Zero to Hero
Building Robotics Application at Scale using OpenSource from Zero to HeroBuilding Robotics Application at Scale using OpenSource from Zero to Hero
Building Robotics Application at Scale using OpenSource from Zero to Hero
 
Building Your Robot using AWS Robomaker
Building Your Robot using AWS RobomakerBuilding Your Robot using AWS Robomaker
Building Your Robot using AWS Robomaker
 
Desafios da transição de estado em um mundo serverless
Desafios da transição de estado em um mundo serverlessDesafios da transição de estado em um mundo serverless
Desafios da transição de estado em um mundo serverless
 
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and RobomakerReinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
 
Webinar de Dados Abertos na AWS
Webinar de Dados Abertos na AWSWebinar de Dados Abertos na AWS
Webinar de Dados Abertos na AWS
 
A maturidade dos sistemas tecnológicos e a migração para a nuvem. Como lidar?
A maturidade dos sistemas tecnológicos e a migração para a nuvem. Como lidar?A maturidade dos sistemas tecnológicos e a migração para a nuvem. Como lidar?
A maturidade dos sistemas tecnológicos e a migração para a nuvem. Como lidar?
 
Deploying Bigdata from Zero to Million of records in Amazon Web Services
Deploying Bigdata from Zero to Million of records in Amazon Web ServicesDeploying Bigdata from Zero to Million of records in Amazon Web Services
Deploying Bigdata from Zero to Million of records in Amazon Web Services
 
HPC in AWS - Technical Workshop
HPC in AWS - Technical WorkshopHPC in AWS - Technical Workshop
HPC in AWS - Technical Workshop
 
Migração do seu website para a AWS
Migração do seu website para a AWSMigração do seu website para a AWS
Migração do seu website para a AWS
 
Seminario de Cloud Computing na UFRRJ
Seminario de Cloud Computing na UFRRJSeminario de Cloud Computing na UFRRJ
Seminario de Cloud Computing na UFRRJ
 
IBM Mobile Platform: Desenvolvimento de Aplicações Mobile
IBM Mobile Platform: Desenvolvimento de Aplicações MobileIBM Mobile Platform: Desenvolvimento de Aplicações Mobile
IBM Mobile Platform: Desenvolvimento de Aplicações Mobile
 
Just java 2011
Just java   2011Just java   2011
Just java 2011
 
Webcast WebSphere Portal Performance
Webcast WebSphere Portal PerformanceWebcast WebSphere Portal Performance
Webcast WebSphere Portal Performance
 

Recently uploaded

WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceSamy Fodil
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Paige Cruz
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform EngineeringMarcus Vechiato
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxFIDO Alliance
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Hiroshi SHIBATA
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...FIDO Alliance
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxjbellis
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...panagenda
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentationyogeshlabana357357
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Skynet Technologies
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...ScyllaDB
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 

Recently uploaded (20)

WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 

Unraveling Multimodality with Large Language Models.pdf

  • 1. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Unraveling Multimodality with LLMs Alex Coqueiro O P E N A I + D A T A F O R U M Director of Solutions Architecture for Canada Public Sector AWS
  • 2. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 3. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 4. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Multimodality refers to a concept that utilizes multiple methods of communication or representation
  • 5. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Multimodal Intelligence as a service Question Answer Medical Staff
  • 6. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Foundation Models (FM) as the heart of LLM Text generation Summarization Information extraction Q&A Chatbot Pretrain Adapt Tasks Unlabeled data FM Text generation Summarization Information extraction Q&A Chatbot Train Deploy Tasks ML models … … … … Labeled data … … … …
  • 7. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 8. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Precision Speed Cost Model Evaluation Score HIL/LLM Feedback FM1 5/5 <Feedback summary> FM2 4/5 <Feedback summary> FM3 3/5 <Feedback summary> Model Cost FM1 $$$$ FM2 $ FM3 $$$ Model Speed FM1 ⚡⚡ FM2 ⚡ FM3 ⚡ Finding the best FM based on your use case
  • 9. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Our Application (v1) Foundational Model Question + Context Answer Medical Staff Could you suggest me ways to prevent allergy?
  • 10. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Meta Llama Source: https://ai.meta.com/resources/models-and-libraries/llama/
  • 11. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 12. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 13. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. … New Requirements …
  • 14. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Adding Multimodal Capabilities to FMs text2text text2image image2text text2audio text2video Pretrain Adapt Tasks Capability Unlabeled data FM
  • 15. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Multimodality Business Use Case Same Product Image Title Title
  • 16. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Advertising tailors ads with deep understanding of user preferences with multimodal query representation Search Engine Based on multimodal query understanding Recommendation System It recommends based on diverse data sources effectively Robotics Enhances robotics with natural language understanding and decision-making Assistant (E.g. Chat) Enhance assistant capability adding visual analysis Query Suggestion Guidance on the best way to explore the image content based on Multimodal query suggestion Multimodal Application Examples
  • 17. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Visual Instruction Tuning (Llava) • Employ LLM's reasoning capability for vision based tasks • Addressing VQA (Visual Question Answering) • Instruction-tuning for images with pre-trained image captions
  • 18. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Hugging Face – llava-v1.6-mistral-7b
  • 19. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Our Application (v2) – Multimodal Retrieval C O L L E C T D A T A F R O M M U L T I M O D A L D O C U M E N T S Llava Mistral 7B Question + Context Answer Medical Staff Could you summarize the main points of these data?
  • 20. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Demo T E X T E X T R A C T
  • 21. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Demo M U L T I M O D A L U N D E R S T A N D I N G
  • 22. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Demo R E A S O N I N G
  • 23. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. … Increasing Complexity …
  • 24. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Latent Diffusion Models 24 https://www.amazon.science/blog/virtual-try-all- visualizing-any-product-in-any-personal-setting https://arxiv.org/abs/2401.13795 • Enriching Image Conditioned Inpainting in Latent Diffusion Models • Multimodal retrieval tasks x Multimodal generation (different problem) • E.g. Virtual try-all: Visualizing any product in any personal setting
  • 25. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Brief Overview of Diffusion Models - “destroy” the data by gradually adding small amounts of gaussian noise - “create” data by gradually denoising a noisy code from a stationary distribution Animations from https://yang-song.github.io/blog/2021/score/
  • 26. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Stable Diffusion 2.0 with Fine-tuning
  • 27. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Depth to Image Model (Stable Diffusion 2.0)
  • 28. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. UIs / Plug-Ins for Photoshop, GIMP etc 28 https://twitter.com/wbuchw/status/1563162131024920576 https://github.com/lkwq007/stablediffusion-infinity
  • 29. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Our Application (v3) – Multimodal Generation G E N E R A T E I M A G E W I T H S T A B L E D I F F U S I O N Question + Context Answer Medical Staff Create an image about the patient's journey from admission to discharge for my clinical report Stable Difusion – SDXL (or SD 2.1)
  • 30. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 31. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. … Balancing Multiple Tasks …
  • 32. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. LLM Multimodel
  • 33. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. LangChain A framework to simplify applications using an LLM Provides a common way of accessing APIs of different LLMs Helps with learning how to use LLMs but may be too restrictive for some use cases
  • 34. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 35. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agents can take Actions Instructions: “you are an health assistant, helping nurses understand about patients” Health Assistant What’s the Joan’s insurance? Insurance ACME Vector DB Patient Medical records Actions Booking Procedure In: name, procedure Out: protocol number Please book Patients’s anesthesia Approved and protocol is 12345
  • 36. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agent Orchestration Task Langchain Final response
  • 37. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Unimodel Multimodal Multimodel Closings
  • 38. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank you! Alex Coqueiro Director of Solutions Architecture for Canada Public Sector AWS