SlideShare a Scribd company logo
1 of 38
Download to read offline
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Unraveling Multimodality with LLMs
Alex Coqueiro
O P E N A I + D A T A F O R U M
Director of Solutions Architecture for Canada Public Sector
AWS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Multimodality
refers to a concept that utilizes multiple methods of
communication or representation
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Multimodal
Intelligence as a
service
Question
Answer
Medical
Staff
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Foundation Models (FM) as the heart of LLM
Text generation
Summarization
Information
extraction
Q&A
Chatbot
Pretrain Adapt
Tasks
Unlabeled
data
FM
Text generation
Summarization
Information
extraction
Q&A
Chatbot
Train Deploy
Tasks
ML
models
…
…
…
…
Labeled
data
…
…
…
…
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Precision
Speed
Cost
Model Evaluation Score HIL/LLM Feedback
FM1 5/5 <Feedback summary>
FM2 4/5 <Feedback summary>
FM3 3/5 <Feedback summary>
Model Cost
FM1 $$$$
FM2 $
FM3 $$$
Model Speed
FM1 ⚡⚡
FM2 ⚡
FM3 ⚡
Finding the best FM based on your use case
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Our Application (v1)
Foundational
Model
Question + Context
Answer
Medical
Staff
Could you suggest me ways to prevent allergy?
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Meta Llama
Source: https://ai.meta.com/resources/models-and-libraries/llama/
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
… New Requirements …
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Adding Multimodal Capabilities to FMs
text2text
text2image
image2text
text2audio
text2video
Pretrain Adapt
Tasks Capability
Unlabeled
data
FM
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Multimodality Business Use Case
Same Product
Image
Title
Title
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advertising
tailors ads with deep
understanding of user
preferences with multimodal
query representation
Search Engine
Based on multimodal query
understanding
Recommendation
System
It recommends based on
diverse data sources effectively
Robotics
Enhances robotics with natural
language understanding and
decision-making
Assistant (E.g. Chat)
Enhance assistant capability
adding visual analysis
Query Suggestion
Guidance on the best way to
explore the image content
based on Multimodal query
suggestion
Multimodal Application Examples
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Visual Instruction Tuning (Llava)
• Employ LLM's reasoning capability for vision based tasks
• Addressing VQA (Visual Question Answering)
• Instruction-tuning for images with pre-trained image captions
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Hugging Face – llava-v1.6-mistral-7b
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Our Application (v2) – Multimodal Retrieval
C O L L E C T D A T A F R O M M U L T I M O D A L D O C U M E N T S
Llava
Mistral 7B
Question + Context
Answer
Medical
Staff
Could you summarize the main points
of these data?
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Demo
T E X T E X T R A C T
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Demo
M U L T I M O D A L U N D E R S T A N D I N G
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Demo
R E A S O N I N G
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
… Increasing Complexity …
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Latent Diffusion Models
24
https://www.amazon.science/blog/virtual-try-all-
visualizing-any-product-in-any-personal-setting
https://arxiv.org/abs/2401.13795
• Enriching Image Conditioned Inpainting in Latent
Diffusion Models
• Multimodal retrieval tasks x Multimodal generation
(different problem)
• E.g. Virtual try-all: Visualizing any product in any
personal setting
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Brief Overview of Diffusion Models
- “destroy” the data by gradually adding
small amounts of gaussian noise
- “create” data by gradually denoising a
noisy code from a stationary
distribution
Animations from https://yang-song.github.io/blog/2021/score/
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Stable Diffusion 2.0 with Fine-tuning
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Depth to Image Model (Stable Diffusion 2.0)
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
UIs / Plug-Ins for Photoshop, GIMP etc
28
https://twitter.com/wbuchw/status/1563162131024920576
https://github.com/lkwq007/stablediffusion-infinity
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Our Application (v3) – Multimodal Generation
G E N E R A T E I M A G E W I T H S T A B L E D I F F U S I O N
Question + Context
Answer
Medical
Staff
Create an image about the patient's journey from admission to discharge for my
clinical report
Stable Difusion – SDXL (or SD 2.1)
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
… Balancing Multiple Tasks …
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
LLM Multimodel
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
LangChain
A framework to simplify applications using
an LLM
Provides a common way of accessing APIs
of different LLMs
Helps with learning how to use LLMs but
may be too restrictive for some use cases
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agents can take Actions
Instructions: “you are an health assistant, helping
nurses understand about patients”
Health Assistant
What’s the Joan’s
insurance?
Insurance ACME
Vector DB
Patient
Medical records
Actions
Booking Procedure
In: name, procedure
Out: protocol number
Please book
Patients’s anesthesia
Approved and
protocol is 12345
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agent Orchestration
Task Langchain
Final
response
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Unimodel
Multimodal Multimodel
Closings
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
Alex Coqueiro
Director of Solutions Architecture for Canada Public Sector
AWS

More Related Content

Similar to Unraveling Multimodality with Large Language Models.pdf

Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...
Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...
Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...Amazon Web Services
 
Ripping off the Bandage: Re-Architecting Traditional Three-Tier Monoliths to ...
Ripping off the Bandage: Re-Architecting Traditional Three-Tier Monoliths to ...Ripping off the Bandage: Re-Architecting Traditional Three-Tier Monoliths to ...
Ripping off the Bandage: Re-Architecting Traditional Three-Tier Monoliths to ...Amazon Web Services
 
2018 re:Invent - Safeguard the Integrity of Your Code for Fast and Secure Dep...
2018 re:Invent - Safeguard the Integrity of Your Code for Fast and Secure Dep...2018 re:Invent - Safeguard the Integrity of Your Code for Fast and Secure Dep...
2018 re:Invent - Safeguard the Integrity of Your Code for Fast and Secure Dep...Martin Klie
 
Aws cloud computing conference
Aws cloud computing conferenceAws cloud computing conference
Aws cloud computing conferenceAnjani Phuyal
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
현대백화점 리테일테크랩과 AWS Prototyping 팀 개발자가 들려주는 인공 지능 무인 스토어 개발 여정 - 최권열 AWS 프로토타이핑...
현대백화점 리테일테크랩과 AWS Prototyping 팀 개발자가 들려주는 인공 지능 무인 스토어 개발 여정 - 최권열 AWS 프로토타이핑...현대백화점 리테일테크랩과 AWS Prototyping 팀 개발자가 들려주는 인공 지능 무인 스토어 개발 여정 - 최권열 AWS 프로토타이핑...
현대백화점 리테일테크랩과 AWS Prototyping 팀 개발자가 들려주는 인공 지능 무인 스토어 개발 여정 - 최권열 AWS 프로토타이핑...Amazon Web Services Korea
 
Nuvem Híbrida - EBC on the road Brazil Edition [Portuguese]
Nuvem Híbrida - EBC on the road Brazil Edition [Portuguese]Nuvem Híbrida - EBC on the road Brazil Edition [Portuguese]
Nuvem Híbrida - EBC on the road Brazil Edition [Portuguese]Amazon Web Services
 
20201013 - Serverless Architecture Conference - How to migrate your existing ...
20201013 - Serverless Architecture Conference - How to migrate your existing ...20201013 - Serverless Architecture Conference - How to migrate your existing ...
20201013 - Serverless Architecture Conference - How to migrate your existing ...Marcia Villalba
 
Monolithic to Heterogenous - AWS FS Cloud Symposium Apr 2019.pdf
Monolithic to Heterogenous - AWS FS Cloud Symposium Apr 2019.pdfMonolithic to Heterogenous - AWS FS Cloud Symposium Apr 2019.pdf
Monolithic to Heterogenous - AWS FS Cloud Symposium Apr 2019.pdfAmazon Web Services
 
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...Amazon Web Services
 
Transforming Enterprise IT - AWS Transformation Day Boston 2018
Transforming Enterprise IT - AWS Transformation Day Boston 2018Transforming Enterprise IT - AWS Transformation Day Boston 2018
Transforming Enterprise IT - AWS Transformation Day Boston 2018Amazon Web Services
 
AWS Systems Manager: Bridging Operational Models - SRV212 - Chicago AWS Summit
AWS Systems Manager: Bridging Operational Models - SRV212 - Chicago AWS SummitAWS Systems Manager: Bridging Operational Models - SRV212 - Chicago AWS Summit
AWS Systems Manager: Bridging Operational Models - SRV212 - Chicago AWS SummitAmazon Web Services
 
How to Enhance Your Application Security Strategy with F5 on AWS
 How to Enhance Your Application Security Strategy with F5 on AWS How to Enhance Your Application Security Strategy with F5 on AWS
How to Enhance Your Application Security Strategy with F5 on AWSAmazon Web Services
 
How Different Large Organizations are Approaching Cloud Adoption
How Different Large Organizations are Approaching Cloud AdoptionHow Different Large Organizations are Approaching Cloud Adoption
How Different Large Organizations are Approaching Cloud AdoptionAmazon Web Services
 
AWS Partner Data Analytics on AWS_Handout.pdf
AWS Partner Data Analytics on AWS_Handout.pdfAWS Partner Data Analytics on AWS_Handout.pdf
AWS Partner Data Analytics on AWS_Handout.pdfSrinjoySaha12
 
DevConZM - Modern Applications Development in the Cloud
DevConZM - Modern Applications Development in the CloudDevConZM - Modern Applications Development in the Cloud
DevConZM - Modern Applications Development in the CloudCobus Bernard
 
Get More from your Data: Accelerate Time-to-Value and Reduce TCO with Conflue...
Get More from your Data: Accelerate Time-to-Value and Reduce TCO with Conflue...Get More from your Data: Accelerate Time-to-Value and Reduce TCO with Conflue...
Get More from your Data: Accelerate Time-to-Value and Reduce TCO with Conflue...HostedbyConfluent
 
AI Services for Developers | AWS Floor28
AI Services for Developers | AWS Floor28AI Services for Developers | AWS Floor28
AI Services for Developers | AWS Floor28Amazon Web Services
 
AI Services for Developers - Floor28
AI Services for Developers - Floor28AI Services for Developers - Floor28
AI Services for Developers - Floor28Boaz Ziniman
 
The Future of AI on AWS
The Future of AI on AWSThe Future of AI on AWS
The Future of AI on AWSBoaz Ziniman
 

Similar to Unraveling Multimodality with Large Language Models.pdf (20)

Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...
Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...
Unlock the Full Potential of Your Media Assets, ft. Fox Entertainment Group (...
 
Ripping off the Bandage: Re-Architecting Traditional Three-Tier Monoliths to ...
Ripping off the Bandage: Re-Architecting Traditional Three-Tier Monoliths to ...Ripping off the Bandage: Re-Architecting Traditional Three-Tier Monoliths to ...
Ripping off the Bandage: Re-Architecting Traditional Three-Tier Monoliths to ...
 
2018 re:Invent - Safeguard the Integrity of Your Code for Fast and Secure Dep...
2018 re:Invent - Safeguard the Integrity of Your Code for Fast and Secure Dep...2018 re:Invent - Safeguard the Integrity of Your Code for Fast and Secure Dep...
2018 re:Invent - Safeguard the Integrity of Your Code for Fast and Secure Dep...
 
Aws cloud computing conference
Aws cloud computing conferenceAws cloud computing conference
Aws cloud computing conference
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
현대백화점 리테일테크랩과 AWS Prototyping 팀 개발자가 들려주는 인공 지능 무인 스토어 개발 여정 - 최권열 AWS 프로토타이핑...
현대백화점 리테일테크랩과 AWS Prototyping 팀 개발자가 들려주는 인공 지능 무인 스토어 개발 여정 - 최권열 AWS 프로토타이핑...현대백화점 리테일테크랩과 AWS Prototyping 팀 개발자가 들려주는 인공 지능 무인 스토어 개발 여정 - 최권열 AWS 프로토타이핑...
현대백화점 리테일테크랩과 AWS Prototyping 팀 개발자가 들려주는 인공 지능 무인 스토어 개발 여정 - 최권열 AWS 프로토타이핑...
 
Nuvem Híbrida - EBC on the road Brazil Edition [Portuguese]
Nuvem Híbrida - EBC on the road Brazil Edition [Portuguese]Nuvem Híbrida - EBC on the road Brazil Edition [Portuguese]
Nuvem Híbrida - EBC on the road Brazil Edition [Portuguese]
 
20201013 - Serverless Architecture Conference - How to migrate your existing ...
20201013 - Serverless Architecture Conference - How to migrate your existing ...20201013 - Serverless Architecture Conference - How to migrate your existing ...
20201013 - Serverless Architecture Conference - How to migrate your existing ...
 
Monolithic to Heterogenous - AWS FS Cloud Symposium Apr 2019.pdf
Monolithic to Heterogenous - AWS FS Cloud Symposium Apr 2019.pdfMonolithic to Heterogenous - AWS FS Cloud Symposium Apr 2019.pdf
Monolithic to Heterogenous - AWS FS Cloud Symposium Apr 2019.pdf
 
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...
Empowering Every Brain! How Brain Power is using AWS-Powered AI in their Miss...
 
Transforming Enterprise IT - AWS Transformation Day Boston 2018
Transforming Enterprise IT - AWS Transformation Day Boston 2018Transforming Enterprise IT - AWS Transformation Day Boston 2018
Transforming Enterprise IT - AWS Transformation Day Boston 2018
 
AWS Systems Manager: Bridging Operational Models - SRV212 - Chicago AWS Summit
AWS Systems Manager: Bridging Operational Models - SRV212 - Chicago AWS SummitAWS Systems Manager: Bridging Operational Models - SRV212 - Chicago AWS Summit
AWS Systems Manager: Bridging Operational Models - SRV212 - Chicago AWS Summit
 
How to Enhance Your Application Security Strategy with F5 on AWS
 How to Enhance Your Application Security Strategy with F5 on AWS How to Enhance Your Application Security Strategy with F5 on AWS
How to Enhance Your Application Security Strategy with F5 on AWS
 
How Different Large Organizations are Approaching Cloud Adoption
How Different Large Organizations are Approaching Cloud AdoptionHow Different Large Organizations are Approaching Cloud Adoption
How Different Large Organizations are Approaching Cloud Adoption
 
AWS Partner Data Analytics on AWS_Handout.pdf
AWS Partner Data Analytics on AWS_Handout.pdfAWS Partner Data Analytics on AWS_Handout.pdf
AWS Partner Data Analytics on AWS_Handout.pdf
 
DevConZM - Modern Applications Development in the Cloud
DevConZM - Modern Applications Development in the CloudDevConZM - Modern Applications Development in the Cloud
DevConZM - Modern Applications Development in the Cloud
 
Get More from your Data: Accelerate Time-to-Value and Reduce TCO with Conflue...
Get More from your Data: Accelerate Time-to-Value and Reduce TCO with Conflue...Get More from your Data: Accelerate Time-to-Value and Reduce TCO with Conflue...
Get More from your Data: Accelerate Time-to-Value and Reduce TCO with Conflue...
 
AI Services for Developers | AWS Floor28
AI Services for Developers | AWS Floor28AI Services for Developers | AWS Floor28
AI Services for Developers | AWS Floor28
 
AI Services for Developers - Floor28
AI Services for Developers - Floor28AI Services for Developers - Floor28
AI Services for Developers - Floor28
 
The Future of AI on AWS
The Future of AI on AWSThe Future of AI on AWS
The Future of AI on AWS
 

More from Alex Barbosa Coqueiro

Generative Artificial Intelligence for Macro-Fiscal Risks.pdf
Generative Artificial Intelligencefor Macro-Fiscal Risks.pdfGenerative Artificial Intelligencefor Macro-Fiscal Risks.pdf
Generative Artificial Intelligence for Macro-Fiscal Risks.pdfAlex Barbosa Coqueiro
 
Unlocking the Power of Quantum Computing dist.pdf
Unlocking the Power of Quantum Computing dist.pdfUnlocking the Power of Quantum Computing dist.pdf
Unlocking the Power of Quantum Computing dist.pdfAlex Barbosa Coqueiro
 
Building Robotics Application at Scale using OpenSource from Zero to Hero
Building Robotics Application at Scale using OpenSource from Zero to HeroBuilding Robotics Application at Scale using OpenSource from Zero to Hero
Building Robotics Application at Scale using OpenSource from Zero to HeroAlex Barbosa Coqueiro
 
Building Your Robot using AWS Robomaker
Building Your Robot using AWS RobomakerBuilding Your Robot using AWS Robomaker
Building Your Robot using AWS RobomakerAlex Barbosa Coqueiro
 
Desafios da transição de estado em um mundo serverless
Desafios da transição de estado em um mundo serverlessDesafios da transição de estado em um mundo serverless
Desafios da transição de estado em um mundo serverlessAlex Barbosa Coqueiro
 
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and RobomakerReinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and RobomakerAlex Barbosa Coqueiro
 
A maturidade dos sistemas tecnológicos e a migração para a nuvem. Como lidar?
A maturidade dos sistemas tecnológicos e a migração para a nuvem. Como lidar?A maturidade dos sistemas tecnológicos e a migração para a nuvem. Como lidar?
A maturidade dos sistemas tecnológicos e a migração para a nuvem. Como lidar?Alex Barbosa Coqueiro
 
Deploying Bigdata from Zero to Million of records in Amazon Web Services
Deploying Bigdata from Zero to Million of records in Amazon Web ServicesDeploying Bigdata from Zero to Million of records in Amazon Web Services
Deploying Bigdata from Zero to Million of records in Amazon Web ServicesAlex Barbosa Coqueiro
 
Migração do seu website para a AWS
Migração do seu website para a AWSMigração do seu website para a AWS
Migração do seu website para a AWSAlex Barbosa Coqueiro
 
Seminario de Cloud Computing na UFRRJ
Seminario de Cloud Computing na UFRRJSeminario de Cloud Computing na UFRRJ
Seminario de Cloud Computing na UFRRJAlex Barbosa Coqueiro
 
IBM Mobile Platform: Desenvolvimento de Aplicações Mobile
IBM Mobile Platform: Desenvolvimento de Aplicações MobileIBM Mobile Platform: Desenvolvimento de Aplicações Mobile
IBM Mobile Platform: Desenvolvimento de Aplicações MobileAlex Barbosa Coqueiro
 
Webcast WebSphere Portal Performance
Webcast WebSphere Portal PerformanceWebcast WebSphere Portal Performance
Webcast WebSphere Portal PerformanceAlex Barbosa Coqueiro
 

More from Alex Barbosa Coqueiro (15)

Generative Artificial Intelligence for Macro-Fiscal Risks.pdf
Generative Artificial Intelligencefor Macro-Fiscal Risks.pdfGenerative Artificial Intelligencefor Macro-Fiscal Risks.pdf
Generative Artificial Intelligence for Macro-Fiscal Risks.pdf
 
Unlocking the Power of Quantum Computing dist.pdf
Unlocking the Power of Quantum Computing dist.pdfUnlocking the Power of Quantum Computing dist.pdf
Unlocking the Power of Quantum Computing dist.pdf
 
Building Robotics Application at Scale using OpenSource from Zero to Hero
Building Robotics Application at Scale using OpenSource from Zero to HeroBuilding Robotics Application at Scale using OpenSource from Zero to Hero
Building Robotics Application at Scale using OpenSource from Zero to Hero
 
Building Your Robot using AWS Robomaker
Building Your Robot using AWS RobomakerBuilding Your Robot using AWS Robomaker
Building Your Robot using AWS Robomaker
 
Desafios da transição de estado em um mundo serverless
Desafios da transição de estado em um mundo serverlessDesafios da transição de estado em um mundo serverless
Desafios da transição de estado em um mundo serverless
 
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and RobomakerReinforcement Learning with Sagemaker, DeepRacer and Robomaker
Reinforcement Learning with Sagemaker, DeepRacer and Robomaker
 
Webinar de Dados Abertos na AWS
Webinar de Dados Abertos na AWSWebinar de Dados Abertos na AWS
Webinar de Dados Abertos na AWS
 
A maturidade dos sistemas tecnológicos e a migração para a nuvem. Como lidar?
A maturidade dos sistemas tecnológicos e a migração para a nuvem. Como lidar?A maturidade dos sistemas tecnológicos e a migração para a nuvem. Como lidar?
A maturidade dos sistemas tecnológicos e a migração para a nuvem. Como lidar?
 
Deploying Bigdata from Zero to Million of records in Amazon Web Services
Deploying Bigdata from Zero to Million of records in Amazon Web ServicesDeploying Bigdata from Zero to Million of records in Amazon Web Services
Deploying Bigdata from Zero to Million of records in Amazon Web Services
 
HPC in AWS - Technical Workshop
HPC in AWS - Technical WorkshopHPC in AWS - Technical Workshop
HPC in AWS - Technical Workshop
 
Migração do seu website para a AWS
Migração do seu website para a AWSMigração do seu website para a AWS
Migração do seu website para a AWS
 
Seminario de Cloud Computing na UFRRJ
Seminario de Cloud Computing na UFRRJSeminario de Cloud Computing na UFRRJ
Seminario de Cloud Computing na UFRRJ
 
IBM Mobile Platform: Desenvolvimento de Aplicações Mobile
IBM Mobile Platform: Desenvolvimento de Aplicações MobileIBM Mobile Platform: Desenvolvimento de Aplicações Mobile
IBM Mobile Platform: Desenvolvimento de Aplicações Mobile
 
Just java 2011
Just java   2011Just java   2011
Just java 2011
 
Webcast WebSphere Portal Performance
Webcast WebSphere Portal PerformanceWebcast WebSphere Portal Performance
Webcast WebSphere Portal Performance
 

Recently uploaded

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Unraveling Multimodality with Large Language Models.pdf

  • 1. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Unraveling Multimodality with LLMs Alex Coqueiro O P E N A I + D A T A F O R U M Director of Solutions Architecture for Canada Public Sector AWS
  • 2. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 3. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 4. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Multimodality refers to a concept that utilizes multiple methods of communication or representation
  • 5. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Multimodal Intelligence as a service Question Answer Medical Staff
  • 6. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Foundation Models (FM) as the heart of LLM Text generation Summarization Information extraction Q&A Chatbot Pretrain Adapt Tasks Unlabeled data FM Text generation Summarization Information extraction Q&A Chatbot Train Deploy Tasks ML models … … … … Labeled data … … … …
  • 7. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 8. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Precision Speed Cost Model Evaluation Score HIL/LLM Feedback FM1 5/5 <Feedback summary> FM2 4/5 <Feedback summary> FM3 3/5 <Feedback summary> Model Cost FM1 $$$$ FM2 $ FM3 $$$ Model Speed FM1 ⚡⚡ FM2 ⚡ FM3 ⚡ Finding the best FM based on your use case
  • 9. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Our Application (v1) Foundational Model Question + Context Answer Medical Staff Could you suggest me ways to prevent allergy?
  • 10. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Meta Llama Source: https://ai.meta.com/resources/models-and-libraries/llama/
  • 11. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 12. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 13. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. … New Requirements …
  • 14. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Adding Multimodal Capabilities to FMs text2text text2image image2text text2audio text2video Pretrain Adapt Tasks Capability Unlabeled data FM
  • 15. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Multimodality Business Use Case Same Product Image Title Title
  • 16. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Advertising tailors ads with deep understanding of user preferences with multimodal query representation Search Engine Based on multimodal query understanding Recommendation System It recommends based on diverse data sources effectively Robotics Enhances robotics with natural language understanding and decision-making Assistant (E.g. Chat) Enhance assistant capability adding visual analysis Query Suggestion Guidance on the best way to explore the image content based on Multimodal query suggestion Multimodal Application Examples
  • 17. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Visual Instruction Tuning (Llava) • Employ LLM's reasoning capability for vision based tasks • Addressing VQA (Visual Question Answering) • Instruction-tuning for images with pre-trained image captions
  • 18. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Hugging Face – llava-v1.6-mistral-7b
  • 19. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Our Application (v2) – Multimodal Retrieval C O L L E C T D A T A F R O M M U L T I M O D A L D O C U M E N T S Llava Mistral 7B Question + Context Answer Medical Staff Could you summarize the main points of these data?
  • 20. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Demo T E X T E X T R A C T
  • 21. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Demo M U L T I M O D A L U N D E R S T A N D I N G
  • 22. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Demo R E A S O N I N G
  • 23. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. … Increasing Complexity …
  • 24. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Latent Diffusion Models 24 https://www.amazon.science/blog/virtual-try-all- visualizing-any-product-in-any-personal-setting https://arxiv.org/abs/2401.13795 • Enriching Image Conditioned Inpainting in Latent Diffusion Models • Multimodal retrieval tasks x Multimodal generation (different problem) • E.g. Virtual try-all: Visualizing any product in any personal setting
  • 25. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Brief Overview of Diffusion Models - “destroy” the data by gradually adding small amounts of gaussian noise - “create” data by gradually denoising a noisy code from a stationary distribution Animations from https://yang-song.github.io/blog/2021/score/
  • 26. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Stable Diffusion 2.0 with Fine-tuning
  • 27. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Depth to Image Model (Stable Diffusion 2.0)
  • 28. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. UIs / Plug-Ins for Photoshop, GIMP etc 28 https://twitter.com/wbuchw/status/1563162131024920576 https://github.com/lkwq007/stablediffusion-infinity
  • 29. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Our Application (v3) – Multimodal Generation G E N E R A T E I M A G E W I T H S T A B L E D I F F U S I O N Question + Context Answer Medical Staff Create an image about the patient's journey from admission to discharge for my clinical report Stable Difusion – SDXL (or SD 2.1)
  • 30. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 31. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. … Balancing Multiple Tasks …
  • 32. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. LLM Multimodel
  • 33. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. LangChain A framework to simplify applications using an LLM Provides a common way of accessing APIs of different LLMs Helps with learning how to use LLMs but may be too restrictive for some use cases
  • 34. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 35. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agents can take Actions Instructions: “you are an health assistant, helping nurses understand about patients” Health Assistant What’s the Joan’s insurance? Insurance ACME Vector DB Patient Medical records Actions Booking Procedure In: name, procedure Out: protocol number Please book Patients’s anesthesia Approved and protocol is 12345
  • 36. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agent Orchestration Task Langchain Final response
  • 37. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Unimodel Multimodal Multimodel Closings
  • 38. © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank you! Alex Coqueiro Director of Solutions Architecture for Canada Public Sector AWS