SlideShare a Scribd company logo
1 of 16
VISVESVARAYA TECHNOLOGICAL UNIVERSITY
"Jnana Sangama", Belgaum: 590 018
H.K.E Society’s
SIR M VISVESVARAYA COLLEGE OF ENGINEERING
(Affiliated to VTU - Belagavi, Approved by AICTE, Accredited by NAAC)
Yeramarus Camp, Raichur-584135, Karnataka
2023-2024
TECHNICAL SEMINAR PRESENTATION
ON
“MULTIMODAL AI ”
UNDER THE GUIDENCE
OF
DR.SHARAN KUMAR
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
POWERING
THE NEXT
CHAPTER IN
GENERATIVE AI
MULTIMODAL AI
PRESENTED
BY
B CHANDANA
3SL20EC003
CONTENTS
• Introduction
• Literature survey
• Block diagram
• Applications
• Future scope
• Benefits and challenges
• Conclusion
• Reference
Introduction
• Multi modal AI is an
advanced form of artificial
intelligence that is able to
analyze and interpret
multiple modes of data
simultaneously allowing it
to generate more accurate
and human like responses.
Literature survey
• The release of ChatGPT in November 2022, a conversation-focused
model that follows human instructions, further underscored the
feasibility of AGI in practical applications (Liu et al., 2023a). This
development has had a wide-ranging impact across various sectors,
including journalism (Liu et al., 2023c), education (Zhai, 2023; Liu
et al., 2023b), healthcare (Li et al., 2023; Liu et al., [n. d.]; Holmes
et al., 2023), industry (Dou et al., 2023), agriculture (Rezayi
et al., 2023), law (Bubeck et al., 2023), gaming (Bubeck et al., 2023),
and finance (Wu et al., 2023c), catalyzing a popular wave in AI (Liu
et al., 2023a, g, h).
• Rishi Bommasani, Drew A Hudson, Ehsan Adeli, Russ Altman, Simran
Arora, Sydney von Arx, Michael S Bernstein, Jeannette Bohg,
Antoine Bosselut, Emma Brunskill, et al. 2021.On the opportunities
and risks of foundation models.arXiv preprint
arXiv:2108.07258 (2021).
Sensory Inputs
Sensory inputs refer to the various forms of data collected from different
senses such as vision, hearing, touch, and smell that are processed by
multimodal AI technology for a technical seminar.
Data Fusion
Data fusion involves combining information from multiple modalities, such
as text, images, and videos, to improve the accuracy and robustness of AI
systems in a technical seminar on multimodal AI technology generation.
Machine Learning Algorithms
Machine learning algorithms play a crucial role in generating multimodal
AI technology for technical seminars by effectively analyzing and
interpreting data from multiple sources such as text, images, and audio.
Natural Language Processing
Natural Language Processing is a crucial component of Multimodal AI
technology, allowing for the analysis and understanding of human
language in combination with other modalities such as images or videos.
Computer Vision
Computer Vision is a key component of Multimodal AI technology, which
allows for the integration of visual data processing with other modes of
information to enhance overall system performance.
Applications
• Social media content moderation: Multimodal AI can be used to analyze text, images, and audio to
identify and moderate harmful content on social media platforms. For instance, it can detect hate
speech, violence, and bullying.
• Virtual assistants: Smart assistants like Google Assistant and Amazon Alexa are powered by
multimodal AI. They can understand and respond to natural language commands, both spoken and
typed.
• Healthcare imaging: In healthcare, multimodal AI can analyze medical images (X-rays, MRIs) along
with text reports and patient history data to improve diagnostics. This can lead to more accurate
diagnoses and better patient outcomes.
• Autonomous vehicles: Self-driving cars rely heavily on multimodal AI. They use a variety of sensors,
including cameras, radar, and LiDAR, to perceive their surroundings and navigate safely.
• E-commerce product recommendations: Many e-commerce websites use multimodal AI to
personalize product recommendations for customers. By considering both the product image and
description, the AI can recommend items that are more likely to interest the customer
Conclusion
• The future of AI is not just about seeing or hearing, it's
about truly understanding. Multimodal AI holds the
key to unlocking a new level of human-computer
interaction, with applications that can bridge
communication gaps, enhance our understanding of
the world, and empower us to solve complex
challenges in entirely new ways. The potential for
positive impact across various fields is truly limitless.
References
• Rania Abdelghani, Yen-Hsiang Wang, Xingdi Yuan, Tong Wang,
Pauline Lucas, Hélène Sauzéon, and Pierre-Yves Oudeyer.
2023.GPT-3-driven pedagogical agents for training children’s
curious question-asking skills. International Journal of Artificial
Intelligence in Education 167, 3 (2023), 102887.
• Hang Bao, Wen Wang, Li Dong, Qianru Liu, Ola K. Mohammed,
Kirti Aggarwal, and Fang Wei. 2022.Vlmo: Unified vision-language
pre-training with mixture-of-modality-experts. In Advances in
Neural Information Processing Systems (NeurIPS), Vol. 35. 32897–
32912.

More Related Content

Similar to technical seminar.pptx on multi model of AI

Ambient intellegence
Ambient intellegenceAmbient intellegence
Ambient intellegence
Lovely Singla
 
Generative AI .pptx.....................
Generative AI .pptx.....................Generative AI .pptx.....................
Generative AI .pptx.....................
hanamshettyvani
 
compueter.pdfurueue7edjcjte6djdjrjducheduu
compueter.pdfurueue7edjcjte6djdjrjducheduucompueter.pdfurueue7edjcjte6djdjrjducheduu
compueter.pdfurueue7edjcjte6djdjrjducheduu
shubhamgupta7133
 
Artificial Intelligence Role in Modern Science Aims, Merits, Risks and Its Ap...
Artificial Intelligence Role in Modern Science Aims, Merits, Risks and Its Ap...Artificial Intelligence Role in Modern Science Aims, Merits, Risks and Its Ap...
Artificial Intelligence Role in Modern Science Aims, Merits, Risks and Its Ap...
ijtsrd
 
Applications of Artificial Intelligence in Human Life
Applications of Artificial Intelligence in Human LifeApplications of Artificial Intelligence in Human Life
Applications of Artificial Intelligence in Human Life
Associate Professor in VSB Coimbatore
 
Beyond AI The Rise of Cognitive Computing as Future of Computing ChatGPT Anal...
Beyond AI The Rise of Cognitive Computing as Future of Computing ChatGPT Anal...Beyond AI The Rise of Cognitive Computing as Future of Computing ChatGPT Anal...
Beyond AI The Rise of Cognitive Computing as Future of Computing ChatGPT Anal...
ijtsrd
 
[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF
[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF
[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF
DataScienceConferenc1
 

Similar to technical seminar.pptx on multi model of AI (20)

The technologies of ai used in different corporate world
The technologies of ai used in different  corporate worldThe technologies of ai used in different  corporate world
The technologies of ai used in different corporate world
 
The Unleashing the Power of AI & How Machine Learning is Revolutionizing Ever...
The Unleashing the Power of AI & How Machine Learning is Revolutionizing Ever...The Unleashing the Power of AI & How Machine Learning is Revolutionizing Ever...
The Unleashing the Power of AI & How Machine Learning is Revolutionizing Ever...
 
Ambient intellegence
Ambient intellegenceAmbient intellegence
Ambient intellegence
 
Face Detection Using Artificial Intelligence and Machine Learning with Python
Face Detection Using Artificial Intelligence and Machine Learning with PythonFace Detection Using Artificial Intelligence and Machine Learning with Python
Face Detection Using Artificial Intelligence and Machine Learning with Python
 
Generative AI .pptx.....................
Generative AI .pptx.....................Generative AI .pptx.....................
Generative AI .pptx.....................
 
A Case Study of Artificial Intelligence is being used to Reshape Business
A Case Study of Artificial Intelligence is being used to Reshape BusinessA Case Study of Artificial Intelligence is being used to Reshape Business
A Case Study of Artificial Intelligence is being used to Reshape Business
 
A SURVEY ON AI POWERED PERSONAL ASSISTANT
A SURVEY ON AI POWERED PERSONAL ASSISTANTA SURVEY ON AI POWERED PERSONAL ASSISTANT
A SURVEY ON AI POWERED PERSONAL ASSISTANT
 
compueter.pdfurueue7edjcjte6djdjrjducheduu
compueter.pdfurueue7edjcjte6djdjrjducheduucompueter.pdfurueue7edjcjte6djdjrjducheduu
compueter.pdfurueue7edjcjte6djdjrjducheduu
 
Artificial Intelligence Scope and Career Opportunity.pdf
Artificial Intelligence Scope and Career Opportunity.pdfArtificial Intelligence Scope and Career Opportunity.pdf
Artificial Intelligence Scope and Career Opportunity.pdf
 
Artificial Intelligence Role in Modern Science Aims, Merits, Risks and Its Ap...
Artificial Intelligence Role in Modern Science Aims, Merits, Risks and Its Ap...Artificial Intelligence Role in Modern Science Aims, Merits, Risks and Its Ap...
Artificial Intelligence Role in Modern Science Aims, Merits, Risks and Its Ap...
 
Applications of Artificial Intelligence in Human Life
Applications of Artificial Intelligence in Human LifeApplications of Artificial Intelligence in Human Life
Applications of Artificial Intelligence in Human Life
 
Quantify Measure App Project concept presentation
Quantify Measure App Project concept presentationQuantify Measure App Project concept presentation
Quantify Measure App Project concept presentation
 
Top technologies of ai 2020
Top technologies of ai 2020Top technologies of ai 2020
Top technologies of ai 2020
 
The A_Z of Artificial Intelligence Types and Principles_1687569150.pdf
The  A_Z of Artificial Intelligence Types and Principles_1687569150.pdfThe  A_Z of Artificial Intelligence Types and Principles_1687569150.pdf
The A_Z of Artificial Intelligence Types and Principles_1687569150.pdf
 
Artificial intelligence and Internet of Things.pptx
Artificial intelligence and Internet of Things.pptxArtificial intelligence and Internet of Things.pptx
Artificial intelligence and Internet of Things.pptx
 
Beyond AI The Rise of Cognitive Computing as Future of Computing ChatGPT Anal...
Beyond AI The Rise of Cognitive Computing as Future of Computing ChatGPT Anal...Beyond AI The Rise of Cognitive Computing as Future of Computing ChatGPT Anal...
Beyond AI The Rise of Cognitive Computing as Future of Computing ChatGPT Anal...
 
WEB APPLICATION FOR MATHEMATICS CLUB OF P.C.E
WEB APPLICATION FOR MATHEMATICS CLUB OF P.C.EWEB APPLICATION FOR MATHEMATICS CLUB OF P.C.E
WEB APPLICATION FOR MATHEMATICS CLUB OF P.C.E
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF
[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF
[DSC Europe 23] Luciano Catani - AI in Diplomacy.PDF
 
Reveal the Car's Potential with Cognitive IoT
Reveal the Car's Potential with Cognitive IoTReveal the Car's Potential with Cognitive IoT
Reveal the Car's Potential with Cognitive IoT
 

Recently uploaded

一比一原版(Griffith毕业证书)格里菲斯大学毕业证成绩单学位证书
一比一原版(Griffith毕业证书)格里菲斯大学毕业证成绩单学位证书一比一原版(Griffith毕业证书)格里菲斯大学毕业证成绩单学位证书
一比一原版(Griffith毕业证书)格里菲斯大学毕业证成绩单学位证书
c3384a92eb32
 
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
mikehavy0
 
一比一原版(NEU毕业证书)东北大学毕业证成绩单原件一模一样
一比一原版(NEU毕业证书)东北大学毕业证成绩单原件一模一样一比一原版(NEU毕业证书)东北大学毕业证成绩单原件一模一样
一比一原版(NEU毕业证书)东北大学毕业证成绩单原件一模一样
A
 

Recently uploaded (20)

一比一原版(Griffith毕业证书)格里菲斯大学毕业证成绩单学位证书
一比一原版(Griffith毕业证书)格里菲斯大学毕业证成绩单学位证书一比一原版(Griffith毕业证书)格里菲斯大学毕业证成绩单学位证书
一比一原版(Griffith毕业证书)格里菲斯大学毕业证成绩单学位证书
 
Call for Papers - Journal of Electrical Systems (JES), E-ISSN: 1112-5209, ind...
Call for Papers - Journal of Electrical Systems (JES), E-ISSN: 1112-5209, ind...Call for Papers - Journal of Electrical Systems (JES), E-ISSN: 1112-5209, ind...
Call for Papers - Journal of Electrical Systems (JES), E-ISSN: 1112-5209, ind...
 
Working Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdfWorking Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdf
 
Adsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) pptAdsorption (mass transfer operations 2) ppt
Adsorption (mass transfer operations 2) ppt
 
History of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & ModernizationHistory of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & Modernization
 
CLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference ModalCLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference Modal
 
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
☎️Looking for Abortion Pills? Contact +27791653574.. 💊💊Available in Gaborone ...
 
UNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxUNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptx
 
Dynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptxDynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptx
 
Path loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata ModelPath loss model, OKUMURA Model, Hata Model
Path loss model, OKUMURA Model, Hata Model
 
Databricks Generative AI Fundamentals .pdf
Databricks Generative AI Fundamentals  .pdfDatabricks Generative AI Fundamentals  .pdf
Databricks Generative AI Fundamentals .pdf
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdfInstruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
 
Diploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdfDiploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdf
 
Augmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxAugmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptx
 
一比一原版(NEU毕业证书)东北大学毕业证成绩单原件一模一样
一比一原版(NEU毕业证书)东北大学毕业证成绩单原件一模一样一比一原版(NEU毕业证书)东北大学毕业证成绩单原件一模一样
一比一原版(NEU毕业证书)东北大学毕业证成绩单原件一模一样
 
Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...
 
SLIDESHARE PPT-DECISION MAKING METHODS.pptx
SLIDESHARE PPT-DECISION MAKING METHODS.pptxSLIDESHARE PPT-DECISION MAKING METHODS.pptx
SLIDESHARE PPT-DECISION MAKING METHODS.pptx
 
Basics of Relay for Engineering Students
Basics of Relay for Engineering StudentsBasics of Relay for Engineering Students
Basics of Relay for Engineering Students
 
Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...
 

technical seminar.pptx on multi model of AI

  • 1. VISVESVARAYA TECHNOLOGICAL UNIVERSITY "Jnana Sangama", Belgaum: 590 018 H.K.E Society’s SIR M VISVESVARAYA COLLEGE OF ENGINEERING (Affiliated to VTU - Belagavi, Approved by AICTE, Accredited by NAAC) Yeramarus Camp, Raichur-584135, Karnataka 2023-2024 TECHNICAL SEMINAR PRESENTATION ON “MULTIMODAL AI ” UNDER THE GUIDENCE OF DR.SHARAN KUMAR DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
  • 4. CONTENTS • Introduction • Literature survey • Block diagram • Applications • Future scope • Benefits and challenges • Conclusion • Reference
  • 5.
  • 6. Introduction • Multi modal AI is an advanced form of artificial intelligence that is able to analyze and interpret multiple modes of data simultaneously allowing it to generate more accurate and human like responses.
  • 7. Literature survey • The release of ChatGPT in November 2022, a conversation-focused model that follows human instructions, further underscored the feasibility of AGI in practical applications (Liu et al., 2023a). This development has had a wide-ranging impact across various sectors, including journalism (Liu et al., 2023c), education (Zhai, 2023; Liu et al., 2023b), healthcare (Li et al., 2023; Liu et al., [n. d.]; Holmes et al., 2023), industry (Dou et al., 2023), agriculture (Rezayi et al., 2023), law (Bubeck et al., 2023), gaming (Bubeck et al., 2023), and finance (Wu et al., 2023c), catalyzing a popular wave in AI (Liu et al., 2023a, g, h). • Rishi Bommasani, Drew A Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, et al. 2021.On the opportunities and risks of foundation models.arXiv preprint arXiv:2108.07258 (2021).
  • 8.
  • 9. Sensory Inputs Sensory inputs refer to the various forms of data collected from different senses such as vision, hearing, touch, and smell that are processed by multimodal AI technology for a technical seminar. Data Fusion Data fusion involves combining information from multiple modalities, such as text, images, and videos, to improve the accuracy and robustness of AI systems in a technical seminar on multimodal AI technology generation. Machine Learning Algorithms Machine learning algorithms play a crucial role in generating multimodal AI technology for technical seminars by effectively analyzing and interpreting data from multiple sources such as text, images, and audio. Natural Language Processing Natural Language Processing is a crucial component of Multimodal AI technology, allowing for the analysis and understanding of human language in combination with other modalities such as images or videos. Computer Vision Computer Vision is a key component of Multimodal AI technology, which allows for the integration of visual data processing with other modes of information to enhance overall system performance.
  • 10.
  • 11.
  • 12. Applications • Social media content moderation: Multimodal AI can be used to analyze text, images, and audio to identify and moderate harmful content on social media platforms. For instance, it can detect hate speech, violence, and bullying. • Virtual assistants: Smart assistants like Google Assistant and Amazon Alexa are powered by multimodal AI. They can understand and respond to natural language commands, both spoken and typed. • Healthcare imaging: In healthcare, multimodal AI can analyze medical images (X-rays, MRIs) along with text reports and patient history data to improve diagnostics. This can lead to more accurate diagnoses and better patient outcomes. • Autonomous vehicles: Self-driving cars rely heavily on multimodal AI. They use a variety of sensors, including cameras, radar, and LiDAR, to perceive their surroundings and navigate safely. • E-commerce product recommendations: Many e-commerce websites use multimodal AI to personalize product recommendations for customers. By considering both the product image and description, the AI can recommend items that are more likely to interest the customer
  • 13.
  • 14.
  • 15. Conclusion • The future of AI is not just about seeing or hearing, it's about truly understanding. Multimodal AI holds the key to unlocking a new level of human-computer interaction, with applications that can bridge communication gaps, enhance our understanding of the world, and empower us to solve complex challenges in entirely new ways. The potential for positive impact across various fields is truly limitless.
  • 16. References • Rania Abdelghani, Yen-Hsiang Wang, Xingdi Yuan, Tong Wang, Pauline Lucas, Hélène Sauzéon, and Pierre-Yves Oudeyer. 2023.GPT-3-driven pedagogical agents for training children’s curious question-asking skills. International Journal of Artificial Intelligence in Education 167, 3 (2023), 102887. • Hang Bao, Wen Wang, Li Dong, Qianru Liu, Ola K. Mohammed, Kirti Aggarwal, and Fang Wei. 2022.Vlmo: Unified vision-language pre-training with mixture-of-modality-experts. In Advances in Neural Information Processing Systems (NeurIPS), Vol. 35. 32897– 32912.