SlideShare a Scribd company logo
TOPIK 1
INTRODUCTION TO BIG DATA
COMP6725 - Big Data Technologies
LEARNING OUTCOMES
At the end of this session, students will be able to:
o LO1: Describe big data architecture layer and processing concepts
OUTCOMES
Students are able to describe big data architecture layer and processing concepts
OUTLINE
1. Business Motivations and Driver for Big Data Adoptions
2. Intro to Big Data
3. Big Data Characteristics
4. Big Data Technology
5. Big Data Life Cycle
6. Challenges Faced by Big Data Technology
7. Big Data Examples​​
BUSINESS MOTIVATIONS AND DRIVER FOR BIG DATA
ADOPTIONS
CURRENT SITUATIONS
Pic 1.1. The information what going in world.
Source : https://www.datasciencecentral.com/ (2013)
INFORMATION FROM INTERNET OF THINGS
Pic 1.2. Information from Internet of Things.
Source : https://www.datasciencecentral.com/ (2013)
STORAGE GROWTH AND DIGITIZATION
Nowadays
Pic 1.3. Storage Growth .
Source : https://www.datasciencecentral.com/ (2013)
INTRO TO BIG DATA
EVOLUTION OF BIG DATA
Pic 1.4. Evolution of Big Data.
Source : Big Data Concepts, Technology, and Architecture. 2021
FAILURE OF TRADITIONAL DATABASE IN HANDLING
BIG DATA
The limitations of traditional database in handling big data.
o Exponential increase in data volume, which scales in terabytes and petabytes, has turned
out to become a challenge to the RDBMS in handling such a massive volume of data.
o To address this issue, the RDBMS increased the number of processors and added more
memory units, which in turn increased the cost.
o Almost 80% of the data fetched were of semi-structured and unstructured for- mat, which
RDBMS could not deal with.
o RDBMS could not capture the data coming in at high velocity.
DATA MINING VS. BIG DATA
ATTRIBUTES RDBMS BIG DATA
Data volume Gigabytes to terabytes Petabytes to zettabytes
Organization Centralized Distributed
Data type Structured Unstructured and semi-structured
Hardware type High-end model Commodity hardware
Updates Read/write many times Write once, read many times
Schema Static Dynamic
DATA MINING VS. BIG DATA (CONT)
No RDBMS BIG DATA
1 Data mining is the process of
discovering the underlying
knowledge from the data sets.
Big data refers to massive volume of data
characterized by volume, velocity, and
variety.
2 Structured data retrieved from
spread sheets, relational
databases, etc.
Structured, unstructured, or semi-
structured data retrieved from non-
relational databases, such as NoSQL.
3 Data mining is capable of
processing large data sets, but
the data processing costs are
high.
Big data tools and technologies are
capable of storing and processing large
volumes of data at a comparatively lower
cost.
4 Data mining can process only
data sets that range from
gigabytes to terabytes.
Big data technology is capable of storing
and processing data that range from
petabytes to zettabytes.
WHAT IS BIG DATA?
o Big data is defined as collections of datasets whose volume, velocity or variety is so large
that it is difficult to store, manage, process and analyze the data using traditional databases
and data processing tools.
o Big Data analytics deals with collection, storage, processing and analysis of this massive-
scale data.
o Specialized tools and frameworks are required for big data analysis when:
1. The volume of data involved is so large that it is difficult to store, process and analyze
data on a single machine,
2. The velocity of data is very high and the data needs to be analyzed in real-time,
3. There is variety of data involved, which can be structured, unstructured or semi-
structured, and is collected from multiple data sources,
4. Various types of analytics need to be performed to extract value from the data such as
descriptive, diagnostic, predictive and prescriptive analytics.
WHAT IS BIG DATA? (CONT)
o Big data analytics involves several steps starting from data cleansing, data munging (or
wrangling), data processing and visualization.
o Big data analytics life-cycle starts from the collection of data from multiple data sources
also life-cycle starts from the collection of data from multiple data sources.
BIG DATA CHARACTERISTICS
CHARACTERISTICS OF BIG DATA
Volume
• Large data would not fit on a
single machine.
• Specialized tools and
frameworks are required to
store process and analyze it.
Velocity
• How fast the data is
generated.
• Specialized tools are
required to ingest such high
velocity data into the big
data infrastructure and
analyze the data in real-time.
Variety
• The forms of the data.
• Consists of structured,
unstructured, or semi-
structured data, including
text data, image, audio,
video and sensor data.
Veracity
• How accurate is the data.
• Cleansing of data is
important so that incorrect
and faulty data can be
filtered out.
•Value
•the usefulness of data for the intended purpose.
•The end goal of any big data analytics system is to
extract value from the data.
•some applications value also depends on how fast
we are able to process the data.
10V’S OF BIG DATA
Pic 1.5. 10V’s of Big Data
Source : http://www.datasciencecentral.com/
BIG DATA TECHNOLOGY
BIG DATA TECHNOLOGY
o The core components of big data technologies are the tools and technologies that provide
the capacity to store, process, and analyze the data.
o The key technologies include
• Hadoop
• HDFS
• MapReduce
BIG DATA LIFE CYCLE
BIG DATA LIFE CYCLE
Pic 1.6. Big Data Life Cycle
Source : Big Data Concepts, Technology, and Architecture.. 2021
CHALLENGES FACED BY BIG DATA TECHNOLOGY
CHALLENGES FACED BY BIG DATA TECHNOLOGY
o a lot of challenges when it comes to dealing with the data, some data are structured that
could be stored in traditional databases, while some are videos, pictures, and documents,
which may be unstructured or semi- structured, generated by sensors, social media,
satellite, business transactions, and much more.
o Real challenge is how to make sense by integrating disparate data from diversified sources :
• Heterogeneity and incompleteness
• Volume and velocity of the data
• Data storage
• Data privacy
BIG DATA EXAMPLES
BIG DATA EXAMPLES
Web
Web
Analytics
Performance
Monitoring
Ad Targeting
& Analytics
Content
Recommendatio
n
Financial
Credit
Risk
Modeling
Fraud
Detection
Healthcare
Epidemiological
Surveillance
Patient Similarity-
based Decision
Intelligence Application
Adverse Drug
Events
Prediction
Detecting Claim
Anomalies
Evidence-based
Medicine
Real-time health
monitoring
Internet of
Things
Intrusion
Detection
Smart
Parkings
Smart
Roads
Structural
Health
Monitoring
Smart
Irrigation
Environment
Weather
Monitoring
Air Pollution
Monitoring
Noise Pollution
Monitoring
Forest Fire
Detection
River Floods
Detection
Water Quality
Monitoring
BIG DATA IN EDUCATION INDUSTRY
Pic 1.7. Big Data in Education Industry
Source : https://intellipaat.com/
BIG DATA IN HEALTHCARE
Pic 1.8. Big Data in Healthcare
Source : https://intellipaat.com/
BIG DATA IN GOVERNMENT SECTOR
Pic 1.19. Big Data in Government Sector
Source : https://intellipaat.com/
BIG DATA IN BANKING SECTOR
Pic 1.10. Big Data in Banking Sector
Source : https://intellipaat.com/
BIG DATA IN WEATHER PATTERNS
Pic 1.11. Big Data in Weather Patterns
Source : https://intellipaat.com/
ThankYOU...
SUMMARY
o Big data is a term for data sets that are so large or complex that traditional data
processing application software are inadequate to deal with them.
o Challenges include capture, storage, analysis, search, sharing, transfer, visualization,
querying, updating and information privacy.
o The term "big data" often refers simply to the use of predictive analytics, user
behaviour analytics, or certain other advanced data analytics methods that extract
value from data, and seldom to a particular size of data set.
o Big Data Characteristics is Volume, Velocity, Variety and Veracity
REFERENCES
o Balusamy. Balamurugan, Abirami.Nandhini, Kadry.R, Seifedine, & Gandomi. Amir H. (2021). Big Data
Concepts, Technology, and Architecture. 1st. Wiley. ISBN 978-1-119-70182-8. Chapter 1
o Sawant, N. and Shah, H., (2013). Big data application architecture Q&A. A Problem-Solution Approach.
Apress, Springer Science. ISBN: 978-1-4302-6292-3. Chapter 1
o https://www.youtube.com/watch?v=aC2CmTTZTVU
o http://www.datasciencecentral.com/
o https://www.irjet.net/archives/V4/i9/IRJET-V4I957.pdf
o http://www.martinhilbert.net/WorldInfoCapacity.html
o http://www.cse.unsw.edu.au/~cs9313/
o https://intellipaat.com/

More Related Content

Similar to 20211011112936_PPT01-Introduction to Big Data.pptx

Unit-1 -2-3- BDA PIET 6 AIDS.pptx
Unit-1 -2-3- BDA PIET 6 AIDS.pptxUnit-1 -2-3- BDA PIET 6 AIDS.pptx
Unit-1 -2-3- BDA PIET 6 AIDS.pptx
YashiBatra1
 
A Survey on Big Data Mining Challenges
A Survey on Big Data Mining ChallengesA Survey on Big Data Mining Challenges
A Survey on Big Data Mining Challenges
Editor IJMTER
 
UNIT 1 -BIG DATA ANALYTICS Full.pdf
UNIT 1 -BIG DATA ANALYTICS Full.pdfUNIT 1 -BIG DATA ANALYTICS Full.pdf
UNIT 1 -BIG DATA ANALYTICS Full.pdf
vvpadhu
 
1 UNIT-DSP.pptx
1 UNIT-DSP.pptx1 UNIT-DSP.pptx
1 UNIT-DSP.pptx
PothyeswariPothyes
 
Big data Mining
Big data MiningBig data Mining
Big data Mining
MariamKhan120
 
Big data
Big dataBig data
Big data
madhavsolanki
 
Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applications
Padma Metta
 
big-data.pdf
big-data.pdfbig-data.pdf
big-data.pdf
aditi276464
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
JOSEPH FRANCIS
 
IRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth EnhancementIRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth Enhancement
IRJET Journal
 
A beginner's guide to Big data
A beginner's guide to Big dataA beginner's guide to Big data
A beginner's guide to Big data
AnushkaGupta763558
 
CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSIS
CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSISCASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSIS
CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSIS
IRJET Journal
 
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital ForensicsBig Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
SherinMariamReji05
 
Unit-I- Introduction- Traits of Big Data-Final.pptx
Unit-I- Introduction- Traits of Big Data-Final.pptxUnit-I- Introduction- Traits of Big Data-Final.pptx
Unit-I- Introduction- Traits of Big Data-Final.pptx
subhashchandra197
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
Hari Priya
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
Tony Bain
 
Analysis of Big Data
Analysis of Big DataAnalysis of Big Data
Analysis of Big Data
IRJET Journal
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introduction
amiyadash
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
SpringPeople
 
Protection of big data privacy
Protection of big data privacyProtection of big data privacy
Protection of big data privacy
redpel dot com
 

Similar to 20211011112936_PPT01-Introduction to Big Data.pptx (20)

Unit-1 -2-3- BDA PIET 6 AIDS.pptx
Unit-1 -2-3- BDA PIET 6 AIDS.pptxUnit-1 -2-3- BDA PIET 6 AIDS.pptx
Unit-1 -2-3- BDA PIET 6 AIDS.pptx
 
A Survey on Big Data Mining Challenges
A Survey on Big Data Mining ChallengesA Survey on Big Data Mining Challenges
A Survey on Big Data Mining Challenges
 
UNIT 1 -BIG DATA ANALYTICS Full.pdf
UNIT 1 -BIG DATA ANALYTICS Full.pdfUNIT 1 -BIG DATA ANALYTICS Full.pdf
UNIT 1 -BIG DATA ANALYTICS Full.pdf
 
1 UNIT-DSP.pptx
1 UNIT-DSP.pptx1 UNIT-DSP.pptx
1 UNIT-DSP.pptx
 
Big data Mining
Big data MiningBig data Mining
Big data Mining
 
Big data
Big dataBig data
Big data
 
Bigdata and Hadoop with applications
Bigdata and Hadoop with applicationsBigdata and Hadoop with applications
Bigdata and Hadoop with applications
 
big-data.pdf
big-data.pdfbig-data.pdf
big-data.pdf
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
IRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth EnhancementIRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth Enhancement
 
A beginner's guide to Big data
A beginner's guide to Big dataA beginner's guide to Big data
A beginner's guide to Big data
 
CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSIS
CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSISCASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSIS
CASE STUDY ON METHODS AND TOOLS FOR THE BIG DATA ANALYSIS
 
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital ForensicsBig Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
 
Unit-I- Introduction- Traits of Big Data-Final.pptx
Unit-I- Introduction- Traits of Big Data-Final.pptxUnit-I- Introduction- Traits of Big Data-Final.pptx
Unit-I- Introduction- Traits of Big Data-Final.pptx
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Analysis of Big Data
Analysis of Big DataAnalysis of Big Data
Analysis of Big Data
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introduction
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Protection of big data privacy
Protection of big data privacyProtection of big data privacy
Protection of big data privacy
 

Recently uploaded

How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
Alireza Kamrani
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
Jio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdfJio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdf
inaya7568
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
ytypuem
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
NABLAS株式会社
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
exukyp
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
vasanthatpuram
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
tzu5xla
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
Building a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdfBuilding a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdf
cjimenez2581
 
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
lzdvtmy8
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 

Recently uploaded (20)

How To Control IO Usage using Resource Manager
How To Control IO Usage using Resource ManagerHow To Control IO Usage using Resource Manager
How To Control IO Usage using Resource Manager
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
Jio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdfJio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdf
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
Building a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdfBuilding a Quantum Computer Neutral Atom.pdf
Building a Quantum Computer Neutral Atom.pdf
 
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
一比一原版格里菲斯大学毕业证(Griffith毕业证书)学历如何办理
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 

20211011112936_PPT01-Introduction to Big Data.pptx

  • 1. TOPIK 1 INTRODUCTION TO BIG DATA COMP6725 - Big Data Technologies
  • 2. LEARNING OUTCOMES At the end of this session, students will be able to: o LO1: Describe big data architecture layer and processing concepts
  • 3. OUTCOMES Students are able to describe big data architecture layer and processing concepts
  • 4. OUTLINE 1. Business Motivations and Driver for Big Data Adoptions 2. Intro to Big Data 3. Big Data Characteristics 4. Big Data Technology 5. Big Data Life Cycle 6. Challenges Faced by Big Data Technology 7. Big Data Examples​​
  • 5. BUSINESS MOTIVATIONS AND DRIVER FOR BIG DATA ADOPTIONS
  • 6. CURRENT SITUATIONS Pic 1.1. The information what going in world. Source : https://www.datasciencecentral.com/ (2013)
  • 7. INFORMATION FROM INTERNET OF THINGS Pic 1.2. Information from Internet of Things. Source : https://www.datasciencecentral.com/ (2013)
  • 8. STORAGE GROWTH AND DIGITIZATION Nowadays Pic 1.3. Storage Growth . Source : https://www.datasciencecentral.com/ (2013)
  • 10. EVOLUTION OF BIG DATA Pic 1.4. Evolution of Big Data. Source : Big Data Concepts, Technology, and Architecture. 2021
  • 11. FAILURE OF TRADITIONAL DATABASE IN HANDLING BIG DATA The limitations of traditional database in handling big data. o Exponential increase in data volume, which scales in terabytes and petabytes, has turned out to become a challenge to the RDBMS in handling such a massive volume of data. o To address this issue, the RDBMS increased the number of processors and added more memory units, which in turn increased the cost. o Almost 80% of the data fetched were of semi-structured and unstructured for- mat, which RDBMS could not deal with. o RDBMS could not capture the data coming in at high velocity.
  • 12. DATA MINING VS. BIG DATA ATTRIBUTES RDBMS BIG DATA Data volume Gigabytes to terabytes Petabytes to zettabytes Organization Centralized Distributed Data type Structured Unstructured and semi-structured Hardware type High-end model Commodity hardware Updates Read/write many times Write once, read many times Schema Static Dynamic
  • 13. DATA MINING VS. BIG DATA (CONT) No RDBMS BIG DATA 1 Data mining is the process of discovering the underlying knowledge from the data sets. Big data refers to massive volume of data characterized by volume, velocity, and variety. 2 Structured data retrieved from spread sheets, relational databases, etc. Structured, unstructured, or semi- structured data retrieved from non- relational databases, such as NoSQL. 3 Data mining is capable of processing large data sets, but the data processing costs are high. Big data tools and technologies are capable of storing and processing large volumes of data at a comparatively lower cost. 4 Data mining can process only data sets that range from gigabytes to terabytes. Big data technology is capable of storing and processing data that range from petabytes to zettabytes.
  • 14. WHAT IS BIG DATA? o Big data is defined as collections of datasets whose volume, velocity or variety is so large that it is difficult to store, manage, process and analyze the data using traditional databases and data processing tools. o Big Data analytics deals with collection, storage, processing and analysis of this massive- scale data. o Specialized tools and frameworks are required for big data analysis when: 1. The volume of data involved is so large that it is difficult to store, process and analyze data on a single machine, 2. The velocity of data is very high and the data needs to be analyzed in real-time, 3. There is variety of data involved, which can be structured, unstructured or semi- structured, and is collected from multiple data sources, 4. Various types of analytics need to be performed to extract value from the data such as descriptive, diagnostic, predictive and prescriptive analytics.
  • 15. WHAT IS BIG DATA? (CONT) o Big data analytics involves several steps starting from data cleansing, data munging (or wrangling), data processing and visualization. o Big data analytics life-cycle starts from the collection of data from multiple data sources also life-cycle starts from the collection of data from multiple data sources.
  • 17. CHARACTERISTICS OF BIG DATA Volume • Large data would not fit on a single machine. • Specialized tools and frameworks are required to store process and analyze it. Velocity • How fast the data is generated. • Specialized tools are required to ingest such high velocity data into the big data infrastructure and analyze the data in real-time. Variety • The forms of the data. • Consists of structured, unstructured, or semi- structured data, including text data, image, audio, video and sensor data. Veracity • How accurate is the data. • Cleansing of data is important so that incorrect and faulty data can be filtered out. •Value •the usefulness of data for the intended purpose. •The end goal of any big data analytics system is to extract value from the data. •some applications value also depends on how fast we are able to process the data.
  • 18. 10V’S OF BIG DATA Pic 1.5. 10V’s of Big Data Source : http://www.datasciencecentral.com/
  • 20. BIG DATA TECHNOLOGY o The core components of big data technologies are the tools and technologies that provide the capacity to store, process, and analyze the data. o The key technologies include • Hadoop • HDFS • MapReduce
  • 21. BIG DATA LIFE CYCLE
  • 22. BIG DATA LIFE CYCLE Pic 1.6. Big Data Life Cycle Source : Big Data Concepts, Technology, and Architecture.. 2021
  • 23. CHALLENGES FACED BY BIG DATA TECHNOLOGY
  • 24. CHALLENGES FACED BY BIG DATA TECHNOLOGY o a lot of challenges when it comes to dealing with the data, some data are structured that could be stored in traditional databases, while some are videos, pictures, and documents, which may be unstructured or semi- structured, generated by sensors, social media, satellite, business transactions, and much more. o Real challenge is how to make sense by integrating disparate data from diversified sources : • Heterogeneity and incompleteness • Volume and velocity of the data • Data storage • Data privacy
  • 26. BIG DATA EXAMPLES Web Web Analytics Performance Monitoring Ad Targeting & Analytics Content Recommendatio n Financial Credit Risk Modeling Fraud Detection Healthcare Epidemiological Surveillance Patient Similarity- based Decision Intelligence Application Adverse Drug Events Prediction Detecting Claim Anomalies Evidence-based Medicine Real-time health monitoring Internet of Things Intrusion Detection Smart Parkings Smart Roads Structural Health Monitoring Smart Irrigation Environment Weather Monitoring Air Pollution Monitoring Noise Pollution Monitoring Forest Fire Detection River Floods Detection Water Quality Monitoring
  • 27. BIG DATA IN EDUCATION INDUSTRY Pic 1.7. Big Data in Education Industry Source : https://intellipaat.com/
  • 28. BIG DATA IN HEALTHCARE Pic 1.8. Big Data in Healthcare Source : https://intellipaat.com/
  • 29. BIG DATA IN GOVERNMENT SECTOR Pic 1.19. Big Data in Government Sector Source : https://intellipaat.com/
  • 30. BIG DATA IN BANKING SECTOR Pic 1.10. Big Data in Banking Sector Source : https://intellipaat.com/
  • 31. BIG DATA IN WEATHER PATTERNS Pic 1.11. Big Data in Weather Patterns Source : https://intellipaat.com/
  • 33. SUMMARY o Big data is a term for data sets that are so large or complex that traditional data processing application software are inadequate to deal with them. o Challenges include capture, storage, analysis, search, sharing, transfer, visualization, querying, updating and information privacy. o The term "big data" often refers simply to the use of predictive analytics, user behaviour analytics, or certain other advanced data analytics methods that extract value from data, and seldom to a particular size of data set. o Big Data Characteristics is Volume, Velocity, Variety and Veracity
  • 34. REFERENCES o Balusamy. Balamurugan, Abirami.Nandhini, Kadry.R, Seifedine, & Gandomi. Amir H. (2021). Big Data Concepts, Technology, and Architecture. 1st. Wiley. ISBN 978-1-119-70182-8. Chapter 1 o Sawant, N. and Shah, H., (2013). Big data application architecture Q&A. A Problem-Solution Approach. Apress, Springer Science. ISBN: 978-1-4302-6292-3. Chapter 1 o https://www.youtube.com/watch?v=aC2CmTTZTVU o http://www.datasciencecentral.com/ o https://www.irjet.net/archives/V4/i9/IRJET-V4I957.pdf o http://www.martinhilbert.net/WorldInfoCapacity.html o http://www.cse.unsw.edu.au/~cs9313/ o https://intellipaat.com/

Editor's Notes

  1. Source: https://www.datasciencecentral.com/profiles/blogs/basic-understanding-of-big-data-what-is-this-and-how-it-is-going
  2. Source: http://www.martinhilbert.net/WorldInfoCapacity.html https://ipsrsolutions.com/library/uploads/2015/07/cloud-computing.png
  3. https://www.irjet.net/archives/V4/i9/IRJET-V4I957.pdf
  4. Source: https://intellipaat.com/blog/wp-content/uploads/2016/07/BigData-02.jpg
  5. Source: https://intellipaat.com/blog/wp-content/uploads/2016/07/BigData-03.jpg
  6. Source: https://intellipaat.com/blog/wp-content/uploads/2016/07/BigData-04.jpg
  7. Source: https://intellipaat.com/blog/wp-content/uploads/2016/07/Big-Data-in-Banking-Sector-1.png
  8. Source: https://intellipaat.com/blog/wp-content/uploads/2016/07/BigData.png