SlideShare a Scribd company logo
Introduction to Big
Data Architecture
Big data architecture is the framework for processing, managing, and
analyzing large and complex data sets. It involves various tools,
techniques, and infrastructure to handle the volume, velocity, and variety
of data in an efficient and cost-effective manner.
Key Components of Big Data
Architecture
Data Nodes
Data nodes refer to individual
servers or machines that
store and process data.
These nodes work together in
a cluster to manage and
analyse large datasets. Each
node typically has its own
local storage and
computational resources.
Data Streams
Data streams for efficient data
transfer and real-time
processing, enabling the
capture of large-scale,
continuously generated data.
Data stream processing deals
with data as it is generated,
allowing for faster insights
and rapid response to
changing conditions.
Processing Frameworks
Frameworks that enable
distributed processing for
handling massive amounts of
data efficiently and effectively.
Data Ingestion and Collection
1 Data Sources
Diverse sources of data including
databases, IoT devices,
applications, sensors, and APIs.
2 Data Pipelines
Efficient and reliable data pipelines
to streamline the collection process
and ensure data quality and integrity.
3 Real-time Processing
Systems capable of real-time processing to handle high-velocity data streams and
immediate data availability.
Data Storage and Management
Distributed Storage
Utilization of distributed storage
systems for cost-effective and scalable
storage of massive volumes of data.
Data Security
Implementation of robust security
measures to protect data from
unauthorized access and ensure
compliance with data protection
regulations.
Data Governance
Establishment of governance frameworks and policies for data classification, retention,
and access control.
Data Processing and Analysis
Data Exploration Uncover patterns, trends, and insights within
large volumes of data.
Data Transformation Prepare and cleanse raw data for analysis and
modeling purposes.
Modeling & Analytics Application of statistical and machine learning
models for predictive and prescriptive
analytics.
Examples
Data Exploration:
Example: Analysing large volumes of social media data to understand global trends and sentiments. This
involves exploring massive datasets containing tweets, posts, and comments to identify patterns, popular
topics, and emerging discussions.
Data Transformation:
Example: Processing and transforming raw sensor data from Internet of Things (IoT) devices in a smart
city. Converting unstructured sensor data into a structured format, aggregating information, and handling
data from diverse sources for further analysis.
Examples
Data Modelling:
Example: Creating a recommendation system for an e-commerce platform based on extensive user
behaviour and purchase history. Implementing machine learning algorithms on large datasets to
personalise product recommendations for individual users.
Data Analytics:
Example: Analysing healthcare data from multiple sources, including electronic health records, wearable
devices, and genomic data. Using advanced analytics to identify correlations, predict disease patterns,
and enhance personalised medicine.
Data Visualization and Reporting
Data Visualization
Transform complex data into visually appealing and easy-to-understand
charts, graphs, and dashboards.
Reporting Automation
Automate the generation of reports to provide insights and support decision-
making processes.
Thank you

More Related Content

Similar to Big Data Architecture Intro and its implementation in the insutry.pptx

BigData Analytics_1.7
BigData Analytics_1.7BigData Analytics_1.7
BigData Analytics_1.7Rohit Mittal
 
ACCOUNTING-IT-APP-MIdterm Topic-Bigdata.pdf
ACCOUNTING-IT-APP-MIdterm Topic-Bigdata.pdfACCOUNTING-IT-APP-MIdterm Topic-Bigdata.pdf
ACCOUNTING-IT-APP-MIdterm Topic-Bigdata.pdfJerichoGerance
 
What is Big Data - Edvicon
What is Big Data - EdviconWhat is Big Data - Edvicon
What is Big Data - Edviconedviconin
 
Big-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdfBig-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdfrajsharma159890
 
201506 OSIsoft Garter Big Data.pdf
201506 OSIsoft Garter Big Data.pdf201506 OSIsoft Garter Big Data.pdf
201506 OSIsoft Garter Big Data.pdfUnitedLiftTechnologi
 
IRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth EnhancementIRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth EnhancementIRJET Journal
 
The Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their UsageThe Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their UsageIRJET Journal
 
International Journal of Database Management Systems (IJDBMS)
International Journal of Database Management Systems (IJDBMS)International Journal of Database Management Systems (IJDBMS)
International Journal of Database Management Systems (IJDBMS)ijfcst journal
 
KIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfKIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfDr. Radhey Shyam
 
Data Warehousing AWS 12345
Data Warehousing AWS 12345Data Warehousing AWS 12345
Data Warehousing AWS 12345AkhilSinghal21
 
International Journal of Database Management Systems (IJDBMS)
International Journal of Database Management Systems (IJDBMS)International Journal of Database Management Systems (IJDBMS)
International Journal of Database Management Systems (IJDBMS)ijfcst journal
 
International Journal of Database Management Systems (IJDBMS)
International Journal of Database Management Systems (IJDBMS)International Journal of Database Management Systems (IJDBMS)
International Journal of Database Management Systems (IJDBMS)ijfcst journal
 

Similar to Big Data Architecture Intro and its implementation in the insutry.pptx (20)

BigData Analytics_1.7
BigData Analytics_1.7BigData Analytics_1.7
BigData Analytics_1.7
 
ACCOUNTING-IT-APP-MIdterm Topic-Bigdata.pdf
ACCOUNTING-IT-APP-MIdterm Topic-Bigdata.pdfACCOUNTING-IT-APP-MIdterm Topic-Bigdata.pdf
ACCOUNTING-IT-APP-MIdterm Topic-Bigdata.pdf
 
What is Big Data - Edvicon
What is Big Data - EdviconWhat is Big Data - Edvicon
What is Big Data - Edvicon
 
Big data
Big dataBig data
Big data
 
Big-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdfBig-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdf
 
201506 OSIsoft Garter Big Data.pdf
201506 OSIsoft Garter Big Data.pdf201506 OSIsoft Garter Big Data.pdf
201506 OSIsoft Garter Big Data.pdf
 
semana1.pptx
semana1.pptxsemana1.pptx
semana1.pptx
 
IRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth EnhancementIRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth Enhancement
 
Ijdbms
IjdbmsIjdbms
Ijdbms
 
All About Big Data
All About Big Data All About Big Data
All About Big Data
 
Abstract
AbstractAbstract
Abstract
 
The Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their UsageThe Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their Usage
 
1 UNIT-DSP.pptx
1 UNIT-DSP.pptx1 UNIT-DSP.pptx
1 UNIT-DSP.pptx
 
International Journal of Database Management Systems (IJDBMS)
International Journal of Database Management Systems (IJDBMS)International Journal of Database Management Systems (IJDBMS)
International Journal of Database Management Systems (IJDBMS)
 
KIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdfKIT-601 Lecture Notes-UNIT-1.pdf
KIT-601 Lecture Notes-UNIT-1.pdf
 
Ijdbms
IjdbmsIjdbms
Ijdbms
 
U - 2 Emerging.pptx
U - 2 Emerging.pptxU - 2 Emerging.pptx
U - 2 Emerging.pptx
 
Data Warehousing AWS 12345
Data Warehousing AWS 12345Data Warehousing AWS 12345
Data Warehousing AWS 12345
 
International Journal of Database Management Systems (IJDBMS)
International Journal of Database Management Systems (IJDBMS)International Journal of Database Management Systems (IJDBMS)
International Journal of Database Management Systems (IJDBMS)
 
International Journal of Database Management Systems (IJDBMS)
International Journal of Database Management Systems (IJDBMS)International Journal of Database Management Systems (IJDBMS)
International Journal of Database Management Systems (IJDBMS)
 

Recently uploaded

Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBAlireza Kamrani
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJames Polillo
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单ewymefz
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单enxupq
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesStarCompliance.io
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单ewymefz
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...elinavihriala
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsalex933524
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单nscud
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxbenishzehra469
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sMAQIB18
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .NABLAS株式会社
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单nscud
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单ukgaet
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单ewymefz
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单ewymefz
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Domenico Conte
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单yhkoc
 

Recently uploaded (20)

Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDB
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 

Big Data Architecture Intro and its implementation in the insutry.pptx

  • 1. Introduction to Big Data Architecture Big data architecture is the framework for processing, managing, and analyzing large and complex data sets. It involves various tools, techniques, and infrastructure to handle the volume, velocity, and variety of data in an efficient and cost-effective manner.
  • 2. Key Components of Big Data Architecture Data Nodes Data nodes refer to individual servers or machines that store and process data. These nodes work together in a cluster to manage and analyse large datasets. Each node typically has its own local storage and computational resources. Data Streams Data streams for efficient data transfer and real-time processing, enabling the capture of large-scale, continuously generated data. Data stream processing deals with data as it is generated, allowing for faster insights and rapid response to changing conditions. Processing Frameworks Frameworks that enable distributed processing for handling massive amounts of data efficiently and effectively.
  • 3. Data Ingestion and Collection 1 Data Sources Diverse sources of data including databases, IoT devices, applications, sensors, and APIs. 2 Data Pipelines Efficient and reliable data pipelines to streamline the collection process and ensure data quality and integrity. 3 Real-time Processing Systems capable of real-time processing to handle high-velocity data streams and immediate data availability.
  • 4. Data Storage and Management Distributed Storage Utilization of distributed storage systems for cost-effective and scalable storage of massive volumes of data. Data Security Implementation of robust security measures to protect data from unauthorized access and ensure compliance with data protection regulations. Data Governance Establishment of governance frameworks and policies for data classification, retention, and access control.
  • 5. Data Processing and Analysis Data Exploration Uncover patterns, trends, and insights within large volumes of data. Data Transformation Prepare and cleanse raw data for analysis and modeling purposes. Modeling & Analytics Application of statistical and machine learning models for predictive and prescriptive analytics.
  • 6. Examples Data Exploration: Example: Analysing large volumes of social media data to understand global trends and sentiments. This involves exploring massive datasets containing tweets, posts, and comments to identify patterns, popular topics, and emerging discussions. Data Transformation: Example: Processing and transforming raw sensor data from Internet of Things (IoT) devices in a smart city. Converting unstructured sensor data into a structured format, aggregating information, and handling data from diverse sources for further analysis.
  • 7. Examples Data Modelling: Example: Creating a recommendation system for an e-commerce platform based on extensive user behaviour and purchase history. Implementing machine learning algorithms on large datasets to personalise product recommendations for individual users. Data Analytics: Example: Analysing healthcare data from multiple sources, including electronic health records, wearable devices, and genomic data. Using advanced analytics to identify correlations, predict disease patterns, and enhance personalised medicine.
  • 8. Data Visualization and Reporting Data Visualization Transform complex data into visually appealing and easy-to-understand charts, graphs, and dashboards. Reporting Automation Automate the generation of reports to provide insights and support decision- making processes.