SlideShare a Scribd company logo
A BEGINNERS GUIDE
Contents:
> Introduction
> What is Big Data?
> Big Data as a technology
> 5 V’s of Big Data
> Big Data Technology
> Benefits of Big Data
2
Introduction
“Data is the new science, Big Data holds the answers”
3
500 million tweets
are sent everyday
4 petabytes of data are
created on Facebook
4 terabytes of data are
created from each
connected car
65 billion messages
are sent on WhatsApp
5 billion searches are
made
294 billion emails
are sent
What is BIG DATA?
A Collection of large and complex datasets which are difficult to store
and process using the traditional database and data processing tools is
considered as big data. Big data is collected from traditional and digital
sources which, when refined properly can be used for research and
analysis.
Everything around us generates big data continuously. Social media
websites and digital sources are responsible for producing such huge
amount of data.
4
Where does BIG DATA come
from?
Social data
comes from the Likes, Tweets &
Retweets, Comments, Video
Uploads, and general media that
are uploaded and shared via the
world’s favorite social media
platforms.
5
Machine data
information which is generated by
industrial equipment, sensors
that are installed in machinery,
and even web logs which track
user behavior.
Transactional data
is generated from all the daily
transactions that take place both
online and offline. Invoices,
payment orders, storage records,
delivery receipts – all are
characterized as transactional
data
The bulk of big data generated comes from three primary sources: social
data, machine data and transactional data.
What are the different types
of BIG DATA?
•Data which has a defined format and is organized in a
predefined schema is called structured data
•Example - Data coming from traditional databases and
repositories like Mainframes, SQL server, Oracle, DB2,
Sybase, Access, Excel, Teradata, etc..
•Data which is unorganized and it is not easy to interpret
such data using traditional databases or data models
•Data coming from social media like Chatter, text analytics,
blogs, Tweets, comments, clicks, tags etc..
•Data is un-modelled and needs to be organized, although
there might be a schema.
•Data coming from emerging market data, e-commerce,
and other third party data like weather, currency
conversion, demographic, panel etc.
6
Structured Data
Unstructured Data
Multi-Structured Data
What are the 5V’s of BIG
DATA?
7
Characteristics
of BIG DATA
VOLUME
VALUE
VELOCITY
VERACITY
VARIETY
What is VOLUME in BIG
DATA?
It refers to the size of Big Data. Data can be considered Big Data
or not is based on the volume. The rapidly increasing volume data
is due to cloud-computing traffic, IoT, mobile traffic etc.
8
What is VELOCITY in BIG
DATA?
It refers to the speed at which the data is getting accumulated.
This is mainly due to IoTs, mobile data, social media etc.
9
What is VARIETY in BIG
DATA?
It refers to collecting data from multiple sources to understand a
problem and make smarter, more informed decisions. Clear,
uncomplicated access to an extensive variety of data is also the key to
creating platforms that boost innovation and efficiency.
10
What is VERACITY in BIG
DATA?
It is the level of precision or honesty of data collection. With regards to
the veracity of big data, it’s not simply the nature of the data that is
significant, yet how dependable the processing, type, and source of the
data are.
11
What is VALUE in BIG
DATA?
This is indeed the holy grail of Big Data and what we are all looking for.
One has to demonstrate value that can be extracted from big or small
data in order to justify the investments, whether on Big Data or on
traditional analytics, data warehouse or business intelligence tools.
12
What is BIG DATA
TECHNOLOGY?
Big data technology is primarily designed to analyze, process and extract
information from a large data set and a huge set of extremely complex
structures. This is very difficult for traditional data processing software
to deal with. Big data technology is broadly integrated with many other
technologies such as deep learning, machine learning, artificial
intelligence (AI), and the Internet of Things (IoT), which are expanding
at scale. Combined with these technologies, big data technology focuses
on the analysis and processing of large amounts of real-time and batch-
related data.
We can categorize the leading big data technologies into the
following four sections:
13
> Data Storage
> Data Mining
> Data Analytics
> Data Visualization
What are the types of BIG
DATA Technology?
14
BIG DATA
TECHNOLOGIES
DATA STORAGE DATA MINING
What are the benefits of BIG
DATA?
15
BENEFITS
Fraud
Detection
Increasing
Brand
Loyalty
Helps in
Decision
Making
Financial
Risk Analysis
Helps
predict
Future
Trends
Increases
Website
Optimization
16
THANK YOU
https://www.cetpainfotech.com/
9212172602
QUERY@CETPAINFOTECH.COM

More Related Content

Similar to A beginner's guide to Big data

Unit III.pdf
Unit III.pdfUnit III.pdf
Unit III.pdf
PreethaSuresh2
 
(Big) Data infographic - EnjoyDigitAll by BNP Paribas
(Big) Data infographic - EnjoyDigitAll by BNP Paribas(Big) Data infographic - EnjoyDigitAll by BNP Paribas
(Big) Data infographic - EnjoyDigitAll by BNP Paribas
EnjoyDigitAll by BNP Paribas
 
20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptx20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptx
SyauqiAsyhabira1
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptx
Shambhavi Vats
 
Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.
Aditya205306
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
Hari Priya
 
Intro big data analytics
Intro big data analyticsIntro big data analytics
Intro big data analytics
Hagar Alaa el-din
 
130214 copy
130214   copy130214   copy
130214 copy
Arpit Arora
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigData
Valarmathi V
 
IRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth EnhancementIRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth Enhancement
IRJET Journal
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
Muhammad Rumman Islam Nur
 
Big data
Big dataBig data
Big data
madhavsolanki
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Hritika Raj
 
Big Data: The Main Pillar of Technology Disruption
Big Data: The Main Pillar of Technology DisruptionBig Data: The Main Pillar of Technology Disruption
Big Data: The Main Pillar of Technology Disruption
Rishabh Sinha
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
MrsSSumathiIT
 
Unit No2 Introduction to big data.pdf
Unit No2 Introduction to big data.pdfUnit No2 Introduction to big data.pdf
Unit No2 Introduction to big data.pdf
Ranjeet Bhalshankar
 
Leveraging IOT and Latest Technologies
Leveraging IOT and Latest TechnologiesLeveraging IOT and Latest Technologies
Leveraging IOT and Latest Technologies
Mithileysh Sathiyanarayanan
 
Big Data why Now and where to?
Big Data why Now and where to?Big Data why Now and where to?
Big Data why Now and where to?
Fady Sayah
 
BIG DATA article.pdf
BIG DATA article.pdfBIG DATA article.pdf
BIG DATA article.pdf
saimanastangirala
 
Data set module 1
Data set   module 1Data set   module 1
Data set module 1
Data-Set
 

Similar to A beginner's guide to Big data (20)

Unit III.pdf
Unit III.pdfUnit III.pdf
Unit III.pdf
 
(Big) Data infographic - EnjoyDigitAll by BNP Paribas
(Big) Data infographic - EnjoyDigitAll by BNP Paribas(Big) Data infographic - EnjoyDigitAll by BNP Paribas
(Big) Data infographic - EnjoyDigitAll by BNP Paribas
 
20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptx20211011112936_PPT01-Introduction to Big Data.pptx
20211011112936_PPT01-Introduction to Big Data.pptx
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptx
 
Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
 
Intro big data analytics
Intro big data analyticsIntro big data analytics
Intro big data analytics
 
130214 copy
130214   copy130214   copy
130214 copy
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigData
 
IRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth EnhancementIRJET- Big Data Management and Growth Enhancement
IRJET- Big Data Management and Growth Enhancement
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
Big data
Big dataBig data
Big data
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
 
Big Data: The Main Pillar of Technology Disruption
Big Data: The Main Pillar of Technology DisruptionBig Data: The Main Pillar of Technology Disruption
Big Data: The Main Pillar of Technology Disruption
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Unit No2 Introduction to big data.pdf
Unit No2 Introduction to big data.pdfUnit No2 Introduction to big data.pdf
Unit No2 Introduction to big data.pdf
 
Leveraging IOT and Latest Technologies
Leveraging IOT and Latest TechnologiesLeveraging IOT and Latest Technologies
Leveraging IOT and Latest Technologies
 
Big Data why Now and where to?
Big Data why Now and where to?Big Data why Now and where to?
Big Data why Now and where to?
 
BIG DATA article.pdf
BIG DATA article.pdfBIG DATA article.pdf
BIG DATA article.pdf
 
Data set module 1
Data set   module 1Data set   module 1
Data set module 1
 

Recently uploaded

Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 

Recently uploaded (20)

Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 

A beginner's guide to Big data

  • 2. Contents: > Introduction > What is Big Data? > Big Data as a technology > 5 V’s of Big Data > Big Data Technology > Benefits of Big Data 2
  • 3. Introduction “Data is the new science, Big Data holds the answers” 3 500 million tweets are sent everyday 4 petabytes of data are created on Facebook 4 terabytes of data are created from each connected car 65 billion messages are sent on WhatsApp 5 billion searches are made 294 billion emails are sent
  • 4. What is BIG DATA? A Collection of large and complex datasets which are difficult to store and process using the traditional database and data processing tools is considered as big data. Big data is collected from traditional and digital sources which, when refined properly can be used for research and analysis. Everything around us generates big data continuously. Social media websites and digital sources are responsible for producing such huge amount of data. 4
  • 5. Where does BIG DATA come from? Social data comes from the Likes, Tweets & Retweets, Comments, Video Uploads, and general media that are uploaded and shared via the world’s favorite social media platforms. 5 Machine data information which is generated by industrial equipment, sensors that are installed in machinery, and even web logs which track user behavior. Transactional data is generated from all the daily transactions that take place both online and offline. Invoices, payment orders, storage records, delivery receipts – all are characterized as transactional data The bulk of big data generated comes from three primary sources: social data, machine data and transactional data.
  • 6. What are the different types of BIG DATA? •Data which has a defined format and is organized in a predefined schema is called structured data •Example - Data coming from traditional databases and repositories like Mainframes, SQL server, Oracle, DB2, Sybase, Access, Excel, Teradata, etc.. •Data which is unorganized and it is not easy to interpret such data using traditional databases or data models •Data coming from social media like Chatter, text analytics, blogs, Tweets, comments, clicks, tags etc.. •Data is un-modelled and needs to be organized, although there might be a schema. •Data coming from emerging market data, e-commerce, and other third party data like weather, currency conversion, demographic, panel etc. 6 Structured Data Unstructured Data Multi-Structured Data
  • 7. What are the 5V’s of BIG DATA? 7 Characteristics of BIG DATA VOLUME VALUE VELOCITY VERACITY VARIETY
  • 8. What is VOLUME in BIG DATA? It refers to the size of Big Data. Data can be considered Big Data or not is based on the volume. The rapidly increasing volume data is due to cloud-computing traffic, IoT, mobile traffic etc. 8
  • 9. What is VELOCITY in BIG DATA? It refers to the speed at which the data is getting accumulated. This is mainly due to IoTs, mobile data, social media etc. 9
  • 10. What is VARIETY in BIG DATA? It refers to collecting data from multiple sources to understand a problem and make smarter, more informed decisions. Clear, uncomplicated access to an extensive variety of data is also the key to creating platforms that boost innovation and efficiency. 10
  • 11. What is VERACITY in BIG DATA? It is the level of precision or honesty of data collection. With regards to the veracity of big data, it’s not simply the nature of the data that is significant, yet how dependable the processing, type, and source of the data are. 11
  • 12. What is VALUE in BIG DATA? This is indeed the holy grail of Big Data and what we are all looking for. One has to demonstrate value that can be extracted from big or small data in order to justify the investments, whether on Big Data or on traditional analytics, data warehouse or business intelligence tools. 12
  • 13. What is BIG DATA TECHNOLOGY? Big data technology is primarily designed to analyze, process and extract information from a large data set and a huge set of extremely complex structures. This is very difficult for traditional data processing software to deal with. Big data technology is broadly integrated with many other technologies such as deep learning, machine learning, artificial intelligence (AI), and the Internet of Things (IoT), which are expanding at scale. Combined with these technologies, big data technology focuses on the analysis and processing of large amounts of real-time and batch- related data. We can categorize the leading big data technologies into the following four sections: 13 > Data Storage > Data Mining > Data Analytics > Data Visualization
  • 14. What are the types of BIG DATA Technology? 14 BIG DATA TECHNOLOGIES DATA STORAGE DATA MINING
  • 15. What are the benefits of BIG DATA? 15 BENEFITS Fraud Detection Increasing Brand Loyalty Helps in Decision Making Financial Risk Analysis Helps predict Future Trends Increases Website Optimization