SlideShare a Scribd company logo
1 of 16
Download to read offline
A BEGINNERS GUIDE
Contents:
> Introduction
> What is Big Data?
> Big Data as a technology
> 5 V’s of Big Data
> Big Data Technology
> Benefits of Big Data
2
Introduction
“Data is the new science, Big Data holds the answers”
3
500 million tweets
are sent everyday
4 petabytes of data are
created on Facebook
4 terabytes of data are
created from each
connected car
65 billion messages
are sent on WhatsApp
5 billion searches are
made
294 billion emails
are sent
What is BIG DATA?
A Collection of large and complex datasets which are difficult to store
and process using the traditional database and data processing tools is
considered as big data. Big data is collected from traditional and digital
sources which, when refined properly can be used for research and
analysis.
Everything around us generates big data continuously. Social media
websites and digital sources are responsible for producing such huge
amount of data.
4
Where does BIG DATA come
from?
Social data
comes from the Likes, Tweets &
Retweets, Comments, Video
Uploads, and general media that
are uploaded and shared via the
world’s favorite social media
platforms.
5
Machine data
information which is generated by
industrial equipment, sensors
that are installed in machinery,
and even web logs which track
user behavior.
Transactional data
is generated from all the daily
transactions that take place both
online and offline. Invoices,
payment orders, storage records,
delivery receipts – all are
characterized as transactional
data
The bulk of big data generated comes from three primary sources: social
data, machine data and transactional data.
What are the different types
of BIG DATA?
>Data which has a defined format and is organized in a
predefined schema is called structured data
>Example - Data coming from traditional databases and
repositories like Mainframes, SQL server, Oracle, DB2,
Sybase, Access, Excel, Teradata, etc..
>Data which is unorganized and it is not easy to interpret
such data using traditional databases or data models
>Data coming from social media like Chatter, text analytics,
blogs, Tweets, comments, clicks, tags etc..
>Data is un-modelled and needs to be organized, although
there might be a schema.
>Data coming from emerging market data, e-commerce,
and other third party data like weather, currency
conversion, demographic, panel etc.
6
Structured Data
Unstructured Data
Multi-Structured Data
What are the 5V’s of BIG
DATA?
7
Characteristics
of BIG DATA
VOLUME
VALUE
VELOCITY
VERACITY
VARIETY
What is VOLUME in BIG
DATA?
It refers to the size of Big Data. Data can be considered Big Data or not is
based on the volume. The rapidly increasing volume data is due to
cloud-computing traffic, IoT, mobile traffic etc.
8
What is VELOCITY in BIG
DATA?
It refers to the speed at which the data is getting accumulated. This is
mainly due to IoTs, mobile data, social media etc.
9
What is VARIETY in BIG
DATA?
It refers to collecting data from multiple sources to understand a
problem and make smarter, more informed decisions. Clear,
uncomplicated access to an extensive variety of data is also the key to
creating platforms that boost innovation and efficiency.
10
What is VERACITY in BIG
DATA?
It is the level of precision or honesty of data collection. With regards to
the veracity of big data, it’s not simply the nature of the data that is
significant, yet how dependable the processing, type, and source of the
data are.
11
What is VALUE in BIG
DATA?
This is indeed the holy grail of Big Data and what we are all looking for.
One has to demonstrate value that can be extracted from big or small
data in order to justify the investments, whether on Big Data or on
traditional analytics, data warehouse or business intelligence tools.
12
What is BIG DATA
TECHNOLOGY?
Big data technology is primarily designed to analyze, process and extract
information from a large data set and a huge set of extremely complex
structures. This is very difficult for traditional data processing software
to deal with. Big data technology is broadly integrated with many other
technologies such as deep learning, machine learning, artificial
intelligence (AI), and the Internet of Things (IoT), which are expanding
at scale. Combined with these technologies, big data technology focuses
on the analysis and processing of large amounts of real-time and batch-
related data.
We can categorize the leading big data technologies into the following
four sections:
13
> Data Storage
> Data Mining
> Data Analytics
> Data Visualization
What are the types of BIG
DATA Technology?
14
BIG DATA
TECHNOLOGIES
DATA STORAGE DATA MINING
What are the benefits of BIG
DATA?
15
BENEFITS
Fraud
Detection
Increasing
Brand
Loyalty
Helps in
Decision
Making
Financial
Risk Analysis
Helps
predict
Future
Trends
Increases
Website
Optimization
16
THANK YOU
https://www.cetpainfotech.com/
9212172602
QUERY@CETPAINFOTECH.COM

More Related Content

Featured

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 

Featured (20)

Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 

Big Data Guide for beginners - pdf

  • 2. Contents: > Introduction > What is Big Data? > Big Data as a technology > 5 V’s of Big Data > Big Data Technology > Benefits of Big Data 2
  • 3. Introduction “Data is the new science, Big Data holds the answers” 3 500 million tweets are sent everyday 4 petabytes of data are created on Facebook 4 terabytes of data are created from each connected car 65 billion messages are sent on WhatsApp 5 billion searches are made 294 billion emails are sent
  • 4. What is BIG DATA? A Collection of large and complex datasets which are difficult to store and process using the traditional database and data processing tools is considered as big data. Big data is collected from traditional and digital sources which, when refined properly can be used for research and analysis. Everything around us generates big data continuously. Social media websites and digital sources are responsible for producing such huge amount of data. 4
  • 5. Where does BIG DATA come from? Social data comes from the Likes, Tweets & Retweets, Comments, Video Uploads, and general media that are uploaded and shared via the world’s favorite social media platforms. 5 Machine data information which is generated by industrial equipment, sensors that are installed in machinery, and even web logs which track user behavior. Transactional data is generated from all the daily transactions that take place both online and offline. Invoices, payment orders, storage records, delivery receipts – all are characterized as transactional data The bulk of big data generated comes from three primary sources: social data, machine data and transactional data.
  • 6. What are the different types of BIG DATA? >Data which has a defined format and is organized in a predefined schema is called structured data >Example - Data coming from traditional databases and repositories like Mainframes, SQL server, Oracle, DB2, Sybase, Access, Excel, Teradata, etc.. >Data which is unorganized and it is not easy to interpret such data using traditional databases or data models >Data coming from social media like Chatter, text analytics, blogs, Tweets, comments, clicks, tags etc.. >Data is un-modelled and needs to be organized, although there might be a schema. >Data coming from emerging market data, e-commerce, and other third party data like weather, currency conversion, demographic, panel etc. 6 Structured Data Unstructured Data Multi-Structured Data
  • 7. What are the 5V’s of BIG DATA? 7 Characteristics of BIG DATA VOLUME VALUE VELOCITY VERACITY VARIETY
  • 8. What is VOLUME in BIG DATA? It refers to the size of Big Data. Data can be considered Big Data or not is based on the volume. The rapidly increasing volume data is due to cloud-computing traffic, IoT, mobile traffic etc. 8
  • 9. What is VELOCITY in BIG DATA? It refers to the speed at which the data is getting accumulated. This is mainly due to IoTs, mobile data, social media etc. 9
  • 10. What is VARIETY in BIG DATA? It refers to collecting data from multiple sources to understand a problem and make smarter, more informed decisions. Clear, uncomplicated access to an extensive variety of data is also the key to creating platforms that boost innovation and efficiency. 10
  • 11. What is VERACITY in BIG DATA? It is the level of precision or honesty of data collection. With regards to the veracity of big data, it’s not simply the nature of the data that is significant, yet how dependable the processing, type, and source of the data are. 11
  • 12. What is VALUE in BIG DATA? This is indeed the holy grail of Big Data and what we are all looking for. One has to demonstrate value that can be extracted from big or small data in order to justify the investments, whether on Big Data or on traditional analytics, data warehouse or business intelligence tools. 12
  • 13. What is BIG DATA TECHNOLOGY? Big data technology is primarily designed to analyze, process and extract information from a large data set and a huge set of extremely complex structures. This is very difficult for traditional data processing software to deal with. Big data technology is broadly integrated with many other technologies such as deep learning, machine learning, artificial intelligence (AI), and the Internet of Things (IoT), which are expanding at scale. Combined with these technologies, big data technology focuses on the analysis and processing of large amounts of real-time and batch- related data. We can categorize the leading big data technologies into the following four sections: 13 > Data Storage > Data Mining > Data Analytics > Data Visualization
  • 14. What are the types of BIG DATA Technology? 14 BIG DATA TECHNOLOGIES DATA STORAGE DATA MINING
  • 15. What are the benefits of BIG DATA? 15 BENEFITS Fraud Detection Increasing Brand Loyalty Helps in Decision Making Financial Risk Analysis Helps predict Future Trends Increases Website Optimization