This document provides an overview of big data, including definitions of key terms like data, big data, and examples of big data. It describes why big data is important, how big data analytics works, and the benefits it provides. It outlines different types of big data like structured, unstructured, and semi-structured data. It also discusses characteristics of big data like volume, velocity, variety, and veracity. Additionally, it identifies primary sources of big data and examples of big data tools and software. Finally, it briefly discusses how big data and machine learning are related and how AI can be used to enhance big data analytics.
2. CONTENT
What is Data?
What is Big Data?
What is an example of Big Data?
Why Is Big Data Important?
Big Data Analytics
Benefits of Big Data Analytics
Types of Big Data
Characteristics of Big Data
Primary Source of Big Data
Big Data Tools and Software
Big Data Mining
3. WHAT IS DATA?
A collection of raw facts and figures is called Data.
It is collected for different purposes.
4. WHAT IS BIG DATA?
Big data primarily refers to data sets that are too large or complex to be dealt with by traditional
data-processing application software
Big data is a type of data that is extremely large in size.
5. WHAT IS AN EXAMPLE OF BIG DATA?
The following are some Big Data examples:
Big Data is hugely used for cyber security engineers protect networks and data from unauthorized access.
Spotify, on-demand music-providing platform, uses Big Data Analytics collects data from all its users
around the globe, and then uses the analyzed data to give informed music recommendations and
suggestions to every individual user.
Amazon Prime which offers, videos, music, and Kindle books in a one-stop shop is also big on using big
data.
6. WHY IS BIG DATA IMPORTANT?
In the energy industry, big data helps oil and gas companies identify potential drilling locations and
monitor pipeline operations; likewise, utilities use it to track electrical grids.
Financial services firms use big data systems for risk management and real-time analysis of market
data.
Manufacturers and transportation companies rely on big data to manage their supply chains and
optimize delivery routes.
Other government uses include emergency response, crime prevention and smart city initiatives.
7. BIG DATA ANALYTICS
Big data analytics refers to collecting, processing, cleaning, and analyzing large datasets to help
organizations operationalize their big data.
1. Collect Data
2. Process Data
3. Clean Data
4. Analyze Data
8. BENEFITS OF BIG DATA ANALYTICS
Some benefits of big data analytics include:
Cost savings. Helping organizations identify ways to do business more efficiently
Product development. Providing a better understanding of customer needs
Market insights. Tracking purchase behavior and market trends
9. TYPES OF BIG DATA
Following are the types of Big Data:
Structured
Unstructured
Semi-structured
10. STRUCTURED
Structured Data is used to refer to the data which is already stored in databases, in an ordered
manner.
There are two sources of structured data;
Human-Generated
Machine-Generated
12. UN-STRUCTURED
Unstructured data is defined as any data with an unknown form or structure.
A typical example of unstructured data is a heterogeneous data source containing a combination of
simple text files, images, videos etc
14. SEMI-STRUCTURED
Semi-structured data can contain both types of information.
Semi-structured data appears to be structured, but it is not defined in the same way that a table
definition in a relational DBMS is.
A data representation in an XML file is an example of semi-structured data.
17. VOLUME
The name Big Data itself is related to a size which is enormous.
Size of data plays a very crucial role in determining value out of data. Also, whether a particular
data can actually be considered as a Big Data or not, is dependent upon the volume of data.
Hence, Volume is one characteristic which needs to be considered while dealing with Big Data
solutions.
For example;
Organizational data
Social media data
18. VELOCITY
The term ‘velocity’ refers to the speed of generation of data.
How fast the data is generated and processed to meet the demands, determines real potential in
the data.
Big Data Velocity deals with the speed at which data flows in from sources like business processes,
application logs, networks, and social media sites, sensors, Mobile devices, etc.
The flow of data is massive and continuous.
19. VERACITY
When we are dealing with a high volume, velocity and variety of data, it is not possible that all of
the data is going to be 100% correct, there will be dirty data.
The quality of the data being captured can vary greatly.
The data accuracy of analysis depends on the veracity of the source data.
20. VALUE
Value is the most important aspect in the big data.
Though, the potential value of the big data is huge.
It is all well and good having access to big data but unless we can turn it into value it is become
useless.
It becomes very costly to implement IT infrastructure systems to store big data, and businesses are
going to require a return on investment.
21. VARIETY
Big data is not always structured data and it is not always easy to put big data into a relational
database.
This means that the category to which Big Data belongs to is also a very essential fact that needs to
be known by the data analysis.
Dealing with a variety of structured and unstructured data greatly increases the complexity of both
storing and analyzing Big Data.
90% of data generated is data is in unstructured form.
22. PRIMARY SOURCE OF BIG DATA
Primary sources of Big Data are;
Social Data
Machine Data
Transactional Data
23. SOCIAL DATA
Social data comes from the Likes, Tweets & Retweets, Comments, Video Uploads, and general
media that are uploaded and shared via the world’s favorite social media platforms.
This kind of data provides invaluable insights into consumer behavior and sentiment and can be
enormously influential in marketing analytics.
The public web is another good source of social data, and tools like Google Trends can be used to
good effect to increase the volume of big data.
24. MACHINE DATA
Machine data is defined as information which is generated by industrial equipment, sensors that are
installed in machinery, and even web logs which track user behavior.
This type of data is expected to grow exponentially as the internet of things grows ever more
universal and expands around the world.
Sensors such as medical devices, smart meters, road cameras, satellites, games and the rapidly
growing Internet Of Things will deliver high velocity, value, volume and variety of data in the very
near future.
25. TRANSACTIONAL DATA
Transactional data is generated from all the daily transactions that take place both online and
offline.
Invoices, payment orders, storage records, delivery receipts – all are characterized as transactional
data yet data alone is almost meaningless, and most organizations struggle to make sense of the
data that they are generating and how it can be put to good use.
26. BIG DATA TOOLS AND SOFTWARE
• Hadoop
• Atlas.it
• HPCC
• Storm
• Cassandra
• Kaggle
• CouchDB
• Pentaho
27. BIG DATA MINING
Big data mining is referred to the collective data mining or extraction techniques that are
performed on large sets /volume of data or the big data.
Big data mining is primarily done to extract and retrieve desired information or pattern from
humongous quantity of data.
Big data mining works on data searching, refinement , extraction and comparison algorithms.
28. Big data and Machine Learning
Big Data and Machine Learning have become the reason behind the success of various industries. Both
these technologies are becoming popular day by day among all data scientists and professionals. Big
data is a term that is used to describe large, hard-to-manage, structured, and unstructured voluminous
data. Whereas, Machine learning is a subfield of Artificial Intelligence that enables machines to
automatically learn and improve from experience/past data.
Both Machine learning and big data technologies are being used together by most companies because it
becomes difficult for the companies to manage, store, and process the collected data efficiently; hence in such a
case, Machine learning helps them.
29. Relationship between AI and big data
By bringing together big data and AI technology, companies can improve business performance
and efficiency by:
Anticipating and capitalizing on emerging industry and market trends.
Analyzing consumer behavior and automating customer segmentation
Personalizing and optimizing the performance of digital marketing campaigns
Using intelligent decision support systems fueled by big data, AI, and predictive analytics
30. BIG DATA AND AI
BIG DATA is being used in AI
Learn how to get big value from big data
Dive deeper on big data
31. How is AI used with big data?
AI makes big data analytics simpler by automating and enhancing data preparation, data
visualization, predictive modeling, and other complex analytical tasks that would otherwise be
labor-intensive and time-consuming. AI helps users work with, manipulate, and surface actionable
insights faster from large, complex datasets.
32. Future Direction
BIG DATA will be further
collected to improve our models
and improve our AI models.
Big data will help us in
Our future projects as well.