2. Big data is a term used to describe the collection,
processing and availability of huge volumes of
streaming data in real-time.
There are some things so big they have implications
for everyone, whether we want it or not.
Big Data is one of those things and is completely
transforming the way that we do business and is
impacting most other parts of our lives.
The basic idea behind the phrase Big Data is that
everything we do is increasing living a digital trace (or
data) which we (and others) can use and analyse.
WHAT IS BIG DATA?
3. Big Data is every and inevitable, it ranges from
suggesting what movie to watch on Netflix/YouTube,
to predicating national disasters.
Big Data surrounds us everywhere, ultimately
influencing decisions that we make every day. For
instance, when shopping on Amazon, products are
recommended to us based on our shopping patterns.
It is used for weather predictions, including whether
or not we will get cyclones or not…..
4. From the dawn of civilization until
2003, humankind generated 5
exabytes of data. Now we produce
5 exabytes every two days…. and
the pace is accelerating.
Eric Schmidt
Executive Chairman, Google.
5. SOME BIG DATA STATS
• Walmart handles more than 1 million customer
transactions every hour
• Facebook handles 40 billion photos from its user
base.
• Decoding the human genome originally took
10years to process; now it can be achieved in one
week.
• The analyst firm Gartner says that by 2020 there
will be over 26 billion connected devices.
• We sent men to the moon in 1969 on a tiny fraction
of the data that's in the average laptop 1.
1. HTTP://WWW.BUSINESSINSIDER.COM/MIND-BLOWING-GROWTH-AND-POWER-OF-BIG-DATA-2015-6
6. CONTINUED….
• 701,389 logins on Facebook
• 1.8million likes per minute on Facebook
• 350GB of data generated
• 69,444 hours watched on Netflix
• 51,000 app downloads on Apple’s App Store
• 347,222 tweets on Twitter
• 28,194 new posts to Instagram
• 38,052 hours of music listened to on Spotify
• 2.78 million video views on Youtube
• 2,083,333 minutes used on Skype Calls
7.
8. With the datafication comes big data,
which is often described using the four
Vs:
•Volume
•Velocity
•Variety
•Vercity
9. VOLUM
E
• Refers to the huge amounts of data generated every
second.
• Not talking Terabytes but Zettabytes or Brontobytes.
• If we take all the data generated in the world between
the beginning of the and 2000, the same amount of data
will soon will be generated every minute.
• A typical PC might have had 10 gigabytes of storage in
2000.
• Today Facebook ingest 500 terabyte of new day every
day.
• Boeing 737 will generate 240 terabytes of flight data
10. VELOCITY
• Refers to the speed at which new data is generated
and the speed at which data moves around.
• Examples: High-frequency stock trading algorithms
reflect market changes within microseconds.
• Online gaming systems support millions of
concurrent users. Each producing multiple inputs
per second.
• 50,000GB/Second is the estimated rate of global
internet traffic by 2018.
11. VARIETY
• Refers to the different types of data we can now use.
• In the past we only focused on structured data that
neatly fitted into tables or relational databases such as
financial data., but now 80% of the worlds data is
unstructured (3D Data, images, video, voice etc)
• With data technology we can now analyse and bring
together data of different types such as messages,
social media conversations, photos sensor data, video
or voice recordings.
• 1 in 3 business leaders don’t trust the information they
use to make decisions
• $3.1trillion is the estiamated amount of money that poor
data quality costs the US economy per year.
12. VERACITY
• Refers to the messiness or trustworthiness of the
data.
• With many forms of big data quality and accuracy
are less controllable (just think of Twitter posts with
hash tags, abbreviations, typos and colloquial
speech as well as the reliability and accuracy of
content) but technology now allows us to work with
this type of data.
13. DIFFERENT TYPES OF DATA
ACTIVITY DATA
Digital music players and eBooks collect data on our activities.
Your smart phone collects data on how you use it and your web
browser collects information on what you are searching for.
Your credit card company collects data on where you shop and
your shop collects data on what you buy. It is hard to imagine
any activity that does not generate data.
PHOTO AND VIDEO IMAGE DATA
Just think about all the pictures we take on our smart phones
or digital cameras. We upload and share 100s of thousands
of them on social media sites every second.
The increasing amounts of CCTV cameras take video
images, we also upload hundreds of hours of video images to
YouTube and other sites every minute.
14. SENSOR DATA
We are increasingly surrounded by sensors that
collect and share data.
Take your smart phone for instance, it contains a
GPS senor to track exactly where you are every
second of the day, and an accelometer to track the
speed and direction at which you are travelling.
15. BIG DATA SOURCES
1. Users: creating data via Facebook, Twitter, internal
company systems etc
2. Applications: Automatically create logs of who has
changed/accessed what within the system and more.
3. Systems: Monitoring systems on aircraft generate
gigabytes of data each flight monitoring different parts
of the plane every second of the flight.
4. Sensors: Door entries, temperature controls etc.
16. DATA GENERATION
EXAMPLES
• Mobile Devices [Phones/Tablets/Ebook readers….]
• Readers/Scanners
• Science Facilities/Programs/Software
• Social Media
• Cameras [Geo-tagging of Photos]
17. THE STRUCTURE OF BIG DATA
• Structured: Most traditional data sources.
• Semi-structured: Many sources of big data.
• Unstructured: Video data, audio data.
• 90% of generated data is “unstructured”. This
includes tweets, photos, customer purchase
history and customer service calls.
18. BENEFITS – 8 DIMENSIONS TO
MEASURE VALUE FROM YOUR
DATA
• What makes data valuable to an organization?
• Data with the following qualities can fuel big
returns, but most organizations are struggling to
make this a reality.
19.
20.
21. BIG DATA = BIG OPPORTUNITIES
Today many organizations have initiated projects and
a subset have truly innovated. But many will fail to
convert these projects to knowledge and business
value.
22.
23. THE INTERNET OF
THINGS
We know have smart TVs that are able to collect and
process data, smart watches, smart alarms. The
Internet of Things connects these devices so that in
future, we will be able to have things like traffic
sensors in the road to send data to your alarm clock
which will wake you up earlier than planned because
the blocked road means you will have to leave earlier
to make your 9am meeting.
24. LAWS ARE ALREADY BEING PASSED
TO DEAL WITH THE IOT &
INTERCONNECTIVITY
The US has ruled that cars must be able to talk to
each other. The National Highway Traffic Safety
Administration has formally proposed a rule requiring
a uniform industry-wide system that would be put in
all new cars. If the rule is approved, NHTSA said, it
would take two to four years for the technology to be
in all new cars. Even with human drivers, the
technology could help avoid 80% of crashes involving
sober drivers, according to NHTSA1.
HTTP://MONEY.CNN.COM/2016/12/13/TECHNOLOGY/NHTSA-VEHICLE-TO-VEHICLE-COMMUNICATION-
RULE/INDEX.HTML?SECTION=MONEY_TOPSTORIES
25. ADVANTAGES OF BIG DATA &
IOT
• Supply chain or delivery route optimization - using data
from geographic positioning and radio frequency identification
sensors.
Allows cities to optimize traffic flows based on real time traffic
information., and operate to minimize jams.
• Improving Health - Ability to monitor and predict epidemics
and disease outbreaks.
• Police forces use big data tools to catch criminals and even
predict criminal activity, and credit card companies use big
data analytics to detect fraudulent transactions.
• Improving Sports Performance - track athletes while playing
on the field, as well as use fitness trackers to track activity,
sleep, and sleep.
26. MAIN BENEFITS
1. Safety, Comfort, Efficiency
Now imagine monotonous tasks being automated and done by machines.
For example, smart assembly lines could report misconfigurations and
errors in real time, producing higher yields and less downtime.
The result is more time for productive and rewarding work. This would drive
higher employee satisfaction and retention, while dramatically improving
profit margins.
2. Better Decision Making
If you can analyze larger trends from empirical data, you can make smarter
decisions. This takes assumptions out of the equation, giving you data-
backed visibility into every aspect of your business. For example, testing
cycles would radically shorten—lowering the costs to optimize a process.
Additionally, the visibility into system behaviors can yield new insights and
ideas, guiding your business like never before.
3. Revenue Generation
At first, the above benefits from the IoT will impact your bottom line simply
by reducing expenses and improving efficiency.
27. DISADVANTAGES OF BIG DATA
1. Security and Privacy
1. Companies being hacked, for example, in the US 91% of all healthcare organizations
have had at least one data breach in the last two years, and US federal government
had 61,000 cyber-security breaches in 2014 alone.
2. Identities stolen, through stolen SSN, credit cards etc.
3. Often default device settings equate to wide open, even when access controls are
present many organizations don’t have strong security protocols in place. This is the
IoT equivalent of having a username/password combo of “admin” and “password”
2. Data and Complexity
• The IoT generates countless bytes of data—but business value is measured not in
bytes, but in the analysis of trends and patterns
• Now, imagine the complexity of thousands of sensors collecting data each hour
across a single organization. If you don’t have a plan to process and analyze these
huge quantities of data, you won’t be able to translate any of these findings to
better business practices.
3: Business and IT Buy-in
Given the above concerns about security and complexity, persuading stakeholders to
buy into the IoT can be difficult. The perceived costs and risks to simply lay a
foundation or run a single experiment can hold back progress.
THREATS
28. REAL WORLD EXAMPLES
Peddamail
Peddamail gives an example of a grocery team struggling to
understand why sales of a particular produce were unexpectedly
declining. Once their data was in the hands of the Cafe analysts, it was
established very quickly that the decline was directly attributable to a
pricing error. The error was immediately rectified and sales recovered
within days2.
Sales across different stores in different geographical areas can also be
monitored in real-time. One Halloween, Peddamail recalls, sales figures
of novelty cookies were being monitored, when analysts saw that there
were several locations where they weren’t selling at all. This enabled
them to trigger an alert to the merchandising teams responsible for
those stores, who quickly realized that the products hadn’t even been
put on the shelves. Not exactly a complex algorithm, but it wouldn’t
have been possible without real-time analytics.2. HTTP://WWW.FORBES.COM/SITES/BERNARDMARR/2016/08/25/THE-MOST-PRACTICAL-BIG-DATA-USE-CASES-OF-
2016/#59B5BDCA7533
29. Rolls-Royce
Rolls-Royce put Big Data processes to use in three key areas of
their operations: design, manufacture and after-sales support.
Design: generate tens of terabytes of data on each simulation of
one of our jet engines. We then have to use some pretty
sophisticated computer techniques to look into that massive
dataset and visualize whether that particular product we’ve
designed is good or bad.”
Manufacture: manufacturing systems are increasingly becoming
networked and communicate with each other in the drive towards a
networked, Internet of Things (IoT) industrial environment, such as
linking their manufacturing plants in the UK, in Rotherham and
Sunderland.
After-Sales Support: In terms of after-sales support, Rolls-Royce
engines and propulsion systems are all fitted with hundreds of
sensors that record every tiny detail about their operation and
report any changes in data in real time to engineers, who then
3. HTTP://WWW.FORBES.COM/SITES/BERNARDMARR/2016/08/25/THE-MOST-PRACTICAL-BIG-DATA-USE-CASES-
OF-2016/2/#7B8B0545F431
30. WHY SHOULD WE CARE?
• We already have sources of large amounts of data available
that we can leverage and make use of, such as systems we
already have, such as our in-house databases, Payroll,
electronic bills from vendors [Vodafone/Telephone] . By
asking the right questions and analyzing the data we can
possibly have big financial savings.
• Find out when are the most months times for Mondayitis, sick
leave, and peak times for overtime, so can manage staff and
overtime better, and identify if we need/don’t need as many
staff in certain locations.
• Can be used to perform data analytics on customers, for
example which types of accounts regularly go into 60, 90 days
and whether it is customers in certain industries, or at certain
times of the year. It can help plan our cash flow better and
possibly think of renegotiating agreements with some
customers.
31. FUTURE POSSIBILITIES
Vehicle Tracking
• Prices for monitoring equipment are getting lower all the
time, we can monitor fuel costs, vehicle maintenance,
spare parts and help see which vehicles are lower cost
maintenance wise.
Asset Tracking
• Ability to optimize purchasing procedures/costs of other
items like stationary/vehicle spare parts across branches
in an effect to cut down on costs, and possible
double/triple handling by different people.
• The ability is already there to monitor toner usage from
printers, the data can be used to analyze which brands
are more reliable/better cost per page in terms of
printing, and support.
32. IN CONCLUSION…..
The rate of data growth will continue to increase, and
from increasing different sources whether we pay
attention to it or not….
The main question should be – Can we use some of
this data and leverage it so that we can we can work
smarter, more efficiently, and cut down costs or make
more money?
33. FURTHER READING &
RESOURCESBig Data – What is It?
http://www.slideshare.net/BernardMarr/140228-big-data-slide-share
2016 UPDATE: WHAT HAPPENS IN ONE INTERNET MINUTE?
http://www.excelacom.com/resources/blog/2016-update-what-happens-in-one-internet-
minute
What is Big Data? What are the Benefits of Big Data?
https://marketingtechblog.com/benefits-of-big-data/
3 Threats and 3 Benefits of the Internet of Things
https://www.atlanticbt.com/blog/3-threats-and-3-benefits-of-the-internet-of-things/
Identity Theft: The Risky Side of Big Data
https://dzone.com/articles/identity-theft-the-risky-side-of-big-data
Identity Theft + Big Data = Identity Reconstruction
http://blogging.avnet.com/ts/advantage/2015/12/identity-theft-big-data-identity-
reconstruction/
Apache Hadoop
http://hadoop.apache.org/
Apache Spark
http://spark.apache.org/
RapidMiner