Overview of bigdata

BIG DATA
B . Abinaya Bharathi,
II-M.Sc Cs&IT.,
Nadar Saraswathi college Of Arts and Science, Theni.
1

SYNOPSIS
 What is big data?
 How big it is...?
 Data generated by us
 Real time example
 5 V of big data
 Technology
 Application
 Conclusion 2

WHAT IS BIG DATA ?
 Big Data is nothing but a size of a data.
 Data with large volume.
 Collection of data sets of large that
is difficult to process .
3

HOW BIG IT IS!!
Byte - one seed
Kilobyte - a cup of seed
Megabyte - 8 bags of seed
Gigabyte - 3 trucks of seed
Terabyte - 2 ships of seed
Petabyte - whole volume of our India
Exabyte - volume of Asian continent
Zettabyte - fills our Indian ocean
Yottabyte - volume of whole earth
A text file
Desktop
Internet
Big data
Future
4

REAL TIME EXAMPLES
Facebook Google
5

DATA GENERATED BY US
 There are 2.5 quintillion bytes of data created each day
 Google now processes more than 40,000 searches EVERY
second (3.5 billion searches per day)!
 There are five new Facebook profiles created every
second!
 Every minute there are 510,000 comments posted and
293,000 statuses updated
 95 million photos and videos are uploaded on face book
per day. 6

TECHNOLOGY
 Big data always brings a number of challenges..
 80% of datum are unstructured .
 how to structured that datum and
 how to analyze and store the datum.
 the top technologies used to store and analyse Big Data are
 Hadoop
 NoSql
 Hive
 Sqoop 7

HADOOP
 Developed by apache software development
 It is a framework. Developed by java.
 This framework runs on a cluster and has an ability to
allow us to process data across all nodes.
 Hadoop distributed file system - storage system of
hadoop
 HDFS splits the data and distribute among different
nodes in clusters. 8

NOSQL
 Not only sql
 NoSQL (Not Only SQL) to handles unstructured data.
 NoSQL databases store unstructured data with no particular schema
 NoSQL gives better performance in storing very big amount of data.
 Other free NoSQL open source database are
 Mongodb
 Couchdb
 Hbase
 Perst
 casandra 9

Hive
 This is a distributed data management for Hadoop.
 It is like SQL query option HiveSQL (HSQL) to access big data.
 This can be primarily used for Data mining purpose.
 This runs on top of Hadoop.
Sqoop
 This tool connects Hadoop with various relational databases to
transfer data.
 used to transfer structured data to Hadoop or Hive.
10

 Volume
 size of the data content generated that needs to be analyzed.
 Velocity
 speed at which new data is generated, and the speed at which
data moves.
 Value
 meaningful outpu
 worth of the data being extracted.
 Having endless amounts of data is one thing, but unless it can be
turned into value it is useless.
12

 Variety
 types of data that can be analyzed. previously we use rdbms it is
a structured data so we can easily analyse the data. but now a day
80% of data are unstructured big data technology is now
allowing structured and unstructured data to be collected, stored,
and used simultaneously.
 Veracity
 trustworthiness of the data Just how accurate is all this data?
13

CONCLUSION
 Companies are turning to Big Data in order to expand into new
markets and improve customer relations .
 The use of analytics can improve the industry knowledge of the
analysts.
 There are huge requirements of big data analytics in different fields
and industries.
 So the role of big data in present IT world is very desirable.
21

Overview of bigdata

More Related Content

What's hot

Similar to Overview of bigdata

More from Abinaya B

Recently uploaded

Overview of bigdata