What is big data and its characteristics

  • 3,857 views
Uploaded on

The term “big data” has been broadly becoming a buzz word – combination of both technical and marketing. Edd Dumbill, principal analyst for O’Reilly Radar in simple terms defined it a Big data is data …

The term “big data” has been broadly becoming a buzz word – combination of both technical and marketing. Edd Dumbill, principal analyst for O’Reilly Radar in simple terms defined it a Big data is data that becomes large enough that it cannot be processed using conventional methods. The size of the data which can be considered to be Big Data is a constantly varying factor and newer tools are continuously being developed to handle this “Big Data”.

More in: Education , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,857
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
192
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. © VH Education Services Pvt. Ltd. http://venturehire.co What is Big Data and its Characteristics? Did you know that 90% of the data in the world today was created in the last two years? The growth of data will continue to rise as the cost of storage decreases. Below is the rate of growth of data since 2005 to 2015(forecast) – IDC Research What is Big Data? The term “big data” has been broadly becoming a buzz word – combination of both technical and marketing. Edd Dumbill, principal analyst for O’Reilly Radar in simple terms defined it a Big data is data that becomes large enough that it cannot be processed using conventional methods. The size of the data which can be considered to be Big Data is a constantly varying factor and newer tools are continuously being developed to handle this “Big Data”. How much data is Big Data and how fast is it growing? To put some historical context and evolution of systems and technology – we must understand the best use of technology is where you are solving a problem or pain point. For example, Keg Kruger, Bixo Labs in his presentation – A Very Short History of Big Data” described how the US census used Hollerith Tabulating Systems in 1890 to tabulate millions of pages of data which was historically being done manually. Hollerith’s tabulating company with three other
  • 2. © VH Education Services Pvt. Ltd. http://venturehire.co companies were combined to form Computing Tabulating Recording Corporation which is now International Business Machines – IBM. We must first understand, how do we measure of data? Byte (8 bits equals 1 byte) is a unit of measure of digital information. it is important to understand the below metrics as we start looking at big data. Big Data is certainly not a measurement but we should understand how much data is considered “Big”. In the table below – the starting of terrabyte of data is considered to starting of what is referred to as big data. Definitions Estimations Gigabyte:1024 megabytes 4.7 Gigabytes: A single DVD Terabyte:1024 gigabytes 1 Terabyte: About two years worth of non-stop MP3s. (Assumes one megabyte per minute of music) 10 Terabytes: The printed collection of the U.S. Library of Congress Petabyte:1024 terabytes 1 Petabyte: The amount of data stored on a stack of CDs about 2 miles high or 13 years of HD-TV video 20 Petabytes: The storage capacity of all hard disk drives created in 1995 Exabyte:1024 petabytes 1 Exabyte: One billion gigabytes 5 Exabytes: All words ever spoken by mankind Source: IBM What are the characteristics of Big Data?
  • 3. © VH Education Services Pvt. Ltd. http://venturehire.co Big Data can be described by the following characteristics: (i) Volume – The quantity of data that is generated is very important in this context.It is the size of the data which determines the value and potential of the data under consideration and whether it can actually be considered as Big Data or not.The name ‘Big Data’ itself contains a term which is related to size and hence the characteristic. (ii) Variety- The next aspect of Big Data is its variety.This means that the category to which Big Data belongs to is also a very essential fact that needs to be known by the data analysts.This helps the people, who are closely analyzing the data and are associated with it, to effectively use the data to their advantage and thus upholding the importance of the Big Data. (iii) Velocity- The term ‘velocity’ in this context refers to the speed of generation of data or how fast the data is generated and processed to meet the demands and the challenges which lie ahead in the path of growth and development. (iv) Variability- This is a factor which can be a problem for those who are analyse the data. This refers to the inconsistency which can be shown by the data at times, thus hampering the process of being able to handle and manage the data effectively. (v) Complexity- Data management can become a very complex process,especially when large volumes of data come from multiple sources.These data need to be linked,connected and correlated in order to be able to grasp the information that is supposed to be conveyed by these data.This situation,is therefore,termed as the ‘complexity’ of Big Data. Examples of Big Data Data comes mainly in two forms- 1. Structured, and 2. Unstructured Data (there are also semi-structured data – eg. XML) Structured data has semantic meaning attached to it whereas Unstructured data has no latent meaning. The growth in data that we are referring is most unstructured data. Below are few examples of unstructured data -
  • 4. © VH Education Services Pvt. Ltd. http://venturehire.co 1. Calls, text, tweet, net surf, browse through various websites each day and exchange messages via several means. 2. Social media usage my several million people for exchanging data in various forms also forms a part of Big Data. 3. Transactions made through card for various payment issues in large numbers every second across the world also constitutes the Big Data. Hope this posts gave you enough of infomation about Big Data and in future posts, we will be looking at – Applications of Big Data i.e. Big Data Analytics, Careers in Big Data – From Software Engineer to becoming a Data Scientist, Hadoop and Applications. Big data Analytics Course in Bangalore