BIGDATA

Decisions! Delivered !!

SARAVANAN . M
SALES MANAGER
21ST NOV 2013

Index:
1. What is BIGDATA?
2. BIGDATA Analytics
3. Characteristics of BIGDATA
4. Attributes of BIGDATA
5. Examples of BIGDATA
6. Size of BIGDATA
7. BIGDATA landscape
8. Industries using BIGDATA
9. Technologies Used
10. HADOOP

11. When should we go for HADOOP
12. Advantages of BIGDATA
13. Risks of BIGDATA
WHAT IS BIGDATA ?
 Bigdata is a term that describes large volumes of high velocity, complex and

variable data that require advance techniques and technologies to enable the
capture, storage, distribution, management and the analysis of information.
 Bigdata is a data that exceeds the processing capacity of conventional

database systems.
 The data is too big , moves too fast, or doesn’t fit the structures of your

database architecture.
 To gain value from this data, you must choose an alternative way to process

it.
BIGDATA ANALYTICS :

 Bigdata analytic is the process of examining and interrogating big data

assets to derive insights of value for decision making.
CHARACTERISTICS OF BIGDATA
 The word ―big‖ in bigdata is not just about the volume. Its also about the

3v`s.
 They are;
 Volume
 Velocity

 Variety.
ATTRIBUTES OF BIGDATA :
 Volume – is that huge amount of digital data created by all sources –

companies, individuals and devices. (What constitutes ―big‖ varies by
perspective and will certainly change over time.)
 Velocity – is the speed of creation, which in turn drives interest in real-time

analytics and automated decision-making.
 Variety - comes from increasing types of data – some structured, as in

databases, much of it unstructured text or video and some semi-structured
data like social media data, location-based data, and log-file data.
EXAMPLES OF BIGDATA
 Sensor networks
 Social networks
 Internet search index
 Astronomy
 Internet text and documents
 Large scale e-commerce
 Weblogs and video archives
 Medical records and call detail records ,etc.
SIZE OF BIGDATA?
 Google :

24PB data processed daily.
 Facebook:

750 million users

12TB daily content
2.7 billion ―likes‖ and ―comments‖.
 Twitter:

340 million daily tweets
1.6 billion search queries
7TB added daily.
BIGDATA LANDSCAPE:
INDUSTRIES THAT ARE USING BIGDATA:
 Banking
 Risk & Fraudulent management
 Customer Analytics

 Telecommunications
 Call detail record processing
 Customer profile

 Health care
 Medical Record text analytics
 Genomic Analysis

 Digital Media
 Real-time ad targeting
 Website analysis

 Government
 Abuse & Fraudulent management
 Customer Analytics
TECHNOLOGY:
 Bigdata is Driven mainly by Open Source Initiatives such as :
 Apache TM HADOOP Project
 Apache TM CASSANDRA Project

 Apache TM HBASE Project
 Apache TM HIVE Project

 Apache TM SOLR Project
HADOOP :
 What is Hadoop?
 Flexible infrastructure for

large scale computation
and data processing on
a network of commodity
hardware.
 Hadoop is completely

written using JAVA.
 Hadoop is an open

source and it is
distributed under Apache
license,

 Hadoop is not :
 a file system nor a

database.
 Not a replacement for

exciting data warehouse
systems nor for all
programing logics.
 Not an On Line

Transaction Processing
(OLTP) system.
WHEN SHOULD WE GO FOR HADOOP?

 When the data is too huge
 When the processes are independent
 For online analytical processing (OLAP)

 For a better scalability
 For Unstructured data
 Also for Parallelism
--BACK TO BIGDATA—
ADVANTAGES:

 Largest and fast growing market
 Leaveraging bigdata for insights can enhance productivity and

competitiveness for companies
 Harnessing bigdata will enable business to improve market intelligence
 Latest trend for IT Professionals in the area of data analytics
RISKS OF BIGDATA:
 Will b so overwhelmed
 Need the right people and solve the right problem

 Technological considerations
 Open source

 Scalability and performance issue

 Many source of bigdata is privacy
 Self regulation
 Legal regulation
THANKYOU !

Decisions! Delivered !!

Bigdata

  • 1.
    BIGDATA Decisions! Delivered !! SARAVANAN. M SALES MANAGER 21ST NOV 2013 Index: 1. What is BIGDATA? 2. BIGDATA Analytics 3. Characteristics of BIGDATA 4. Attributes of BIGDATA 5. Examples of BIGDATA 6. Size of BIGDATA 7. BIGDATA landscape 8. Industries using BIGDATA 9. Technologies Used 10. HADOOP 11. When should we go for HADOOP 12. Advantages of BIGDATA 13. Risks of BIGDATA
  • 2.
    WHAT IS BIGDATA?  Bigdata is a term that describes large volumes of high velocity, complex and variable data that require advance techniques and technologies to enable the capture, storage, distribution, management and the analysis of information.  Bigdata is a data that exceeds the processing capacity of conventional database systems.  The data is too big , moves too fast, or doesn’t fit the structures of your database architecture.  To gain value from this data, you must choose an alternative way to process it.
  • 3.
    BIGDATA ANALYTICS : Bigdata analytic is the process of examining and interrogating big data assets to derive insights of value for decision making.
  • 4.
    CHARACTERISTICS OF BIGDATA The word ―big‖ in bigdata is not just about the volume. Its also about the 3v`s.  They are;  Volume  Velocity  Variety.
  • 5.
    ATTRIBUTES OF BIGDATA:  Volume – is that huge amount of digital data created by all sources – companies, individuals and devices. (What constitutes ―big‖ varies by perspective and will certainly change over time.)  Velocity – is the speed of creation, which in turn drives interest in real-time analytics and automated decision-making.  Variety - comes from increasing types of data – some structured, as in databases, much of it unstructured text or video and some semi-structured data like social media data, location-based data, and log-file data.
  • 6.
    EXAMPLES OF BIGDATA Sensor networks  Social networks  Internet search index  Astronomy  Internet text and documents  Large scale e-commerce  Weblogs and video archives  Medical records and call detail records ,etc.
  • 7.
    SIZE OF BIGDATA? Google : 24PB data processed daily.  Facebook: 750 million users 12TB daily content 2.7 billion ―likes‖ and ―comments‖.  Twitter: 340 million daily tweets 1.6 billion search queries 7TB added daily.
  • 8.
  • 9.
    INDUSTRIES THAT AREUSING BIGDATA:  Banking  Risk & Fraudulent management  Customer Analytics  Telecommunications  Call detail record processing  Customer profile  Health care  Medical Record text analytics  Genomic Analysis  Digital Media  Real-time ad targeting  Website analysis  Government  Abuse & Fraudulent management  Customer Analytics
  • 10.
    TECHNOLOGY:  Bigdata isDriven mainly by Open Source Initiatives such as :  Apache TM HADOOP Project  Apache TM CASSANDRA Project  Apache TM HBASE Project  Apache TM HIVE Project  Apache TM SOLR Project
  • 11.
    HADOOP :  Whatis Hadoop?  Flexible infrastructure for large scale computation and data processing on a network of commodity hardware.  Hadoop is completely written using JAVA.  Hadoop is an open source and it is distributed under Apache license,  Hadoop is not :  a file system nor a database.  Not a replacement for exciting data warehouse systems nor for all programing logics.  Not an On Line Transaction Processing (OLTP) system.
  • 12.
    WHEN SHOULD WEGO FOR HADOOP?  When the data is too huge  When the processes are independent  For online analytical processing (OLAP)  For a better scalability  For Unstructured data  Also for Parallelism
  • 13.
    --BACK TO BIGDATA— ADVANTAGES: Largest and fast growing market  Leaveraging bigdata for insights can enhance productivity and competitiveness for companies  Harnessing bigdata will enable business to improve market intelligence  Latest trend for IT Professionals in the area of data analytics
  • 14.
    RISKS OF BIGDATA: Will b so overwhelmed  Need the right people and solve the right problem  Technological considerations  Open source  Scalability and performance issue  Many source of bigdata is privacy  Self regulation  Legal regulation
  • 15.