BIG DATA
Done by,
B.SENTHIL PANDI(17MC040)
II YEAR MECHATRONICS
SNS COLLEGE OF TECHNOLOGY
OUTLINE :
• INTRODUCTION
• CHARACTERISTIC OF BIG DATA
• PROCESSES IN BIG DATA
• BIG DATA SOURCES
• TYPES OF TOOLS
• BIG DATA STATISTICS
• APPLICATIONS OF BIG DATA
• ADVANTAGES AND DISADVANTAGES
• CONCLUSION
INTRODUCTION :
• Big data refers to data sets that are too large or complex
for traditional data-processing application software to
adequately deal with.
• Firms like Google, eBay, LinkedIn and Facebook were
built around big data from the beginning.
• Big data generates value from storage and process very
large quantities of digital information that cannot be
analyzed with traditional computer system.
CHARACTERISTICS OF BIG DATA :
VOLUME :
• A typical PC might have 10gb of storage in 2000.
• Today, Facebook ingests 500 terabytes of new
data everyday.
• Boeing 737 will generate 240 terabytes of flight
data during a single flight across US.
• Thus the big data source must be very vast in
volume.
VELOCITY :
• Clickstreams and ad impressions capture user behaviour at
millions of events per second.
• Machine to machine processes exchange data between
billion of devices.
• Infrastructure and sensors generate massive log data in real
time.
• Online gaming systems supports millions of concurrent
users, each producing multiple inputs per second.
VARIETY :
• Big data is not just numbers but also collection of
3D data, audio and video, unstructured text,
including log files and social media.
• Traditional database systems are designed only to
store small amount of information.
• Big data analysis includes different types of data.
PROCESSES IN BIG DATA :
I. INTEGRATING DISPARATE DATA STORES :
• Mapping data to programming frame network.
• Connecting and extracting data from storage.
• Transforming data for processing.
• Subdividing data in preparation for Hadoop MapReduce.
II.EMPLOYING HADOOP MAPREDUCE :
• Creating the components of Hadoop MapReduce jobs.
• Distributing data processing across server farms.
• Executing Hadoop MapReducing jobs.
• Monitoring the progress of job flows.
BIG DATA SOURCES :
DATA GENERATION POINTS :
• Mobile devices
• Microphones
• Readers/Scanners
• Science facilities
• Programs/Software
• Social media
• Cameras
TYPES OF TOOLS :
 Where processing data is hosted?
Distributed servers/cloud(eg. Amazon EC2).
 Where data is stored?
Distributed storage(eg. Amazon S3).
 How data is stored and indexed?
High performance schema-free databases(eg. MongoDB).
 What operations are performed on data?
Analytic/Semantic processing.
BIG DATA STATISTICS :
• Facebook generates 10TB of data daily.
• Twitter generates 7TB of data daily.
• IBM claims 90% of today’s stored data is
generated in the last two years.
• Everyday we create 2.5 quintillion bytes of data.
• Walmart handles more than 1 million customer
transactions every hour.
APPLICATIONS OF BIG DATA :
ADVANTAGES OF BIG DATA :
• It helps in improving science and research. It improves
healthcare and public health with availability of record of
patients.
• It helps in financial tradings, sports, polling, security/law
enforcement etc.
• Any one can access vast information via surveys and
deliver answer of any query.
• Every second additions are made.
• One platform carry unlimited information.
DISADVANTAGES OF BIG DATA :
➨Traditional storage can cost lot of money to store
big data.
➨Lots of big data is unstructured.
➨Big data analysis violates principles of privacy.
➨Big data analysis results are misleading
sometimes.
➨Speedy updates in big data can mismatch real
figures.
CONCLUSION :
• Big data brings new and existing
opportunities to companies who utilize the
platforms available.
• In this information Era, big data technology
has got its own importance for businesses.
• It has got lot of opportunities in the
upcoming days.
THANK YOU

Big data ppt

  • 1.
    BIG DATA Done by, B.SENTHILPANDI(17MC040) II YEAR MECHATRONICS SNS COLLEGE OF TECHNOLOGY
  • 2.
    OUTLINE : • INTRODUCTION •CHARACTERISTIC OF BIG DATA • PROCESSES IN BIG DATA • BIG DATA SOURCES • TYPES OF TOOLS • BIG DATA STATISTICS • APPLICATIONS OF BIG DATA • ADVANTAGES AND DISADVANTAGES • CONCLUSION
  • 3.
    INTRODUCTION : • Bigdata refers to data sets that are too large or complex for traditional data-processing application software to adequately deal with. • Firms like Google, eBay, LinkedIn and Facebook were built around big data from the beginning. • Big data generates value from storage and process very large quantities of digital information that cannot be analyzed with traditional computer system.
  • 4.
  • 5.
    VOLUME : • Atypical PC might have 10gb of storage in 2000. • Today, Facebook ingests 500 terabytes of new data everyday. • Boeing 737 will generate 240 terabytes of flight data during a single flight across US. • Thus the big data source must be very vast in volume.
  • 6.
    VELOCITY : • Clickstreamsand ad impressions capture user behaviour at millions of events per second. • Machine to machine processes exchange data between billion of devices. • Infrastructure and sensors generate massive log data in real time. • Online gaming systems supports millions of concurrent users, each producing multiple inputs per second.
  • 7.
    VARIETY : • Bigdata is not just numbers but also collection of 3D data, audio and video, unstructured text, including log files and social media. • Traditional database systems are designed only to store small amount of information. • Big data analysis includes different types of data.
  • 8.
    PROCESSES IN BIGDATA : I. INTEGRATING DISPARATE DATA STORES : • Mapping data to programming frame network. • Connecting and extracting data from storage. • Transforming data for processing. • Subdividing data in preparation for Hadoop MapReduce. II.EMPLOYING HADOOP MAPREDUCE : • Creating the components of Hadoop MapReduce jobs. • Distributing data processing across server farms. • Executing Hadoop MapReducing jobs. • Monitoring the progress of job flows.
  • 9.
  • 10.
    DATA GENERATION POINTS: • Mobile devices • Microphones • Readers/Scanners • Science facilities • Programs/Software • Social media • Cameras
  • 11.
    TYPES OF TOOLS:  Where processing data is hosted? Distributed servers/cloud(eg. Amazon EC2).  Where data is stored? Distributed storage(eg. Amazon S3).  How data is stored and indexed? High performance schema-free databases(eg. MongoDB).  What operations are performed on data? Analytic/Semantic processing.
  • 12.
    BIG DATA STATISTICS: • Facebook generates 10TB of data daily. • Twitter generates 7TB of data daily. • IBM claims 90% of today’s stored data is generated in the last two years. • Everyday we create 2.5 quintillion bytes of data. • Walmart handles more than 1 million customer transactions every hour.
  • 13.
  • 14.
    ADVANTAGES OF BIGDATA : • It helps in improving science and research. It improves healthcare and public health with availability of record of patients. • It helps in financial tradings, sports, polling, security/law enforcement etc. • Any one can access vast information via surveys and deliver answer of any query. • Every second additions are made. • One platform carry unlimited information.
  • 15.
    DISADVANTAGES OF BIGDATA : ➨Traditional storage can cost lot of money to store big data. ➨Lots of big data is unstructured. ➨Big data analysis violates principles of privacy. ➨Big data analysis results are misleading sometimes. ➨Speedy updates in big data can mismatch real figures.
  • 16.
    CONCLUSION : • Bigdata brings new and existing opportunities to companies who utilize the platforms available. • In this information Era, big data technology has got its own importance for businesses. • It has got lot of opportunities in the upcoming days.
  • 17.