Big data ecosystem
Upcoming SlideShare
Loading in...5
×
 

Big data ecosystem

on

  • 514 views

 

Statistics

Views

Total Views
514
Views on SlideShare
514
Embed Views
0

Actions

Likes
0
Downloads
26
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Big data ecosystem Big data ecosystem Presentation Transcript

  • Mariusz Gil BIG data ecosystem
  • / ABOUT ME /
  • This talk is about BIG DATA
  • What is... BIG DATA?
  • VOLUME large amounts of data
  • VELOCITY needs to be analyzed quickly
  • VARIETY different types of structured and unstructured data
  • Big Data is data that is too large, complex and dynamics for any conventional data tools to capture, store, manage and analyze.
  • 30 billion pieces of content we added past month
  • more than 2 billion videos were watched yesterday
  • more than 58 millions messages were send yesterday
  • / MAIN QUESTIONS /
  • WHY?
  • 49 % IMPROVED RISK MANAGEMENT 32 % INCREASED SALES FIGURES 36 40 % IMPROVED MANAGEMENT CONTROL % IT ANALYSIS 43 % MARKET-ORIENTED PRODUCT DEVELOPMENT 27 % FINANCES AND ECONOMICS
  • 690 nodes Hadoop cluster for predictions and analytics
  • HOW?
  • HDFS YARN / MapReduce v2 HADOOP DISTRIBUTED FILE SYSTEM DISTRIBUTED PROCESSING FRAMEWORK COLUMNAR STORAGE SQL DATA WAREHOUSE ENGINE HIVE DATA SERIALIZATION AVRO SCALABLE MACHINE LEARNING MAHOUT SCRIPTING FOR LARGE DATA SETS PIG WORKFLOWS ORCHESTRATION PROVISIONING, MANAGING AND MONITORING CLUSTERS HBASE DATA EXCHANGE SQOOP OOZIE DISTRIBUTED COORDINATION SERVICE ZOOKEEPER LOG COLLECTOR FLUME AMBARI WHIRR RUNNING CLOUD SERVICES
  • We can choose from multiple VENDORS like Cloudera, HortonWorks or Amazon
  • Even from...
  • Can we get results FASTER?
  • Cloudera Impala Storm Apache Drill
  • thanks