More Related Content
Similar to Mariusz Gil BIG data ecosystem overview and solutions
Similar to Mariusz Gil BIG data ecosystem overview and solutions (20)
Mariusz Gil BIG data ecosystem overview and solutions
- 9. Big Data is data that is too large,
complex and dynamics for any conventional data tools
to capture, store, manage and analyze.
- 20. HDFS
YARN / MapReduce v2
HADOOP DISTRIBUTED FILE SYSTEM
DISTRIBUTED PROCESSING FRAMEWORK
COLUMNAR STORAGE
SQL DATA WAREHOUSE ENGINE
HIVE
DATA SERIALIZATION
AVRO
SCALABLE MACHINE LEARNING
MAHOUT
SCRIPTING FOR LARGE DATA SETS
PIG
WORKFLOWS ORCHESTRATION
PROVISIONING, MANAGING AND MONITORING CLUSTERS
HBASE
DATA EXCHANGE
SQOOP
OOZIE
DISTRIBUTED COORDINATION SERVICE
ZOOKEEPER
LOG COLLECTOR
FLUME
AMBARI
WHIRR
RUNNING CLOUD SERVICES
- 22. We can choose from multiple
VENDORS
like Cloudera, HortonWorks or Amazon