Big Data in the Microsoft Platform

373 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
373
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
12
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Big Data in the Microsoft Platform

  1. 1. •••••
  2. 2. •••••
  3. 3. ETL Tools BI Reporting RDBMSZookeepr (Coordination) Pig (Data Flow) Hive (SQL) Sqoop Avro (Serialization) MapReduce (Job Scheduling/Execution System) HBase (key-value store) (Streaming/Pipes APIs) HDFS (Hadoop Distributed File System)
  4. 4. Block Size = 64MBReplication Factor = 3Cost/GB is a few ¢/month vs $/month
  5. 5. HDFSDemo
  6. 6. •••
  7. 7. Map Reduce Demo
  8. 8. Hacking with Hive
  9. 9.   
  10. 10. Rocking Data Processing with Pig
  11. 11. Bulk Data Loading UsingSqoop
  12. 12. HDInsight Service Overview

×