Your SlideShare is downloading. ×
Intro to Big Data
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Intro to Big Data

485
views

Published on

Introduction to Big Data.

Introduction to Big Data.

Published in: Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
485
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
32
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Presented by: Jon Bloom Senior Consultant, Agile Bay, Inc.
  • 2. Customers & Partners Jon Bloom Blog: http://www.bloomconsultingbi.com Twitter: @sqljon Linked-in: http://www.linkedin.com/in/BloomConsultingBI Email: jbloom@AgileBay.com
  • 3. w w w . a g i l e b a y. c o m
  • 4. Session Agenda  What is Big Data?  What is Hadoop?  BI vs. Hadoop  Demo:
  • 5. Terms and Acronyms  Hadoop: Apache project (open source) project to develop software for reliable, scalable, distributed computing.  Cluster: A group of computers (nodes) linked together to perform a highly-available and high computation work  HDFS distributed file system that provides high-throughput access to application data.  YARN A framework for job scheduling and cluster resource management.  MapReduce A system for parallel processing of large data sets.
  • 6. What is Big Data?  Volume, Velocity, Variety
  • 7. What is Hadoop  Apache open source project  Batch Oriented Parallel Processing across Commodity Servers  Ecosystem • • • • • Ambari HBase Avro Cassandra Chukwa • • • • Hive Mahout Pig ZooKeeper
  • 8. Distributed Computing & MapReduce Mapper Reducer
  • 9. BI vs. Hadoop  Hadoop not a replacement of BI  Extends BI capabilities  BI = Scale up to 100s of Gigabytes  Hadoop = From 100s of Gygabytes to Terabytes (1,000s og Gygabytes) and Terabytes (1,000,000 Gigabytes)
  • 10. Blog: www.bloomconsultingbi.com Twitter: @sqljon Linked-in: http://www.linkedin.com/in/BloomConsultingBI Email: jbloom@agilebay.com