×
  • Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
 

Introducing Apache Hadoop: The Modern Data Operating System - Stanford EE380

by CTO, Founder at Cloudera, Inc on Nov 23, 2011

  • 5,741 views

Sophisticated data instrumentation and collection technologies are leading to unprecedented growth. Data-driven organizations need to be able to scalably store data and perform complex data processing ...

Sophisticated data instrumentation and collection technologies are leading to unprecedented growth. Data-driven organizations need to be able to scalably store data and perform complex data processing on the collected data(i.e. not just "queries"). Given the unstructured nature of the source data, and the need to stay agile, organizations also need to be able to change their schemas dynamically (at read-time vs write-time). Apache Hadoop is an open-source distributed fault-tolerant system that leverages commodity hardware to achieve large-scale agile data storage and processing. In this presentation, Dr. Amr Awadallah will introduce the design principles behind Apache Hadoop and explain the architecture of its core sub-systems (the Hadoop Distributed File System and MapReduce). Amr will also contrast Hadoop to relational database systems and illustrate how they truly complement each other. Finally, Amr will cover the Hadoop ecosystem at large which includes a number of projects that together form a cohesive Data Operating System for the modern data center.

Statistics

Views

Total Views
5,741
Views on SlideShare
4,208
Embed Views
1,533

Actions

Likes
8
Downloads
188
Comments
0

10 Embeds 1,533

http://www.scoop.it 1427
http://www.bigdatacloud.com 54
http://www.linkedin.com 21
http://paper.li 12
https://twitter.com 7
http://a0.twimg.com 7
http://webcache.googleusercontent.com 2
http://tweetedtimes.com 1
http://translate.googleusercontent.com 1
https://www.linkedin.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via SlideShare as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
Post Comment
Edit your comment

Introducing Apache Hadoop: The Modern Data Operating System  - Stanford EE380 Introducing Apache Hadoop: The Modern Data Operating System - Stanford EE380 Presentation Transcript