Introduction to Hadoop at Data-360 Conference

431
-1

Published on

A short introduction to Hadoop mostly with live industry examples and scenarios.

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
431
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
23
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Slides are for reference only. We can understand and learn more about live example and live discussion.Lets see who is who in the room. How many coders? Program Managers? Any Hadoop stories? How about where is Hadoop headquarter?
  • Introduction to Hadoop at Data-360 Conference

    1. 1. avkash@bigdataperspective.com
    2. 2. http://www.packtpub.com/using-cloudera-impala/book http://www.amazon.com/Simplifying-Windows-Azure-HDInsight-Service/dp/0735673802 https://www.linkedin.com/in/avkashchauhan
    3. 3. Hadoop is an Open Source (Java based), “Scalable”, “fault tolerant” platform for large amount of unstructured data storage & processing, distributed across machines.
    4. 4. Flexibility A Single Repo for storing and analyzing any kind of data not bounded by schema Scalability Scale-out architecture divides workload across multiple nodes using flexible distributed file system Low Cost Deployed on commodity hardware & open source platform Fault Tolerant Continue working event if node(s) go down
    5. 5. A system to move computation, where the data is.
    6. 6. Hadoop Common HDFS Map/Reduce
    7. 7. Hadoop Common HDFS MapReduce
    8. 8. Cloudera Impala Hortonworks Tez Impala uses C++ based in-memory processing of HDFS data through SQL like statements to expedite the data processing Use cases include user collaborative filtering, user recommendations, clustering and classification.
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×