• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Hadoop hive presentation
 

Hadoop hive presentation

on

  • 417 views

Hadoop seminar topic,Hadoop Cse,Hadoop ppt

Hadoop seminar topic,Hadoop Cse,Hadoop ppt

Statistics

Views

Total Views
417
Views on SlideShare
417
Embed Views
0

Actions

Likes
0
Downloads
24
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Hadoop hive presentation Hadoop hive presentation Presentation Transcript

    • Hadoop
    • Agenda• Problems with traditional large-scale systems• Requirements for new approaches• What is Hadoop..?• Why Hadoop?• Overview of Hadoop• HDFS• Map Reduce• Applications• Conclusion
    • Problems with traditional large-scale systemsData is being increased day-by-dayIssues with the network failureServer failureLoss of dataCost is more.Distributed computing need manual processing
    • Requirements for new approachesData should be stored in a distributed mannerand parallel processing.High performance and less cost.Should be scalableShould be simple to access and processFault tolerance
    • What is Hadoop…?Open Source FrameworkProcess large amount of data
    • Why Hadoop…?• Accessible• Scalable• Robust• Simple
    • Overview of HadoopIt handles 3 types of dataStructuredSemi – structuredUnstructuredAnalyses and process large amounts of data (Peta byte)
    • Compare with traditional DB’sRDBMS• Stores GB’s of data• Supports batch processand interactive process• Allows Updation• Schemas must me defined• Only structured dataHADOOP• Stores PB’s of data• Only batch process• Does not allow Updation, itfollows WORM• Schemas not required• Supports 3 types of data
    • ComponentsHadoop can be divided into 2 parts1. HDFS – Hadoop Distributed File System2. MapReduce Programming model
    • Hadoop Distributed File SystemIt is a distributed file systemRuns on commodity hardwareProvides high throughput access to application datasuitable for applications that have large data sets.It is designed to store a very large amount of data (Tera or petabytes).
    • Core Architectural Goal of HDFSA HDFS instance may consist of thousands of server machines.Detection of faults and quickly recovering from them in anautomated manner
    • MapReduce Programming ModelMapReduce works on divide and conquer rule on the data.Schedules execution across a set of machinesManages inter-process communicationThe Reducer processes all output from all mappers and arrivesat final output
    • MapReduce Programming Model– MAP• Map() function that processes a key/value pair togenerate a set of intermediate key/value pairs– REDUCE• reduce() function that merges all intermediate valuesassociated with the same intermediate key.
    • Applications
    • REFERENCE• HADOOP IN ACTION- By CHUK LAM• YOUTUBE• WIKEPEDIA• GOOGLE IMAGES
    • Conclusion