Your SlideShare is downloading. ×
Hadoop summit 2010 frameworks panel elephant bird
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Hadoop summit 2010 frameworks panel elephant bird

3,868
views

Published on

Published in: Technology

0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,868
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
65
Comments
0
Likes
9
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • This is the Title slide. Please use the name of the presentation that was used in the abstract submission.
  • This is the agenda slide. There is only one of these in the deck.
  • This is the agenda slide. There is only one of these in the deck.
  • This is the agenda slide. There is only one of these in the deck.
  • This is the agenda slide. There is only one of these in the deck.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • Transcript

    • 1. Hadoop Frameworks
      • Kevin Weil @kevinweil
      Twitter
    • 2.
      • A framework for working with structured data within the Hadoop ecosystem
      Elephant Bird
    • 3.
      • A framework for working with structured data within the Hadoop ecosystem
        • Protocol Buffers
        • Thrift
        • JSON
        • W3C Logs
      Elephant Bird
    • 4.
      • A framework for working with structured data within the Hadoop ecosystem
        • InputFormats
        • OutputFormats
        • Hadoop Writables
        • Pig LoadFuncs
        • Pig StoreFuncs
        • Hbase LoadFuncs
      Elephant Bird
    • 5.
      • A framework for working with structured data within the Hadoop ecosystem… plus:
        • LZO Compression
        • Code Generation
        • Hadoop Counter Utilities
        • Misc Pig UDFs
      Elephant Bird
    • 6.
      • You should only need to specify the data schema
      Why?
    • 7.
      • You should only need to specify the ( flexible, forward-backward compatible, self-documenting ) data schema
      Why?
    • 8.
      • You should only need to specify the ( flexible, forward-backward compatible, self-documenting ) data schema
      • Everything else can be codegen’d.
      Why?
    • 9.
      • You should only need to specify the ( flexible, forward-backward compatible, self-documenting ) data schema
      • Everything else can be codegen’d.
      • Less Code. Efficient Storage. Focus on the Data.
      Why?
    • 10.
      • You should only need to specify the ( flexible, forward-backward compatible, self-documenting ) data schema
      • Everything else can be codegen’d.
      • Less Code. Efficient Storage. Focus on the Data.
      • Underlies 20,000 Hadoop jobs at Twitter every day.
      Why?
    • 11.
      • You should only need to specify the ( flexible, forward-backward compatible, self-documenting ) data schema
      • Everything else can be codegen’d.
      • Less Code. Efficient Storage. Focus on the Data.
      • Underlies 20,000 Hadoop jobs at Twitter every day.
      • http://github.com/kevinweil/elephant-bird : contributors welcome!
      Why?