Your SlideShare is downloading. ×
Hadoop summit 2010 frameworks panel elephant bird
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Hadoop summit 2010 frameworks panel elephant bird

3,910
views

Published on

Published in: Technology

0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,910
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
65
Comments
0
Likes
9
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • This is the Title slide. Please use the name of the presentation that was used in the abstract submission.
  • This is the agenda slide. There is only one of these in the deck.
  • This is the agenda slide. There is only one of these in the deck.
  • This is the agenda slide. There is only one of these in the deck.
  • This is the agenda slide. There is only one of these in the deck.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • Transcript

    • 1. Hadoop Frameworks
      • Kevin Weil @kevinweil
      Twitter
    • 2.
      • A framework for working with structured data within the Hadoop ecosystem
      Elephant Bird
    • 3.
      • A framework for working with structured data within the Hadoop ecosystem
        • Protocol Buffers
        • Thrift
        • JSON
        • W3C Logs
      Elephant Bird
    • 4.
      • A framework for working with structured data within the Hadoop ecosystem
        • InputFormats
        • OutputFormats
        • Hadoop Writables
        • Pig LoadFuncs
        • Pig StoreFuncs
        • Hbase LoadFuncs
      Elephant Bird
    • 5.
      • A framework for working with structured data within the Hadoop ecosystem… plus:
        • LZO Compression
        • Code Generation
        • Hadoop Counter Utilities
        • Misc Pig UDFs
      Elephant Bird
    • 6.
      • You should only need to specify the data schema
      Why?
    • 7.
      • You should only need to specify the ( flexible, forward-backward compatible, self-documenting ) data schema
      Why?
    • 8.
      • You should only need to specify the ( flexible, forward-backward compatible, self-documenting ) data schema
      • Everything else can be codegen’d.
      Why?
    • 9.
      • You should only need to specify the ( flexible, forward-backward compatible, self-documenting ) data schema
      • Everything else can be codegen’d.
      • Less Code. Efficient Storage. Focus on the Data.
      Why?
    • 10.
      • You should only need to specify the ( flexible, forward-backward compatible, self-documenting ) data schema
      • Everything else can be codegen’d.
      • Less Code. Efficient Storage. Focus on the Data.
      • Underlies 20,000 Hadoop jobs at Twitter every day.
      Why?
    • 11.
      • You should only need to specify the ( flexible, forward-backward compatible, self-documenting ) data schema
      • Everything else can be codegen’d.
      • Less Code. Efficient Storage. Focus on the Data.
      • Underlies 20,000 Hadoop jobs at Twitter every day.
      • http://github.com/kevinweil/elephant-bird : contributors welcome!
      Why?

    ×