Hadoop summit 2010 frameworks panel elephant bird
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Hadoop summit 2010 frameworks panel elephant bird

  • 4,914 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
4,914
On Slideshare
4,893
From Embeds
21
Number of Embeds
4

Actions

Shares
Downloads
64
Comments
0
Likes
9

Embeds 21

https://www.linkedin.com 10
http://www.linkedin.com 8
https://twitter.com 2
http://www.slideshare.net 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • This is the Title slide. Please use the name of the presentation that was used in the abstract submission.
  • This is the agenda slide. There is only one of these in the deck.
  • This is the agenda slide. There is only one of these in the deck.
  • This is the agenda slide. There is only one of these in the deck.
  • This is the agenda slide. There is only one of these in the deck.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.

Transcript

  • 1. Hadoop Frameworks
    • Kevin Weil @kevinweil
    Twitter
  • 2.
    • A framework for working with structured data within the Hadoop ecosystem
    Elephant Bird
  • 3.
    • A framework for working with structured data within the Hadoop ecosystem
      • Protocol Buffers
      • Thrift
      • JSON
      • W3C Logs
    Elephant Bird
  • 4.
    • A framework for working with structured data within the Hadoop ecosystem
      • InputFormats
      • OutputFormats
      • Hadoop Writables
      • Pig LoadFuncs
      • Pig StoreFuncs
      • Hbase LoadFuncs
    Elephant Bird
  • 5.
    • A framework for working with structured data within the Hadoop ecosystem… plus:
      • LZO Compression
      • Code Generation
      • Hadoop Counter Utilities
      • Misc Pig UDFs
    Elephant Bird
  • 6.
    • You should only need to specify the data schema
    Why?
  • 7.
    • You should only need to specify the ( flexible, forward-backward compatible, self-documenting ) data schema
    Why?
  • 8.
    • You should only need to specify the ( flexible, forward-backward compatible, self-documenting ) data schema
    • Everything else can be codegen’d.
    Why?
  • 9.
    • You should only need to specify the ( flexible, forward-backward compatible, self-documenting ) data schema
    • Everything else can be codegen’d.
    • Less Code. Efficient Storage. Focus on the Data.
    Why?
  • 10.
    • You should only need to specify the ( flexible, forward-backward compatible, self-documenting ) data schema
    • Everything else can be codegen’d.
    • Less Code. Efficient Storage. Focus on the Data.
    • Underlies 20,000 Hadoop jobs at Twitter every day.
    Why?
  • 11.
    • You should only need to specify the ( flexible, forward-backward compatible, self-documenting ) data schema
    • Everything else can be codegen’d.
    • Less Code. Efficient Storage. Focus on the Data.
    • Underlies 20,000 Hadoop jobs at Twitter every day.
    • http://github.com/kevinweil/elephant-bird : contributors welcome!
    Why?