• Share
  • Email
  • Embed
  • Like
  • Private Content
Webinar: The Future of Hadoop
 

Webinar: The Future of Hadoop

on

  • 15,134 views

With a community of over 500 contributors, Apache Hadoop and related projects are evolving at an ever increasing rate. Join the co-creator of Apache Hadoop, Doug Cutting, and Cloudera’s Chief ...

With a community of over 500 contributors, Apache Hadoop and related projects are evolving at an ever increasing rate. Join the co-creator of Apache Hadoop, Doug Cutting, and Cloudera’s Chief Scientist, Jeff Hammerbacher, for a discussion of the most exciting new features being developed by the Apache Hadoop community.

Statistics

Views

Total Views
15,134
Views on SlideShare
7,530
Embed Views
7,604

Actions

Likes
15
Downloads
276
Comments
1

19 Embeds 7,604

http://www.techgig.com 4475
http://www.cloudera.com 1894
http://d.hatena.ne.jp 1105
http://siliconangle.com 72
http://twitter.com 22
http://us-w1.rockmelt.com 11
http://webcache.googleusercontent.com 6
http://www.moriwaki.net 3
http://test.cloudera.com 3
http://translate.googleusercontent.com 2
http://servicesangle.com 2
http://a0.twimg.com 2
http://cloudera.louddog.net 1
http://cloudera.matt.dev 1
https://twimg0-a.akamaihd.net 1
http://115.112.206.134 1
http://tweetedtimes.com 1
http://www.instacurate.com 1
http://blog.cloudera.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

11 of 1 previous next

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Hi,
    I am recruiting you any for universalisation, charismation, divinisation and presentation,
    Sorry, for this comment, i have commented on topic for recession, but then i went universal, pardon me .... !
    i am not doing too much, i am doing what i think it has to be done ....
    my solution for recession is universalisation, means evaluate all resourcess and assets of universe and then apply necessary sum of new currency (Zik=100$) to pay all debts and to buy off all taxes from national governments ....
    of course for this we need adequate entity, i see on horizon only myself as the secular and universal, legal and official The God, recognised by UN and with contracts with all national states governments,
    of course i invite you all to create a fresh new account at google, free, but with my data: universal identities names and universal residence, like this: Zababau Ganetros Cirimbo Ostangu zaqaqef@gmail.com ogiriny64256142, ( you can create this one but then inform me), access to account i have to have because this is divinising universalisation, but you can open it for all, i simply have to arrange it to adapt to paradigm, isn't it ......
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Webinar: The Future of Hadoop Webinar: The Future of Hadoop Presentation Transcript

    • The Future of Hadoop Doug Cutting | A Founder of Apache HadoopJeff Hammerbacher | Chief Scientist, Cloudera Welcome to the webinar! Audio/Telephone: +1 (215) 383-1016 Access Code: 421-634-457 Audio Pin: Shown after joining the Webinar Hadoop, Hbase, Pig, Hive, Bigtop, Avro, Flume & Whirr are trademark of the Apache Software Foundation
    • Housekeeping▪ All lines are on mute▪ Ask questions at any time using the Questions panel on GoToMeeting▪ Slides and recording will be available on www.cloudera.com/events ©2011 Cloudera, Inc. All Rights Reserved.
    • Presentation Outline▪ 1. Context▪ 2. Apache Bigtop▪ 3. Apache Hadoop Core▪ 4. Apache HBase, Hive, and Pig▪ 5. Other components▪ Questions and Discussion ©2011 Cloudera, Inc. All Rights Reserved.
    • 1. Context
    • ContextData▪ 1.8 ZB will be created and replicated in 2011 ▪ Up 9x in the last five years ▪ More than 90% of this data is unstructured ▪ Enterprises have some liability for 80% of this data ▪ Enterprises will spend $4T on managing data in 2011 ▪ Source: IDC Digital Universe Report 2011 ©2011 Cloudera, Inc. All Rights Reserved.
    • ContextHadoop▪ Apache Hadoop and related software are designed for this world▪ Volume ▪ Commodity hardware and open source software lowers cost and increases capacity▪ Velocity ▪ Data ingest speed aided by append-only and schema-on-read design▪ Variety ▪ Multiple tools to structure, process, and access data ©2011 Cloudera, Inc. All Rights Reserved.
    • ContextHadoop
    • ContextHDFS and MapReduce▪ Apache Hadoop = HDFS + MapReduce ▪ Similar to kernel of an operating system ▪ Referred to as “Hadoop Core”▪ Related components are often deployed with Hadoop ▪ For example: HBase, Hive, Pig, Oozie, Flume, Sqoop ▪ Together, these components form a “Hadoop Stack” ▪ Not all components must be deployed
    • ContextBigtop▪ What standards should all components follow?▪ How can we ensure all components of the stack work together?▪ How can we find the right version of each component?▪ How can we make it easy to install an additional component?
    • 2. Apache Bigtop
    • Apache Bigtop▪ Now incubating at Apache▪ Hadoop ecosystem-wide project, including: ▪ Interoperability testing of components ▪ Packaging of compatible versions of components▪ Like a Fedora, Debian or CentOS for Hadoop ecosystem▪ Releases are not a single artifact ▪ Rather a set of interdependent, compatible components ©2011 Cloudera, Inc. All Rights Reserved.
    • Apache Bigtop▪ Current components ▪ Hadoop ▪ HBase ▪ Hive ▪ Pig ▪ Oozie ▪ Sqoop ▪ Flume ▪ ZooKeeper ▪ Whirr
    • Apache Bigtop▪ Outputs ▪ Source ▪ RPM ▪ Deb▪ Tests ▪ Integration ▪ Package ▪ Smoke▪ Release 0.1.0 under vote now!
    • 3. Apache Hadoop Core
    • Apache Hadoop Core▪ Current stable releases based on branches from 0.20▪ Upcoming release: 0.22 ▪ Includes both security and new implementation of append ▪ Not expected to be run at scale or commercially supported ▪ Nearly ready for vote▪ Upcoming release: 0.23 ▪ Build and dependency management moved to Maven ▪ Branch to happen soon
    • HDFS▪ Robustness ▪ HDFS-1073: Checkpointing of image and edits log▪ Availability ▪ HDFS-1623: High availability▪ Performance ▪ HDFS-941: Faster random reads ▪ HDFS-2080: Faster checksums ©2011 Cloudera, Inc. All Rights Reserved.
    • HDFS▪ Scalability ▪ HDFS-1052: Federation of the NameNode ▪ Source of diagram: http://www.hortonworks.com/an-introduction-to-hdfs-federation/
    • MapReduce▪ Modularity ▪ MAPREDUCE-279: MapReduce 2.0 ▪ Break JobTracker into ResourceManager and ApplicationMaster ▪ Replace TaskTracker with NodeManager ▪ Source of diagram: http://www.odbms.org/download/dean-keynote-ladis2009.pdf
    • MapReduce▪ Potential New Frameworks ▪ MAPREDUCE-2719: Distributed shell ▪ MAPREDUCE-2720: Distributed Java commands ▪ MPI: Communication-intensive parallelism ▪ Fast scans and aggregations ▪ OpenDremel ▪ Bulk Synchronous Parallel ▪ Giraph, Golden Orb, Hama, et al. ▪ Actor Model (streaming) ▪ S4, Akka, Storm, et al.
    • 4. HBase, Hive, and Pig
    • Apache HBase▪ Upcoming release: 0.92.0▪ Server-side triggers ▪ HBASE-2000: Coprocessors▪ Availability ▪ HBASE-1730/4213: Online schema changes▪ Performance ▪ HBASE-3857: HFile 2.0▪ HBase book in September! ©2011 Cloudera, Inc. All Rights Reserved.
    • Apache Hive▪ Upcoming release: 0.8▪ Data transfer ▪ HIVE-306: INSERT INTO ▪ HIVE-1918: EXPORT/IMPORT▪ Indexes ▪ HIVE-1644: Automatically use indexes ▪ HIVE-1803: Bitmap indexes▪ Data formats ▪ HIVE-895: Avro support ©2011 Cloudera, Inc. All Rights Reserved.
    • Apache Pig▪ Recent release: 0.9▪ Scripting ▪ PIG-1479: Embedding Pig in Python ▪ PIG-1793: Macro expansion▪ Debugging ▪ PIG-1712: ILLUSTRATE rework▪ Data formats ▪ PIG-1748: Avro support ©2011 Cloudera, Inc. All Rights Reserved.
    • 5. Other Components
    • Other Components▪ Apache Incubator ▪ Sqoop, Flume, and Oozie now incubating ▪ Whirr graduated to a top-level Apache project▪ Apache Avro ▪ Interoperability with Protocol Buffers and Thrift ▪ Column-oriented file format ▪ Python MapReduce implementation▪ Apache ZooKeeper ▪ Multi-update ▪ Kerberos authentication of clients ©2011 Cloudera, Inc. All Rights Reserved.
    • Q&AVisit www.hadoopworld.com• November 8-9, 2011 in New York City• Early bird discount ends September 5, 2011Enter Today: www.facebook.com/cloudera• Click the “Be a Cloudera Hero for Apache Hadoop” tab• Share what you think Apache Hadoop can do for you• Win a personal hackathon with Doug Cutting in San Francisco, CA