May 2013 HUG: Building common denominator of Hadoop distributions with Bigtop

1,313 views

Published on

Bigtop is stepping up in its role as the foundation of a standard Hadoop-based data analytics stack, essentially bringing most of the commercial offering to the standard footing. 6 out of 7 commercial vendors using Bigtop framework to power their distributions based on ASF Hadoop.

Bigtop is also the must have stabilization tool for Hadoop platform where's any downstream application or system developer can make sure that their software would work with the next version of Hadoop.

Presenter(s):
Dr. Konstantin Boudnik, ASF Hadoop committer, Bigtop PMC; Director of Engineering, WANdisco
Roman Shaposhnik, VP, Apache Bigtop, IPMC member at ASF; Software engineer, Cloudera inc.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,313
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
17
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

May 2013 HUG: Building common denominator of Hadoop distributions with Bigtop

  1. 1. Hadoop.next
  2. 2. Who are we?● Hadoop downstream community● Well, specifically:– Roman Shaposhnik● VP, Apache Bigtop, IPMC member at ASF● Software engineer, Cloudera inc.– Dr. Konstantin Boudnik,● ASF Hadoop committer, Bigtop PMC,● Director of Engineering, WANdisco
  3. 3. What are we dealing with?● Hadoop 1.x– stable, but old● Hadoop 2.0.x– modern, used to be alpha, now stabilizing● Hadoop 2.1.x– modern, feature-driven● Hadoop 3.x– perpetual trunk
  4. 4. What are the implications?● YARNs appeal as IaaS● Fragmentation● Repeat of “UNIX vendor wars”● Cutting off vital sources of feedback● Jaded downstream● Confused users● Delayed world domination
  5. 5. Whats downstream to do?● mvn help:all-profilesProfile Id: hadoop_0.20.203 (active)Profile Id: hadoop_1.0Profile Id: hadoop_non_secureProfile Id: hadoop_facebookProfile Id: hadoop_0.23Profile Id: hadoop_yarnProfile Id: hadoop_2.0.0Profile Id: hadoop_2.0.1Profile Id: hadoop_2.0.2Profile Id: hadoop_2.0.3Profile Id: hadoop_trunkProfile Id: hadoop_cdh4.1.2
  6. 6. That active profile?● http://mvnrepository.com/artifact/org.apache.hbase/hbase/0.94.3<dependency><groupId>org.apache.hbase</groupId><artifactId>hbase</artifactId><version>0.94.3</version></dependency>● Try finding Apache Giraph artifacts!
  7. 7. 13We dont have the TCK, but... Zookeeper HBase Pig Hive Impala Giraph Hama Hue Solr Crunch Sqoop Oozie Whirr Mahout Flume
  8. 8. Apache Bigtop“open-source software relatedto a system for integration,packaging, deployment andvalidation of a big datamanagement softwaredistribution based on ApacheHadoop”
  9. 9. 15Remember what Debian did to Linux?GNU Software Linux kernelLinux kernel
  10. 10. 16Bigtop is trying to do it with HadoopHadoop Ecosystem(Pig, Hive, Mahout)Linux kernelHadoop(HDFS + MR)
  11. 11. What does Bigtop offer:● Community focused on all of the above● Software for:– Integration– Build (make, Maven)– Packaging (RPM, DEB)– Deployment (Puppet)– Testing (iTest)● A continuous integration Jenkins server
  12. 12. Embrace asynchronous nature● Dont expect flag days● Dont expect agreement on releases● Do practice Last Known Good BuildsAv1 Bv22Cv3 Dv4Av1 Bv2Cv3 Dv2........Av1 Bv2Cv3 Dv4Bv22Dv44
  13. 13. Whos on-board?● Cloudera– CDH4 is 100% based on Bigtop (hadoop v2)● WANdisco● TrendMicro● Hortonworks, EMC, EBay, Intel (partially)● Canonical– Ubuntu Server: Hadoop and Bigdata blueprint● Illumos (early stages of interest)
  14. 14. Whos on-board?● Cloudera– CDH4 is 100% based on Bigtop (hadoop v2)● WANdisco● TrendMicro● Hortonworks, EMC, EBay, Intel (partially)● Canonical– Ubuntu Server: Hadoop and Bigdata blueprint● Illumos (early stages of interest)
  15. 15. Whos on-board?● Cloudera– CDH4 is 100% based on Bigtop (hadoop v2)● WANdisco● TrendMicro● Hortonworks, EMC, EBay, Intel (partially)● Canonical– Ubuntu Server: Hadoop and Bigdata blueprint● Illumos (early stages of interest)
  16. 16. Whos on-board?● Cloudera– CDH4 is 100% based on Bigtop (hadoop v2)● WANdisco● TrendMicro● Hortonworks, EMC, EBay, Intel (partially)● Canonical– Ubuntu Server: Hadoop and Bigdata blueprint● Illumos (early stages of interest)
  17. 17. Whats happening● A special release: Bigtop 0.3.0-incubating– Hadoop 1.0.1● Last stable release: Bigtop 0.5.0– Hadoop 2.0.2-alpha● Next stable release: Bigtop 0.6.0– End of Mar 2013 release– Hadoop 2.0.4-alpha– Major focus on developers
  18. 18. A special note on 2.0.4-alpha● It really will be 2.0.4.1● First release to use Bigtop as release criteria● A 100% community effort● First non-vendor stabilization effort● A stable base for 18 applications andcounting!
  19. 19. <your idea here>

×