Your SlideShare is downloading. ×
Hadoop Summit 2013 : Continuous Integration on top of hadoop
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Hadoop Summit 2013 : Continuous Integration on top of hadoop

1,473
views

Published on

This topic is on Hadoop Summit 2013 San Jose.

This topic is on Hadoop Summit 2013 San Jose.

Published in: Technology

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,473
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
41
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Continuous Integration on top of hadoop Wisely Chen & Neal Lee Tuesday, June 11, 13
  • 2. Agenda • Who I am • Problem • Solution • Demo • Q&A Tuesday, June 11, 13
  • 3. Who I am • Wisely Chen ( thegiive@gmail.com ) • Release manager of Yahoo![Taiwan] shopping and data team • Love to promote open source tech at Taiwan • Ruby and Rails : Coscup 2006, Ubisunrise 2007, OSDC 2007 • Puppet : PHPConf 2012 , RubyConf 2012 • Release Practice :Webconf 2013, Coscup 2012 Tuesday, June 11, 13
  • 4. Who I am • Neal Lee (@neal_lee) • Data Engineer at Yahoo![Taiwan] • Aiming to build up an easy use of self-service BI platform connecting to Hadoop. Tuesday, June 11, 13
  • 5. Story 1 Tuesday, June 11, 13
  • 6. Another Story Tuesday, June 11, 13
  • 7. Yet Another Story Tuesday, June 11, 13
  • 8. Solution Tuesday, June 11, 13
  • 9. One click • Manual commit code to SCM • And DONE • Auto unit testing • Auto push beta for performance testing • Auto push to production grid • Auto trigger code Tuesday, June 11, 13
  • 10. This feeling is 爽! Tuesday, June 11, 13
  • 11. Continuous Integration Tuesday, June 11, 13
  • 12. Continuous Integration • A software engineering practice • Maintain code repos • Automate the build • Make the build self-testing • Everyone commit to the baseline everyday • Every commit should be a build • Test in a clone of production environment • Make it easy to get the latest deliverables • Everyone can see the result of latest build • Automate deployment Tuesday, June 11, 13
  • 13. We focus on • A software engineering practice • Maintain code repos • Automate the build • Make the build self-testing • Everyone commit to the baseline everyday • Every commit should be a build • Test in a clone of production environment • Make it easy to get the latest deliverables • Everyone can see the result of latest build • Automate deployment Tuesday, June 11, 13
  • 14. CI flow 4. CI slave exec local UnitTest 7. CI slave exec Performanc 11. CI exec pig People DEV Alpha Beta Grid Prod Grid 2. notify CI 5. deploy 8. deploy CI Master 1. Commit Code SCM 3. Call 6. Call 10. Call 9. git tag 12. notify user Tuesday, June 11, 13
  • 15. CI flow 4. CI slave exec local UnitTest CI slave exec Performanc CI exec pig People DEV Alpha Beta Grid Prod Grid 2. notify CI CI Master 1. Commit Code SCM 3. Call 5. Notify user Tuesday, June 11, 13
  • 16. CI flow 4. CI slave exec local UnitTest 7. CI slave exec Performanc CI exec pig People DEV Alpha Beta Grid Prod Grid 2. notify CI 5. deploy CI Master 1. Commit Code SCM 3. Call 6. Call 8.Notify user Tuesday, June 11, 13
  • 17. CI flow 4. CI slave exec local UnitTest 7. CI slave exec Performanc CI exec pig People DEV Alpha Beta Grid Prod Grid 2. notify CI 5. deploy 8. deploy CI Master 1. Commit Code SCM 3. Call 6. Call 9. Notify user Tuesday, June 11, 13
  • 18. Unit Test 4. CI slave exec local UnitTest 7. CI slave exec Performanc 11. CI exec pig People DEV Alpha Beta Grid Prod Grid 2. notify CI 5. deploy 8. deploy CI Master 1. Commit Code SCM 3. Call 6. Call 10. Call 9. git tag 12. notify user Tuesday, June 11, 13
  • 19. PigUnit • A simple xUnit framework • No cluster set up is required in local mode • Unit testing, regression testing, and rapid prototyping on the fly Tuesday, June 11, 13
  • 20. Using PigUnit • Coding • Write PigUnit test case • Run local PigUnit test • Push to grid • Run Pig on grid • Get right result ! Tuesday, June 11, 13
  • 21. Unit test is live doc • Unit test is runnable live doc • Pass test case and meet previous requirement Tuesday, June 11, 13
  • 22. Performance Test 4. CI slave exec local UnitTest 7. CI slave exec Performanc 11. CI exec pig People DEV Alpha Beta Grid Prod Grid 2. notify CI 5. deploy 8. deploy CI Master 1. Commit Code SCM 3. Call 6. Call 10. Call 9. git tag 12. notify user Tuesday, June 11, 13
  • 23. Vaidya • Rule based performance diagnosis of M/R jobs • Extensible framework • You can add your own rules • Write complex rules using existing rules Tuesday, June 11, 13
  • 24. CI toolset CI slave exec local UnitTest CI slave exec Performanc CI exec pig People DEV Alpha Beta Grid Prod Grid notify CI deploy deploy CI Commit Code SCM Call Vaidya BASH Tuesday, June 11, 13
  • 25. CI is flexible • MapReduce can use MapUnit • Hive can use hive_test • Pig can use PigUnit Tuesday, June 11, 13
  • 26. Github trigger CI Tuesday, June 11, 13
  • 27. CI testing build pipeline Tuesday, June 11, 13
  • 28. Testing Trend Tuesday, June 11, 13
  • 29. DEMO Tuesday, June 11, 13
  • 30. Conclusion • Auto testing will save your life • CI will boost your productivity • This process can feed in any platform Tuesday, June 11, 13
  • 31. 謝謝大家 Tuesday, June 11, 13