Feb 2013 HUG: HIT (Hadoop Integration Testing) for Automated Certification and Deployment


Published on

HIT, which stands for Hadoop Integration Testing, is a Yahoo! framework for assembling Hadoop components into a full Stack and running integration tests to make sure that the components can inter-operate with each other. HIT aims to:
- build fully automated, modular, scalable and flexible Hadoop stack deployment and test framework
- develop integration processes and tools for development, quality engineering, operations, and customers
- grow participation and evolve into a comprehensive self-service stack deployment and test solution

HIT is designed as an open system to plug in any type of testing. We will also share new developments around HIT and how it can be a Platform for all testing and automation.

Speaker: Baljit Deot, Technical Yahoo!, Cloud Engineering Group, Yahoo!

Published in: Technology
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Introduction of tools like Igor or yinst should have been done already, but was not. And therefore it falls to HIT project to do it.
  • Feb 2013 HUG: HIT (Hadoop Integration Testing) for Automated Certification and Deployment

    1. 1. Hadoop Integrated TestingBaljit DeotFeb 20, 2013
    2. 2. Agenda Overview Solution FutureYahoo! Confidential & Proprietary. 2 2/22/2013
    3. 3. Overview Hadoop Grid != HDFS and MR › It is the whole ecosystem, including monitoring Set up of the Hadoop ecosystem for testing is a complex problem › Consistently repeatable › Co-ordinated install of multiple packages › Setup necessary schemas and do the “wiring” › Make the packages work with other internal proprietary systems › Automated! Run Hadoop tests for everything in the Hadoop ecosystem Record and report results across builds/releasesYahoo! Confidential & Proprietary. 3 2/22/2013
    4. 4. Solution Yahoo’s HIT (Hadoop Integrated Testing) › Deploy and certify the complete Hadoop ecosystem with one click • Assemble all products on a given cluster • Run integration/functional tests • Record and report results › Repeatable automated environment setup • Select available test environment • Wipe clean and install Hadoop and all components every time • Setup necessary database schemasYahoo! Confidential & Proprietary. 4 2/22/2013
    5. 5. Legend: HIT Deployment Hadoop-related HIT-related Hadoop cluster Test driver NN dnHudson job Isolated virtual environment Core tests Pig tests NN2 dn HIT HIT Oozie tests Nova tests JT DAQ tests ….. dn results HDFS proxy Pig Vaidya ….. Hadoop client Oozie distcp Oozie cli ….. dnResultsstorage HDFS MR Hadoop core dn DAQ Nova
    6. 6. Legend:Workflow 2 Deploy cluster Hadoop-related1 Start certification HIT-relatedHudson HIT job 3 Deploy HITTAG: H22.rc2.5 Click to run HIT Results page 5 View resultsTAG: H22.rc2.5 Hadoop: pass Pig: pass Hive: pass Oozie: pass … 4 Run tests6
    7. 7. HIT configurable job 7
    8. 8. HIT Key Values HIT system benefits › Enable integration testing › Provide safety net for deployment to sandbox and production grids › Provide comprehensive reference platform › Empower engineering teams through self-service Hadoop stack integration and testing. › Provide test solution for teams with no test environment › Enforce repeatable automated environment in QE teams › Give QE a way to deploy a full Hadoop Stack › Standard tool for entry/exit criterion for QE › Viable framework for CI – nightly 8
    9. 9. Future direction Command line interface › Enable developer pre-commit testing “on the box” › Develop a fully automated commit-to-deployment process Contributions to the communityYahoo! Confidential & Proprietary. 9 2/22/2013