Your SlideShare is downloading. ×
Lessons Learned in Software Development: QA Infrastructure – Maintaining Robustness in Commercial Software
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Lessons Learned in Software Development: QA Infrastructure – Maintaining Robustness in Commercial Software


Published on

1st DSV-PhD Workshop: Keynote speech by Marcus Lagergren, Oracle Inc.

1st DSV-PhD Workshop: Keynote speech by Marcus Lagergren, Oracle Inc.

Published in: Technology

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Lessons Learned in Software DevelopmentQA Infrastructure – Maintaining Robustness in Commercial Software
    Marcus Lagergren
    Consulting Member of Technical Staff
    Oracle Corporation
  • 2. About the Speaker
    Marcus Lagergren holds a master’s degree from KTH, major in Theoretical Computer Science
    Marcus was one of the founders of Appeal Virtual Machines that was acquired by BEA in 2002, which in turn was acquired by Oracle in 2008.
    Marcus has worked on almost all aspects of the JRockit Virtual Machine and is now working with Virtualization technology
    Marcus likes power tools and scuba diving.
  • 3. Agenda
    Robustness in commercial apps with tight release schedules.
    Utopian vision: perpetual stable bits so we can spin off a release at any time
    Build Systems
    Source control and Development
    Regression testing
  • 4. Agenda
    Result databases
    Automatic Testing
    Complex and not-so-standard testing
    Development aspects
  • 5. Why listen to me?
    We’vespent the last 10 yearsdeveloping a JVM and the last 3 yearsdeveloping a Guest Operating system for twocommercial hypervisors.
    Hundreds of thousands on man hoursspent on robustnessalone
    Harder-to-debug software hardlyexists.
    We’vehad to invent stuff from dayone.
    Ok, from day 365 or so, lessonslearned
    We’vemademanymistakesalong the way.
    No one gets to Utopia, but at leastwehave a reasonablygoodidea of in whichdirection to go
  • 6. BEA Confidential. | 6
    QA infrastructure
    QA infrastructure is harder and probably even more important than development infrastructure.
    The most valuable lesson we have learned is that it must be developed parallel to the application and significant effort must be spent on it.
    It is at least as important as the application itself.
    Sometimes the boundaries between app and test infrastructure aren’t even clear.
  • 7. QA infrastructure
    QA and Dev, if separate deparments or roles, shouldalways work together.
    Preferably as physicallyclose to eachother as possible
    Theyshould be able to fill in for eachother and be encouraged to doeachothers’ work.
    Verydangerous to have a separate QA department on anotherfloor.
    Verydangerous for QA to just do blackbox testing withoutunderstandingwhat’s in the box.
    QA staffshould be treated as anyotherdeveloper
  • 8. Build System
    Build system, test system and sourcecontrol are parts of the same distributed system.
    Mobility - Buildanythinganywhere, locally or globally (distributed) - ”Adistributed cross compiler”
    Build system should be selfcontained & part of sourcecontrol.
    Do a sync on a fresh laptop, have all the details.
    We chose to putbinariesthere as well to producedeterministic bits and provide selfsufficience
    Not always a goodidea, butmostly a goodidea
  • 9. Source Control and Development
    Needgood support for distributeddevelopment
    Should be able to handledirectories as separate sourcecontrolentitites.
    Gatekeepers of mainbranches, distributed team baseddevelopment.
    Sourcecontrol, builds and developmentshouldonlyrequire vi + prompt
    Morecomplexenvironments on top for ease of use.
    Easier to extend with different UIs.
  • 10. Test System
    Also under sourcecontrol
    Distributed system – veryimportant.
    Virtualizeifpossible. Maximizeresourceusage.
    Local and remote test runspossible.
    Submitjobs ”crunchthroughthese tests”
    ”Check in ifpasses tests”.
    Test Machines
    Performance test machines (dedicated)
    Functionality test machines (not necessarilydedicated)
    Anymachinecanvolounteer CPU cycles for functional testing.
  • 11. Building Blocks – Tests
    Many tests, especially regression tests, for an appneedn’t be morethan a mainclass with a returnvalue.
    Keep it simple!
    ”I spent a fewhoursdistilling this huge program down to a reproducer for BUG123456”
    Claim:ifit’s simple enough to write and submit a test, > 50% of the bugscan get regression tests as part of the original bugfix.
    I willaddress the other 50% later.
  • 12. Building Blocks – Tests
    Easy-to-write tests make it possible for the test suite to grownaturally.
    If 10 minutes of spare time canlead to a new test beingwritten, checked in and enabled as part of the global test suite, you havesucceeded.
    Encouragedevelopers to check in unit tests for new functionalitytogether with the functionality.
    Need the infrastructure for it in the app
    Mightwant to enforce this strictly, but it might hinder developmenttoo.
  • 13. Building Blocks – Result Database
    Store results in cheapdatabase with sensible layout somewhere.
    Any freeware is fine – get it up and running.
    Easy to maintain and backup
    Query from localmachinesabouthistorical test results.
    ”Whenexactlydid this performance regression appear?”
    ”List all benchmarkscores on this machine for this benchmarksinceJanuary 1”
    ”Has this functional test failedbefore? Whatwere the bugfixes?”
  • 14. Building Blocks - Tests
    Use ”terror harnesses” that attack the cross sectionsbetweenmodules.
  • 15. Building Blocks - Performance
    Anythingcaneffects performance.
    EVERYTHING affects performance.
    Weneedautomatic regression warnings. Anyone who submits a performance regression will get an e-mail from the test system.
    Continuously make it easy to addmorebenchmarks.
    Automation: Deviations, baselines, invariants.
  • 16. Testing – The need for continuous automatic testing
    Needcontinuousautomatic testing.
    Example from real life: JRockit Solaris has beenmadeavailableoff and on over the years. Bit rot sets in immediatelywhenremoved from automated testing.
    Release version may break debug version and vice versa.
    Linux version may break Windows version and vice versa.
    Useextremelystrict and pickycompilerflags.
  • 17. Testing – So What About the Other 50%?
    Simple Java programs with main functions may not be enough for all the bugs.
    How do we test for a specific optimization bug in the code generator?
    How do we test for a strange boundary case that crashes the GC, that happens after two weeks in production?
    Key observation: We need to export a state.
  • 18. Testing – So What About the Other 50%?
    Create a very special heap with a fewobjects in nastyplaces. Load it and trigger a garbagecollection. Save it and compare to reference.
    Serialize an IR from just before an offendingoptimization. Load it and trigger the optimization. Save the resulting IR and compare it to reference.
    Comparewould be more of an ”equals” than a ”memcmp”
    Weneed a level of modularizationthat’sgoodenough for this.
    The collection of tests shouldgrownaturally, but the VM design shouldallow the ways of testing the VM to grownaturally as well.
  • 19. Testing – So What About the Other 50%?
    But of courseit’s not as simple as that.
    Whataboutmultithreadedapps? Race conditions?
    Plenty of threadsoperate on the same memory – e.g. Multithreaded GC. Howcanwe make test cases?
    Synchronization points.
    Randomized input, randomizedsleeps. Try to cover the malicioussideeffects of parallelism.
    ThingsliketheRaceTrackalgorithmcanfindsome (not all) races in staticcode, but the world is dynamic. Testing needs to be.
  • 20. Testing – So What About the Other 50%?
    Disclaimer: Sometimeswe just need to crunch a lot of code for a long, long time. Nothingelsesuffices to reproduce a problem or the framework that would make it possibledoesn’texist.
    So make sure the distributed system burnsthosefree CPU cycles
    And make the dumps full and comprehensible.
    Don’tlosethem, dammit! No wipingthem after 24h. Disk is cheap.
    Suprisinglyeffectiveif you haveenough beta testers.
  • 21. Testing – Retrofitting a framework
    You willprobablyhave to do this, sincepeopledon’tunderstand the importance of fundamental QA from day 1.
    Situation: Weneed the QA infrastructurebutdon’thave it. Our app has come a longway.
    Learn from history
    For example, go over 500 bug parade entries for HotSpot.
    Howmanycan be tested by small deterministicreproducers?
    Whatabout the rest - brainstormwhatfunctionality the VM wouldneedifwehad to write a simple reproducer for each problem.
  • 22. Development – The platform matrix
    Try to keep the amount of common code as large as possible.
    It is always a choice between platform specific features and test matrix growth.
    Initially, our performance critical code was native. As our JIT got better, we would write more and more in Java. Native is much worse. ”premethods”
    Augmented Java – intrinsics, ”pd_addr”, preprocessed Java files.
  • 23. Development – The platform matrix
    Otherseemlinglyplatformdependentthingscan be madeplatform independent.
    Example: Native stubs. The bulk of the work is parameter marshalling, the register allocatorcando that already.
    Beware of ”falseabstraction”.
    That extra parameter that is NULL on all platformsexcept IA64.
    Implementationlanguage: Debugging is an issue
    Powerful C/C++ debuggersexist. Meta-debugging is usuallyharder.
  • 24. Development
    Don’tlosefocus. Modularity first.
    Example: ”the fastest server side JVM”, ”startup time is an issue”, ”clientapplications are an issue” ”weneedzero overhead runtime instrumentation”. Runfool! Run!
    It is importantwhenoptimizing for performance not just too look at e.g.SPECjbb™and SPECjvm98™ Real world applicationsdo a lot of otherthings.
    ”There is no genericcommutative plus operator”. At leastnobodycares.
  • 25. Development - Policy
    Don’t be toomuch of a quality fascist whencode is written.
    If you spend all your time preventinglargercheckins or demand 100% testing on everythingnothingwillever get checked in.
    If you demand a strictlydocumented process with specifications for everything, all anyonewilleverdo is to writespecifications and holdmeetings.
    Both of the above are good in smalleramounts.
    It’smore of an awarenessthing.
    And the infrastructursshouldquickly and mercilesslyraise the alarm as soon as something breaks to preventfurtherdamage.
  • 26. Lessons Learned
    Summary – The important stuff to bring with you
    Build the test infrastructure in parallel with the application
    Start at the same time! Don’tput it off. It is part of the appdevelopment process and should be in the time budget.
    IdeallyDevand QA teams should be fused and be able to doeachother’sjobs. No separate compartments.
    Don’t be afraid to couple it tightly in placesif that is what is required to maintainstability.
    Use all available CPU cycles for testing