NICTA Copyright 2012 From imagination to impactElicitingOperationsRequirements forApplicationsL. Bass, R. Jeffrey, I. Weber,H. Wada and L. Zhu
NICTA Copyright 2012 From imagination to impactOperations Requirements● "Through 2015, 80% of outages will be causedby people and process issues. 50% are causedby change, config and release" - Gartner● Devs and Ops are (still) isolated but Ops areimportant source of product requirements○ Before unit-test, less attention paid to "testability"○ In DevOps era, we should incorporate "operatability"into products● Making applications operation process aware!○ But where requirements come from?
NICTA Copyright 2012 From imagination to impactOverview of Our Study● Studied sources of operations requirements anddiscuss in the context of our spin-out○ Operations personnel○ Internal development efforts○ Operations standards○ Organizational process descriptions○ Academic studies● Model processes and the product○ Verify if a product satisfies operationsrequirements
NICTA Copyright 2012 From imagination to impactStandards and Organizational Process● Process standards, ISO 15504 or ITIL, are goodsource but not specific enough to turn intoproduct requirements● Organizational process descriptions tends toprovide more details○ e.g., resource migration in Amazon Web Services ● We found standards are useful to (1) implement(automate) into a product, and (2) define a methodto validate the process by operators media.amazonwebservices.com/AWS_Migrate_Resources_To_New_Region.pdf
NICTA Copyright 2012 From imagination to impactExample Operational Requirement● CP-6 Alternate Storage Site, NIST 800-53○ "The organization establishes an alternate storagesite including necessary agreements to permit thestorage and recovery of information system backupinformation"● Derived product requirement○ "The product shall maintain backup in an alternatestorage site. The product shall provide a method toassess the recoverability of the system"● Actual implementation in our product○ Setup a backup site and a schedule job as part ofproduct initialization. Otherwise, launch fails○ Provide a report to assess the quality of backup (e.g,timestamp, execution time, capacity of disk, ...)
NICTA Copyright 2012 From imagination to impactAcademic Studies● Difference between the environment is the mostcommon source of upgrade problem ○ Called "hidden dependencies" - incorrect file path,incorrect network address, library conflict, ...● Hidden dependencies is a useful list of productrequirements● Actual implementation in our product○ e.g., run dependency check at boot. Terminate theapp immediately to prevent fatal issues occurringlater (e.g., getting data corrupted)○ Boot failure is easy to detect - make Ops happy T. Dumitras, "Why do upgrades fail and what can we do about it?: towards dependable, onlineupgrades in enterprise system", Middleware 2009
NICTA Copyright 2012 From imagination to impactInternal DevOps Experience● Context: Our spin-out provides a SaaS solutionfor replicating resources in AWS● Issue: Expensive to clean up resources○ Tests○ Handle unexpected failures● "undo" functionality to revert the resource statusto a certain point ○ Easy to run tests○ Easy to clean up the mess I. Weber, et. al. "Automatic undo for cloud management via AI planning," HotDep12
NICTA Copyright 2012 From imagination to impactTowards the formal validation● Incorporating Ops requirements intodevelopment/product is useful; however, how toverify the implementation is correct?● Our on-going work - modeling process andproduct together○ Does the product satisfy ops requirements?○ The process operates the product as required?
NICTA Copyright 2012 From imagination to impactExample● Model the mixed-version upgradingprocess● Version conflictbetween clients andservers over longrunning process● Were evaluating thismethod in a realsystem
NICTA Copyright 2012 From imagination to impactConclusion● Operations including release are a large sourceof outages● To improve the "operatability" of products, westudied operations requirements● Future work: validate whether the "operatabiliy"is satisfied by implementations?