Lessons learned validating 60,000 pages of api documentation


Published on

In 2002, the US Department of Justice and EU regulators ordered Microsoft to publish documentation sufficient to allow third parties to interoperate with hundreds of its server and client programs. This presentation highlights the novel requirements engineering and model-based testing devised to validate the resulting 60,000 pages of API documentation.

This process resulted in over 50,000 defect corrections. Lessons learned from this project can improve quality, reduce costs, and facilitate interoperability of any complex system.

Published in: Software, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Lessons learned validating 60,000 pages of api documentation

  1. 1. LESSONS LEARNED VALIDATING 60,000 PAGES OF API DOCUMENTATION Robert V. Binder ACM Chicago Chapter Meeting Chicago May 8, 2013 Copyright © Robert V. Binder, 2013. All rights reserved.
  2. 2. Overview • Background • Microsoft Protocol QA Process • Scope and approach • Requirements engineering • Model-based testing • Non-Microsoft Applications • Q & A 2
  3. 3. The evil Dogbert mocks Dilbert as “Boron,” the most boring man in the universe …
  5. 5. What is a Protocol? • Data • Content and format • Behavior • Request/Response • Acceptable sequences 5 • Rules for interaction between (among) endpoints using messages
  6. 6. Layers, Protocols, Stacks • Layer: level of abstraction • Each layer is a protocol • Stack of layers • L ↔ L-1 ok • L ↔ L ± m NOT ok • Layer uses other protocols • HTTP over TCP or RPC • IP over WiFi or LAN • Protocol may define own data or use standard format (XML) 6 SOAP BING HTTP IP XML 802.11 802.2 TCP UDP XMP
  7. 7. Protocols Everywhere • Cellular: CDMA, GSM, SMS, MMS, WAP … • Network: 802.11 (WiFi), 802.16 (WiMax) … • Wireless: Bluetooth, Zigbee, ANT, ISO 14443 … • Routing: OSPF, IGP, RIP, CIDR, BGP … • RFCs: HTTP, FTP, SOAP, TCP, IP, IPv4, IPv6 (1000s) • Corba: GIOP, IIOP, ESIOP, RMI, IDL • FIX (Financial Information eXchange) • Amazon API, REST, BING API, Netflix API, Google Protocol Buffers, … 7
  9. 9. Publish or Perish • US Federal Court and EU order • “Consent Decree” • Microsoft to publish server side API documentation • Goal: interoperability for third parties • Hard milestones/deadlines imposed by Federal Judge • Microsoft Open Specification Initiative 10
  10. 10. Cast of Thousands • MSFT project management • 100s of senior MSFT developers wrote/revised TDs • TD publication staff • More than 350 test suite devs (mostly in Hyderabad & Beijing) • ~20 Independent Reviewers (5 System Verification Assoc.) • Process Architects (MSFT & System Verification Assoc.) • MSFT Netmon and other tool developers • MSFT Plugfest team • “Technical Committee” chartered by court, with over 50 FTE reviewers of published TDs 11
  11. 11. What is a Microsoft Protocol? • All product groups • Windows Server • Office • Exchange • SQL Server • Others • 500+ protocols • Remote Desktop • Active Directory • File System • Security • Many others • Remote API for a service 12
  12. 12. Microsoft Technical Document (TD) • Publish protocols as “Technical Documents” • One TD for each protocol • Similar to RFC • Must strictly follow template • Black-box spec – no internals • All data and behavior specified with text • OS version differences – endnotes 13
  13. 13. All Technical Docs Public 14 http://msdn.microsoft.com/library/jj712081
  14. 14. Challenges • Validation of documentation, not as-built implementation • Is each TD well-formed? • Follows TD standards • Consistency, correctness, completeness • Is each TD all a third party needs to develop: • A client that interoperates with an existing service? • A service that interoperates with existing clients? • Only use over-the-wire messages 15
  15. 15. Test-Model Driven Protocol Verification 16 Model-based Test Suite Analysis Data and behavior statements Model assertions generate and check response of actual Windows Services Technical Document Modeling• Approximates third party implementation • Validates consistency with actual Windows implementation Test Execution Requirements Specification WS 2008 WS 2003 WS 2000 Stobie et al, © 2010 Microsoft. Adapted with permission.
  16. 16. Study • Scrutinize TD • Define Test Strategy Protocol Quality Assurance Process 17 Plan • Complete Test Rqmts • High Level Test Plan Design • Complete Model • Complete Adapters Final • Gen & Run Test Suite • Prep User Doc Review • TD ready? • Strategy? Review • Test Rqmts? • Config ? • Plan ? Review • Model? • Adapters? Review • Coverage? • Test Code? TD v2 TD vn Test Suite Developers Authors Reviewers TD v1
  17. 17. Results • Published 500+ TDs • 60,000+ pages • 50,000+ “Technical Document Issues” • Most identified before tests run • Many Plugfests, many 3rd party users • Released high interest test suites as open source • US DoJ case closed June 12, 2011 18
  19. 19. TD Statements Data Statement Behavior Statement 20 <24> Windows XP and Windows Server 2003 DHCP clients request only option code 249 in the Parameter Request List. Endnote R_DhcpRemoveOptionValue When processing this call, the DHCP server MUST do the following … If the DHCPv4ClassedOptValue corresponding to the OptionID parameter is not present, then return ERROR_OPTION.<24> Otherwise, … DHCP_OPTION_ID … a unique integer that identifies the option defined for a user class and a vendor class. The option ID range for DHCPv4 options is 1 to 255, while the option ID range for DHCPv6 options is 0 to 65536. [MS-DHCPM] Microsoft Dynamic Host Configuration Protocol (DHCP)
  20. 20. TD Statements to TD Test Requirements Req ID Doc Sect Description Pos Neg Derived Inform /Norm Verification TSCH _R142 2.4.1 The client MUST set the File Version (2bytes, it contains the Version of the .JOB file format) field of the FIXDLEN_DATA structure to 0x0001. R110 2 R113 1 Norm TSCH _R145 2.4.1 The server MUST ignore the value in the App Name Len Offset field of the FIXDLEN_DATA structure. Norm Non-testable TSCH _R146 2.4.1 The Trigger Offset (2 bytes) field of the FIXDLEN_DATA structure MUST contain the offset in bytes within the .JOB file where the task triggers are located. R110 2 R113 1 Norm TSCH _R1332 Upon receipt of the SchRpcGetSecurity call, the server MUST return S_OK on success. Norm Test Case TSCH _R1333 The SchRpcEnumFolders method MUST retrieve a list of folders on the server. Norm Adapter TSCH _R1350 Norm Test Case 21 [Upon receipt of the SchRpcEnumFolders call, the server MUST ] Return [the value 0x80070003] the HRESULT version of the Win32 error ERROR_PATH_NOT_FOUND, if the path argument does not name a folder in the XML task store, or if the caller does not have either read or write access to that folder.
  21. 21. Document Debugging • Template, MUST-SHOULD-MAY • Ambiguous, unclear, inconsistent • Missing or incorrect • SUT response inconsistent • Implicit antecedent • Cause or effect too broad • No effect for corrupt/missing cause • Unobservable or uncontrollable • Infeasible • TDI – Author rewrite • TDI – Author rewrite • TDI – Author rewrite • TDI – Code bug and/or rewrite • Add antecedent […] • Add derived, narrow domain • Add derived, negative effect negative effect • Add derived, observable effect • Add derived, narrowed scope Bug Fix
  22. 22. Document Debugging •Every TD statement analyzed • Scrutinize • Categorize • Make context explicit • Trace dependencies • Assess testability • Allocate 23
  23. 23. Scrutinize • Ambiguous phrasing • Misuse of MUST, SHOULD, MAY • Inconsistent • Unclear • TD template violations • Write bug report for author correction 24
  24. 24. Categorize • Normative or Informative? • Like code comments (informative)? Conceptually, cells are numbered in a dataset as if the dataset were a p-dimensional array, where p is the number of axes. • Or like code (normative)? SVR_RESP (1 byte): A single byte whose value MUST be 0x05. • If removed, would that prevent 3rd party interop? • No modeling/testing for informative 25
  25. 25. Make Context Explicit • Add implicit antecedents • Use [ ] to indicate addition • Preserves meaning in code, test results, log files 26 Test Requirement 1 If the computeByClause is present, one group is created for each unique combination of values in the column or columns specified in the computeByClause. Test Requirement 2 Otherwise [if the computeByClause is not present], all rows of the child RecordSet are treated as a single group [in the computeByClause.] Original TD statements If the computeByClause is present, one group is created for each unique combination of values in the column or columns specified in the computeByClause. Otherwise, all rows of the child RecordSet are treated as a single group.
  26. 26. Trace Dependencies • Is there a stated observable effect: • For every cause? • When a cause is missing or corrupted? • Record analysis with linked requirements 27 Req ID Description Pos Neg Derived Verification R100 Actions: This part MUST be present and MUST specify the action to be performed once the task is started. R110 ??? Test Case R110 The server MUST execute multiple actions sequentially, in the order specified in the Actions field. Test Case Stobie et al, © 2010 Microsoft. Adapted with permission.
  27. 27. Assess Testability • A test requirement is testable if: • Sufficient to generate and/or evaluate in code • Observable over-the-wire • Non-testable if: • Unobservable • Uncontrollable • Infeasible • Excessive cost to develop test 28
  28. 28. Assess Testability • Unobservable or uncontrollable All the structures MUST begin on 8-byte boundaries, although the data that is contained within the structure need not be aligned to 8- byte boundaries. • Can’t detect mis-alignment at test endpoint After close, the server MUST release all resources. • No way to check using protocol • Infeasible The server MUST return a unique ID. • No way to conclusively determine uniqueness 29
  29. 29. Assess Testability • What to do about non- testable statements? • Punt? • Interpretation unpredictable (testers and users) • Lowers coverage • Skip? • Taints credibility • lowers coverage • Add derived test requirement • Rewrite non-testable • Strictly limited revision or elaboration • Significant requirements engineering innovation 30
  30. 30. Derived Test Requirements • Case: only an instance of a domain • Partial: Observable subset • Inferred: Indirect result of several causes 31 Req ID Description Derived Verification Comments R42 MUST accept any positive number Non-testable Infeasible R1042 MUST accept 1024 42:c Test Case R39 Ignored by the server on receipt Non-testable Server internal behavior R1039 Reply is the same whether 0 or non- zero is used for Field 39:p Test Case
  31. 31. Fully Elaborated Test Requirements 32 Req ID Description Pos Neg Derived Verification R100 Actions: This part MUST be present and MUST specify the action to be performed once the task is started. R110 R1100 Test Case R110 The server MUST execute multiple actions sequentially, in the order specified in the Actions field. Test Case R113 pErrorInfo: If this parameter is non-NULL and the XML task definition is invalid, the server MUST return additional error information. Test Case R114 0x8004131A SCHED_E_MISSINGNODE: The task XML is missing a required element or attribute. Test Case R1100 If Action is missing, SCHED_E_MISSINGNODE is returned in pErrorInfo R100, R113:i, R114:i Test Case
  32. 32. Allocate Test Requirements • To Test Case • Develop model contract and/or test code • Generate the condition and send a message • Evaluate response, pass or fail • To Adapter • Basic data structure and format checked as side-effect • Netmon parsing • Transport layer marshaling 33
  34. 34. Test Suite Model-Based Testing SUT Test Requirements Model error, omission Bug Develop Ambiguous, missing, contradictory, incorrect, obscured, incomplete Missing, incorrect Coverage  Requirements  Model  Code Expected Outputs (Test Oracle) Inputs (Test Sequences) Run Control Observe Evaluate Generate Test Model 35
  35. 35. Why Model-based Testing? • Effectiveness • Scope • Automate generation of huge number of tests • Mitigate brittleness/breakage risk • Highly structured behavior well-suited to modeling • Easier to assess model than huge test suite • Consistent and automatic transition coverage versus arbitrary or ad hoc strategies 36
  36. 36. Spec Explorer • Model-based testing tool • Developed at Microsoft Research • Productized after extensive use • Visual Studio “Power Tool” • Development UI 37 • Generates standalone executable test suite • Tests for any service or API – not limited to Microsoft
  37. 37. Spec Explorer Personality • Entire model in C# - no pictures/UML • Inline calls to Spec Explorer framework • Include any programmable function or Dot Net capability • Bottom up modeling • Aggregate model synthesized • State machine slicing defines scenarios • Coverage strategy • Constraint Solver • All transitions of the explored model/scenario, short or long path • Combinational selection of transition parameter values 38
  38. 38. Spec Explorer Model Program 39 static class Model { public enum TimerMode { Reset, Running, Stopped } static bool displayTimer = false; static TimerMode timerMode = TimerMode.Reset; static bool timerFrozen = false; [Rule] static void StartStopButton() { Condition.IsTrue(displayTimer); if (timerMode == TimerMode.Running){ timerMode = TimerMode.Stopped; timerFrozen = false; } else timerMode = TimerMode.Running; } [Rule] static void ModeButton(){ ... } [Rule] static void ResetLapButton(){ ... } [AcceptingState] static bool IsTimerReset(){ ... } ... } [Rule] identifies behaviors to be explored Precondition – rest of body executed only if true Identifies goal state – stops exploration Body: update model state, log, generate expected result Model state drives exploration
  39. 39. Model Exploration • Scenarios and slices • Any subset of all [Rule] methods • Uses reg-exp like syntax • Represent use cases • Data-driven slice • Manages state explosion problem • Explore • Constraint solver finds feasible paths using initial data values and symbolic execution • Supports iterative model development 40
  40. 40. SE’s Visualization of an Exploration 41 Stobie et al, © 2010 Microsoft. Used with permission. Rule enabled: SUT message + feasible data binding Expect a response from the SUT • Each root to leaf path: test case(s) • Explores feasible pre/post bindings • User limits depth and action set • Generate code for each test case Updated model state SE reached an Accepting State
  41. 41. Traceability in Model Programs static void CaptureRequirement (User caller) { if (!caller.isAdmin) { RequiresCapture(1087, "In response to NetrJobGetInfo request the " + "server MUST Return ERROR_ACCESS_DENIED if the caller does not have " + "administrative privileges on the server."); RequiresCapture(1091, "In response to NetrJobGetInfo request, the " + "server MUST use Windows Error Codes as specified in [MS-ERREF]."); return TschErrorCode.ERROR_ACCESS_DENIED; } else { //This action returns success if caller has admin privilege and //The requested job exists in the job list if (atsvcJobsStore.ContainsKey(jobId)) { RequiresCapture(1025, "If the server implements the ATSvc " + "interface, it MUST implement the NetrJobGetInfo (Opnum 3) method."); RequiresCapture(1785, "NetrJobGetInfo method MUST have " + "administrator privileges."); return TschErrorCode.ERROR_SUCCESS; } } 42 Requirement Id Requirement text hardcoded to make code clear and logs readable Helper method called when Rule selected Stobie et al, © 2010 Microsoft. Used with permission.
  42. 42. Server OS Transport Client OS Transport Adapter Typical Test Configuration 43 Adapter Test Suite Netmon Transport Endpoint Under Test Transport
  43. 43. Netmon Capture 44 20540TSAP TSAP:TestCase Name=....Test_ITask_RegisterFlagsS8, Message= MS-TSCH_R1215 20541TSAP TSAP:TestCase Name=....Test_ITask_RegisterFlagsS8, Message= MS-TSCH_R1569 20544TSCH TSCH:ITaskSchedulerService::SchRpcDelete Response, ReturnValue=1 vstesthost.exe 20545TSCH TSCH:ITaskSchedulerService::SchRpcGetTaskInfo Request, Path=CH1223330325 Flags=0 (0x0) 20546TSCH TSCH:ITaskSchedulerService::SchRpcGetTaskInfo Response, Enabled=0 State=0 ReturnValue=1 20547TSAP TSAP:TestCase Name=....Test_ITask_RegisterFlagsS8, Message= MS-TSCH_R5 20548TSAP TSAP:TestCase Name=....Test_ITask_RegisterFlagsS8, Message= MS-TSCH_R17 20549TSAP TSAP:TestCase Name=....Test_ITask_RegisterFlagsS8, Message= MS-TSCH_R10 20550TSAP TSAP:TestCase Name=....Test_ITask_RegisterFlagsS8, Message=Assert.IsTrue succeeded. The SchRpcDelete method MUST delete a task in the task store. 20551TSAP TSAP:TestCase Name=....Test_ITask_RegisterFlagsS8, Message= MS-TSCH_R1892 20552TSAP TSAP:TestCase Name=....Test_ITask_RegisterFlagsS8, Message=Assert.IsTrue succeeded. Upon receipt of the SchRpcDelete call the server MUST delete the task from the XML task store. Stobie et al, © 2010 Microsoft. Adapted with permission. Netmon free download http://www.microsoft.com/en-us/download/details.aspx?id=4865
  44. 44. Complete Traceability 45 Technical Document Requirements Spec Model Test Suite Logs Network Captures Stobie et al, © 2010 Microsoft. Used with permission.
  45. 45. Productivity Requirements Study Modeling Adapter Coding Test Coding Test Execution Model-based Testing 1.4 Days/Requirement Traditional Testing 2.4 Days/Requirement • Total effort: 250 person years (mostly junior SDETs) • Saved 50 person years with model-based testing 42% Less Time Per Requirement 46
  46. 46. Lessons Learned – Technology • Technical documentation for complex systems is meaningless (or worse) without validation. Different mindset needed for doc validation • Many published standards (RFCs) have significant deficiencies • Shallow coverage (1 test/rqmt) was effective because existing services were “golden” • MBT productivity and effectiveness even greater for deeper testing of new services • Be practical: use hand-coded test logic when convoluted behavior defies modeling 47
  47. 47. Lessons Learned - Process • Allow flexibility for improvement • Insist on compliance for results • Ongoing training crucial • Bottom-up evolution • Transparent community participation • Strict and consistent enforcement • Usable conformance test suites highly valuable 48
  48. 48. Q & A 49 rvbinder@tezzter.com
  49. 49. Resources and Sources • Microsoft Open Specification web site http://www.microsoft.com/openspecifications • Technical Documents http://msdn.microsoft.com/library/jj712081 • Project Overview http://queue.acm.org/detail.cfm?id=1996412 • Spec Explorer About http://msdn.microsoft.com/en-us/library/ee620411.aspx Download http://visualstudiogallery.msdn.microsoft.com/en- us/271d0904-f178-4ce9-956b-d9bfa4902745 • Netmon and protocol parsers http://blogs.technet.com/b/netmon/ • Protocol Test Suites (must provide Live Id to login) http://msdn.microsoft.com/en-us/openspecifications/cc816059.aspx 50 Credits Protocol map, Copyright 2001, Agilent Technologies. Some slides adapted with permission from Stobie, Kicillof, and Grieskamp, “Discretizing Technical Documentation for End-to-End Traceability Tests,” INRIA 2010. Selected charts and figures from Grieskamp, Kicillof, Stobie, and Braberman, “Model-based quality assurance of protocol documentation: tools and methodology,” Journal of Software Testing, Verification, Validation and Reliability 21: 55-71 (March 2011).