Hello everyone! Good Afternoon. I’m Jitendra Subramanyam from CAST Software. I work closely with Bill and unfortunately, Bill couldn’t be here – he wrenched his shoulder and had to have some surgery. [He does send his regrets.] Bill is the Director of CISQ – the Consortium for IT Software Quality. In his absence, I’m going to give you an update on CISQ quality metrics and some examples of what those metrics might look like in the field. As you can tell, I’m not from Texas, and I’m not as loud as Bill, but I’ll do my best to convey the letter and the spirit of his message! [“Confidence As a Product” Confidence in measuring against a standard. Clearly defining *WHAT* to measure and specifying *HOW* to measure it. (Soley: Standards create a market and an ecosystem around that market) – Reliability (automation is the key to consistency). Confidence that you’re measuring things that matter – Validating the metrics: Verifiability Confidence that the standard is being applied properly – Certification]
CISQ is a global consortium of IT executives from private and public sector organizations, IT service providers, and technical experts coming together to define the metrics for measuring quality (the *WHAT*) and specifying *HOW* to measure them. These groups are brought together by the SEI and OMG. This brings us to the main objectives of CISQ.
CISQ has 4 main objectives. Objective 1: to raise awareness of software quality issues. Objective 2: Develop an automated standard for software quality. Automation is key because it increases repeatability, makes measurement cost effective, and enables benchmarking. Objective 3: To promote acceptance of the standard – Bill was instrumental in doing this for the CMM standard and he wants to take a similar approach here as well. (Involve all parties, make sure the standards are clear and applicable to how people do their work.) Objective 4: A system to assess and certify if services and products are up to the CISQ standard. Both SEI and OMG have a lot of experience doing this.
Any organization can become a member of CISQ and have their folks join CISQ technical groups and attend executive webinars and meetings. I’ll tell you about the technical groups in just a moment. So far, CISQ participants have come from corporations like FedEx, IBM, Morgan Stanley, McKesson; system integrators like Capgemini, Booz, TCS; govt agencies like DHS, HHS; and universities likes the Technical University Munich and the University of Memphis. You can also sign up for membership on the CISQ web site at www.it-cisq.org.
You’ve probably seen some version or the other of this widely-reproduced cartoon. One scientist is saying to the other, “I think you should be more explicit here in step two.” Indeed! To create a standard means to define it clearly and have a repeatable way to measure it. As you know, there’s already a considerable amount of “infrastructure” around a quality standard. CISQ is not trying to reinvent the wheel.
Let me describe the elements of what’s already out there. To the right are the two tangible outputs of CISQ -- A set of defined metrics, and a living repository of weaknesses and anti-patterns. To get there we piggy back on several elements that are already in place. OMG has two task forces that are suitable for CISQ: The Architecture Modernization Platform and the Software Assurance Platform Task Force. In addition, there are three OMG meta-models that provide guidance on how to write the definitions: The Structured Metrics Meta-Model, the Abstract Syntax Tree Meta-Model, and the Knowledge Discovery Meta-Model. As much as possible, we also plan on incorporating and staying consistent with existing standards – ISO 9126 and the newer ISO 25000 series, the Common Vulnerability Scoring System, and the Common Weakness Enumeration from MITRE. So we’re not building from scratch but standing on the shoulders of giants. CISQ will get the bulk of its work done through technical groups. And there are 5 of them.
CISQ work products will be created by these 5 Technical Working Groups: Size, Maintainability, Reliability & Performance, Security, and Metrics Best Practices. These five focus areas were decided during the two inaugural meetings for CISQ that took place late last year – one in Frankfurt, Germany and the other in Arlington, Virginia. Any organization can become a member of CISQ and have their folks join these technical groups. Bill is finalizing the 2010 calendar for Technical Group meetings and work products. He’ll have an update on the CISQ web site very shortly.
CISQ aims to create three types of certification – for developers, appraisers, and the tools themselves. For the developer and appraiser certifications CISQ will again leverage existing knowledge from OMG and SEI. Tools has proven difficult in the past, but we’re hoping to explore some options with SEI and OMG.
CAST Application Intelligence 08/07/13 In addition to defining quality metrics clearly, specifying how to automate their measurement, and certification, a quality standard like CISQ must specify how to aggregate quality measures from the component level up to the application level. Two facts about software quality make this non trivial. The first is that software quality is contextual. A module can be excellent in quality or highly dangerous depending on the context in which it operates. And context depends on interactions that cross component, interface, language, and technology boundaries [A module that does connection pooling can be just fine until you add a database around it that doesn’t like that specific way in which the connections are handled. That’s not the poor component’s problem, but that’s the contextual nature of quality. Calls to tables that look fine one day start to look terrible when those same tables have grown by 100x (or contain binary files like images).] So CISQ will take the entire application into account when defining and measuring quality and provide clear rules for aggregating from one layer to another. The second condition of quality that makes aggregation difficult is that software quality cannot simply work at the physical level – it must be aware of the logical structure of the application as well.
Software quality is structural. What do I mean by that? Think about how you would sum 1+2+3+ and so on +100. Now think about summing to 1 billion. The point is, the software we deal with has billions and billions of states. At best, performance tests cover only a tiny fraction of these states. To have any confidence in our software, we have to rise to the structural or meta-model level. It’s at the structural level that we get a better grip on these billions of states. So back to the addition problem. You can simply add the numbers by brute force. But the reliable way to do it is to take advantage of a structural pattern. In this instance, put the 100 aside. 1+99 is 100; 2+98 is 100. You get 49 of these – that’s 49 hundred. Add the remaining 50 and the 100 you set aside, you get 5050. You solve the problem at the structural level. It’s much more reliable to do it this way and you’re much more confident that you’ve got it right. At CAST we’re committed to full compatibility with the CISQ standard. Our metrics already take context and structure into account and we’ll continue to work closely with CISQ to ensure complete compatibility. To give you a concrete sense of existing software quality metrics, I’ll quickly cover the ones we use at CAST.
The metrics at the tip of the iceberg is what usually gets measured – defects, response time, outage duration. The submerged part – complexity, robustness, and maintainability, are the root causes of the problems that show up above the waterline. At CAST we make these root causes of outages – what’s below the waterline -- explicit. We make them measurable; and we automate their measurement.
At the highest level, these are the quality metrics we automate and make measureable. I’ll give you a moment to scan the slide. If you look at the bottom right, you’ll see the term “Critical Violations”. Critical violations occur when the software deviates from well accepted rules of software engineering. To put it simply – more critical violations, the lower the quality of the software. When critical violations are fixed, software performance, robustness, transferability – in other words, QUALITY -- will improve.
We’ve tested this out in the field. This is a large technology company’s internal global accounts system which tracks credit requests as they flow through the system. It is a large, important, and highly-visible corporate system. We measured the number of new violations introduced per back-fired function point. That’s the Y axis on the RIGHT. The Y axis on the LEFT shows production defects per back-fired function point as recorded in IBM’s defect tracking system. There’s a strong correlation between CAST quality metrics and actual production defects. So we’re not just making it up. The way we define and measure software quality tracks what goes on in the real world. Tracking CAST quality metrics has enabled the internal IT team at this company to reduce their development and M&E costs on the global credit management system. It’s something I’m sure their CFO appreciates!
A second example from the field. The Retirement Services arm of a large bank has been using CAST for 8 years. Performance is key to them because even minor business disruption can lead to large losses of revenue. When a problem is found, there’s a premium on fixing it quickly. Tracking quality enables them to find and fix problems more efficiently. In the period spanning Q4 of 2007 to Q2 of 2009, the cost of fixing a defect per 100 resource hours has dropped dramatically, almost by an order of magnitude. There may be some ups and downs, but the overwhelming trend is a significant drop in cost of defects – a clear sign of rising quality despite the very diverse technology environment in which they operate – a result of multiple acquisitions over the last 15 years. Quality and size trends are used in Agile development to check quality at the end of each sprint. They’re also setting objective, precise, actionable quality targets for their outsource providers. So different CAST customers, different technology landscapes, similar quality results.
Over the last 10 years, we’ve analyzed literally thousands of applications. We’re building the biggest software quality database in the world with quality data from these applications. The database is called AppMarQ – short for Application Quality Benchmark. We’ve started to use AppMarQ to generate benchmarking reports at the company level. Here’s an example from a retail company in the UK. A benchmark like this one can quickly highlight and prioritize areas for improvement. For example: * Test the 20% of modules that contribute to 80% of problems * Train developers to correct the 3 most common critical violations With quality benchmarks on the right and additional information like maintenance costs, development costs, and customer satisfaction on the left, we can begin to answer questions like – if I improve quality by 10%, how much will maintenance costs drop? How much quality is enough ? We’ve looked at some of the ways CAST quality metrics are used in the field. Let me wrap up by looking ahead.
CISQ is a member-driven organization. Members shape the particular metrics to focus on and their uses in the field. Of late we’ve had requests for additional objectives and topics for the executive forums.
[Watts Humphrey is a software metrics process pioneer and guru.] CISQ is the map. Measuring against these well-defined metrics tells you where you are. The CISQ standard gives us reliability, verifiability, and certification – greatly improving confidence in the software product. Let me stop there. Thank you for your attention.
Software Quality MeasurementSoftware Quality Measurement
Dr. Bill Curtis
• Application quality metrics
• Method for automated measurement
• Technical certification
THE ECOSYSTEMTHE ECOSYSTEM
Raise international awareness of the critical
challenge of IT software quality1
Develop standard, automatable measures and
anti-patterns for evaluating IT software quality2
Promote global acceptance of the standard in
acquiring IT software and services3
Develop an infrastructure of authorized
assessors and products using the standard4
Platform Task Force
Platform Task Force
Develop a definition for
automating Function Points
Measure elements affecting
maintenance cost, effort, & time
Measure elements affecting
availability and responsiveness
Measure elements affecting
vulnerability to attack and loss
Define methods for using code
measures internally and externally
Technical Working Groups
Certify that developers
understand how to
developers on many of
Certify that appraisers
are capable of using the
standards effectively in
SEI has developed
licensing services for
appraisers in areas
such as CMMI
Certify that tools which
implement the defined
measures and anti-
Proven difficult in the
past, but options will
Software Quality is Contextual
Java, C++, …
Frameworks Struts MVC, Spring
tion Tiertion Tier
Web / Client Server Applications
CICS Monitor (Cobol)
Tuxedo Monitor (C)
Data Management Layer
EJB – Hibernate - Ibatis
Drivers of business disruption risk and cost thrive at the interfaceDrivers of business disruption risk and cost thrive at the interface
between technologies, beyond siloed skill sets and expertisebetween technologies, beyond siloed skill sets and expertise
28 native +
Application Structure Meta-Model
Health Factors Cost DriversRisk Drivers
Rules From Industry
Rules from CAST
Application Analysis Engine
Software Quality is Structural
Software Quality: From Symptom to Cause
poor response timedegraded performance
program structureprogram structure
coding practicescoding practices
Steve McConell (1993), Code Complete.
CAST Application Quality Metrics
Business Risk Exposure
Maintainability (as defined
by the SEI)
Size in KLOC
Size in Back-Fired Function
Size in CAST-Computed
Cyclomatic: Number of Objects
of Low, Medium, High, and Very
High Cyclomatic Complexity
CAST Complexity: Number of
Objects of Low, Medium, High,
and Very High CAST
Number of Passed Checks
Number of Failed Checks
Number of Critical Violations
Reduced Development and Maintenance Costs
3.2 3.3 3.4 3.6
CAST Violations vs. Actual QA Defects
Application Analyzed: Global,
comprehensive tracking system
of requests from the first receipt
of the credit request to the final
approval of the request by the
Technologies: J2EE, DB2
~10x Reduction in Cost of Fixing Defects
Industry: Financial Services
Applications: 75 supported
application/functions run by
the Business Groups and
Very complex technology
environment, grown over
last 15 years (J2EE, .NET,
COBOL, Oracle, DB2)
AppMarQ Benchmark and Prioritization
Driver is at or exceeds Median of World-Class
Driver is between Median of Peer Group and
Driver is below Peer Group Median
Cost Driver Scores
Cost & Risk Matrix
2010 AND BEYOND2010 AND BEYOND
• CISQ will pursue member-driven objectives
– Determined by CISQ Executive Forum
– Consensus among CISQ members of problem to be addressed
• Early requests for additional objectives:
– Defect and failure-related definitions
– Business value measures related to application quality
– Productivity/Size measurement
• Use of Executive Forum for addressing industry
– Outsourcing quality SLAs
– Regulatory compliance