Lecture 10 Static Testing.ppt

Outline
 Introduction
 Types of reviews
 Formal reviews
 Exercise

“A stitch in time
saves nine.”

Cost to fix a defect
Time detected
Requirements Architecture Construction System test Post-release
Time
introduced
Requirements 1× 3× 5–10× 10× 10–100×
Architecture - 1× 10× 15× 25–100×
Construction - - 1× 10× 10–25×
Detect errors at the phase in which they were introduced
You will see different but similar figures elsewhere

Objectives
 Improve the quality of a work product
 Identify and document defects/issues earlier
 By…
 Verifying that the work product conforms to standards, product specifications, and
requirements
 Reaching consensus on work products
 Increasing product knowledge amongst peers
 Identifying improvements to the process

Definition : Review (IEEE Std 1028-1988):
“An evaluation of software element(s) or
project status to ascertain discrepancies from
planned results and to recommend
improvement. This evaluation follows a
formal process...”

DEFINITION : Peer Review (SEI CMM):
“A review of a software work product,
following defined procedures, by peers of
the producers of the product for the
purpose of identifying defects and
improvements.”

Who Reviews?
PROOF-READING
“…reviewing can be viewed as being analogous to
proof-reading or writing a critique of a paper or
book. As many writers are aware, it is extremely
difficult to proof-read and critique one’s own
work.”

Who Reviews?
PSYCHOLOGICALLY DIFFICULT
“...it is extremely difficult, after a programmer has
been constructive while designing and coding a
program, to suddenly, overnight, change his or her
perspective and attempt to form a completely
destructive frame of mind toward the program.
Completely Deleting a project, a disposing off a
digital circuit you designed is quite hard.

Who Reviews?
MISINTERPRETATION OF REQUIREMENTS
.“...the program may contain errors due to the
programmer’s misunderstanding of the problem
statement or specification. If this is the case, it is
likely that the programmer will have the same
misunderstanding when attempting to test his or her
own program.”

WHO REVIEWS?
Who Reviews?
Not the author of the work product but…
… his or her peers

Why Choose Reviews?
 Reviews are cost effective
 Reviews are often the only verification activity
available at the start of the project
 A review can find multiple errors while a test can
generally only find one
 Reviews encourage teamwork and people to work
together (two heads are better than one)
 Work products that have been reviewed are of higher
quality (e.g. lower defects)
 Reviews are a good way to learn about work products
(especially for new people)
 Reviews are flexible (different types of reviews can
be designed to meet a variety of quality objectives)

Benefits of reviews
MEASUREMENT WORLD CLASS
BENCHMARK
COST of poor quality Reduced from 35% to 15%
Defect removal efficiency 70-90% defects removed before
test
Inspection Cost/Saving Average cost $2,500
Saving $25,000
Post Release Defect Rates 0.01 defects per KSLOC
Productivity Doubled in 3 years (30% increase
per year)
Return on investment 7:1 - 12:1
Cycle times Reduced by 10 – 25%

Reduced Life-Span
Time
Normal life-
cycle without
inspection
Life-cycle with
inspection. Effort is
front loaded and total
effort reduced

Defects - Detect and Prevent
 Of course, we would like to detect more errors earlier
by performing better reviews - “Product inspection”

Defects - Detect and Prevent
 … But we really want to prevent errors entering the
product altogether by managing the process better -
“Process improvement”
 “learning by making mistakes”
 ...by ensuring that there is proper
feedback from the reviews
 A “formal” review process provides the
framework necessary for this

Levels of formality
Informal Formal
A group of peers
hold a meeting to
discuss a work
product
In reality, most
Reviews lie in
between. Most
actual processes
lie somewhere
here
•Planned
•Tracked
•Defined process
•Defined roles
•Metrics used
•Training required
•Checklists used
•Disciplined
•Controlled
•Improving

Informal Reviews - Low
Rigour
 A group of work colleagues meeting to discuss some
work in progress
 Bounce ideas off peers
 Quickly resolve easy to spot problems
 Raise consensus on difficult development issues
 Reduce the number of minor errors and formal review
“show stoppers”
 Educate other team members on a new area of
functionality
 Enhance the chances that the finished product will pass
the formal review

Walkthrough - Medium rigour
 The author presents their ideas to a group of 3 - 7 peers
 Not a rigorous process, the presenter simply walks the
participants through the document/code
 Although it is intended to find errors, it is primarily for others
to get an understanding

Formal Review(Program
inspections) - Highly rigorous
 A managed process (someone is responsible for it)
 Reviews are planned and tracked
 Participants are trained
 Metrics used to measure the effectiveness of the reviews
 Strict entry and exit criteria act as quality gates
 Supported by checklists
 A work product is not passed onto the next development
phase until it has passed the formal review
 The formal reviews process is a mechanism for monitoring
and identifying process improvement initiatives

Fagan Inspections
 Developed by Michael E Fagan in 1970’s whilst at IBM
 Fagan inspections are rigorously controlled reviews within the
context of a process improvement programme
 Inspections are attended by 3 to 5 hand picked peers most
appropriate for the material under inspection
 Focuses on capturing errors and can find up to 85% of
requirements and software errors

When to do what?
“Use walkthroughs for training, reviews for consensus, but use
inspections to improve the quality of the document and its process.”
Tom Gilb

When not to use peer reviews
 When the change is small
 When the costs of finding the error during review
exceeds the cost to find it by other methods.

When Reviews Go Wrong
 Your task is to consider in groups of
3 the following question:
 Why/How might reviews “go wrong”?
 How might they be abused?
 You are free to consider any aspect of reviews and the
review process (however constituted).

Stages in A Formal Review
Process
Planning What will be inspected, who will attend,
where and when. Material is distributed
Preparation Attendees read and understanding the
material
Inspection The inspection lasts for a maximum of 2
hours and the objective is to find defects.
Rework The defects detected are tasked for repair
Follow-up The moderator is responsible for ensuring
all actions are completed or justified.

Inspections…
 3-4 member teams
 Only one of them is programmer of code under
inspection
 Error locations are detected precisely
 30-70 percent errors are detected
 Is effective for even detecting complex errors
 Is lead by a moderator

Moderator
 Schedule, moderate and session
 Record errors
 Make sure errors are corrected

Process
 Programmer reads code
 Other member ask questions
 Programmer can find errors by just reading code
 Upon completion programmer is given list of errors

Human Agenda
 Adopt an appropriate attitude..

Side Benefits
 Programming style
 Choice of algorithms
 Introduction to products and procedures

Desk Checking
 A desk check can be viewed as a one-person inspection
or walk-through: A person reads a program, checks it
with respect to an error list, and/or walks test data
through it.

Peer Ratings
 Peer rating is a technique of evaluating anonymous
programs in terms of their overall quality,
maintainability, extensibility, usability, and clarity. The
purpose of the technique is to provide programmer self-
evaluation.

Definitions
 “Usability testing” is the common name for
multiple forms of both user and non-user based
system evaluation focused on a specific aspect
of the system use
 Done for many, many years prior, but
popularized in the media by Jakob Neilson in
the 1990’s
35

36
What does “usability” mean?
 ISO 9126
 “A set of attributes that bear on the effort needed for
use, and on the individual assessment of such use, by
a stated or implied set of users”
 ISO 9241
 “Extent to which a product can be used by specified
users to achieve specified goals with effectiveness,
efficiency and satisfaction in a specified context of
use.”

37
What does “usability” mean?
 Jakob Neilson
 Satisfaction
 Efficiency
 Learnability
 Low Errors
 Memorability
 Ben Shneiderman
 Ease of learning
 Speed of task completion
 Low error rate
 Retention of knowledge over time
 User satisfaction

Usability Testing is…
 Any of a number of methodologies used to try
to determine how a product’s design
contributes or hinders its use when used by
the intended users to perform the intend tasks
in the intended environment
 Most common forms include
 Expert Review/Heuristic Evaluations
 User-based testing

When is usability assessed?
 On an existing product to determine if usability problems
exist
 During the design phase of a product
 During the development phase of a product to assess
proposed changes
 Once a design is completed to determine if goals were
met

Expert Review
 Aka: Heuristic Evaluation
 One or more usability experts review a
product, application, etc.
 Free format review or structured review
 Subjective but based on sound usability and
design principles
 Highly dependent on the qualifications of the
reviewer(s)
40

Expert Review (Concluded)
 Nielson’s 10 Most Common Mistakes Made by
Web Developers (three versions)
http://www.nngroup.com/articles/top-10-mistakes-
web-design/
 Shneiderman’s 8 Golden Rules
 Constantine & Lockwood Heuristics
 Forrester Group Heuristics
 Norman’s 4 Principles of Usability
41

User based Usability Testing
 An empirical study of a product’s usability by observing
actual users do real tasks with the product
 Involves:
 Real users
 Real tasks
 Specific usability goals/concerns
 Observing and recording the testing
 Data analysis

What to test?
 Has each user interface been tailored to the intelligence, educational
background, and environmental pressures of the end user?
 Are the outputs of the program meaningful, non-insulting to the user, and
devoid of computer gibberish?
 Are the error diagnostics, such as error messages, straightforward?
 Where accuracy is vital, such as in an online banking system, is sufficient
redundancy present in the input?
 Does the system contain an excessive number of options, or options that
are unlikely to be used?
 Does the system return some type of immediate acknowledgment to all
inputs?
 Are the user actions easily repeated in later sessions?
 Did the user feel confident while navigating the various paths or menu
choices?

Process
 Planning
 Test Design
 Test User Selection
 Execute Tests
 Analyse Data

Generate Tests
 Create real user tasks in various random order
 For example, among the processes you might test in a customer tracking
application are:
 Locate an individual customer record and modify it.
 Locate a company record and modify it.
 Create a new company record.
 Delete a company record.
 Generate a list of all companies of a certain type.
 Print this list.
 Export a selected list of contacts to a text file or spreadsheet format.
 Import a text file or spreadsheet file of contacts from another
 application.
 Add a photograph to one or more records.
 Create and save a custom report.
 Customize the menu structure.

…
 Provide detailed instruction on how to perform these
tasks
 Design questionnaires and interviews to be conducted
after user performs these tasks

Select Users
 How to select users?

User Analysis & Profiles
 Who are your actual users? You may need to break
your users into typical user categories. Consider:
 Demographics: age, sex, race, education level, cultural
background, socioeconomic status,…
 Experience level with the product, with products of the same
genre, with required technology,...
 Other things:
 motivation
 learning style
 subject matter knowledge
 location of use
 physical characteristics
 people with disabilities or impairments (from color blindness
and learning disabilities to more severe disabilities)

User Analysis & Profiles
 Create user profiles:
 Break users into clear subgroups
 Profile/Define the characteristics of each subgroup
 Choose user profiles to test:
 Ideally users from all major profiles will be tested
 If limited testing: Choose profiles based on highest
number of users in that profile or profiles that you
think may have the greatest usability issues

How many Users is enough
 Based on the work of Jakob Nielsen
 E = 100 x (1 –(1-L)^ n)
where:
 E =percent of errors found
 n =number of testers
 L =percent of usability problems found by a tester

 Based on L = 31, suggested by Nielsen

…
 Number of users also depends on
 How critical system is
 Complexity of system
 Application
 Budget and time available

Gather Data during
Experiments
 Thin-aloud protocol
--Is this real use of the system ?

Gather Data during
Experiments
 Eye Tracking

Usability Questionnaire
 develop questionnaires that generate responses that
can be counted and analysed across the spectrum of
testers.
 Yes/no answers
 True/false answers
 Agree/disagree on a scale

Example
1. The main menu was easy to navigate.
2. It was easy to find the proper software operation from the
main menu.
3. The screen design led me quickly to the correct software
operational choices.
4. Once I had operated the system, it was easy to remember
how to repeat my actions.
5. The menu operations did not provide enough feedback to
verify my choices.
6. The main menu was more difficult to navigate than other
similar programs I use.
7. I had difficulty repeating previously accomplished operations

Analyzing the Data
1. Collate data into findings:
a. Choose an approach:
 Top-down approach: predetermine categories of findings (like
navigation, design, terminology) and go through data looking for
“hits”
 Bottom-up approach: put each observation on a sticky note/note
card, sort into categories and label categories
b. Determine time and errors/success
 Examine findings for each user, user profile, and task
 Use analysis techniques such as statistics (even averages help)

Analyzing the Data
1. Analyze data:
a. Determine cause of problems
b. Determine scope/severity of problems
c. Make recommendations/changes
2. Report Findings

Why Statistics?
• Testing is used to support a decision
• For example, “this design change is going to be better for users”,
or “design A is better than design B”
• Research to used to test a hypothesis based on a theory
• Smoking increases the likelihood of developing cancer
• Usability testing is generally done with small samples
(mostly do to the cost associated with any alternatives)
• Statistics are used to provide a way relate the small
sample tested to the larger population
• All statistical analysis assumes the data obtained is
valid and reliable
59

Further Reading
 Software Reviews, Chapter 12 , Software Testing by
Jorgenson
 Program Inspections, Walkthroughs, and Reviews,
Chapter 3, The Art of Software Testing, Glenford J.
Mayers

Lecture 10 Static Testing.ppt

Recommended

Recommended

More Related Content

Similar to Lecture 10 Static Testing.ppt

Similar to Lecture 10 Static Testing.ppt (20)

Recently uploaded

Recently uploaded (20)

Lecture 10 Static Testing.ppt