2. Quality Assurance
Quality assurance is the essential component of any web archiving
program. All collection methods involve some degree of automation
to ensure that the selection policy and the collection list are actually
being implemented successfully. The greater the scale of collection
undertaken, the more basic the level of quality assurance that can be
employed. These is a trade off between the number of resources
that can be collected and the quality control which can be applied to
them, and a policy decision is required as to the minimum
acceptable level of assurance.
3. PRE-COLLECTION
TESTING
ISSUE TEST
COLLECTION
LOG SCRIPT
POST COLLECTION
TESTING
The Quality Assurance Process
4. PRE-COLLECTION TESTING
Pre-collection testing is concerned with the identification of
potential issues that may affect the quality of collected content
before its acquisition. It is clearly desirable to identify and
remove all the potential problems before collection. Pre-
collection testing will typically include two approaches:
1- Resource Analysis and
2- Test Collection
5. Resource Analysis
It involves the manual or automated analysis of the target web
resource, in order to identify the appropriate method and any
issues that are likely to arise during collection. In resource
analysis it should be necessary to determine
- Website is static or dynamic
- Resource target is linked or available through database queries
- Suitable Collection Method (remote harvesting or other technique)
6. Test Collection
If a target web resource is only intended to be collected on a
single occasion, and if the resource is going to be collected
repeatedly, than it may be beneficial to undertake a test
collection. This will allow the selected collection method to be
fully evaluated, and any necessary corrections made to the
collection parameters.
7. Post Collection Testing
This type of technique is followed after the collection
has been made. The most feasible approach will be to
test a representative sample of the collected material,
the size of the sample being determined by the volume of the
collection and available resources.
To ensure the consistency the test should be based on a standard
test script, which describes the precise test to be conducted and
allows recording of the results. The test script should be followed
using two browser windows, for both the live and archived
versions of the web resource: this allows the valid comparison
between the results.
8. Example of Testing Types
- Availability of website snapshot
- Functionality of the Navigation
- Date and Time
- Frames
- Text
- Images
- Multimedia Content (audio/video/flash animations)
- Downloadable Content
- Search Facility
9. Issue Tracking
Testing at the pre and post collection stages may identifies issues that need to
be addressed, and an efficient system for logging, tracking and resolving those
issues lies at the heart of the quality assurance process. Every issue identified
must be recorded in a standard issue log, and include the following
information..
- Nature of the issue
- Severity of the issue
- The date when it was identified
- The name of the person who identify it
- The individual/ the process or organization able to resolve the issue
- The expected resolution date.
Once the issue has been passed on for the resolution the log should be
monitored periodically for outstanding issues and any necessary action taken
to facilitate their resolution.
10. Issue Tracking
Once the issue has been resolved the following information
should be saved in the appropriate log
- The date on which the issue is resolved
- The manner in which it is resolved
- An indication of whether the issue is now closed, or if the
resolution is unsatisfactory remains open.