Your SlideShare is downloading. ×
The immuassay handbook parte42
The immuassay handbook parte42
The immuassay handbook parte42
The immuassay handbook parte42
The immuassay handbook parte42
The immuassay handbook parte42
The immuassay handbook parte42
The immuassay handbook parte42
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

The immuassay handbook parte42

118

Published on

Published in: Health & Medicine
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
118
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. 395© 2013 David G. Wild. Published by Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/B978-0-08-097037-0.00026-9 The constant stream of new and improved immunodiag- nostic products presents laboratories with numerous opportunities to improve quality and reduce costs. How- ever, change can be disruptive and sometimes does not bring about the benefits hoped for. Although it is rea- sonably safe to simply follow the mainstream trends, this strategy delays the positive aspects of new technology reaching your laboratory. How can new methods be screened quickly and effectively, with the minimum of fuss? Sometimes there is a suspicion that performance is sub- optimal, perhaps after an analyzer has been repaired, for example. It would be useful to run an efficient test to ensure that the results are satisfactory or pinpoint the problem area. Two of the authors worked together in a commercial immunoassay development team. Previously it took about 6 months to carry out performance tests as part of the pre- launch validation and verification of a new product. If an opportunity for optimizing an assay became apparent part-way through, it was sometimes too disruptive to adjust the reagent formulation and repeat the entire test- ing program. The team developed a rapid but comprehen- sive screening process that reduced the time taken to carry out essentially the same performance tests to just four intense days, including the results analysis and presenta- tion. The most successful product launch of the team was the result of numerous iterations of this process, which allowed multiple tweaks to be made until the performance was optimized. Later, the same team introduced similar protocols for troubleshooting performance issues rapidly in the field. Laboratories that develop an efficient and effective strategy are more likely to benefit from change and avoid the pitfalls. The strategy should be phased, so that poor products are quickly screened out without taking up valuable laboratory time. When a product is first intro- duced into the laboratory, extra tests can detect “teeth- ing problems” early on, so that they can be resolved quickly. This chapter explains how to perform an initial screen quickly and effectively. The approach should not be considered a definitive guide to assay verification and vali- dation, or method evaluation, which is explained in more depth elsewhere in this book. Assay Groups The extent of testing required depends on the nature of the test. In another chapter (see IMMUNOASSAY PERFOR- MANCE MEASURES), the key elements of performance are explained for quantitative immunoassays. They are: G Sensitivity G Imprecision (within- and between-assay) and minimal distinguishable difference in concentration G Specificity and cross-reactivity G Absence of interferences G Accuracy, recovery, and dilution linearity G Correlation G Assay drift Further requirements for clinical application are explained in another chapter (see CLINICAL CONCEPTS). Of particular importance to method evaluation in this context are clini- cal sensitivity and specificity, the avoidance of false nega- tives and positives. Qualitative and semiquantitative immunoassays are slightly different and testing tends to focus on the cutoff or the boundaries between the low, gray, and high zones. Clinical sensitivity and specificity are particularly important for these assays (see QUALITATIVE IMMUNOASSAY—FEATURES AND DESIGN). Initial Screen Using Available Information Whether you are intending to replace one test or a whole system, it is worthwhile considering and listing the fea- tures required. Include the positive attributes of the product you are already using in the list. Include assay- related factors, such as “less cross-reactivity from drug x,” instrument features, such as “stored calibration curve,” or general concerns, which could include “better service support”, for example. Cost is always an important con- sideration, so try to differentiate between internal costs, such as technician labor, and external costs due to the price of the system and consumables. The costs associ- ated with maintenance contracts should also be taken into account. Then group these requirements in order of impor- tance, identifying each one as “essential,” ”highly desir- able,” or “nice to have.” These requirements will also depend on how the information is used, so for example a test that rules out a disease may have different require- ments for cutoffs and imprecision than a rule-in test. Method Evaluation—A Practical Guide Chris Sheehan1 (info@cotswoldconcierge.com) Jianwen He1 (chaska63@yahoo.com) Mari Smith2 C H A P T E R 5.2 2 First edition author. 1 Fourth edition revision.
  • 2. 396 The Immunoassay Handbook The adverse consequences of false negatives and posi- tives must be considered. For example, screening blood units for infectious diseases means that cutoffs are set so that false negatives are minimized. There are of course trade-offs of cost as well as quality to be considered. The use of receiver operator characteristic (ROC) curves and the measurement of the area under such curves (AUROC) are important ways of achieving these trade-offs. An excellent website (http://www.anaesthetist.com/mnm/stats/ roc) shows how this is achieved in practice. The next step is to gather information about the avail- able products so that a shortlist can be drawn up. Find out which methods are being used by your colleagues and ask them for advice. Questions such as “Do you like product Y?” are generally unhelpful. It is better to ask specific questions using your list of features as a guide. Some labo- ratories specialize in assessing many different systems for manufacturers, and their findings are particularly reveal- ing. However, they may have different requirements from you, so again ask detailed questions. External quality assessment schemes are a useful res- ervoir of information. The printouts show how many participants are using each method, and it may be useful to review two printouts, perhaps a year apart, and check which methods are growing in popularity and which are in decline. The estimates of imprecision for each method, expressed as percent coefficients of variation (%CV), give a good indication of between-assay and between-laboratory imprecision, and the number of out- lying results can reflect unreliability (although some- times outliers may be due to the use of inconsistent units of concentration). The method means are particularly useful, as they reflect method biases and can be used to indicate how much change is likely when switching from one method to another. Some external quality assess- ment scheme organizers also offer advice on method performance. In some countries, national organizations or government departments carry out independent eval- uations, e.g., Department of Health in the United Kingdom. Visit the manufacturer’s website and download infor- mation. Also, Google the products for any useful insights. Visit the manufacturer’s stand at exhibitions and call their technical support line with a few questions. For each product reviewed, record the relevant information in the list of requirements. Check whether accessory equipment or reagents are required and whether all the products listed are available in your country. Estimate the labor requirements and turnaround times. Also check the maintenance and service requirements and any envi- ronmental specifications, e.g., for laboratory tempera- ture and humidity. Reagent stability during transport may be a concern if your laboratory is isolated from pri- mary distribution routes or very distant from the manu- facturer. Check if refrigerated transport is required for the typical shipment time. (When the first shipment is ordered, check the total shipping time if you have a concern.) Published, independent technical evaluations may be found using online journal searching tools. Check for per- formance data and warnings of unexpected results or inter- ferences. For new tests, check for clinical issues. Other sources of key requirements come from competi- tive method instructions for use, regulatory agency web- sites (such as the FDA), and published warning and recall notices published by, e.g., the FDA in the USA and MHRA in the UK. Clinical guidelines may also be recommended by approved bodies such as NICE or Royal Colleges in the UK, or societies in the USA, especially where new tests are being introduced and little published data are available. Clinical and Laboratory Standards Institute (CLSI) (for- merly National Committee for Clinical Laboratory Stan- dards (NCCLS)) publishes guidelines that can help with method evaluations. At this stage, it should be possible to draw up a shortlist of products that best suit your requirements, based on the prioritized list of features. Cost-Effective Initial Test Evaluation With comparatively little effort, you can carry out a basic test of product performance, evaluate the quality of the documentation, and evaluate the quality of service from the supplier. Although a manufacturer’s full evaluation of a new product can involve a team of people in many months’ work, it is surprising how much information can be obtained by running less than 200 tests for each analyte under consideration, in a carefully designed experiment. If the experiment is successful, a more detailed assessment can then be carried out. The test should include, in the following sequence: G Calibrators, if required by the protocol, in duplicate G Controls in duplicate G A panel of at least five patient samples at a range of concentrations, in duplicate. The samples should have been previously tested by a predicate method. G Representative of zero concentration sample (or as low a concentration as available) with ten replicates G Sample dilutions (neat, 1/2, 1/5, and 1/10) in duplicate G Diluent used above in duplicate G External Quality Assurance (QA)/proficiency testing scheme samples, in duplicate G Controls (same controls as run near the beginning of the sequence) in duplicate This experiment should be run after the analyzer or other equipment has been in standby mode (e.g., after the week- end) and before any other tests are run that day, and then repeated when the analyzer or equipment has been run- ning and is warmed up, ideally a few days later. Although limited in scale, this experiment provides useful data that tests a number of basic performance parameters. The patient samples and controls should also be run in the test used currently in the laboratory. The external QA scheme (proficiency testing) samples should be from a set for which all the results have been published by the scheme organizer. By running the test immediately after the analyzer and any other relevant equipment has been in standby mode, start-up effects may be apparent in the duplicates of the first few tests in the run.
  • 3. 397CHAPTER 5.2 Method Evaluation—A Practical Guide ANALYSIS OF RESULTS FROM INITIAL EVALUATION Assuming the method either uses calibrators or allows the user to view the stored calibration curve and provides the concentrations and raw signal levels for the calibrator points, the calibration curves may be re-plotted by hand using linear graph paper and a flexicurve, and the values of the unknowns read off the curve, to provide a check for curve-fitting errors. This may seem an antiquated approach but there is no better way to detect curve-fitting bias. Curve Fitting Curve-fitting software has largely replaced manual data reduction though the latter is a useful check. Use linear regression to compare the sample and control concentra- tions between curve-fitted and manual methods. Assess the acceptability of the manufacturer’s curve-fitting method by checking for bias from the manually-fitted calibration curve. Within-Assay Imprecision Calculate and tabulate the means and %CV for the con- trols, samples, and calibrators, using the data for the indi- vidual replicates from the assay printout. There should be two sets of data, from the first and second runs. The %CVs should be calculated on the values for concentration, not the signal levels. Plot a precision profile of concentration against %CV using data from all of the sources and use different symbols to distinguish between samples, con- trols, and calibrators. Between-Assay Differences and Stored Calibration Curve Stability Calculate the percent differences between the results for the two assays for each control and sample. Any differ- ences may be due to kit, operator, or occasion effects. To test whether changes are due to instability of the stored calibration curve, compare the values generated by the stored calibration curve with those derived from a manual plot of all the calibrators. Drift Compare the values of the controls obtained at the begin- ning and end of the assay to detect assay drift. Check that the control results from the beginning and end of the run were determined using the same reagent pack or cartridge. Drift is unlikely to occur in a fully automated analyzer because the timings are strictly controlled, but it can hap- pen. Sample evaporation, particle clumping, reagent equil- ibration, and mixing effects can all contribute to “drift” where identical samples give different results. Compare the raw signal levels from the two occasions when the experiment was carried out. Sensitivity For analytical sensitivity determinations, calculate the sen- sitivity from the signal levels of the 10 replicates of the zero or near-zero sample. Analytical sensitivity is the theo- retical concentration equivalent to two standard deviations above or below the zero calibrator mean (converted to concentration units). Note that for immunometric (non- competitive/sandwich) assays this is primarily a measure of background noise. For more in-depth assessment of sensi- tivity see FINAL EVALUATION TESTS—ASSAY SENSITIVITY. Accuracy Compare the results for the external QA scheme samples with those obtained from other methods and the all- laboratory trimmed means. Also compare the patient sample values with those obtained using the current labo- ratory method. Other methods of accuracy are explained in FINAL EVALUATION TESTS—ACCURACY. Dilutional Recovery Samples should be diluted using a validated diluent (if available), a zero calibrator, or a patient sample or pool with a low endogenous concentration. Do not make serial dilutions, dilute to each level separately. The concentra- tion of the diluent does not have to be zero, but the con- centration (if not zero) must be determined using the assay and allowed for in the dilution calculations. Plot the dilu- tion curves for the sample(s) tested. The dilution should be on the x-axis and the measured concentration on the y-axis. This should result in an approximately straight line, although some allowance must be made for within-assay imprecision. This experimental design is intended as a rapid screen. Triplicates would normally be used for this type of experiment. Other Information Check the appearance of the reagents for particulates, clumping, settlement or cloudiness, the ease of use of the packaging, and the quality of the instructions. Estimate the total assay time, hands-on time, and any other logistical parameters relevant to your requirements. In addition the need for start-up machine preparation, storage space for reagents, and the number of preventative maintenance events all affect test adoption and use. In particular, the need to have appropriate services such as water, drainage, hazardous waste disposal, and interfacing all affect test adoption. Also pay attention to the quality of the installa- tion work if new equipment is involved. Was training pro- vided? Was maintenance explained and demonstrated? If the manufacturer is new to you, telephone the customer service organization and ask one or two questions to check the quality and the speed of their responses. Ask for reli- ability data from the manufacturer and ask other users about their experience of reliability with the method under evaluation. Evaluation against Current Method Using Clinical Samples The subject of clinical evaluation is dealt with in-depth in another chapter (see CLINICAL CONCEPTS). Assessing the clinical utility of a diagnostic test is a difficult and
  • 4. 398 The Immunoassay Handbook time-consuming task. Few general laboratories have the time or resources to carry out a full clinical trial on each new product introduced to the laboratory. The clinical chemist must rely on the manufacturer, to a certain extent, for information that enables a judgment to be made about the suitability of any new product. However, it is impor- tant that reference intervals are established for the local population when a new test is introduced. Establishing the equivalence between new and old tests for the same analyte is usually one of the most important requirements of method evaluation, as a change to the reference interval can cause confusion and errors in diagnosis. For most immunoassay analytes, it is not possible to estimate accu- racy as few have reference methods with which true con- centrations can be determined. The most useful test is to run a range of samples in new and old methods (correlation). For quantitative assays, agreement is sought; for qualitative assays, it is concor- dance. Care should be taken that samples covering the entire analytical range of the assays are included, with sam- ples at critical levels, e.g., at the cutoff level in a qualitative assay. The samples are assayed approximately at the same time in both methods. If dilution is likely to be required, high-concentration samples should also be included, run at the appropriate dilutions. If historical values are used for the comparison, method artifacts due to sample age, fresh versus frozen and freeze-thawing effects should be taken into account. Samples that are known to be problematic (rogue samples) and samples with a confirmed pathological status should be highlighted in the plots and in the analysis. The assays used to measure the samples should be vali- dated by including controls, ensuring that they give the appropriate values, and by checking that the shape of the calibration curve is as expected. At least 20, but ideally 50, samples should be assessed for an initial assessment, and the results plotted graphically. More samples would be required for a full method comparison. The comparison method should be on the x-axis and the new method under test on the y-axis. The results can be compared using linear regression, but use of a functional relation- ship (Deming or Passing–Boblock regression) is better. They are linear regression methods with an allowance for errors in both the parameters, whereas linear regression normally assumes that there is no error in the x parame- ter. (See IMMUNOASSAY PERFORMANCE MEASURES for details of the Deming functional correlation method). Linear regression analysis may be carried out using EXCEL. The form of the equation produced by linear regression is: y=mx+c. It is useful to plot the line of best fit to the data and a line at 45° from the point where the x- and y-axes meet at zero concentration, for reference. The quality of this experiment is much improved if it is run twice, on separate occasions, and the means from the two experiments are used. The slope and the intercept indicate how closely the two methods agree across the concentration range. If the inter- cept is near 0 and the slope is close to 1, the methods may be suitable for use with the same reference interval or with a slight adjustment. However, this may not apply if the ref- erence interval is close to 0, as is the case with thyrotropin (TSH) and some tumor markers. If the regression was per- formed using a computer software package, the printout will probably include confidence intervals for the slope and intercept. If the confidence interval of the intercept includes 0, then the intercept is not significantly different to 0. Likewise, if the confidence interval of the slope includes 1.00, the slope is not significantly different to the ideal. Significant differences need to be carefully consid- ered for their implications on the introduction of the new method. It is important that any regression comparison is con- ducted across an appropriate dynamic range. If the test is most usually used at a particular low or high range, then comparisons should be done in detail at this part of the range as comparisons across a wide range can be mislead- ing due to the location of one or two samples. The correlation coefficient, r, is also useful, although it requires some getting used to. It is a statistical test that there is a relationship between the two methods, so, unless something has gone disastrously wrong, the value of r is likely to show a strongly significant correlation between the two tests. However, the nearer the value is to 1.000, the better is the correlation. If both tests are of a high qual- ity, the value may be greater than 0.995, although lower values are not at all uncommon. If r is below 0.990, check the data to ensure that the consequences of changing methods are fully understood. To understand how well one method correlates with another in a clinical, rather than a purely statistical, context, it is necessary to review r, the closeness of the intercept to 0 and the closeness of the slope to 1. Check the plot for apparent outliers. If these are near the concentration extremes, they will have a considerable effect on the slope, intercept, and r value. Rerun both of the tests on each of the samples involved, as they may rep- resent rogue samples that have given quite different results in the two tests. It is important that if using sam- ples with values from a reference method, these values must be representative and ideally determined with two reagent lots. The correlation test does not help you to determine which test is better, simply how well they agree. However, samples with known clinical status can be used to distin- guish the test that has the better clinical diagnostic value. For some tests that are semiquantitative or have wide sample variability, it may be more appropriate to compare classes. Classes are essentially a range of values. This approach is widely used in allergy immunoassay compari- son where six classes are compared. So two methods are seen to be equivalent if there is good class concordance rather than absolute numerical agreement. For qualitative assays, the results should be analyzed to see the concordance of the tested assay to the reference or predicate method. For discordant samples, a tiebreaker is needed. The following are tie breaker approaches: G a third, confirmatory method G the clinical outcome G diluted sample results G follow-up studies For more information about evaluating the clinical perfor- mance of immunoassays, see Zweig and Robertson (1987).
  • 5. 399CHAPTER 5.2 Method Evaluation—A Practical Guide Final Evaluation Tests IMPRECISION There are three aspects of imprecision that are important when evaluating a method: within-assay, between-assay, and between-lot. Within-assay imprecision indicates the suitability of a method for the measurement of singleton samples; between-assay is important in assessing the ability to measure serial and repeat samples reliably; and between- lot is important in assessing the ability of the manufacturer to make the product reproducibly. Between-assay and between-lot imprecision are important because patient results are always compared to a fixed reference interval. A number of factors can significantly affect the impreci- sion estimates obtained in the experiment, in comparison with the manufacturer’s claims, and these should be con- sidered when designing the experiment. Thus, the follow- ing elements should be taken into account: (1) The nature of the material being tested, e.g., freeze- dried or liquid. (2) Concentration of the control or sample. (3) Number of replicates of the control or sample to be run in single assays. Within-Assay imprecision Run replicates of control(s) or sample(s) in the same assay, either consecutively or randomly across the assay, and determine the mean, standard deviation (SD), and %CV. Use the concentrations not the raw signal levels. Ideally, 20 replicates of each control should be run at no less than three different concentrations, but the exact number may depend on sample availability or previous knowledge about imprecision. See LABORATORY QUALITY ASSURANCE. Between-Assay Imprecision To estimate between-assay imprecision (which may or may not include different lots), ensure that the same con- trols or samples are run in all of the evaluation assays. The results can then be plotted by assay and the overall mean, SD, and %CV calculated. If liquid patient samples are being used, care must be taken that sufficient aliquots are made available and stored deep-frozen for the duration of the exercise. Alternatively freshly reconstituted or deep- frozen aliquots of commercial controls may be used. Often it is incorrectly stated that one method has less imprecision than another, because of a lower standard deviation or %CV, when there is no statistical difference between them. To find out whether there is a statistically significant difference between the imprecision of two methods, the F-test is used: Note that SD1 is the greater of the two standard deviation values. Use a set of F tables at the appropriate confidence level (e.g., P<0.05, which gives a 95% confidence level) and read off the F-statistic for the appropriate degrees of freedom (n−1) for the two estimates of imprecision. Unless the value of F from the equation above is at least as high as the statistic from the table, the two methods do not have significantly different levels of imprecision. The F-test may be carried out to compare the imprecision of two data populations for the same sample or control using EXCEL. This test in EXCEL determines the proba- bility that the imprecision of the two populations is the same. Between-Lot Imprecision There are several ways to detect lot-to-lot variation. Prob- ably the simplest is to plot the confidence intervals for the controls on the laboratory control charts. Calculate the control means and confidence intervals for the assays per- formed with each kit lot. (The confidence interval is the mean±t×the standard error; the standard error is the stan- dard deviation divided by the square root of the number of data). Significant lot-to-lot differences become apparent when the confidence intervals do not overlap. Alternative methods include analysis of variance and CUSUM analysis (see LABORATORY QUALITY ASSURANCE). Care needs to be taken over the storage of liquid samples. ACCURACY To the purist, accuracy can only be determined against a reference method. Unfortunately, for most of the analytes measured by immunoassay, there is not an accepted refer- ence method. Where a reference method has been accepted, such as gas chromatography–mass spectrometry (GC–MS), calibrated reference materials can be obtained. These are simply tested in the immunoassay (at clinically useful concentrations and ideally on at least two occasions) to determine whether the immunoassay test results agree with the reference values. In the absence of a reference method, various alterna- tive, indirect approaches can be used to check for method differences and standardization. For example, the compar- ison of patient sample, control, or QC scheme/proficiency testing data between methods is a useful practical test of a new method against one that has agreed well with other methods in the past (see EVALUATION AGAINST CURRENT METHOD USING CLINICAL SAMPLES). The most useful method for assessing accuracy in the absence of a reference method is the recovery test, which tests accuracy by measuring the difference between the endogenous concentration of a sample and the concentra- tion once a known amount of a pure sample of the analyte has been added. To carry out a recovery test you need: (1) A pure preparation of the analyte under test, with as high a concentration as possible to minimize the amount of solution to be added to the samples. The matrix of this solution to be used for “spiking” the samples should be chosen carefully so that the integrity of the final sample is not affected. (2) Two or three patient samples or pools with different endogenous concentrations. Choose two to three concentrations at which the samples or pools will be spiked and calculate the appropriate
  • 6. 400 The Immunoassay Handbook amounts of the spiking solution to be added to the samples or pools. Also make a “control” spike that is similar in every way to the spiking solution but without the added analyte, to test possible effects due to interference or dilution. Assay all samples or pools and measure the recovery in the following way: % If the concentration of the spiking pool is such that a sig- nificant amount of solution must be added to reach the required concentration of added analyte, ensure that you correct for the effects of dilution on the endogenous con- centration before using the equation above. INTERFERENCE TESTING AND SPECIFICITY Manufacturers usually state the cross-reactivity of closely related metabolites or drugs. The effects of common inter- fering substances, such as hemoglobin and bilirubin, may also have been tested. Knowledge of the patient base in a particular area may lead a laboratory to test other sub- stances, such as locally common drugs or their metabo- lites, to see if they are affecting patient sample values. Enzyme immunoassays are more prone to nonspecific interferences from patient samples because of color, tur- bidity, enzymic activity, or reducing agents (see SUBJECT PREPARATION, SAMPLE COLLECTION, AND HANDLING). There are also other types of interference from samples, such as human anti-mouse antibodies and rheumatoid fac- tor (see INTERFERENCES IN IMMUNOASSAY). To confirm that a substance does not interfere with a test, first ascertain the maximum physiological concentra- tion that is likely to be encountered. Then spike at least three samples or controls (containing different levels of analyte) with a range of levels of the potentially interfering substance in the spike (see RECOVERY). Assay each sample, spiked and unspiked, on at least four occasions including a number of runs and on different days. Check that the imprecision for all the samples is satisfactory. To assess interference at each level calculate the confidence interval of the difference from the unspiked control (paired t-test). There is no significant interference or cross-reactivity if the confidence limit spans 0. . SD=between-assay standard deviation at the appropriate concentration, derived either from the manufacturer’s package insert or internal data. n=number of determinations/sample. ASSAY SENSITIVITY Assay sensitivity is defined as the lowest concentration that is distinguishable from 0 and as such is often used as the lowest detectable limit and the lowest limit of the report- able range. Sensitivity can be determined in a number of ways, each of which has validity as long as the limitations are understood. Claims for sensitivity made by manufac- turers should include a description of the method used. Analytical Sensitivity Analytical sensitivity is defined as the concentration equal to N standard deviations above the zero calibrator. The most usual value for N is 2. To determine the sensitivity in this way, a zero concentration calibrator or sample is assayed approximately 20 times within an assay. The mean and standard deviation of the signal are calculated and the mean signal level ±2 standard deviations interpo- lated as an unknown from the calibration curve. This “concentration,” which is a function of both the impreci- sion of the signal generated and the slope of the calibra- tion curve, is the analytical sensitivity. There are significant limitations of this procedure. From a practical standpoint, a genuine zero sample, with a matrix typical of a patient sample, is required, as is a data processing package into which signal levels can be entered to deter- mine concentration. However, the main disadvantage, especially for immunometric (non-competitive) assays is that, since there is no analyte present, this test is simply a measurement of system background noise. For this rea- son, functional sensitivity is a much more relevant test of the ability of an assay to measure low concentrations with adequate precision. Functional Sensitivity Functional sensitivity is determined from the imprecision of very low concentration samples, either within- or between-assay. The most common method is to dilute a low sample or control to a series of levels that are then assayed on a number of different occasions (usually >6). A precision profile is constructed from these data and the concentration that gives a certain level of imprecision (usually 20% CV) is deemed to be the sensitivity. The choice of the number of determinations, imprecision level, method of defining the precision profile, nature of the sample to be diluted, and the diluent can all have an effect on the final determination of sensitivity. Sensitivity may also be expressed in terms of limit of detection (LOD), limit of quantification (LOQ), and limit of blank (LOB), (Armbruster and Pry, 2008). Limit of Blank LOB is the highest apparent analyte concentration expected to be found when replicates of a blank sample containing no analyte are tested. . Limit of Detection LOD is the lowest analyte concentration likely to be reli- ably distinguished from the LOB and at which detection is feasible. LOD is determined by utilizing both the mea- sured LOB and test replicates of a sample known to con- tain a low concentration of analyte. LOB and LOD are
  • 7. 401CHAPTER 5.2 Method Evaluation—A Practical Guide similar in concept to analytical sensitivity and share its limitations. . Limit of Quantitation LOQ is the lowest concentration at which an analyte can be reliably detected but with the additional requirement that certain predefined goals for bias and imprecision are met. The LOQ may be equivalent to the LOD or it could be at a much higher concentration. LOQ is similar to functional sensitivity and is generally preferred when assessing methods for clinical applications. LINEARITY For those methods that do not use a set of five or more calibrators to describe the calibration curve, especially those that use single or two-point calibration and assume linearity between the points, an assessment of linearity should be carried out. The simplest method is to measure multiple dilutions of a high-concentration sample and plot the results graphically. The undiluted sample should be near the upper limit of the assay range. This sample should be diluted out to below the lowest concentration calibrator excluding calibrator 0, therefore, the dilutions can cover the entire assay range. Alternatively, if a sample of the pure analyte is available, a set of five or more cali- brators may be made up in analyte-free serum. This curve can be used to provide control and sample values that can be compared to the reduced calibration curve. The t-test or comparison of confidence limits can be used to com- pare the results. MISCELLANEOUS All of the staff involved in the evaluation should be encour- aged to report their comments on elements such as ease of use, safety requirements, and the assay time from the start of the procedure to the availability of the first results. The availability of both routine information and troubleshoot- ing assistance from the manufacturer should be considered as well as the cost of the method for the typical run size used in the laboratory. Reagent quality and supply, instru- ment service, and customer support are the key factors to be assessed when choosing an assay or system from an alternative manufacturer. The cost of generic reagents that have to be ordered separately should also not be overlooked. DIFFERENT ASSAY TECHNOLOGIES Newer technologies, such as point-of-care, microarray, and other systems using multiple analytes, still need evalu- ation using the approaches given above. Combination tests may increase the confidence in a diagnosis but the quality attributes of individual tests are relevant as they can cause aggregate imprecision. Qualitative tests appear to be sim- ple but their results can have significant implications for patients, and they suffer from the same causes of impreci- sion and inaccuracy. When assessing these tests, check that they are designed to warn the user of any performance issues. The fundamental evaluation of immunoassay perfor- mance is just as relevant in non-diagnostic applications, such as drug discovery and proteomics research. Suggested Format of Evaluation Report An important aspect of method evaluation is the presenta- tion of the data and conclusions in a report that is easily understood and provides justification for the course of action decided. These are suggested section headings and topics to be considered: Summary. Summarize the conclusions of the evaluation and list the major positive and negative points. Introduction. State why the evaluation was performed. Summarize how the decision to evaluate this particular method was reached. State the objectives of the evaluation. Manufacturer’s information. Summarize the informa- tion from the manufacturer’s literature: technical data, protocol, shelf life, costs, etc. Design of evaluation. Describe which elements of per- formance were evaluated, how this was done and the anal- ysis carried out. Also include lot numbers of reagents used and any commercial controls or components required but not supplied by the assay manufacturer. Results. Summarize the results for each of the experi- ments. Ensure all data are appended or referred to in a numbered laboratory notebook, if not quoted in the main text of the report. Include a non-technical assessment of ease of use, etc. Conclusion. State the general conclusions of the evalu- ation including the recommendations. Useful Guidelines CLSI, formerly known as the NCCLS, develops consensus-driven standards. CLSI has issued the following guidelines. For full details see REFERENCES AND FURTHER READING below. G Preliminary Evaluation of Quantitative Clinical Labo- ratory Measurement Procedures G User Protocol for Evaluation of Qualitative Test Performance G Evaluation of Imprecision Performance of Quantitative Measurement Methods G User Verification of Performance for Imprecision and Trueness G Expression of Measurement Uncertainty in Laboratory Medicine G Evaluation of the Linearity of Quantitative Measure- ment Procedures: G Interference Testing in Clinical Chemistry G Method Comparison and Bias Estimation Using Patient Samples G Statistical Quality Control for Quantitative Measure- ment Procedures: Principles and Definitions
  • 8. 402 The Immunoassay Handbook References and Further Reading Armbruster, D.A. and Pry, T. Limit of blank, limit of detection and limit of quan- titation. Clin. Biochem. Rev. (Suppl. 1), S49–S52 (2008). CLSI. Evaluation of Precision Performance of Quantitative Measurement Methods; Approved Guideline - 2nd edn, P05-A2 (CLSI, Wayne, Pennsylvania, 2004). CLSI. Expression of Measurement Uncertainty in Laboratory Medicine: Approved Guideline, C51-A. (CLSI, Wayne, Pennsylvania, 2012). CLSI. Evaluation of the Linearity of Quantitative Measurement Procedures: A Statistical Approach; Approved Guideline, EP06-A. (CLSI, Wayne, Pennsylvania, 2003). CLSI. Interference Testing in Clinical Chemistry; Approved Guideline - 2nd edn, EP07-A2 (CLSI, Wayne, Pennsylvania, 2005). CLSI. Method Comparison and Bias Estimation Using Patient Samples; Approved Guideline - 2nd edn, (Interim Revision). EP09-A2-IR (CLSI, Wayne, Pennsylvania, 2010). CLSI. Preliminary Evaluation of Quantitative Clinical Laboratory Measurement Procedures; Approved Guideline - 3rd edn, EP10-A3 (CLSI, Wayne, Pennsylvania, 2006). CLSI. Statistical Quality Control for Quantitative Measurement Procedures: Principles and Definitions; Approved Guideline - 3rd edn, C24-A3 (CLSI, Wayne, Pennsylvania, 2006). CLSI. User Protocol for Evaluation of Qualitative Test Performance; Approved Guideline - 2nd edn, EP12-A2 (CLSI, Wayne, Pennsylvania, 2008). CLSI. User Verification of Performance for Precision and Trueness; Approved Guideline - 2nd edn, EP15-A2 (CLSI, Wayne, Pennsylvania, 2006). Cummings, J., Ward, T.H., Greystoke, A., Ranson, M. and Dive, C. Biomarker method validation in anticancer drug development. Br. J. Pharmacol. 153, 646–656 (2008). Ekins, R.P. The precision profile; its use in assay design, assessment and quality control. In: Immunoassays for Clinical Chemistry 2nd edn, (eds Hunter, W.M. and Corrie, J.E.T.) (Churchill Livingstone, London, 1983) Feldkamp, C.S. & Smith, S.W. Practical guide to immunoassay method evaluation. In: Immunoassay, a Practical Guide, (eds Chan, D.W. and Perlstein, M.T.). 49–95 (Academic Press, Orlando, 1987). Hopley, L. and van Schalkwyk, J. The magnificent ROC. http://www.anaesthetist. com/mnm/stats/roc Jin, H. and Zangar, R.C. Antibody microarrays for high-throughput, multianalyte analysis. Cancer Biomark. 6, 281–290 (2010). Marchiò, C., Dowsett, M. and Reis-Filho, J.S. Revisiting the technical validation of tumour biomarker assays: how to open a Pandora’s box. BMC Med. 9, 41 (2011). Khan, M.N. and Findlay, J.W. Ligand-Binding Assays: Development, Validation, and Implementation in the Drug Development Arena. (Wiley-blackwell, Oxford, 2009). Zweig, M.H. & Robertson, E.H. Clinical validation of immunoassays: a well- designed approach to a clinical study. In: Immunoassay, a Practical Guide (eds Chan,D.W. and Perlstein, M.T.). 97–127 (Academic Press, Orlando, 1987).

×