By Stephen Lacy and Daniel Riffe
This study views intercoder reliability as a sampling problem. It develops a formula for generating sample sizes needed to have valid reliability estimates. It also suggests steps for reporting reliability. The resulting sample sizes will permit a knozun degree of confidence that the agreement in a sample of items is representative of the pattern that would occur if all content items were coded by all coders. Every researcher who conducts a content analysis faces the same question: How large a sample of content units should be used to assess the level of reliability? To an extent, sample size depends on the number of content units in the population and the homogeneity of the population with respect to variable coding complexity. Content can be categorized easily for some variables, but not for other variables. How does a researcher ensure that variations in degree of difficulty are included in the reliability assessment? As in most applications involving representativeness, the answer is probability sampling, assuring that each unit in the reliability check is selected randomly.' Calculating sampling error for reliability tests is possible with probability sampling, but few content analyses address this point. This study views intercoder reliability as a sampling problem, requiring clarification of the term "population." Content analysis typically refers to a study's "population" as all potentially codabte content from which a sample is drawn and analyzed. However, this sample itself becomes a "population" of content units from which a sample of test units is randomly drawn to check reliability. This article suggests content samples need to have reliabilify estimates representing the population. The resulting sample sizes will permit a known degree of confidence that the agreement in a sample of test units is representative of the pattem that would occur if all study units were coded by al! coders. Reproducibility reliability is the extent to which coding decisions can be replicated by different researchers.- In principle, the use of multiple independent coders applying the same rules in the same way assures that categorized content does not represent the bias of one coder. Research methods texts discuss reliability in terms of measurement error resulting from problems in coding instructions, failure of coders to achieve a common frame of reference, and coder mistakes.-* Few texts or Background

Stephen Lacy is a professor in the Michigan State University School of journalism, and l&MC Quarterly Daniel Riffe is professor m the E.W.Scripps School of journalism at Ohio üniversity. The J^.'*,^'!^^ authors thank Fred Fico for his comments and suggestions. 963-973


but the article concentrated on measurement error due to chance agreement. reliability samples have been selected haphazardly or based on convenience (e. even if chance agreement could be eliminated/^ the "remainder" level of agreement would exceed the acceptable level. Cohen'^ discussed sampling error while introducing kappa. the researcher cannot UiM & MASS CCMAUNÍCXTION QiMjrreKiy . "probably between 10% and 25%. then to code the main body of data.)/ in statistics and content analysis texts. For example.. assuming satisfactory results. the sample used must havea confidence interval that does not dip below . though the existence of coding criteria reduces the influence chance could have. He explored the impact of "chance agreement" on reliability measures: i." should be reanalyzed by independent coders to calculate overall intercoder reliability. For example. The formula can be used to generate samples with confidence intervals that tell researchers if the minimal acceptable reliability figure has been achieved.e. If.-* Often.'^ Schutz offered a formula that enabled a researcher to set a minimal acceptable level of reliability and then compute the level that must be achieved in a reliability test to account for chance agreements.studies address whether the content units tested represent the population of items studied. Weber's^^ only pertinent recommendation is that "The best test of the clarity of category definitions is to code a small sample of the text. if the reliability coefficient must equal or exceed ."" Wimmer and Dominick" urge analysts to conduct a pilot study on a sample of the "content universe" and." Yet early inquiries into reliability testing did address probability sampling. the confidence interval does dip below .80 to be acceptable. Scott's'^ article introducing his pi included an equation accounting for sampling error." How large a subsample? "When a very large sample is involved. Research texts vary in their approach to sampling for reliability tests. and Janowitz'' comparing reliability of different coding schemes provided reliability coefficients with confidence intervals.80. the researcher might need to achieve a level as high as 837<. in a given test. the first 50 items to be coded might be used). Schutz''' dealt with measurement error and sample size. Sampling Error and Estitnating Sample SizeA Formula 964 The goal ofthe following analysis is to generate a formula for estimating simple random sample sizes for reliability tests. though that component was dropped from subsequent references to . Fadner. An early article by Janis.. a subsample of 5-7 percent of the total is probably sufficient for assessing reliability. Singletary has noted that reliability checks introduce sampling error when probability samples are used..80.'" Krippendorf argues that probability sampling to gei a representative sample is not necessary. if the minimal acceptable level of agreement is 807c. in the reliability test in order to control for chance agreement. Then a subsample.g. The formula allows the researcher to be certain that the observed sample reliability level is high enough that." Stempel concludes that reliability estimates "should be based on several samples of content from the material in the study"'' and that a "minimum standard would be the selection of three passages to be coded by all coders. Schutz incorporated sampling error into his formula. Kaid and Wadsworth'' suggest that "levels of reliability should be assessed initially on a subsample of the total sample to be analyzed betöre proceeding with the actual coding." Most texts do not discuss reliability in the context of probability sampling and the resulting samplingerror. some coder agreements could occur by chance.

I 100% y^5% -h5'l^o\ Confidence Interval Continuum for level of agreement in coding decisions Relevant area for determining acceptability of reliability test. The minimal acceptable agreement level is 80%.the population size (number of content units in the study). SAMPUNG ERROR AND SELECTINC imERcoaER REUAMITY SAMPLES ÍOR NOMINAL CoNrcm. the formula becomes: (Equafion 2) Where N .^Ñ^ V N-l (Equafion 1) But with the radical removed and the distributive property applied. We start with the equation for the standard error of prop()rfion and add the finite populafion correcfion (FPC). For simplicity. lt reduces the standard error but is often ignored because it has little impact when a sample has a small proporfion of the population. conclude that the "true" reliability of the populafion equals or exceeds the minimal acceptable level.CVLGORIES 965 .. The reason for a one-tailed confidence interval is illustrated in Figure 1. The FPC is used when the sample makes up 10'^ or more of the population. A similar procedure is used here. The resulting area of concem is the gray area between 90% and 80%. which would fall on the negative side of the interval. and the sample level of agreement is 90%. A researcher'sconclusionofacceptablereiiability is not affected by whether the population agreement exceeds 5% on the posifive side because acceptance is based on a minimal standard. this analysis uses "simple agreement" (total agreements divided by total decisions) with a dichotomous decision (the coders either agree or disagree). The resulting formula is: SE = /PQ7 V ÍI-1 .. Survey researchers use the formula for standard error of proporfion to estimate a minimal sample size necessary to infer to the population at a given level of confidence.FIGURE 1 Why Reliahility Confidence Interval Uses a One-Tailed Test Minimal acceptiibic 0% 80% ya". which involves the negafive side of the confidence interval.

64.. we find that the one-tailed Z-score" associated with . The researcher must set a minimal level of intercoder reliability for the test units.000 content units (e. Step 4. The desired level of certainty is the traditional .05 level. Two approaches are possible.O5) and 99% (p=.P ^ the population level of agreement."^ It usuallyhasbeen determined before reaching the point of checking for the reliability of the instrument. it will be assumed that the population level should be set at 5 percentage points above the minimal acceptable level of agreement. ThesecondistoassumeaPthat exceeds the minimal acceptable reliability figure by a certain level. Once the acceptable probability level is determined.g. 95% (p=. Five percentage points is useful because it is consistent with a confidence interval of 5%.85. Content analysis texts warn that an acceptable level of intercoderreliability should reflectthenature and difficulty of categoriesand content. the resulting figures are plugged into Equation 2 and the number of units needed for the reliability test is determined.. a level consistent with minimal requirement recommendations by Krippendorf and the analysisof Schutz.^' Step 5. The researcher must determine the acceptable level of probability for estimating the confidence interval. The first step is to determine ÏV (the number of content units being studied). chances are 95 out of 100 that the population (content units in the study) figure equals or exceeds . The level of agreement in coding all study units (P) must be estimated. using the formula: JOURNALISM & MASS CoMMUNiCAnoN QLMRTERLY 966 . The second approach creates the question: How many percentage points above the minimal reliability level should P be? For this analysis. A Simulation Assume an acceptable minimal level of agreement of 85% and P of 90% in a study using 1.80. The first is to estimate P based on a pretest of the coding instrumentand on previous research. a minimum level of 80% simple agreement is often used with new coding procedures. i. if the minimal acceptable reliability figure is . Step 2. Andfi=samplesizefor the reliability check.8. If the reliability figure equals or exceeds . Step 3. the researcher must follow five steps: Step 1.Ol) levels of probability. the formula for confidence intervals is used to calculate the standard error (SE). For example. This step is the most difficult step because it involves estimating the unknown population reliability figure.^' But this level is lower than recommended by others. which represents the number of test units. The formula is: Confidence interval probability = Z (SE) (Equation 3) Z is the standardized point on the normal curve that corresponds with the acceptable level of probability. Once the five steps have been taken. This is the level of agreement among all coders if they coded every content unit in the study. Using the normal curve. Equation 2 allows the researcher to solve for n.05 is 1. Then we solve for standard error (SE).85. We assume most content analysts will use the same levels of probability for the sampling error in intercoder reliability checks as are used with most sampling error estimates. '^ For example. then the assumed P would be . newspaper stories).e. ^..andQ^(l-P). In order to solve for n.

However. However.000 study units.Ü9 0. . with 1. and 95%) and with numbers of study units equal to 100. The table demonstrates how higher P levels and smaller numbers of study units affect the test units needed. Table 1 solves Equation 2 for n with three hypothefical levels of P (85%.10) or .00Ö9) + .000.000. and 10.000. Table 2 presents numbers of test units for 99% level of probability. and the resulfing SE at p . SE-.1.09. 5..64-. Assuming a study unit level of 5 percentage points above the minimal level will control for this incentive because the higher the assumed level.-^ Table 2 assumes the same agreement levels as Table 1. The higher the assumed percentage. the number of test units needed decreases much faster with higher levels of P than with the decline in the number of study units. chances are 95 out of 100 that 85"/" or better agreement would exist if all study units were coded by all coders and reliability measured.500. SE = V H-l and becomes '-1)(SE)^ + PQN {Equafion 2) V N-i (Equation 1) Now we can plug in our numbers and determine how large a random sample we will need to achieve at minimum the standard 85% reliability agreement.90. So Equation 2 looks like n .09(1000) .Confidence interval = Z (SE) (Equation 3) Our example confidence interval is 5% and our desired level of probability is 95%. The figures for a given number of study units and agreement level are higher in Table 2 because they represent the increased number of test units needed to reach the higher level of probability. the smaller will be the sample. The main problem in determining an appropriate sample of test units is estimafing the level of P.0009.000 study units and an assumed true agreement level of 90%.05 confidence level was .05.899 = 91.64 (SE) or.0009) + .250. So. A problem can occur if the level of agreement in the test units SAMPUNC ERROR AND SELECTING ¡matcoDEH REiJABunr SAMPLES FOR NOMINAL ComuirCAizcomES 96/ .90 (. if we achieve at least 90% agreement in a simple random sample of 92 test units (rounded from 91. the higher will be the minimal acceptable level of reliability. PQ . squared to . This might produce an incentive to overestimate this level because it would reduce the amount of work in the reliability test.. Thus.(999)(.05 = 1. Our confidence interval is .9) taken from 1. 90%.9 (999)(.03.989 In other words.05/1.03 Recall that our formula for sample size begins with SE. The sample sizes are based on confidence interval with 95% probability.

Based on Various Population Sizes. Neither is usinga dichofomous decision a problem.E. and (c) it uses a simple agreement measure of reliability. who introduce measurement error after the reliability sample is selected. This indicates that reliability figure for the population of study units might not exceed the acceptable level of . which means the full range of categories has not been tested. the larger sample size can be determined by plugging the test units' reliability level (.000 l.000 5.85. and n = the sample size. and a 95% Level of Probability Assumed Level of Agreement in Population (Study Units) Population Size (Sfudy Units) 10. say . the confidence interval dips below the minimal acceptable level of .86) into Equation 2 as P. if the test units' reliability level equals . Three Assumed Levels of Population ¡ntercoder Agreement. Limitations of the Analysis This analysts may seem limited because it is: (a) based on a dichotomous decision. as JOURNAUSM & MASS COMMUNKATION QuARTEniy 968 . N = the population size. (b) with two coders. Equation 2 would easily fit nominal content with more than two categories.80. Additional units could be randomly selected and added to the original test units to calculate a new reliability figure and confidence interval based on a larger sample.Oi)O 85"X> 90% 95% 11 4 139 125 111 91 59 100 99 92 84 72 51 54 54 52 49 45 36 500 250 100 Note: The numbers are taken from the equation for standard error of proportions and are adjusted with the finite population adjustment. The standard error was used to find a sample size that would have sampling error equal to or less than 5% for the assumed population level of agreement. the researcher could randomly select more content units for the reliability check or accept a lower minimal level of agreement. Under this condition. If this is the case. generates a confidence interval that does dip below the minimal acceptable level of reliability. However. Q = (1-P).05. The equation is S. If the first approach is used.'"* However. For example.85. These infrequent categories have less likelihood of being in the sample.TABLE 1 Number of Content Units Needed for Reliability Test. = /PxQ X where P = percentage of agreement in population. the first two are not limitations. the impact of more complex coding schemes might affect the representativeness of a reliability sample if some of the categories occur infrequently.86 minus . Sampling error is not affected by the number of coders.

Q = (l-P). The use of simple agreement in reliability test is not a problem either. First. or both.-" These three measures were developed to deal with measurement error due to chance and not with error introduced through sampling. If 95%. if 99% use Table 2. besides agreement among coding pairs. to nominal data because it is based on the standard error of proportions. A parallel analysis to this one for interval and ratio level categories could be developed using the standard error of means.TABLE 1 Number of Content Units Needed for Reliability Test. if the variables are straightforward counting measures. use Table 1. If the variables involve coding meanings of content. the researcher should randomly stratify the test units. for the assumed population level of agreement. Three Assumed Levels of Population Intercoder Agreement. If this is the case. however. discussed in note 11.E. and a 99% Level of Probability Assumed Level of Agreement in Population (Study Units) 85% 90% 95% Population Size (Study Units) 10. the two tables can be useful for selecting a sample of test units to establish equivalence reliability. The representativeness of a sample of test units is not dependent on the test applied. the researcher should start by selecting the level of probability appropriate for the study. Some beginning researchers might struggle with the task of making assumptions and solving the equations. The standard error was used to find a sample size that would have sampling error equal to or less than 5"/.000 1. take the assumed agreement level among study units to be 90%. Theseare Scott's pi. select a larger number of test units.-^ Krippendorf's alpha. and n = the sample size.000 5. such as source of newspaper stories.areavailablefornomina] level data.-*" and Cohen's kappa.^~ Several discussions of the relative advantages and disadvantages of these measures are available.000 500 250 100 271 263 218 179 132 74 193 190 165 142 111 67 104 103 95 87 75 52 Note: The numbers are taken from the equation for standard error of proportions and are adjusted with the finite population adjustment.. = / F X Q X / V n-1 V N-1 where P = percentage of agreemt-nt in population. Based on Various Population Sizes. The equations is S. At least three other measures of reliability. such as political KHOR ANO SELECTING INTERCODER RinABiLn\ SAMPLES TOR NOMINAL COOTENT CATEOJRIES Using the Tables " O ? . Second. Equation 2 is limited. N = the population size.

Guido H. The study of content needs a more rigorous way of dealing with potential selecfion bias. Using probability samples and confidence intervals for reliability figures would help add rigor. This arficie has attempted to answer this quesfion and to suggest a procedure for esfimating sampling error in reliability samples. Westley (Englewood Cliffs. the researcher might oversample these categories. such as strafified random sampling. Stability concerns the same coder testing reliability of the same content at two points in fime. ed. This bias can only be estimated through probability sampling. Inc. differs from stability and accuracy reliability." in Research Methods in Mass Communication. 3. See Klaus Krippendorf. 1980).^^ Third. For example. a researcher studying coverage of economic news in network newscasts has 425 stories from 40 newscasts selected from the previous year. Of course. the sample must bea prohahility sample. For example. Take the number of test units from the table. confidence intervals should be reported with both measures of reliability. Guido H.leaning of news stories. "Content Analysis. 'y An inevitable question from graduate sfijdents conducting their first content analysis is how many items to use in the intercoder reliability test. find the population size in the tables that is closest but greater than the size of the study units being analyzed. 2. under some circumstances. 1981). The term reliability is used here to refer to reproducibility.•*" When reporfing reliability level. The analysis in this arficle is based on simple random sampling for reliability tests. "Sins of Omission and Commission in " ' " ¡ouRNAUSM & MASS COMMUNICAHON QUARTEIUÏ . Acceptinga confidence level of 95%. samples based on proporfion or stratification will require adjustments available in many stafistics books. Content Analysis: An Introduction to ¡ts Methodology (Beverly Hills. Simple agreement confidence intervals can be calculated using the standard error of proportions. CA: Sage. Variables involve numbers of stories devoted to various types of economic news. Reproducibility reliability. The formula used here is the unbiased esfimator for simple random samples. The role of selecHon bias in determining reliability coefficients seems to have gotten lost since earlier explorafions of reliability. Stempel III and Bruce H. Stempel III. Accuracv reliability involves comparing coding results with some known standard.'/ and Cohen's kappa can be calculated by referring to the formulas presented in the original articles for these coefficients. for sampling error to have meaning. 127. also called equivalence reliability. if certain categories of a variable may make up a small proporfion of the content units being studied. 4. The confidence intervals for Scott's . take the assumed agreement level of 85% among study units. other forms of probability sampling. the researcher would look down the 907ci level of agreement column in Table 1 unfil she or he came to a population size of 500 (the closest sample size that is greater than 425). might be preferable for selecting reliability test samples. The number of units needed for the reliability check equals 84. Stephen Lacy and Daniel Ri ffe. NOTES 1. However. 130-32. NJ: Prentice-Hall.

additional units can be selected. Another way of handling infrequent categories would be to increase the reliability test sample size above the minimum recommended here. ed. ed. This procedure might create problems when content has infrequent categories that are difficult to identify. 23. Basic Content Analysis. 6. Guido H. See Krippendorf. 13. Just generating a stratified reliability sample that would include sufficient numbers of units for each of these categories would be time consuming and difficult. If the larger sample does not include sufficient numbers of the infrequent categories." 128. When figuring overall agreement for reliability. Larger samples will increase the probability of including infrequent categories among the test units. 1994). "Coefficient of Agreement for Nominal Scales. If a researcher suspects that some variable categories will occur infrequently in a simple random sample for a reliability check." EducaSAMPIJNG ERROR AND SELEOWJG J^frERCOD£^^ REUABSIFY SAMPLES FOR MIM/N/U. Robert Philip Weber. 1991)." in Research Methods in Mass Communication. Frequency of categories could be estimated by a pretest and different sampling rates could be used for categories that appear less frequently. "Statistical Designs for Content Analysis. Lynda Lee Kaid and Anne Johnston Wadsworth. (Belmont.A.: Sage University Paper Series on Quantitative Applications in the Social Sciences. Roger D. Stempel and Westley. This will. all decisions specified by various forms of instructions. or selecting and checking content units for these infrequent categories until a proportion of the test units equals the estimated proportion of the infrequent categories. 3d ed. Content Analysis. Dominick. Cüivrew C^ATÏCORIES "71 . Stempel III. 173. 146. Some would question whether the logistical problems outweigh the potential impact of such a "micro" measure of reliability on the overall validity of the data. J. It could require quota sampling. Mass Communication Research (NY: Longman." journalism Quarterly 70 (spring 1993): 126-32. Krippendorf argues that reliability samples "need not be representative of the population characteristics" but "must be representative of all distinctions made within the sample of data at hand" (emphasis in original). "Content Analysis. 12. 8. 2d ed. 5. No one would argue that all variables need to be tested in a reliability check. of course. disproportionate sampling of the less frequent categories would be useful. the results for particular categories would have to be weighted to reflect the proportions in the study units. 10. 11. Philip Emmert and Larry L.g. Scott. He suggests purposive or stratified sampling to ensure that "all categories of analysis." Pulylic Opinion Quarterly 19 (fall 1955): 321-25. but a large number of categories within a variable (e. but the resulting reliability figure will be more representative of content units being studied. CA: Wadsworth. 143. "Reliability of Content Analysis: The Case of Nominal Scale Coding. (Newbury Park. 297. CA.Mass Communication Quantitative Research. Barker (NY: Longman. 9." in Measurement of Communication Behavior. are indeed represented in the reliability data regardless of lioiofrequently they may occur in the actual data" (emphasis added). a twenty-sixcategory scheme for coding the variable "news topic") could create logistical problems. 208. Mass Media Research: An Introduction. 07-075). William A. 7.. lead to coding of additional units from categories that appear frequently. Michael Singletary. Cohen. 1989). Stempel. Wimmer and Joseph R. "Content Analysis.

The acceptance of a coding instrument as reliable is not affected by whether the population reliability figure exceeds the reliability test figure on the positive side of the confidence interval.7 is the consensus val ue for the sta tistic. N equals the number of units analyzed mulfiplied by the number of categories being used. Eadner. See Singletary (Mi7ss Communication Research. and the proportion of the population in the sample. these chance agreements could lead content analysts to overestimate theextent of coder agreement due to the precision of the coding instrument. Raymond H.80 to remain consistent with Schutz. while sound. 15. Content Analysis. recommends generally using the .tional and Psychological Measurement 20 (1960): 37-46.8 level of simple agreement." How long is a piece of string? 20. Schutz. Strictly interpreted." Psychological Rtim-w 59 (1952): 119-29. Content analysis researchers are concerned that the reliability figure exceeds a minimal level. Ambiguity and Content Analysis") analysis starts with the . Schutz sought a way to control for the effect of those chance agreements. 23. Presumably.000. This advice. Multiple-category variables differ from dichotomous variables because mulfipie-categories are not independent of each other. "The Reliability of a Content Analysis Technique. which would be on the negafive side of a confidence interval. It is not clear whether Krippendorf's agreement level figures are for simple agreement among coders or for some other reliability measure. populafion size reduces the number of test units noticeably when the number of study units falls under 1. Irving L. adds a bothersome vagueness to content analysis. Its effect can only be acknowledged and compensated for. then N equals the total number of units selected for the content analysis. 181 ) report a rule of thumb of at least . Schutz's ("Reliability. Wimmer and Dominick (Mass Media Research.8. The last factor has little impact unless the proportion is large. 16. but not always. In effect. Ambiguity and Content Analysis. N equals the number of coding decisions that will be made by each coder. This is a bit like a professor's response that the length of an essay should be "as long as it takes.8 level for intercoder reliability. Ifthe reliability ischecked for total decisions made in using the coding procedure. 17. Note that this is a one-tailed test." Public Opinion Quarterly 7 (summer 1943): 293-96. the homogeneity of the populafion. 24. Under some condifions this would be consistent with a simple agreement of . William C. 22. This analysis assumes that each variable is checked and reported separately.9 for simple agreement and a Scott's /'/ or Krippendorf's alpha of . Three factors affect sampling error: the size of the sample. 18. However. 21. But of course it can't.67 could be reported for highly speculative conclusions. 14. But just because chance could affect reliability does not mean it dtïes. As Table 1 shows.75 for intercoder reliabilitv. which means N equals number of content units in the populafion. although he says some data with reliability figures as low as . 296) who states that a Scott's . If reliability is checked separately for each coding category in the content analysis. Krippendorf. and Morris Janowitz. and the "odds" of agreement through randomness change once a coding criterion is introduced and used.)/ of . "Reliability. 19. This analysis will use . this lack of independence is a bias in coding and not in the selection of units for 972 JüURNAiJSM & MASS CoMMUNiCAnON QiiAm^my . Janis.

Kalton. Kolbe and Melissa S. see Maria Adele Hughes and Dennis F. typically yields higher levels of reliability because cues for coding are more explicit. see C." 26. "Coefficient of Agreement for Nominal Scales.a reliability test. Scott. Coding simple content. 2d ed."/Di/í-mi/dfConsiíWír Research 18 (September 1991): 243-250. A. The population agreement will be higher than coding schemes that deal with word mearungs. "Content-Analysis Research: An Examination of Applications with Directives for Improving Research Reliability and Objectivity." journal ofMarketing Research 27 (May 1990): 185-195. (NY: Basic Books. Content Analysis. Survey Methods in Social Investigations. 1972). Moser and G. SAMPU\G ERSOR AND SEifcriNc ¡UTERCODEM REUABIUTY SAMPISS FOR NOMINAL COKTENI CAiTCi>ms 973 . and Richard H. Burnett. "Intercoder Reliability Estimation Approaches in Marketing: A Generalization Theory Framework for Quantitative Data. "Reliability of Content Analysis. For examples. Cohen. For example. such as numbers of stories. Carrett. A lower reliability figure is an acceptable trade off for studying categories that concern meaning. 27. 25." 28. Krippendorf. 29. 30.

Copyright of Journalism & Mass Communication Quarterly is the property of Association for Education in Journalism & Mass Communication and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. download. . users may print. However. or email articles for individual use.

Sign up to vote on this title
UsefulNot useful