By Stephen Lacy and Daniel Riffe
This study views intercoder reliability as a sampling problem. It develops a formula for generating sample sizes needed to have valid reliability estimates. It also suggests steps for reporting reliability. The resulting sample sizes will permit a knozun degree of confidence that the agreement in a sample of items is representative of the pattern that would occur if all content items were coded by all coders. Every researcher who conducts a content analysis faces the same question: How large a sample of content units should be used to assess the level of reliability? To an extent, sample size depends on the number of content units in the population and the homogeneity of the population with respect to variable coding complexity. Content can be categorized easily for some variables, but not for other variables. How does a researcher ensure that variations in degree of difficulty are included in the reliability assessment? As in most applications involving representativeness, the answer is probability sampling, assuring that each unit in the reliability check is selected randomly.' Calculating sampling error for reliability tests is possible with probability sampling, but few content analyses address this point. This study views intercoder reliability as a sampling problem, requiring clarification of the term "population." Content analysis typically refers to a study's "population" as all potentially codabte content from which a sample is drawn and analyzed. However, this sample itself becomes a "population" of content units from which a sample of test units is randomly drawn to check reliability. This article suggests content samples need to have reliabilify estimates representing the population. The resulting sample sizes will permit a known degree of confidence that the agreement in a sample of test units is representative of the pattem that would occur if all study units were coded by al! coders. Reproducibility reliability is the extent to which coding decisions can be replicated by different researchers.- In principle, the use of multiple independent coders applying the same rules in the same way assures that categorized content does not represent the bias of one coder. Research methods texts discuss reliability in terms of measurement error resulting from problems in coding instructions, failure of coders to achieve a common frame of reference, and coder mistakes.-* Few texts or Background

Stephen Lacy is a professor in the Michigan State University School of journalism, and l&MC Quarterly Daniel Riffe is professor m the E.W.Scripps School of journalism at Ohio üniversity. The J^.'*,^'!^^ authors thank Fred Fico for his comments and suggestions. 963-973


)/ in statistics and content analysis texts. the researcher cannot UiM & MASS CCMAUNÍCXTION QiMjrreKiy . Fadner. reliability samples have been selected haphazardly or based on convenience (e. the confidence interval does dip below . the first 50 items to be coded might be used). Kaid and Wadsworth'' suggest that "levels of reliability should be assessed initially on a subsample of the total sample to be analyzed betöre proceeding with the actual coding.-* Often. "probably between 10% and 25%. and Janowitz'' comparing reliability of different coding schemes provided reliability coefficients with confidence intervals. Cohen'^ discussed sampling error while introducing kappa.'^ Schutz offered a formula that enabled a researcher to set a minimal acceptable level of reliability and then compute the level that must be achieved in a reliability test to account for chance agreements."" Wimmer and Dominick" urge analysts to conduct a pilot study on a sample of the "content universe" and." should be reanalyzed by independent coders to calculate overall intercoder reliability.80 to be acceptable.." How large a subsample? "When a very large sample is involved. assuming satisfactory results. in the reliability test in order to control for chance agreement.'" Krippendorf argues that probability sampling to gei a representative sample is not necessary. The formula allows the researcher to be certain that the observed sample reliability level is high enough that.g. Scott's'^ article introducing his pi included an equation accounting for sampling error. then to code the main body of data. if the reliability coefficient must equal or exceed . even if chance agreement could be eliminated/^ the "remainder" level of agreement would exceed the acceptable level. Then a subsample. in a given test.80. the researcher might need to achieve a level as high as 837<. If. though that component was dropped from subsequent references to ... if the minimal acceptable level of agreement is 807c. the sample used must havea confidence interval that does not dip below ." Stempel concludes that reliability estimates "should be based on several samples of content from the material in the study"'' and that a "minimum standard would be the selection of three passages to be coded by all coders." Most texts do not discuss reliability in the context of probability sampling and the resulting samplingerror.studies address whether the content units tested represent the population of items studied. Weber's^^ only pertinent recommendation is that "The best test of the clarity of category definitions is to code a small sample of the text." Yet early inquiries into reliability testing did address probability sampling. He explored the impact of "chance agreement" on reliability measures: i. The formula can be used to generate samples with confidence intervals that tell researchers if the minimal acceptable reliability figure has been achieved.80. For example. though the existence of coding criteria reduces the influence chance could have. Research texts vary in their approach to sampling for reliability tests. but the article concentrated on measurement error due to chance agreement. some coder agreements could occur by chance. Sampling Error and Estitnating Sample SizeA Formula 964 The goal ofthe following analysis is to generate a formula for estimating simple random sample sizes for reliability tests. a subsample of 5-7 percent of the total is probably sufficient for assessing reliability. Schutz''' dealt with measurement error and sample size.e. For example. Singletary has noted that reliability checks introduce sampling error when probability samples are used. Schutz incorporated sampling error into his formula. An early article by Janis.

I 100% y^5% -h5'l^o\ Confidence Interval Continuum for level of agreement in coding decisions Relevant area for determining acceptability of reliability test. conclude that the "true" reliability of the populafion equals or exceeds the minimal acceptable level. SAMPUNG ERROR AND SELECTINC imERcoaER REUAMITY SAMPLES ÍOR NOMINAL CoNrcm. lt reduces the standard error but is often ignored because it has little impact when a sample has a small proporfion of the population. the formula becomes: (Equafion 2) Where N .FIGURE 1 Why Reliahility Confidence Interval Uses a One-Tailed Test Minimal acceptiibic 0% 80% ya". The resulting area of concem is the gray area between 90% and 80%. The resulting formula is: SE = /PQ7 V ÍI-1 . The reason for a one-tailed confidence interval is illustrated in Figure 1. The FPC is used when the sample makes up 10'^ or more of the population.. this analysis uses "simple agreement" (total agreements divided by total decisions) with a dichotomous decision (the coders either agree or disagree).the population size (number of content units in the study). which involves the negafive side of the confidence interval. Survey researchers use the formula for standard error of proporfion to estimate a minimal sample size necessary to infer to the population at a given level of confidence. For simplicity. A similar procedure is used here. A researcher'sconclusionofacceptablereiiability is not affected by whether the population agreement exceeds 5% on the posifive side because acceptance is based on a minimal standard.CVLGORIES 965 . which would fall on the negative side of the interval.^Ñ^ V N-l (Equafion 1) But with the radical removed and the distributive property applied.. We start with the equation for the standard error of prop()rfion and add the finite populafion correcfion (FPC). and the sample level of agreement is 90%. The minimal acceptable agreement level is 80%.

The formula is: Confidence interval probability = Z (SE) (Equation 3) Z is the standardized point on the normal curve that corresponds with the acceptable level of probability. The researcher must determine the acceptable level of probability for estimating the confidence interval. Step 3.e. Once the acceptable probability level is determined. Once the five steps have been taken.. it will be assumed that the population level should be set at 5 percentage points above the minimal acceptable level of agreement.. using the formula: JOURNALISM & MASS CoMMUNiCAnoN QLMRTERLY 966 . the resulting figures are plugged into Equation 2 and the number of units needed for the reliability test is determined. Then we solve for standard error (SE). Step 4.85. chances are 95 out of 100 that the population (content units in the study) figure equals or exceeds . i. a minimum level of 80% simple agreement is often used with new coding procedures. 95% (p=..O5) and 99% (p=.05 level.000 content units (e.^' But this level is lower than recommended by others. Content analysis texts warn that an acceptable level of intercoderreliability should reflectthenature and difficulty of categoriesand content. If the reliability figure equals or exceeds . a level consistent with minimal requirement recommendations by Krippendorf and the analysisof Schutz. then the assumed P would be . The desired level of certainty is the traditional . '^ For example.^' Step 5. Two approaches are possible.g. The first is to estimate P based on a pretest of the coding instrumentand on previous research. Step 2. Using the normal curve.05 is 1. The second approach creates the question: How many percentage points above the minimal reliability level should P be? For this analysis. This step is the most difficult step because it involves estimating the unknown population reliability figure. In order to solve for n. Equation 2 allows the researcher to solve for n. newspaper stories). which represents the number of test units. Five percentage points is useful because it is consistent with a confidence interval of 5%. For example. if the minimal acceptable reliability figure is . the researcher must follow five steps: Step 1.80. we find that the one-tailed Z-score" associated with . A Simulation Assume an acceptable minimal level of agreement of 85% and P of 90% in a study using 1.85.P ^ the population level of agreement.andQ^(l-P). ThesecondistoassumeaPthat exceeds the minimal acceptable reliability figure by a certain level. Andfi=samplesizefor the reliability check. ^. The first step is to determine ÏV (the number of content units being studied).8. the formula for confidence intervals is used to calculate the standard error (SE).64. We assume most content analysts will use the same levels of probability for the sampling error in intercoder reliability checks as are used with most sampling error estimates.Ol) levels of probability. The level of agreement in coding all study units (P) must be estimated. This is the level of agreement among all coders if they coded every content unit in the study. The researcher must set a minimal level of intercoder reliability for the test units."^ It usuallyhasbeen determined before reaching the point of checking for the reliability of the instrument.

0009.64 (SE) or.0009) + .000. . with 1. PQ .000 study units and an assumed true agreement level of 90%. Table 2 presents numbers of test units for 99% level of probability.10) or .05 = 1. A problem can occur if the level of agreement in the test units SAMPUNC ERROR AND SELECTING ¡matcoDEH REiJABunr SAMPLES FOR NOMINAL ComuirCAizcomES 96/ .05 confidence level was . Thus.9) taken from 1. chances are 95 out of 100 that 85"/" or better agreement would exist if all study units were coded by all coders and reliability measured.(999)(.09. SE = V H-l and becomes '-1)(SE)^ + PQN {Equafion 2) V N-i (Equation 1) Now we can plug in our numbers and determine how large a random sample we will need to achieve at minimum the standard 85% reliability agreement.500.64-.90 (. This might produce an incentive to overestimate this level because it would reduce the amount of work in the reliability test.899 = 91. The higher the assumed percentage.000.03 Recall that our formula for sample size begins with SE. 90%. The sample sizes are based on confidence interval with 95% probability. However. Assuming a study unit level of 5 percentage points above the minimal level will control for this incentive because the higher the assumed level.989 In other words. and 95%) and with numbers of study units equal to 100. 5.05/1. However. squared to . the number of test units needed decreases much faster with higher levels of P than with the decline in the number of study units.250.9 (999)(.03. the smaller will be the sample. So Equation 2 looks like n . and the resulfing SE at p .05. the higher will be the minimal acceptable level of reliability.09(1000) .000.1. and 10.00Ö9) + .. So. The figures for a given number of study units and agreement level are higher in Table 2 because they represent the increased number of test units needed to reach the higher level of probability. Our confidence interval is . SE-.Confidence interval = Z (SE) (Equation 3) Our example confidence interval is 5% and our desired level of probability is 95%. Table 1 solves Equation 2 for n with three hypothefical levels of P (85%.90.000 study units..-^ Table 2 assumes the same agreement levels as Table 1. The main problem in determining an appropriate sample of test units is estimafing the level of P. if we achieve at least 90% agreement in a simple random sample of 92 test units (rounded from 91.Ü9 0. The table demonstrates how higher P levels and smaller numbers of study units affect the test units needed.

Three Assumed Levels of Population ¡ntercoder Agreement. Under this condition.TABLE 1 Number of Content Units Needed for Reliability Test. (b) with two coders. Equation 2 would easily fit nominal content with more than two categories. who introduce measurement error after the reliability sample is selected.80. the first two are not limitations. Limitations of the Analysis This analysts may seem limited because it is: (a) based on a dichotomous decision. if the test units' reliability level equals .000 l.000 5. the confidence interval dips below the minimal acceptable level of . If this is the case.05. Based on Various Population Sizes. say . the researcher could randomly select more content units for the reliability check or accept a lower minimal level of agreement. Neither is usinga dichofomous decision a problem. This indicates that reliability figure for the population of study units might not exceed the acceptable level of .86 minus . = /PxQ X where P = percentage of agreement in population. the impact of more complex coding schemes might affect the representativeness of a reliability sample if some of the categories occur infrequently. and a 95% Level of Probability Assumed Level of Agreement in Population (Study Units) Population Size (Sfudy Units) 10. as JOURNAUSM & MASS COMMUNKATION QuARTEniy 968 .86) into Equation 2 as P. the larger sample size can be determined by plugging the test units' reliability level (. generates a confidence interval that does dip below the minimal acceptable level of reliability.85. If the first approach is used. which means the full range of categories has not been tested. and n = the sample size.Oi)O 85"X> 90% 95% 11 4 139 125 111 91 59 100 99 92 84 72 51 54 54 52 49 45 36 500 250 100 Note: The numbers are taken from the equation for standard error of proportions and are adjusted with the finite population adjustment. and (c) it uses a simple agreement measure of reliability. Additional units could be randomly selected and added to the original test units to calculate a new reliability figure and confidence interval based on a larger sample. The standard error was used to find a sample size that would have sampling error equal to or less than 5% for the assumed population level of agreement.85. N = the population size.E. Q = (1-P). These infrequent categories have less likelihood of being in the sample. For example.'"* However. The equation is S. However. Sampling error is not affected by the number of coders.

000 1.-*" and Cohen's kappa.TABLE 1 Number of Content Units Needed for Reliability Test. = / F X Q X / V n-1 V N-1 where P = percentage of agreemt-nt in population.areavailablefornomina] level data. Second. or both. The equations is S.000 5. The representativeness of a sample of test units is not dependent on the test applied. If 95%. select a larger number of test units. The standard error was used to find a sample size that would have sampling error equal to or less than 5"/.E.000 500 250 100 271 263 218 179 132 74 193 190 165 142 111 67 104 103 95 87 75 52 Note: The numbers are taken from the equation for standard error of proportions and are adjusted with the finite population adjustment.. Q = (l-P). A parallel analysis to this one for interval and ratio level categories could be developed using the standard error of means. and n = the sample size. and a 99% Level of Probability Assumed Level of Agreement in Population (Study Units) 85% 90% 95% Population Size (Study Units) 10. N = the population size.-^ Krippendorf's alpha. Equation 2 is limited.-" These three measures were developed to deal with measurement error due to chance and not with error introduced through sampling. First. Three Assumed Levels of Population Intercoder Agreement. however. if 99% use Table 2. the two tables can be useful for selecting a sample of test units to establish equivalence reliability. Based on Various Population Sizes. if the variables are straightforward counting measures. for the assumed population level of agreement. such as source of newspaper stories. If this is the case. the researcher should randomly stratify the test units. discussed in note 11. such as political KHOR ANO SELECTING INTERCODER RinABiLn\ SAMPLES TOR NOMINAL COOTENT CATEOJRIES Using the Tables " O ? . If the variables involve coding meanings of content. the researcher should start by selecting the level of probability appropriate for the study. Theseare Scott's pi. At least three other measures of reliability. Some beginning researchers might struggle with the task of making assumptions and solving the equations. besides agreement among coding pairs.^~ Several discussions of the relative advantages and disadvantages of these measures are available. to nominal data because it is based on the standard error of proportions. take the assumed agreement level among study units to be 90%. The use of simple agreement in reliability test is not a problem either. use Table 1.

might be preferable for selecting reliability test samples.•*" When reporfing reliability level. The formula used here is the unbiased esfimator for simple random samples. other forms of probability sampling. 2. Content Analysis: An Introduction to ¡ts Methodology (Beverly Hills. The role of selecHon bias in determining reliability coefficients seems to have gotten lost since earlier explorafions of reliability. take the assumed agreement level of 85% among study units.leaning of news stories. the researcher would look down the 907ci level of agreement column in Table 1 unfil she or he came to a population size of 500 (the closest sample size that is greater than 425). Acceptinga confidence level of 95%.^^ Third. "Content Analysis. differs from stability and accuracy reliability. the sample must bea prohahility sample. This arficie has attempted to answer this quesfion and to suggest a procedure for esfimating sampling error in reliability samples. The analysis in this arficle is based on simple random sampling for reliability tests. NOTES 1. also called equivalence reliability. Using probability samples and confidence intervals for reliability figures would help add rigor. if certain categories of a variable may make up a small proporfion of the content units being studied. Reproducibility reliability. 3. Accuracv reliability involves comparing coding results with some known standard. such as strafified random sampling. ed. Westley (Englewood Cliffs. The study of content needs a more rigorous way of dealing with potential selecfion bias. for sampling error to have meaning. Guido H. CA: Sage. Stephen Lacy and Daniel Ri ffe. This bias can only be estimated through probability sampling. NJ: Prentice-Hall. 1981). For example. samples based on proporfion or stratification will require adjustments available in many stafistics books." in Research Methods in Mass Communication. Stempel III and Bruce H. The number of units needed for the reliability check equals 84. 127. the researcher might oversample these categories. Stability concerns the same coder testing reliability of the same content at two points in fime. a researcher studying coverage of economic news in network newscasts has 425 stories from 40 newscasts selected from the previous year. For example. 'y An inevitable question from graduate sfijdents conducting their first content analysis is how many items to use in the intercoder reliability test. The confidence intervals for Scott's . Inc. confidence intervals should be reported with both measures of reliability. Of course. 4. Variables involve numbers of stories devoted to various types of economic news. 1980). "Sins of Omission and Commission in " ' " ¡ouRNAUSM & MASS COMMUNICAHON QUARTEIUÏ . However. See Klaus Krippendorf. 130-32. Guido H. The term reliability is used here to refer to reproducibility.'/ and Cohen's kappa can be calculated by referring to the formulas presented in the original articles for these coefficients. Simple agreement confidence intervals can be calculated using the standard error of proportions. under some circumstances. find the population size in the tables that is closest but greater than the size of the study units being analyzed. Take the number of test units from the table. Stempel III.

146. Roger D. It could require quota sampling. Dominick. Wimmer and Joseph R." EducaSAMPIJNG ERROR AND SELEOWJG J^frERCOD£^^ REUABSIFY SAMPLES FOR MIM/N/U. but the resulting reliability figure will be more representative of content units being studied. Just generating a stratified reliability sample that would include sufficient numbers of units for each of these categories would be time consuming and difficult." Pulylic Opinion Quarterly 19 (fall 1955): 321-25. ed. additional units can be selected. disproportionate sampling of the less frequent categories would be useful. of course. Content Analysis. are indeed represented in the reliability data regardless of lioiofrequently they may occur in the actual data" (emphasis added). 11. "Content Analysis. 7. See Krippendorf.: Sage University Paper Series on Quantitative Applications in the Social Sciences. Another way of handling infrequent categories would be to increase the reliability test sample size above the minimum recommended here. Stempel III. Robert Philip Weber. Stempel and Westley. Mass Communication Research (NY: Longman. This will. 10. 13. a twenty-sixcategory scheme for coding the variable "news topic") could create logistical problems. or selecting and checking content units for these infrequent categories until a proportion of the test units equals the estimated proportion of the infrequent categories. Lynda Lee Kaid and Anne Johnston Wadsworth." journalism Quarterly 70 (spring 1993): 126-32." in Research Methods in Mass Communication. Mass Media Research: An Introduction. Cohen. but a large number of categories within a variable (e. 6." in Measurement of Communication Behavior.Mass Communication Quantitative Research. (Newbury Park. This procedure might create problems when content has infrequent categories that are difficult to identify. Scott. 208.A. CA: Wadsworth. 8. Some would question whether the logistical problems outweigh the potential impact of such a "micro" measure of reliability on the overall validity of the data. "Content Analysis. Basic Content Analysis. If the larger sample does not include sufficient numbers of the infrequent categories.. the results for particular categories would have to be weighted to reflect the proportions in the study units. Krippendorf argues that reliability samples "need not be representative of the population characteristics" but "must be representative of all distinctions made within the sample of data at hand" (emphasis in original). No one would argue that all variables need to be tested in a reliability check. If a researcher suspects that some variable categories will occur infrequently in a simple random sample for a reliability check. Stempel. (Belmont. Cüivrew C^ATÏCORIES "71 . Barker (NY: Longman. ed. 1994). Guido H. 173. Michael Singletary. 23. CA. When figuring overall agreement for reliability. 5. Frequency of categories could be estimated by a pretest and different sampling rates could be used for categories that appear less frequently. 3d ed. 297. lead to coding of additional units from categories that appear frequently. all decisions specified by various forms of instructions. 07-075). 9.g. Larger samples will increase the probability of including infrequent categories among the test units. William A. "Reliability of Content Analysis: The Case of Nominal Scale Coding. 1989). 1991). J. He suggests purposive or stratified sampling to ensure that "all categories of analysis. "Statistical Designs for Content Analysis." 128. 143. "Coefficient of Agreement for Nominal Scales. 2d ed. 12. Philip Emmert and Larry L.

Wimmer and Dominick (Mass Media Research. although he says some data with reliability figures as low as . 14. and the "odds" of agreement through randomness change once a coding criterion is introduced and used. 16. 296) who states that a Scott's . William C. This analysis will use . Schutz. Irving L.000. 15.8 level for intercoder reliability. which would be on the negafive side of a confidence interval. populafion size reduces the number of test units noticeably when the number of study units falls under 1.tional and Psychological Measurement 20 (1960): 37-46. adds a bothersome vagueness to content analysis. 17. 22. 21. which means N equals number of content units in the populafion. but not always. This advice. This analysis assumes that each variable is checked and reported separately. Note that this is a one-tailed test. Schutz sought a way to control for the effect of those chance agreements. 19. It is not clear whether Krippendorf's agreement level figures are for simple agreement among coders or for some other reliability measure. Krippendorf. N equals the number of coding decisions that will be made by each coder. Strictly interpreted. recommends generally using the .75 for intercoder reliabilitv. See Singletary (Mi7ss Communication Research. 24. while sound. The last factor has little impact unless the proportion is large.)/ of ." Public Opinion Quarterly 7 (summer 1943): 293-96.8 level of simple agreement. this lack of independence is a bias in coding and not in the selection of units for 972 JüURNAiJSM & MASS CoMMUNiCAnON QiiAm^my . then N equals the total number of units selected for the content analysis.80 to remain consistent with Schutz." Psychological Rtim-w 59 (1952): 119-29. Under some condifions this would be consistent with a simple agreement of . If reliability is checked separately for each coding category in the content analysis. 181 ) report a rule of thumb of at least . Multiple-category variables differ from dichotomous variables because mulfipie-categories are not independent of each other. 23. Eadner. 18. Ifthe reliability ischecked for total decisions made in using the coding procedure. "The Reliability of a Content Analysis Technique. Raymond H. This is a bit like a professor's response that the length of an essay should be "as long as it takes. the homogeneity of the populafion.7 is the consensus val ue for the sta tistic. Janis. these chance agreements could lead content analysts to overestimate theextent of coder agreement due to the precision of the coding instrument. Three factors affect sampling error: the size of the sample. Schutz's ("Reliability. Ambiguity and Content Analysis") analysis starts with the . The acceptance of a coding instrument as reliable is not affected by whether the population reliability figure exceeds the reliability test figure on the positive side of the confidence interval. N equals the number of units analyzed mulfiplied by the number of categories being used.67 could be reported for highly speculative conclusions. Content Analysis. Content analysis researchers are concerned that the reliability figure exceeds a minimal level. Presumably.8. In effect. Ambiguity and Content Analysis.9 for simple agreement and a Scott's /'/ or Krippendorf's alpha of . "Reliability." How long is a piece of string? 20. Its effect can only be acknowledged and compensated for. and Morris Janowitz. However. But just because chance could affect reliability does not mean it dtïes. and the proportion of the population in the sample. But of course it can't. As Table 1 shows.

Kalton. and Richard H. SAMPU\G ERSOR AND SEifcriNc ¡UTERCODEM REUABIUTY SAMPISS FOR NOMINAL COKTENI CAiTCi>ms 973 . (NY: Basic Books. 29. Survey Methods in Social Investigations. 30." journal ofMarketing Research 27 (May 1990): 185-195. Coding simple content. Scott. A lower reliability figure is an acceptable trade off for studying categories that concern meaning. For examples.a reliability test. Cohen. For example. see C. Carrett. Content Analysis. 1972). such as numbers of stories." 26. Kolbe and Melissa S. "Intercoder Reliability Estimation Approaches in Marketing: A Generalization Theory Framework for Quantitative Data. see Maria Adele Hughes and Dennis F. 2d ed. "Reliability of Content Analysis. Krippendorf. Moser and G. A. 25. typically yields higher levels of reliability because cues for coding are more explicit. 27."/Di/í-mi/dfConsiíWír Research 18 (September 1991): 243-250. "Coefficient of Agreement for Nominal Scales. Burnett. The population agreement will be higher than coding schemes that deal with word mearungs. "Content-Analysis Research: An Examination of Applications with Directives for Improving Research Reliability and Objectivity." 28.

Copyright of Journalism & Mass Communication Quarterly is the property of Association for Education in Journalism & Mass Communication and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. or email articles for individual use. download. . However. users may print.