Vous êtes sur la page 1sur 8

Psychological Bulletin 2017 American Psychological Association

2017, Vol. 143, No. 7, 775782 0033-2909/17/$12.00 http://dx.doi.org/10.1037/bul0000112

REPLY

Violent Video Game Effects Remain a Societal Concern: Reply to Hilgard,


Engelhardt, and Rouder (2017)

Sven Kepes Brad J. Bushman


Virginia Commonwealth University Ohio State University
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Craig A. Anderson
This document is copyrighted by the American Psychological Association or one of its allied publishers.

Iowa State University

A large meta-analysis by Anderson et al. (2010) found that violent video games increased aggressive
thoughts, angry feelings, physiological arousal, and aggressive behavior and decreased empathic
feelings and helping behavior. Hilgard, Engelhardt, and Rouder (2017) reanalyzed the data of
Anderson et al. (2010) using newer publication bias methods (i.e., precision-effect test, precision-
effect estimate with standard error, p-uniform, p-curve). Based on their reanalysis, Hilgard, Engel-
hardt, and Rouder concluded that experimental studies examining the effect of violent video games
on aggressive affect and aggressive behavior may be contaminated by publication bias, and these
effects are very small when corrected for publication bias. However, the newer methods Hilgard,
Engelhardt, and Rouder used may not be the most appropriate. Because publication bias is a potential
a problem in any scientific domain, we used a comprehensive sensitivity analysis battery to examine
the influence of publication bias and outliers on the experimental effects reported by Anderson et al.
We used best meta-analytic practices and the triangulation approach to locate the likely position of
the true mean effect size estimates. Using this methodological approach, we found that the combined
adverse effects of outliers and publication bias was less severe than what Hilgard, Engelhardt, and
Rouder found for publication bias alone. Moreover, the obtained mean effects using recommended
methods and practices were not very small in size. The results of the methods used by Hilgard,
Engelhardt, and Rouder tended to not converge well with the results of the methods we used,
indicating potentially poor performance. We therefore conclude that violent video game effects
should remain a societal concern.

Keywords: violent video games, aggression, meta-analysis, publication bias, outliers

Supplemental materials: http://dx.doi.org/10.1037/bul0000112.supp

Anderson et al. (2010) published a large meta-analysis of 381 as well as correlations between violent game play and aggres-
effects from violent video game studies involving more than sive affect, behavior, and cognitions in cross-sectional studies.
130,000 participants. They found that violent video games Hilgard et al. (2017) examined a total of 13 meta-analytic distri-
increased aggressive thoughts, angry feelings, physiological butions (see their Table 3). For the most part, there is agreement
arousal, and aggressive behavior, and decreased empathic feel- between the mean estimates of Hilgard, Engelhardt, and Rouder and
ings and helping behavior. Hilgard, Engelhardt, and Rouder Anderson et al., although Hilgard, Engelhardt, and Rouder concluded
(2017) reanalyzed the data of Anderson et al. on experimental that the estimates of Anderson et al. of the experimental effects of
effects of violent-game exposure on aggressive affect, aggres- violent video games on aggressive behavior and aggressive
sive behavior, aggressive cognitions, and physiological arousal affect should be adjusted downward. Their conclusions are
based on several relatively new publication bias methods, in-
cluding the precision-effect test (PET), precision-effect estimate
with standard error (PEESE), p-uniform, and p-curve.
Sven Kepes, Department of Management, School of Business, Vir- In this response, we follow a two-pronged approach. First, we
ginia Commonwealth University; Brad J. Bushman, School of Communi- provide a brief critique of the methods Hilgard et al. (2017) used.
cation and Department of Psychology, Ohio State University; Craig A. An-
Second, given the shortcomings highlighted in our critique and
derson, Department of Psychology, Iowa State University.
Correspondence concerning this article should be addressed to Brad J. taking a strong inference approach (Platt, 1964), we reanalyze the
Bushman, School of Communication, Ohio State University, 3016 Derby experimental data with additional recommended statistical tech-
Hall, 154 North Oval Mall, Columbus, OH 43210. E-mail: bushman niques to determine with greater confidence whether Anderson et
.20@osu.edu al.s (2010) conclusions need to be altered.

775
776 KEPES, BUSHMAN, AND ANDERSON

Hilgard et al.s (2017) Anderson et al.s original analysis, their assertion is not necessarily
Methodological and Statistical Approach correct. We believe the most sophisticated analysis uses best
meta-analytic practices (e.g., Kepes & McDaniel, 2015; Kepes et
Hilgard et al. (2017) suggest that trim and fill, the publication al., 2013; Viechtbauer & Cheung, 2010) and the triangulation
bias assessment method Anderson et al. (2010) used, is best approach (Jick, 1979) to locate the likely position of the true mean
viewed as a sensitivity analysis rather than a serious estimate of the effect size estimate using a comprehensive sensitivity analysis
unbiased [meta-analytic] effect size (p. 760). In turn, they imply battery (Kepes et al., 2012). We use this more comprehensive
that their publication bias assessment methods are not sensitivity approach to determine whether the results reported by Hilgard et
analyses and should be viewed as more serious because they al. (2017) or by Anderson et al. (2010) are more accurate. How-
provide an accurate for-bias-adjusted mean estimate. Such an ever, before we proceed to reanalyzing the data, we briefly review
implication is misleading because all methods that assess the the publication bias methods used by Hilgard, Engelhardt, and
robustness of a nave meta-analytic mean estimate should be Rouder.
viewed as sensitivity analyses (Kepes, McDaniel, Brannick, &
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Banks, 2013). By nave we mean the meta-analytic mean effect


PET-PEESE
This document is copyrighted by the American Psychological Association or one of its allied publishers.

without any adjustment for potential biases (Copas & Shi, 2000).
Sensitivity analyses examine the degree to which the results of The PET-PEESE (Stanley & Doucouliagos, 2014) approach to
a nave meta-analysis remain stable when conditions of the data or publication bias is a combination of two weighted regression
the analysis change (Greenhouse & Iyengar, 2009). We know of models. As Hilgard et al. (2017) stated, PET extrapolates from the
no valid method that can provide a for-bias-adjusted mean estimate available data to estimate what the effect would be in a hypothet-
of the true underlying population effect size. Instead, sensitivity ical study with perfect precision (p. 760). PEESE works in a
analyses tend to estimate the degree to which a nave meta-analytic similar manner, except that precision is modeled as a quadratic
mean may be adversely affected by publication and/or other biases. function instead of a linear function. Both PET and PEESE may
Furthermore, it is important to note that all methods become less incorporate multiple moderator variables, although Hilgard, En-
stable with small distributions. In fact, most publication bias gelhardt, and Rouder did not use them in that way. Furthermore,
assessment methods should not be applied to meta-analytic distri- both PET and PEESE are modified versions of Eggers test of the
butions with fewer than 10 samples, including funnel plot- and intercept and, as such, some of the shortcomings associated with
regression-based methods (Kepes, Banks, McDaniel, & Whetzel, the Egger test (Moreno et al., 2009; Stanley & Doucouliagos,
2012; Sterne et al., 2011). 2014; Sterne & Egger, 2005) may also apply to PET and/or
In addition, Hilgard et al. (2017) focused on one type of sensi- PEESE.
tivity analysispublication bias. Yet as Hilgard et al. (2017) PET is known to underestimate the size of nonzero effects
noted, heterogeneity can adversely affect the results of publication (Stanley & Doucouliagos, 2007), and PEESE can yield inaccurate
bias analyses (as well as the results of a nave meta-analysis). results the closer the true mean effect size is to zero (Stanley &
Because outliers can be a major source of between-study hetero- Doucouliagos, 2012), which is why Stanley and Doucouliagos
geneity, they should be considered when examining the potential (2014) outlined conditional decision rules to determine which of
effects of publication bias (Kepes & McDaniel, 2015). Like pub- the two models should be used to assess the potential presence of
lication bias (Kepes et al., 2012; Rothstein, Sutton, & Borenstein, publication bias (see also Kepes & McDaniel, 2015; van Elk et al.,
2005), the effects of outliers tend to lead to upwardly biased mean 2015). In a reanalysis of data regarding the predictive validity of
estimates to the extent that they are on one side of the distribution conscientiousness, Kepes and McDaniel (2015) found that their
(Viechtbauer & Cheung, 2010). Furthermore, because between- PET-PEESE results converged relatively well with the results of a
study heterogeneity due to outliers can be mistakenly attributed to battery of other publication bias assessment methods, indicating
publication bias, a comprehensive assessment of the influence of that the method tended to perform quite well with real data. More
publication bias should also include a thorough assessment of recently, Stanley and Doucouliagos (2017) conducted a simulation
outliers or otherwise influential data points (Kepes & McDaniel, and concluded that PET-PEESE properly accounts for heteroge-
2015). In other words, to obtain precise and robust estimates neity and performs quite well, although another simulation study
regarding the potential presence of publication bias, one should found that variants related to PET and PEESE did not perform well
account for outliers when conducting publication bias analyses. (Moreno et al., 2009). Therefore, there is somewhat contradictory
Unfortunately, Hilgard et al. (2017) used only leave-one-out evidence regarding the performance of PET-PEESE.
(i.e., one-sample-removed) analyses to identify outliers. In this
type of sensitivity analysis, the influence of each individual sample
P-uniform
on the nave mean is assessed. This approach poses two problems.
First, no consideration is given to the possibility that more than one The p-uniform method is essentially a selection model (Mc-
outlier has adverse effects on the nave meta-analytic mean esti- Shane, Bckenholt, & Hansen, 2016) that uses only significant
mates. Second, it is unclear what criteria Hilgard, Engelhardt, and studies to estimate the true effect using a fixed-effects model. The
Rouder used when determining whether a particular sample should developers explicitly stated that it is not applicable in the presence
be left out or excluded from subsequent analyses. of between-study heterogeneity (van Assen, van Aert, & Wicherts,
Taken together, although Hilgard et al. (2017) presented their 2015). In support of this view, p-uniform exhibited very low
reanalysis of Anderson et al.s (2010) meta-analytic data set as the convergence rates with other publication bias assessment methods
most up-to-date and comprehensive reanalysis possible, it is not when using real data (Kepes & McDaniel, 2015), probably because
without its own shortcomings. Albeit more sophisticated than of its sensitivity to heterogeneity. More recently, a comprehensive
REPLY TO HILGARD, ENGELHARDT, AND ROUDER (2017) 777

simulation study highlighted p-uniforms poor performance in we included them as well (Stanley & Doucouliagos, 2014). Fur-
realistic settings, which have been defined as settings with thermore, there is value in assessing the level of convergence
flexible publication rules and heterogeneous effect as opposed to between PET-PEESE and other, more established methods (e.g.,
restrictive settings, which involve rigid publication rules and trim-and-fill, selection models), especially because of the newness
homogeneous effect sizes (McShane et al., 2016, p. 731). More of the method. However, following the recommendations by Stan-
traditional selection models that use the complete data when esti- ley and Doucouliagos (2014), we use the conditional PET-PEESE
mating the adjusted mean effect (e.g., Hedges & Vevea, 2005) model and report only the appropriate estimate of the respective
should be used instead because they tend to perform better (Mc- mean effect.
Shane et al., 2016). With regard to trim and fill, we use the recommended fixed-
effects (FE) model with the L0 estimator (Kepes et al., 2012). To
address some of the legitimate criticisms of the trim-and-fill
P-Curve
method, we also use the random-effects (RE) model with the same
Like p-uniform, the p-curve method uses only significant studies estimator to assess the robustness of the results from the FE model
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

to estimate an overall mean effect. Therefore, as with p-uniform, (Moreno et al., 2009). In addition to the general cumulative meta-
This document is copyrighted by the American Psychological Association or one of its allied publishers.

for the p-curve method to work, the nonsignificant studies have to analysis by precision, which typically gets plotted in a forest plot
be estimating the same overall mean effect as the significant (see Kepes et al., 2012), we also present the cumulative meta-
studies, and typically that is not the case when there is between- analytic mean of the five most precise effect sizes (i.e., the effect
study heterogeneity (as there is in virtually all real data in the sizes from the five largest primary studies; for a similar approach,
social sciences). Indeed, when the developers of the p-curve see Stanley, Jarrell, & Doucouliagos, 2010). This method helps
method tested it against a gold standard of replications of 13 shed some light on the issue of low statistical power that often
effects across 36 laboratories, they focused on the effects that plagues social science studies. For the selection models, we use a
proved homogeneous across the laboratories, for exactly this rea- priori models (e.g., Hedges & Vevea, 2005) with recommended p
son (Simonsohn, Nelson, & Simmons, 2014). Not surprisingly, as value cut points to model moderate and severe instances of pub-
with p-uniform, McShane et al.s (2016) simulation study found lication bias (Vevea & Woods, 2005).
that p-curve did not perform well in realistic settings and con- Our comprehensive approach involved five steps. First, we
cluded that traditional selection models (e.g., Hedges & Vevea, performed a nave meta-analysis for each relevant subsample of
2005) are more appropriate for assessing the potential presence of studies on violent video games. Second, we applied our compre-
publication bias in meta-analytic studies. hensive battery of publication bias analyses. Third, we assessed the
potential presence of outliers using a battery of multidimensional,
multivariate influence diagnostics (Viechtbauer, 2015; Viecht-
Summary
bauer & Cheung, 2010). Fourth, we deleted any identified outli-
Although Hilgard et al. (2017) used more recently developed er(s) from the meta-analytic distribution and reran all analyses.
publication bias methods than Anderson et al. (2010) did, past Hence, all meta-analytic and publication bias analyses were ap-
research has shown that several of their methods tend to perform plied to data with and without identified outliers. Fifth, we con-
poorly when applied to real data. It is therefore questionable ducted all analyses with and without the two studies identified by
whether the methods Hilgard, Engelhardt, and Rouder used to Hilgard et al. (2017; p. 763) as being problematic (i.e., Graybill,
assess publication bias perform better than the trim-and-fill Kirsch, & Esselman, 1985; Panee & Ballard, 2002).1 This com-
method used by Anderson et al. (2010). Thus, Hilgard, Engelhardt, prehensive approach allows us to present the possible range of
and Rouders obtained results and conclusions could be erroneous, mean effect size estimates instead of relying on a single value,
as could Anderson et al.s results, especially because neither set of which is aligned with the advantages of the triangulation approach
authors used a comprehensive approach to account for outlier- and customer-centric science (Aguinis et al., 2010; Jick, 1979;
induced between-study heterogeneity, which can adversely affect Kepes et al., 2012). In fact, our comprehensive approach is re-
nave meta-analytic estimates and publication bias results (Kepes quired or recommended in some areas in the medical and social
& McDaniel, 2015; Viechtbauer & Cheung, 2010). sciences (American Psychological Association, 2008; Higgins &
Green, 2011; Kepes et al., 2013).
Our Methodological and Statistical Approach
Results
We implemented a comprehensive battery of sensitivity analy-
ses using the R programing language and the metafor (Viecht- The results of our analyses are displayed in Table 1 (the bottom
bauer, 2015) and meta (Schwarzer, 2015) packages. Following panel displays the results with identified outliers removed). The
best-practice recommendations (Kepes et al., 2012; Kepes & Mc- first three columns report what distribution was analyzed as well as
Daniel, 2015; Rothstein et al., 2005; Viechtbauer & Cheung,
2010), we used trim-and-fill (Duval, 2005), cumulative meta- 1
We note that these two studies with the four samples were deleted
analysis (Kepes et al., 2012), selection models (Vevea & Woods, across study type (e.g., experimental studies, cross-sectional studies, lon-
2005), the one-sample removed analysis (Borenstein, Hedges, gitudinal studies) and outcome (e.g., aggressive affect, aggressive cogni-
Higgins, & Rothstein, 2009), and a battery of multivariate influ- tion, aggressive behavior, physiological arousal). Thus, the removal of the
two studies did not affect the number of correlations in all meta-analytic
ence diagnostics (Viechtbauer, 2015; Viechtbauer & Cheung, distributions equally. In fact, some meta-analytic distributions were com-
2010). Given that Hilgard et al., (2017) based their conclusions to pletely unaffected by their removal (e.g., aggressive cognition best ex-
a large extent on the results from their PET and PEESE analyses, periments).
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Table 1
778
Meta-Analytic and Publication Bias Results for the Anderson et al. (2010) Data Set

Publication bias analyses


Trim and fill
Selection
Meta-analysis FE trim and fill RE trim and fill CMA models PET-PEESE
osr t&fFE t&fFE t&fRE t&fRE
Distribution k N ro 95% CI 90% PI Q I2 ro FPS ik ro 95% CI FPS ik ro 95% CI pr5 ro smm ro sms ro pp ro

Original distributions
Aggressive affect
All experiments 37 3,015 .23 .16, .29 .05, .47 111.22 67.63 .16 .20, .24; .23 L 9 .14 .07, .22 0 .23 .16, .29 .08 .19 .13 .34
All experiments (w/o 2 s) 36 2,979 .21 .15, .28 .05, .45 102.30 65.79 .16 .19, .22; .21 L 8 .14 .07, .22 L 7 .15 .08, .22 .08 .18 .13 .33
Best experiments 21 1,454 .33 .25, .41 .09, .54 49.15 59.31 .15 .28, .34; .34 L 6 .25 .15, .34 0 .33 .25, .41 .22 .31 .29 .55
Best experiments (w/o 2 s) 20 1,418 .32 .24, .39 .09, .52 43.82 56.64 .14 .27, .33; 32 L 6 .24 .15, .34 0 .32 .24, .39 .22 .30 .28 .55
Aggressive cognition
All experiments 48 4,289.5 .21 .16, .25 .04, .37 90.00 47.78 .10 .19, .21; .21 0 .21 .16, .25 R 6 .23 .19, .28 .21 .18 .13 .25
All experiments (w/o 2 s) 47 4,173.5 .19 .16, .23 .07, .31 66.31 30.63 .07 .19, .20; .19 0 .19 .16, .23 0 .19 .16, .23 .21 .18 .15 .22
Best experiments 24 2,887 .22 .18, .27 .11, .33 35.11 34.49 .07 .21, .23; .22 L 5 .20 .15, .25 L 5 .20 .15, .25 .23 .21 .20 .19
Best experiments (w/o 2 s) Same results as above
Aggressive behavior
All experiments 45 3,464 .19 .14, .24 .02, .36 79.08 44.36 .10 .18, .20; .19 L 8 .15 .10, .21 L 8 .15 .10, .21 .14 .17 .13 .23
All experiments (w/o 2 s) 44 3,428 .18 .14, .21 .08, .27 52.94 18.78 .06 .17, .18; .18 L 7 .16 .11, .20 L 7 .16 .11, .20 .14 .16 .14 .17
Best experiments 27 2,513 .21 .17, .25 .18, .24 19.41 .0 .0 .20, .23; .21 L 10 .18 .15, .22 L 10 .18 .15, .22 .16 .20 .19 .07
Best experiments (w/o 2 s) Same results as above
Physiological arousal
All experiments 29 1,906 .15 .09, .21 .03, .31 45.48 38.44 .10 .13, .16; .15 L 1 .14 .08, .20 0 .15 .09, .21 .09 .12 .07 .11
All experiments (w/o 2 s) 28 1,870 .15 .09, .21 .02, .31 43.59 38.06 .10 .13, .16; .15 L 3 .13 .06, .20 0 .15 .09, .21 .09 .12 .08 .09
Best experiments 15 969 .20 .10, .29 .05, .42 30.43 53.99 .14 .17, .22; .20 0 .20 .10, .29 0 .20 .10, .29 .19 .16 n/a .27
Best experiments (w/o 2 s) 14 933 .21 .11, .31 .02, .43 27.62 52.93 .14 .18, .24; .21 L 5 .10 .01, .21 0 .21 .11, .31 .19 .18 .11 .23
Distributions without identified outliers
Aggressive affect
All experiments 36 2,985 .20 .14, .25 .0, .38 75.53 53.66 .12 .19, .21; .20 L 8 .14 .08, .20 L 7 .15 .09, .21 .08 .17 .14 .01
All experiments (w/o 2 s) 35 2,949 .19 .13, .24 .0, .36 66.24 48.67 .11 .18, .20; .19 L 7 .14 .09, .20 L 6 .15 .10, .21 .08 .16 .13 .01
Best experiments 20 1,424 .28 .23, .33 .21, .34 20.25 6.15 .03 .27, .29; .28 L 6 .24 .18, .30 L 6 .24 .18, .30 .22 .27 .26 .0
Best experiments (w/o 2 s) 19 1,388 .27 .21, .31 .22, .31 14.35 .0 .0 .26, .28; .27 L 5 .24 .18, .29 L 5 .24 .18, .29 .22 .26 .25 .0
Aggressive cognition
KEPES, BUSHMAN, AND ANDERSON

All experiments 46 3,966.5 .19 .15, .22 .08, .29 58.45 23.01 .06 .18, .19; .19 0 .19 .15, .22 0 .19 .15, .22 .18 .17 .15 .20
All experiments (w/o 2 s) 46 3,966.5 .19 .15, .22 .08, .29 58.45 23.01 .06 .18, .19; .19 0 .19 .15, .22 0 .19 .15, .22 .18 .17 .15 .20
Best experiments No outlier(s) identified (see the original distribution for the results)
Best experiments (w/o 2 s) No outlier(s) identified (see the original distribution for the results)
Aggressive behavior
All experiments 43 3,074 .18 .14, .22 .08, .28 51.26 18.07 .06 .18, .19; .18 L 6 .16 .12, .20 L 6 .16 .12, .20 .17 .17 .15 .19
All experiments (w/o 2 s) Same results as above
Best experiments 26 2,159 .23 .19, .27 .19, .26 14.91 .0 .0 .22, .23; .23 L 7 .20 .17, .24 L 7 .20 .17, .24 .18 .22 .21 .18
Best experiments (w/o 2 s) Same results as above
Physiological arousal
All experiments 28 1,872 .13 .08, .18 .02, .24 33.90 20.35 .06 .12, .14; .13 L 2 .12 .06, .18 L 1 .13 .07, .18 .09 .10 .06 .08
All experiments (w/o 2 s) 27 1,836 .13 .08, .19 .02, .24 32.17 19.18 .06 .12, .14; .13 L 2 .13 .07, .18 L 1 .13 .08, .19 .09 .11 .07 .06
Best experiments No outlier(s) identified (see the original distribution for the results)
Best experiments (w/o 2 s) No outlier(s) identified (see the original distribution for the results)

Note. w/o 2 s, without the two studies excluded by Hilgard et al. (2017); k, number of correlation coefficients in the analyzed distribution; N, meta-analytic sample size; ro, random-effects weighted
mean observed correlation; 90% PI, 90% prediction interval; Q, weighted sum of squared deviations from the mean; I2, ratio of true heterogeneity to total variation; , between-sample standard deviation;
osr, one sample removed, including the minimum and maximum effect size and the median weighted mean observed correlation; trim and fill, trim-and-fill analysis; FPS, funnel plot side (i.e., side
of the funnel plot in which samples were imputed; L, left; R, right); ik, number of trim-and-fill samples imputed; t&fFE ro, fixed-effects trim-and-filladjusted observed mean; t&fFE 95% CI, fixed-effects
trim-and-filladjusted 95% confidence interval; t&fRE ro, random-effects trim-and-filladjusted observed mean; t&fRE 95% CI, random-effects trim-and-filladjusted 95% confidence interval; CMA,
cumulative meta-analysis; pr5 ro, meta-analytic mean estimate of the five most precise effects; smm ro, one-tailed moderate selection models adjusted observed mean; sms ro, one-tailed severe selection
models adjusted observed mean; PET-PEESE, precision-effect testprecision effect estimate with standard error; PET-PEESE ro, PET-PEESE adjusted observed mean; n/a, not applicable (because
sms ro presented nonsensical results because of high variance estimates).
REPLY TO HILGARD, ENGELHARDT, AND ROUDER (2017) 779

its number of samples (k) and individual observations (N). Col- exceptions, particularly for PET-PEESE (e.g., aggressive affect
umns 4 10 display the nave meta-analytic results, including the all experiments and aggressive affect best experiments).
RE meta-analytic mean (the nave mean; ro), the 95% confidence
interval, the 90% prediction interval (PI), Cochrans Q statistic, I2,
tau (), and the one-sample removed analysis (minimum, maxi- Discussion
mum, and median mean estimates). Columns 1118 show the Recent research indicates that publication bias and outliers can
results from the trim-and-fill analyses; for the recommended FE as distort meta-analytic results and associated conclusions (e.g.,
well as the RE model, respectively. For each model, the table Banks, Kepes, & McDaniel, 2015; Kepes, Banks, & Oh, 2014;
includes the side of the funnel plot on which the imputed samples Kepes & McDaniel, 2015; Viechtbauer & Cheung, 2010). Hilgard
are located (FPS), the number of imputed samples (ik), the trim- et al. (2017) concluded that some of the Anderson et al. results
and-fill adjusted mean effect size (t&fFE ro or t&fRE ro), and the overestimated the impact of violent video game playing on aggres-
respective 95% confidence interval. Column 19 contains the cu- sive tendencies. Below, we will address some of the main conclu-
mulative mean for the five most precise samples (pr5 ro). Columns sions of Hilgard, Engelhardt, and Rouder.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

20 and 21 illustrate the results from the moderate (smm ro) and
This document is copyrighted by the American Psychological Association or one of its allied publishers.

severe selection (sms ro) models. Column 22 contains the result of


the PET-PEESE (pp ro) analysis). Finally, although not discussed Bias in Nave Meta-Analytic Mean Estimates From
in the Results section because of space considerations, we have Experimental Data
included the forest plots that display the cumulative meta-analyses
by precision in the supplemental materials (for interpretation Hilgard et al. (2017), noted that they
guidelines, see Kepes et al., 2012). Because of space limitations, detect[ed] substantial publication bias in experimental research on the
we also focused on experimental effects, which are the effects effects of violent games on aggressive affect and aggressive behav-
Hilgard et al. (2017) claimed were most biased. Obviously exper- ior and that after adjustment for bias, the effects of violent games
imental effects also allow the strongest causal inferences. on aggressive behavior in experimental research are estimated as
Upon first glance, our results for experimental studies seem to being very small, and estimates of effects on aggressive affect are
be aligned with the results reported by Hilgard et al. (2017). Like much reduced. (p. 757)
Hilgard, Engelhardt, and Rouder, we found that many of the nave
meta- analytic mean estimates were adversely affected by publi- Although we agree that some the nave meta-analytic means
cation bias. However, contrary to Hilgard, Engelhardt, and Rouder, involving experimental studies reported by Anderson et al. (2010)
we did not obtain results that would come close to nullifying the appear to have been adversely affected by publication bias, we do
original nave meta-analytic mean reported by Anderson et al. not agree with the notion that the effects are very small once
(2010). For example, for the aggressive affect best experiments, publication bias was considered. As our results indicate, after
all but the PET-PEESE publication bias assessment methods in- accounting for the potential influence of publication bias and
dicate that the originally obtained nave meta-analytic mean (ro outliers, most mean correlations between exposure to violent video
.32) may be overestimated by potentially .05.09 (1533%) after games and aggressive behavior in experimental samples were
the deletion of identified outliers (e.g., t&fFE ro .24, t&fRE ro between .15 and .25. Effect sizes of this magnitude are not trivial
.24, pr5 ro .22, smm ro .27, sms ro .26). Only the in size. Indeed, most effects observed in social sciences are of this
PET-PEESE estimate suggests a vastly different mean estimate (pp magnitude. For example, one meta-analysis examined the magni-
ro .0), indicating that the results of this method did not converge tude of effects obtained in social psychology studies during the
well with the results of the other, more established methods. By past century. The average effect size obtained from 322 meta-
contrast, for the aggressive behavior best experiments distribu- analyses of more than 25,000 social psychology studies involving
tion, the most important distribution for drawing causal inferences more than 8 million participants was r .20 (Richard, Bond, &
about the effects of violent video games on aggression, it appears Stokes-Zoota, 2003).
as if neither outliers nor publication bias adversely affected the Also, although the reduction in the mean estimates seem large in
nave meta-analytic mean. After the deletion of one outlier, magnitude for the distributions involving aggressive affect (e.g.,
the originally obtained nave mean (ro .21) remained essentially for all experiments, mostly differences between .06 and .09 or 25%
the same (e.g., ro .23, t&fFE ro .20, t&fRE ro .20, t&fRE ro, and 30%; for best experiments, mostly differences between .06 and
pr5 ro .18, smm ro .22 sms ro .21, pp ro .18). .11 or 27% and 33%), the obtained mean effect magnitudes of
Overall, our results indicate that some distributions are essen- around .15 (all experiments) or .25 (best experiments) leads us to
tially unaffected by outliers and publication bias, whereas others believe that, although reduced, the effect is not very small, as
are noticeably affected by both. The two studies Hilgard et at. Hilgard et al. (2017) indicated. Furthermore, once the potential
(2017) removed from the meta-analytic data set seem to have no influence of outliers was taken into consideration, the obtained
real influence on the final results. Likewise, our results suggest results from our publication bias assessment methods were very
that outliers did have a potentially distorting effect on the origi- consistent, indicating that the underlying true effect is quite robust.
nally obtained nave mean estimate. In sum, publication bias did The PET-PEESE method was the only one that yielded occasion-
seem to have noticeably adversely affected some original nave ally widely diverging results. The other methods, especially both
meta-analytic video games effects. By contrast, outliers seem to trim-and-fill methods and the selection models, tended to yield
have a more negligible but sometimes detectable influence. Once converging results. Following the triangulation approach, we can
the identified outliers were removed, most of the publication bias thus conclude that the true mean effect sizes for, for instance,
assessment methods yielded very similar results, with occasional aggressive affect are likely between .15 and .25 (see Table 1).
780 KEPES, BUSHMAN, AND ANDERSON

Other Issues cially after outlier removal, the results of the various publication bias
assessment methods converged, increasing our confidence in the
Hilgard et al. (2017) recommended the exclusion of two studies. obtained results and associated conclusions.
Although their exclusion may be justifiable based on conceptual or We do not dispute that publication bias is a serious problem in
methodological grounds, we did not find support for the notion that general or that it may have affected some of the estimates in the
the four samples in these two studies had a real meaningful effect Anderson et al. (2010) meta-analysis. In fact, we found that out-
on the obtained meta-analytic results, regardless of whether or not liers, in addition to publication bias, affected some estimates
we took the potential effects of publication bias and outliers into reported by Anderson et al. We also echo prior calls for compre-
consideration. Furthermore, we found that more than one identi- hensive reanalyses of previous published meta-analytic reviews
fied outlier was detected in several meta-analytic distributions. The (e.g., Kepes et al., 2012). However, such reanalyses should follow
leave-one-out method used by Hilgard, Engelhardt, and Rouder is best-practice recommendations and, therefore be primarily con-
not capable of handling such situations. Relatedly, our results ducted with appropriate and endorsed methods instead of relying
indicated that outliers, in addition to publication bias, did have a on relatively new and potentially unproven methods, especially
noticeable effect on the originally reported mean estimates (An-
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

p-uniform and p-curve.


derson et al., 2010). Thus, outliers and publication bias had a
This document is copyrighted by the American Psychological Association or one of its allied publishers.

We also agree with the suggestion of Hilgard, Engelhardt, and


combined adverse effect on the meta-analytic mean estimates, Rouder (Hilgard et al. 2017) to combat publication bias through
although neither outliers nor publication bias dramatically changed the prospective registration of meta-analyses (see Kepes & Mc-
the main conclusions of the Anderson et al. meta-analytic study. In Daniel, 2013), as the International Committee of Medical Journal
other words, the Anderson et al. (2010) conclusions remain valid. Editors requires for clinical trials (De Angelis et al., 2004). Fi-
We also found that the PET-PEESE results did not always nally, we agree with numerous other recommendations, ranging
converge well with the other methods under conditions of notice- from alternative editorial review processes to more stringent data
able heterogeneity, as is often the case with real data in the social sharing requirements and a closer attention to the statistical power
sciences (see Moreno et al., 2009). As an example, PET-PEESE of our primary studies, that have been made to improve the
tended to function relatively poorly for the aggressive affectall accuracy and trustworthiness of our cumulative scientific knowl-
experiments distributions when compared with the other methods, edge (e.g., Banks et al., 2015; Kepes, Bennett, & McDaniel, 2014;
even after the deletion of the one identified outlier, potentially Kepes & McDaniel, 2013; Maxwell, 2004; OBoyle, Banks, &
because of the relatively large heterogeneity in the data (i.e., before Gonzalez-Mul, 2017).
the removal of the outlier: Q 111.22, I2 67.63, .16; 90% As indicated by the results of our cumulative meta-analysis by
PI .05, .47; after the removal of the identified outlier: Q precision, both the cumulative mean of the five most precise samples
75.53, I2 53.66, .12; 90% PI .0, .38). (see Table 1) and the forest plots of the complete cumulative meta-
analyses (see our supplemental materials), it seems evident that small
Limitations and Strengths
sample studies with small magnitude effects (most likely effect sizes
Although our findings regarding the influence of publication that failed to reach the magical p value threshold of .05) were being
and other biases on meta-analytic mean estimates echo the results suppressed from the publicly available literature (see Kepes et al.,
of prior research (e.g., Banks et al., 2015; Kepes & McDaniel, 2012). By contrast, from the forest plots in our supplemental materi-
2015; Viechtbauer & Cheung, 2010), our meta-analytic study, like als, one may infer that small sample studies (i.e., underpowered
all meta-analyses, has limitations. For example, all methods used studies) that, maybe by chance, reached an acceptable level of statis-
to assess the potential presence of publication bias have their tical significance (i.e., p .05) were getting published. This selective
shortcomings, especially with heterogeneous data (Kepes et al., publishing seems to have adversely affected our cumulative knowl-
2012; Kepes & McDaniel, 2015). That is why we looked for edge regarding the effects of violent video games.
convergence across methods when triangulating the true underly- Finally, we acknowledge that our conclusions may change as
ing mean effect. Furthermore, by forming theoretically derived more evidence regarding the superiority of an existing or new
subgroup distributions and deleting the outliers that were identified publication bias assessment method becomes available. However,
by a comprehensive battery of multivariate influence diagnostics given that we used multiple recommended methods that rely on
(Viechtbauer, 2015; Viechtbauer & Cheung, 2010), we reduced the different statistical assumptions and that the results of them tended
degree of heterogeneity noticeably as an inspection of our statistics to converge on a narrow range of possible true mean estimates, we
for heterogeneity (e.g., Q, I2, , and 90% PI) before and after have confidence on our results and the associated conclusions. We
outlier removal indicate. In addition, besides the recommended also note that our comprehensive approach to sensitivity analysis is
fixed-effects trim-and-fill model (Duval, 2005; Kepes et al., 2012), recommended in some areas in the medical and social sciences
we also used the random-effects trim-and-fill model to evaluate (American Psychological Association, 2008; Higgins & Green,
potential performance problems with the fixed-effects trim-and-fill 2011; Kepes et al., 2013). Therefore, we suggest that all future
model (Moreno et al., 2009). More weight should be given to the meta-analytic reviews follow the approach we used to assess the
results of the fixed-effects trim-and-fill model if the random-effects robustness of their obtained results.
model yielded similar results. Finally, some methods, such as tradi-
tional selection models, are relatively robust to heterogeneous influ-
Future Research
ences (Hedges & Vevea, 2005; Vevea & Woods, 2005), which is why
they have been recommended to assess the potential for publication Like many other meta-analyses, the data in the Anderson et al.
bias in the presence of heterogeneity (Kepes et al., 2012; McShane et (2010) meta-analysis are heterogeneous. One of the biggest causes
al., 2016). For the vast majority of our analyzed distributions, espe- of heterogeneous effects are hidden moderator variables. Although
REPLY TO HILGARD, ENGELHARDT, AND ROUDER (2017) 781

Anderson et al. considered numerous moderators (e.g., participant Copas, J., & Shi, J. Q. (2000). Meta-analysis, funnel plots and sensitivity
gender; participant age; Eastern vs. Western country; type of analysis. Biostatistics, 1, 247262. http://dx.doi.org/10.1093/
design experimental, cross-sectional, or longitudinal; type of biostatistics/1.3.247
outcomeaggressive cognition, aggressive affect, physiological De Angelis, C., Drazen, J. M., Frizelle, F. A. P., Haug, C., Hoey, J.,
arousal, aggressive behavior, empathy, helping; game characteris- Horton, R., . . . the International Committee of Medical Journal Eds.
tics such as human vs. nonhuman targets, first- vs. third-person (2004). Clinical trial registration: A statement from the International
Committee of Medical Journal Eds. New England Journal of Medicine,
perspectives), these moderators did not fully account for the
351, 1250 1251. http://dx.doi.org/10.1056/NEJMe048225
between-study heterogeneity observed in the effects. Thus, future
Duval, S. J. (2005). The trim and fill method. In H. R. Rothstein, A.
research should examine other possible moderator variables, such Sutton, & M. Borenstein (Eds.), Publication bias in meta analysis:
as publication year (to see whether the effects have changed over Prevention, assessment, and adjustments (pp. 127144). West Sussex,
time), amount of blood and gore in the game, whether the violence UK: Wiley.
is justified or unjustified, whether players use a gun-shaped con- Graybill, D., Kirsch, J. R., & Esselman, E. D. (1985). Effects of playing
troller or a standard controller, whether the video game is played violent versus nonviolent video games on the aggressive ideation of
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

cooperatively or competitively, and whether the video game is aggressive and nonaggressive children. Child Study Journal, 15, 199
This document is copyrighted by the American Psychological Association or one of its allied publishers.

played alone or with other players, to name a few. There were not 205.
enough studies to test these latter potential moderators in 2010, but Greenhouse, J. B., & Iyengar, S. (2009). Sensitivity analysis and diagnos-
there may be now. tics. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook
of research synthesis and meta-analysis (2nd ed., pp. 417 433). New
York, NY: Russell Sage Foundation.
Conclusion Hedges, L. V., & Vevea, J. L. (2005). Selection methods approaches. In
In conclusion, the trustworthiness of our cumulative knowledge H. R. Rothstein, A. Sutton, & M. Borenstein (Eds.), Publication bias in
meta analysis: Prevention, assessment, and adjustments (pp. 145174).
regarding the effects of violent video games is of clear concern to
West Sussex, UK: Wiley.
society, which is why we applaud Hilgard et al.s (2017) attempt
Higgins, J. P., & Green, S. (Eds.). (2011). Cochrane handbook for system-
to assess the trustworthiness of this literature. However, our con- atic reviews of interventions; version 5.1.0 [updated September 2011].
clusions about violent video game effects differ from those of The Cochrane Collaboration. Available at www.cochrane-handbook
Hilgard, Engelhardt, and Rouder. Contrary to the conclusions of .org
Hilgard, Engelhardt, and Rouder, ours are based on results from a Hilgard, J., Engelhardt, C. R., & Rouder, J. N. (2017). Overstated evidence
comprehensive battery of sensitivity analyses and are thus likely to for short-term effects of violent games on affect and behavior: A
be more robust to potential adverse effects. reanalysis of Anderson et al. (2010). Psychological Bulletin, 143, 757
There was convergence in our results across various different 774. http://dx.doi.org/10.1037/bul0000074
methods when we triangulated the true underlying mean effect for Jick, T. D. (1979). Mixing qualitative and quantitative methods: Triangu-
the relations between violent video games and aggression. Con- lation in action. Administrative Science Quarterly, 24, 602 611. http://
trary to what Hilgard et al. (2017) suggested, that effect was not dx.doi.org/10.2307/2392366
very small in size. As stated in our title, although the magnitude of Kepes, S., Banks, G. C., McDaniel, M. A., & Whetzel, D. L. (2012).
the mean effects were reduced by publication bias and outliers, Publication bias in the organizational sciences. Organizational Research
Methods, 15, 624 662. http://dx.doi.org/10.1177/1094428112452760
violent video game effects remain a societal concern.
Kepes, S., Banks, G. C., & Oh, I.-S. (2014). Avoiding bias in publication
bias research: The value of null findings. Journal of Business and
References Psychology, 29, 183203. http://dx.doi.org/10.1007/s10869-012-9279-0
Kepes, S., Bennett, A. A., & McDaniel, M. A. (2014). Evidence-based
Aguinis, H., Werner, S., Abbott, J. L., Angert, C., Park, J. H., & Kohl-
management and the trustworthiness of our cumulative scientific knowl-
hausen, D. (2010). Customer-centric science: Reporting significant re-
edge: Implications for teaching, research, and practice. Academy of
search results with rigor, relevance, and practical impact in mind. Or-
Management Learning & Education, 13, 446 466. http://dx.doi.org/10
ganizational Research Methods, 13, 515539. http://dx.doi.org/10.1177/
.5465/amle.2013.0193
1094428109333339
Kepes, S., & McDaniel, M. A. (2013). How trustworthy is the scientific
Anderson, C. A., Shibuya, A., Ihori, N., Swing, E. L., Bushman, B. J.,
literature in industrial and organizational psychology. Industrial and
Sakamoto, A., . . . Saleem, M. (2010). Violent video game effects on
Organizational Psychology: Perspectives on Science and Practice, 6,
aggression, empathy, and prosocial behavior in eastern and western
countries: A meta-analytic review. Psychological Bulletin, 136, 151 252268. http://dx.doi.org/10.1111/iops.12045
173. http://dx.doi.org/10.1037/a0018251 Kepes, S., & McDaniel, M. A. (2015). The validity of conscientiousness is
American Psychological Association. (2008). Reporting standards for re- overestimated in the prediction of job performance. PLoS ONE, 10,
search in psychology: Why do we need them? What might they be? e0141468. http://dx.doi.org/10.1371/journal.pone.0141468
American Psychologist, 63, 839 851. http://dx.doi.org/10.1037/0003- Kepes, S., McDaniel, M. A., Brannick, M. T., & Banks, G. C. (2013).
066X.63.9.839 Meta-analytic reviews in the organizational sciences: Two meta-analytic
Banks, G. C., Kepes, S., & McDaniel, M. A. (2015). Publication bias: schools on the way to MARS (the Meta-analytic Reporting Standards).
Understanding the myths concerning threats to the advancement of Journal of Business and Psychology, 28, 123143. http://dx.doi.org/10
science. In C. E. Lance & R. J. Vandenberg (Eds.), More statistical and .1007/s10869-013-9300-2
methodological myths and urban legends (pp. 36 64). New York, NY: Maxwell, S. E. (2004). The persistence of underpowered studies in psy-
Routledge. chological research: Causes, consequences, and remedies. Psychological
Borenstein, M., Hedges, L. V., Higgins, J. P., & Rothstein, H. R. (2009). Methods, 9, 147163. http://dx.doi.org/10.1037/1082-989X.9.2.147
Introduction to meta-analysis. West Sussex, UK: Wiley. http://dx.doi McShane, B. B., Bckenholt, U., & Hansen, K. T. (2016). Adjusting for
.org/10.1002/9780470743386 publication bias in meta-analysis: An evaluation of selection methods
782 KEPES, BUSHMAN, AND ANDERSON

and some cautionary notes. Perspectives on Psychological Science, 11, Stanley, T. D., & Doucouliagos, H. (2017). Neither fixed nor random:
730 749. http://dx.doi.org/10.1177/1745691616662243 Weighted least squares meta-regression. Research Synthesis Methods, 8,
Moreno, S. G., Sutton, A. J., Ades, A. E., Stanley, T. D., Abrams, K. R., 19 42. http://dx.doi.org/10.1002/jrsm.1211
Peters, J. L., & Cooper, N. J. (2009). Assessment of regression-based Stanley, T. D., Jarrell, S. B., & Doucouliagos, H. (2010). Could it be better
methods to adjust for publication bias through a comprehensive simu- to discard 90% of the data? A statistical paradox. American Statistician,
lation study. BMC Medical Research Methodology, 9, 2. http://dx.doi 64, 70 77. http://dx.doi.org/10.1198/tast.2009.08205
.org/10.1186/1471-2288-9-2 Sterne, J. A., & Egger, M. (2005). Regression methods to detect publica-
OBoyle, E. H., Jr., Banks, G. C., & Gonzalez-Mul, E. (2017). The tion bias and other bias in meta-analysis. In H. R. Rothstein, A. J. Sutton,
chrysalis effect: How ugly initial results metamorphosize into beautiful & M. Borenstein (Eds.), Publication bias in meta analysis: Prevention,
articles. Journal of Management, 43, 376 399. http://dx.doi.org/10 assessment, and adjustments (pp. 99 110). West Sussex, UK: Wiley.
.1177/0149206314527133 http://dx.doi.org/10.1002/0470870168.ch6
Panee, C. D., & Ballard, M. E. (2002). High versus low aggressive priming Sterne, J. A. C., Sutton, A. J., Ioannidis, J. P. A., Terrin, N., Jones, D. R.,
during video-game training: Effects on violent action during game play, Lau, J., . . . Higgins, J. P. T. (2011). Recommendations for examining
hostility, heart rate, and blood pressure (Vol. 32, pp. 2458 2474). and interpreting funnel plot asymmetry in meta-analyses of randomised
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

United Kingdom: Blackwell Publishing. controlled trials. British Medical Journal, 343, d4002. http://dx.doi.org/
Platt, J. R. (1964). Strong inference: Certain systematic methods of scien- 10.1136/bmj.d4002
This document is copyrighted by the American Psychological Association or one of its allied publishers.

tific thinking may produce much more rapid progress than others. van Assen, M. A. L. M., van Aert, R. C. M., & Wicherts, J. M. (2015).
Science, 146, 347353. http://dx.doi.org/10.1126/science.146.3642.347 Meta-analysis using effect size distributions of only statistically signif-
Richard, F. D., Bond, C. F., Jr., & Stokes-Zoota, J. J. (2003). One hundred icant studies. Psychological Methods, 20, 293309. http://dx.doi.org/10
years of social psychology quantitatively described. Review of General .1037/met0000025
Psychology, 7, 331363. http://dx.doi.org/10.1037/1089-2680.7.4.331 van Elk, M., Matzke, D., Gronau, Q. F., Guan, M., Vandekerckhove, J., &
Rothstein, H. R., Sutton, A. J., & Borenstein, M. (2005). Publication bias Wagenmakers, E.-J. (2015). Meta-analyses are no substitute for regis-
in meta-analysis: Prevention, assessment, and adjustments. West Sus- tered replications: A skeptical perspective on religious priming. Fron-
sex, UK: Wiley. http://dx.doi.org/10.1002/0470870168 tiers in Psychology, 6, 1365. http://dx.doi.org/10.3389/fpsyg.2015
Schwarzer, G. (2015). Meta-analysis package for R: Package meta. R .01365
package (version 4.3-2) [Computer software]. Retrieved from http:// Vevea, J. L., & Woods, C. M. (2005). Publication bias in research syn-
portal.uni-freiburg.de/imbi/lehre/lehrbuecher/meta-analysis-with-r thesis: Sensitivity analysis using a priori weight functions. Psychologi-
Simonsohn, U., Nelson, L. D., & Simmons, J. P. (2014). P-curve and effect cal Methods, 10, 428 443. http://dx.doi.org/10.1037/1082-989X.10.4
size: Correcting for publication bias using only significant results. Per- .428
spectives on Psychological Science, 9, 666 681. http://dx.doi.org/10 Viechtbauer, W. (2015). Meta-analysis package for R: Package metafor.
.1177/1745691614553988 R package (version 1.9-5) [Computer software]. Retrieved from http://
Stanley, T. D., & Doucouliagos, H. (2007). Identifying and correcting www.metafor-project.org/doku.php
publication selection bias in the efficiency-wage literature: Heckman Viechtbauer, W., & Cheung, M. W. L. (2010). Outlier and influence
meta-regression. Economics Series, 11. Retrieved from https://ideas diagnostics for meta-analysis. Research Synthesis Methods, 1, 112125.
.repec.org/p/dkn/econwp/eco_2007_11.html http://dx.doi.org/10.1002/jrsm.11
Stanley, T. D., & Doucouliagos, H. (2012). Meta-regression analysis in
economics and business. New York, NY: Routledge.
Stanley, T. D., & Doucouliagos, H. (2014). Meta-regression approxima- Received October 3, 2016
tions to reduce publication selection bias. Research Synthesis Methods, Revision received May 2, 2017
5, 60 78. http://dx.doi.org/10.1002/jrsm.1095 Accepted May 4, 2017

E-Mail Notification of Your Latest Issue Online!


Would you like to know when the next issue of your favorite APA journal will be available
online? This service is now available to you. Sign up at https://my.apa.org/portal/alerts/ and you will
be notified by e-mail when issues of interest to you become available!

Vous aimerez peut-être aussi