Skip to main content.

Technical Affairs

Mike Aamodt, Associate Editor


Question

Several issues ago, you reported some meta-analysis results. I thought I understood your results but after reading several journal articles using meta-analysis, I'm not so sure. Could you explain meta-analysis?

Answer

I was worried that no one would ever ask! In the following pages, I will take a shot at explaining the concept of meta-analysis, how it is done, and how to understand meta-analysis results. I will try to keep my answer to a "one pocket protector" level of difficulty.

Meta-Analysis Made Simple

In the old days (prior to 1980), a research topic was reviewed by reading all the articles on it and then drawing a conclusion. For example, suppose that a personnel analyst was asked to review the literature to see if education was related to police performance. The analyst would find every article on the topic, perhaps count the articles that showed significant results, and then reach a conclusion such as "given that eight articles showed significant results and nine did not, we must conclude that education is not related to police performance."

Unfortunately, there are two reasons why such a conclusion might be inaccurate. First, it may be that there is a relationship between education and performance but the relationship is relatively small and a large number of subjects would be needed in each study to detect the relationship. For example, you might have five studies each with samples of 50 officers. The correlations between education level and performance in the five studies are .14, .20, .16, .21, and .17. Though the size of the coefficients is consistent across the five studies, none of the correlations by itself is statistically significant due to small sample sizes in each study.

Second, the studies may have had small sample sizes, and any differences among the results is purely due to sampling error. To explain this last point further, imagine that you have a bowl containing three red balls, three white balls, and three blue balls. You are then asked to close your eyes and pick three balls from the bowl. Because there are equal numbers of red, white, and blue balls in the bowl, you would expect to draw one of each color. However, in any given draw from the bowl, it is unlikely that you will get one of each color. If you have no life and draw three balls at a time for ten hours, you might get three red balls on some draws, no white balls on other draws, and three white balls on other draws. Thus, even though we know there are an equal number of each color of ball, any one draw may or may not represent what we know is "the truth."

The same is true in research. Suppose we know that the true correlation between education level and performance is .20 (God faxed this info to me yesterday). A study at one agency might yield a correlation of .10, another agency might report a correlation of .50, and yet another agency might report a correlation of .30. If all three studies had small samples, the differences among the studies and differences from the "truth" might be due purely to sampling error. This is where meta-analysis saves the day.

Meta-analysis is a statistical method for combining research results. Since the first meta-analysis was published by Gene Glass in 1976, the number of published meta-analyses has increased tremendously and the methodology has become increasingly complex. The current meta-analysis Gods are Frank Schmidt and John Hunter, and almost every meta-analysis uses the methods they suggested in their 1990 book Methods of Meta-Analysis.

Though meta-analyses will vary somewhat in their methods and their purpose, most meta-analyses involving personnel selection issues try to answer three questions:

(1) What is the mean validity coefficient found in the literature for a given predictor (e.g. interviews, assessment centers, cognitive ability)?

(2) If we had perfect measures of the construct, a perfect measure of performance, and no restriction in range, what would be the "true correlation" between our construct and performance?

(3) Can we generalize the meta-analysis results to every agency (validity generalization), or is our construct a better predictor of performance in some situations than in others?

Conducting a Meta-Analysis

Finding Studies

The first step in a meta-analysis is to locate studies on the topic of interest. It is common to use both an "active search" and a "passive search." An active search tries to identify every research study within a given parameter. For example, a meta-analyst might concentrate her active search on journal articles and dissertations published between 1970 and 1996 and referenced in one of three computerized literature data bases (Psych Lit, First Search, Dissertation Abstracts International) or referenced in an article found during the computer search. A passive search might include queries to professionals known to be experts in the area, papers presented at conferences, or technical reports known to the author.

The major difference between an active and passive search is that the goal of an active search is to include every relevant study within the given parameters, whereas the goal of the passive search is to find other relevant research without any thought that every study on the topic was found. Though this may not seem much of a difference, it is. These days, there are so many potential sources for research - thousands of journals, conference presentations, theses, dissertations, technical reports, and unpublished research articles - that relevant studies are going to be missed. Thus the credibility of a meta-analysis hinges on the scope and inclusion accuracy of its active search.

Choosing Studies to Include in the Meta-Analysis

Once all the relevant studies on a topic have been located, the next step is to determine which of these studies will be included in the meta-analysis. To be included in a meta-analysis, an article must report the results of an empirical investigation and include a correlation coefficient, another statistic (e.g. F, t, chi-square) that could be converted to a correlation coefficient, or tabular data that can be entered into the computer to yield a correlation coefficient (many meta-analyses use Cohen's D rather than a correlation coefficient but the rules to include an article are the same). Articles that report results without the above statistics (e.g. "we found a significant relationship between education and academy performance" or "we didn't see any real differences between our educated and uneducated officers") cannot be included in a meta-analysis.

Often, meta-analysts will have other rules about keeping studies. For example, in a meta-analysis on employee wellness programs, the decision to include only studies using both pre- and post-measures of absenteeism as well as experimental and control groups resulted in only three usable studies.

Converting Research Findings to Correlations

Once research articles have been located and the decision is made as to which articles to include, statistical results (e.g. F, t, Chi-square) that need to be converted into correlation coefficients are done using the formulas provided in Rosenthal (1985). In some cases, raw data or data listed in tables can be entered into a statistical program (e.g. SAS, SPSS) to directly determine a correlation coefficient.

Cumulating Validity Coefficients

After the individual correlation coefficients have been computed, the validity coefficient for each study is weighted by the size of the sample and combined using the method suggested by Hunter and Schmidt (1990). In addition to the mean validity coefficient, the observed variance, amount of variance expected due to sampling error, and a 95% confidence interval are calculated.

Correcting for Artifacts

When conducting a meta-analysis, it is desirable to adjust correlation coefficients to correct for error associated with predictor unreliability, criterion unreliability, restriction of range, and a host of other artifacts (see Hunter & Schmidt, 1990 for a thorough discussion). These adjustments answer the second question of "If we had perfect measures of the construct, a perfect measure of performance, and no restriction in range, what would be the "true correlation" between our construct and performance?"

These adjustments can be made in one of two ways. The most desirable way is to correct the validity coefficient from each study based on the predictor reliability, criterion reliability, and restriction of range associated with that particular study. To do this, however, each study must provide this information, and very few actually do.

When the necessary information is not available for each study, the mean validity coefficient is corrected rather than each individual coefficient. The numbers used to make these corrections come either from the average of information found in the studies that provided reliability or range restriction information or from other meta-analyses. For example, an estimate of the reliability of supervisor ratings of overall performance (r=.52) can be borrowed from the meta-analysis on rating reliability by Viswesvaran, Ones, and Schmidt (1996). To correct for restriction of range, the value of u = .70 suggested by Roth and his colleagues (1996) can be used.

Searching for Moderators

Being able to generalize meta-analysis findings across all similar organizations and settings (validity generalization) is an important goal of any meta-analysis. It is standard practice in meta-analysis to generalize results when at least 75% of the observed variability in validity coefficients can be attributed to sampling error. When less than 75% can be attributed to sampling error, a search is conducted to find variables that might moderate the size of the validity coefficient. For example, education might predict performance better in larger police departments than in smaller ones.

Understanding Meta-Analysis Results

Now that you have an idea about how a meta-analysis is conducted, let's talk about how to understand meta-analysis results that you might find in a published article. In Table 1 you will find the partial results of a meta-analysis we conducted on education and police performance. The numbers in the table represent the validity of education in predicting academy grades and commendations received as a police officer. The "K" column indicates the number of studies included in the meta analysis and the "N" column indicates the number of total subjects in the studies.

The "r" column represents the mean validity coefficient across all studies (weighted by the size of the sample). This coefficient answers our question about the typical validity coefficient found in validation studies on the topic of education and police performance. To determine if this coefficient is "statistically significant," we look at the next two columns which represent the lower and upper limits to our 95% confidence interval. If the interval includes zero, we cannot say that our mean validity coefficient is significant. From the figures in Table 1, we would conclude that education is a significant predictor of grades in the academy but not of commendations received as a police officer.

The next two columns represent our mean validity coefficient corrected for range restriction (rrr) and both range restriction and criterion unreliability (rcc,rr). These coefficients represent what the "true validity" of education would be if we had a perfectly reliable measure of academy grades and no range restriction. Notice that there was no correction made for criterion unreliability in commendations because no such information was available.

The final column represents the percentage of observed variance that is due to sampling error. Notice that in both cases, this percentage is less than 75% so we would try to find a variable that might moderate the relationship between education and academy performance and between education and commendations received.

I hope this answer helped. Meta-analysis is one of those topics that is difficult to explain without getting real technical.

Table 1

95% CI
Criteria K N r L U rrr rcr,rr observed
variance
Academy Grades 13 3,212 .27 .16 .39 .37 .40 65%
Commendations 11 4,127 -.05 -.15 .06 -.07   25%

References

   Hunter, J. E., & Schmidt, F. L. (1990). Methods of meta-analysis: Correcting error and bias in research findings. Newbury Park, CA: Sage Publications.

   Rosenthal, R. (1984). Meta-analytic procedures for social research. Beverly Hills, CA: Sage Publications.

   Roth, P. L., BeVier, C. A., Switzer, F. S., & Schippmann, J. S. (1996). Meta-analyzing the relationship between grades and job performance. Journal of Applied Psychology, 81(5), 548-556.

   Viswesvaran, C., Ones, D. S., & Schmidt, F. L. (1996). Comparative analysis of the reliability of job performance ratings. Journal of Applied Psychology, 81(5), 557-574.

 

Mike Aamodt is a Professor at Radford University and has graciously agreed to continue his excellent work as our Associate Editor for Technical Affairs. If you have a technical question you want answered please send it to Mike by email (maamodt@runet.edu), phone [(540) 831-5513)] or fax [(540) 831-6113].


© Copyright 1997 by the IPMA Assessment Council. All rights reserved.