Skip to main content.

Technical Affairs Section

Mike Aamodt, Associate Editor


Reinventing, restructuring, rightsizing, and reengineering the Technical Affairs Section

When Beverly Waldron called a few months back, we had a bad phone conversation and I thought she asked if I would be willing to associate with the Editor of the ACN. It was a strange request but being a nice guy, I thought "hey, if no one else will be associated with her, I would. Heck, maybe we could become pen pals." A few months later I discovered that I had actually volunteered to be an Associate Editor of the ACN, not to associate with the Editor.

This realization sent me into a panic; I was now responsible for the technical affairs section of the best darn little newsletter in America. I called Beverly and started asking the questions my students pose to me: How long should the column be? When is it due? Do I have to have references? Will it be graded? Will there be a chance for extra credit if I mess up? Beverly replied with the same answer I rely on for my students: "it's hard to say so I guess it's really up to you."

Being so empowered, I begin thinking about how to make the column useful. We could have debates, but no one ever participates. We could present new research, but that's what a journal is for. We could have editorials, but who cares. And then it dawned on me, why not take commonly asked questions (commonly asked by HR people with pocket protectors that is) and answer those questions? A given column could consist of a long answer to one complex question or a series of short answers to less complex questions. I ran the idea by me the next day and thought it was perfect (this empowerment stuff can really get to your head).

So, here is how the technical affairs section of the ACN will work for 1996. If you have a technical question that you need answered in a simple way, send it to me and I'll find an expert to answer it (kind of like a Dear Abbey for public personnel except I promise not to recycle my columns). Or, if you are aware of a technical issue that you would like to address in a mode similar to that of our first column, let me know and I'll probably thank you repeatedly for offering. Any questions, answers, or other correspondence can be sent to me at the following address:

Michael G. Aamodt, Ph.D.
Department of Psychology
Box 6946
Radford University
Radford, VA 24142
Phone: (540) 831-5513
Fax: (540) 831-6113
Email: maamodt@runet.edu


Is all that item-writing advice true?

Mike Aamodt, Radford University

There are many sources that provide advice about writing test items and creating tests. Though much of this advice appears true from a common sense perspective, it either hasn't been empirically tested or has not been supported by the results of empirical tests. The purpose of the present article is to review common test advice and report the results in meta-analysis or, if necessary, single studies that have tested some of this advice.

For those of you who are not yet familiar with meta-analysis, a brief explanation might be beneficial. In the dark ages (prior to 1975), research on a particular topic was synthesized through a formal written review of the literature. For example, a reviewer might read 10 studies conducted on the relationship between job satisfaction and performance and conclude that "because only four of the ten studies showed a significant relationship between the two variables, we can conclude that there is not a significant relationship between job satisfaction and performance." These traditional reviews suffer from many problems. The most important of these are potential bias, not considering the sample sizes of the studies reviewed (a study of 10 subject carries the same weight as one with 5,000 subjects), and an overemphasis on statistical significance ( a finding significant at the .05 level supports a hypothesis, one significant at .06 does not).

Meta-analysis, the modern method for reviewing research, is a statistical method of reaching conclusions based on previous research. With meta-analysis, a researcher goes through each research article, determines the effect size for each article by converting the test statistic in the article (e.g., F, t, Chi-square) into a correlation coefficient (r) or a d score, and then finds a statistical average (weighted by the size of the sample) of the effect sizes across studies. Thus, meta-analysis will result in one number, called the mean effect size, that indicates the effectiveness of some variable. The beauty of meta-analysis is that it can answer a complex question with one easy-to-interpret number (the mean effect size). Instead of saying that the results of 20 studies are mixed, meta-analysis allows a researcher to say "based on the 20 studies conducted on the topic, the bottom line is that a particular techniques works and the size of the effect is d (the mean effect size)" or "across 15 studies the average correlation between two variables is r."

With this brief example of meta-analysis in mind, the answers to the test construction advice below can be considered "the final word" based on the available research. The advice offered below was limited to advice in which a meta-analysis has been conducted of the research testing the advice. If we get enough requests (at least one), we will have a more detailed explanation of meta-analysis in an upcoming issue.

Item Writing Advice: Never use "none-of-the-above" as an option in multiple-choice exams.

Research Finding: A meta-analysis by Knowles and Welch (1992) examined 20 studies found in 12 articles looking at the effects of uing "none-of-the-above" as an option. The total sample size across the studies was nearly 16,000. The results of their meta-analysis indicated that there was no significant effect from using "none-of-the-above" on either the difficulty or the discrimination index of a test. Thus, the findings of this meta-analysis suggest that there is nothing wrong with properly using "none-of-the-above" as an option.

Item Writing Advice: To lessen the effect of guessing, always use at least four option on a multiple-choice question.

Research Finding: A meta-analysis of 14 studies by Aamodt and McShane (1992) found that using three options rather than four will make a 100-item test 1.2 points easier, have no effect on item discrimination, and save 4.59 minutes in test taking time. Because the subjects across the 14 studies were able to complete 2.69 items per minute, using three rather than four choice would allow about 12 items to be added to an exam without increasing the time needed to complete the exam. A recent study in the public sector by Sidick, Barrett, and Doverspike (1994) found that three-option tests were at least as good, if not superior, to five-option tests. Based on the results of the meta-analysis and the Sidick et al. Study, using three options rather than four or five will not greatly affect test scores or item discrimination indices but will significantly reduce item writing and test taking time.

Item Writing Advice: Place the easy items at the beginning of the test and the more difficult toward the end.

Reasearch Finding: A meta-analysis of 26 studies by Aamodt and McShane (1992) found that placing easier items first resulted in a 100-item test being approximately 1.5 points easier than using a random order and 3 points easier and less anxiety producing than beginning the test with the most difficult items.

Item Writing Advice: Keep all items concerning the same topic together on the test.

Research Finding: The Aamodt and McShane (1992) meta-analysis found that across the 16 studies testing this advice, the average score on tests organizing items by content will only be almost a half a point higher than test not organizing items by content. Thus, organizing items by content is not really necessary nor is it harmful.

References

Aamodt, M.G. & McShane T. (1992). A meta-analytic investigation of the effect of various test item characteristics on test scores and test completion times. Public Personnel Management, 21(2), 151-160.

Hunter, J.E., and Schmidt, F.L. 1990. Methods of Meta-Analysis: Correcting Error and Bias in Research Findings. . Newbury Park, Cal.: Sage Publications.

Knowles, S.L., & Welch, C.A. (1992). A meta-analytic review of item discrimination and difficulty in multiple-choice items using "none of the above." Educational and Psychological Measurement, 52, 571-577.

Sidick, J.T., Barrett, G.V., & Doverspike, D. (1994). Three-alternative multiple-choice tests: An attractive option. Personnel Psychology, 46(4), 829-835.


© Copyright 1996 by the IPMA Assessment Council. All rights reserved.