An organized set of measurements, all of which measure one property or characteristic. Different types of test-score scales use different units, for example, number correct, percentiles, or IRT scale scores.
Scores on a single scale with intervals of equal size. The scale can be applied to all groups taking a given test, regardless of group characteristics or time of year, making it possible to compare scores from different groups of examinees. Scale scores are appropriate for various statistical purposes; for example, they can be added, subtracted, and averaged across test levels. Such computations permit educators to make direct comparisons among examinees, compare individual scores to groups, or compare an individual's pre-test and post-test scores in a way that is statistically valid. This cannot be done with percentiles or grade level equivalents.
A question or incomplete statement that is followed by answer choices, one of which is the correct or best answer. Also referred to as a "multiple-choice" item.
Special Admissions Test
A test of a student's ability to participate in special programs or advanced learning situations. For example, an honors-level class or a magnet school may require the attainment of high scores on an assessment for admission.
A test in which one aspect of performance is measured by the number of tasks performed in a given time. A "pure" speed test is one in which examinees make no errors and that cannot be completed by any examinee in the allotted time.
A statistic used to express the extent of the divergence of a set of scores from the average of all the scores in the group. In a normal distribution, approximately two-thirds (68.3%) of the scores lie within the limits of one standard deviation above and one standard deviation below the mean. One-sixth of the scores lie more than one standard deviation above the mean, and one-sixth lie more than one standard deviation below the mean.
Standard Error of Measurement
A measure of the amount of error to be expected in a score from a particular test. The smaller the standard error of measurement, the greater the accuracy of the test score. The standard error of measurement is the standard deviation of a theoretical distribution of a set of variations, each of which is the difference between the obtained score and true score. Thus, if a standard error of measurement is 5, the chances are two to one that an obtained score lies within five units of the true score.
A derived score scaled to produce an arbitrarily assigned mean and standard deviation. For example, deviation IQs are standard scores with a mean of 100 and, usually, a standard deviation of 16.
The process of administering a test to a nationally representative sample of examinees using carefully defined directions, time limits, materials, and scoring procedures. The results produce norms to which the performance of other examinees can be compared, provided they took the test under the same conditions.
That part of the population that is used in the norming of a test, i.e., the reference population. The sample should represent the population in essential characteristics, some of which may be geographical location, age, or grade for K-12 students, or, for adults, participation in a specific type of program (for example, adult basic education).
A test constructed of items that are appropriate in level of difficulty and discriminating power for the intended examinees, and that fit the pre-planned table of content specifications. The test is administered in accordance with explicit directions for uniform administration and is interpreted using a manual that contains reliable norms for the defined reference groups.
A unit of a standard score scale that divides the norm population into nine groups with the mean at stanine 5. The word stanine draws its name from the fact that it is a STAndard score on a scale of NINE units.
The part of an item that asks a question, provides directions, or presents a statement to be completed.
A passage or graphic display about which questions are asked.
A test battery is a set of several tests designed to be administered as a unit. Individual subject-area tests measure different areas of content and may be scored separately; scores from the subtests may also be combined into a single score.
One who prepares and develops tests.
A question or problem on a test.
A desired educational outcome such as "constructing meaning" or "adding whole numbers." Usually several different objectives are measured in one subtest.
One who uses test results for some decision-making purpose
One who takes a test whether by choice, direction, or necessity.