Valid and reliable assessments result from the application of systematic development efforts by both content experts and psychometricians.
Valid and reliable assessments result from the application of systematic development efforts by both content experts and psychometricians. This is a multistep process that includes carefully defining the construct to be measured, developing a test blueprint aligned with the purpose of the assessment, thoughtful application of the art and science of item writing, and empirical data collection to support item and test validity. Research in test development is concerned with improved methods to enhance the validity and reliability of assessments.
Targeting Cognition in Item Design to Enhance Valid Interpretations of Test Performances: A Case Study and Some Speculations
Significant conceptual and empirical work over the last 10 years highlight the importance of understanding and specifiying the knowledge, skills, and other abilities that achievement test items elicit from examinees. This paper discusses the need for coherent and comprehensive understandings of how examinees interact with items, and several coding frameworks to identify item response demands.
Aligning Achievement Level Descriptors to Mapped Item Demands to Enhance Valid Interpretations of Scale Scores and Inform Item Development
Achievement level descriptors (ALDs) delineate the knowledge, skills, and abilities found in the standards that a student should posess depending on his or her level of achievement. This study examines the relationships among various cognitive and contextual coding frameworks and item difficulty when reviewing items in test administration order, with the ultimate goal of informing improved procedures for developing ALDs.
Various reliability coefficients of internal consistency have been proposed that make different assumptions regarding the degree of test-part parallelism, with implications for mixed-format tests. This study compares the IRT model-derived coefficients and observed values for mixed-format tests.
We provide information that serves as a starting point and as a continuing reference for school board members, educators, and policy leaders for guidance.
This paper describes different Item Response Theory (IRT) models for both multiple- choice items and constructed-response items.