Test Development

Test Development at QUTC

Behind every test is a rigorous test development process. There is no shortcut to creating tests that are valid, reliable, and fair. QUTC’s team includes many professionals, such as psychometricians, item writers, and reviewers. This team works together to develop standardized tests that go through multiple reviews to meet the highest standards for quality and fairness in the testing industry. QUTC follows international standards for educational and psychological testing and examines test validity, reliability/precision, and the error of measurement.

Test Development and Analysis Services

QUTC applies the science of assessment (a.k.a. psychometrics) that analyzes and scores them. Building a standardized test is a rigorous undertaking. Standardized tests begin with a test framework (that defines the objectives, the number of test items per objective, the specific skills to be assessed, and each test question’s cognitive complexity). Item Response Theory (IRT) is used for test development because item difficulty and person test scores are on the same measurement scale. IRT is a probabilistic model, and for each item, there is a likelihood for a student at a certain ability will attain the correct answer. Classical test development is unable to make the same predictions. IRT analyzes items and test results with mathematical models (calibration), and each item’s measures are evaluated in terms of their adherence to model expectations. Test results are scaled (to facilitate score interpretation), linked (to enable the construction of different test forms with many different questions that are related), and equated (so test scores on one test form is on the same scale as another test). Once the cut-scores for the test are determined, one can use the same score on another version of the test and apply them to another test with a different set of test questions.

QUTC has worked with a wide range of educators, including ministries of education, university colleges, and classroom instructors by supporting efforts to improve educational quality, whether this is for less than a few hundred students or for tens of thousands.

QUTC has expertise with modern software and hardware to score tests, and generate needed item and test performance metrics, which assist content specialists in their efforts to improve their assessments and provide them with feedback.

An example of a test development collaboration with QUTC

For many years, the Foundation Program’s Department of Math (FPDM) used an external placement exam, but when the content of the exam changed and no longer matched its curriculum, another test solution was sought. The Department created several committees, a task force, sought independent advice from specialist reviewers and advisory boards, to create the necessary test assessment framework and item specifications. The test items then underwent multiple reviews, test piloting, all to create a new placement test. This test was to match outcomes and the concern for the criterion scores used for exemption and placement that the department had. Using discrimination and Cronbach reliability indices, as well as, other statistical methods, it was found that the new test achieved goals for high quality tests of reliability, validity and fairness. Any underperforming items are removed and new items are constantly being piloted and reviewed.