In response to these calls for broader measures, the IALSS survey commissioned the development of frameworks to use as the basis for introducing new measures into the comparative assessments of adults. Those responsible for the development of IALSS recognized that the design of any reliable and valid instrument should begin with a strong theoretical underpinning that is represented by a framework that characterizes current thinking in the field. According to Messick (1994) any framework that takes a construct-centered approach to assessment design should: begin with a general definition or statement of purpose – one that guides the rationale for the survey and what should be measured in terms of knowledge, skills or other attributes; identify various performances or behaviours that will reveal those constructs, and; identify task characteristics and indicate how these characteristics will be used in constructing the tasks that will elicit those behaviours.
This annex provides an overview of the frameworks used to develop tasks that measure prose and document literacy, numeracy, and problem solving in the IALSS survey. In characterizing these frameworks this annex also provides a scheme for understanding the meaning of what has been measured in IALSS and for interpreting levels along each of the scales. It borrows liberally from more detailed chapters that were developed in conjunction with the IALSS survey (Murray, Clermont and Binkley, in press).
The results of the IALSS survey are reported along four scales – two literacy scales (prose and document), a single numeracy scale, and a scale capturing problem solving– with each ranging from 0 to 500 points. One might imagine these tasks arranged along their respective scale in terms of their difficulty for adults and the level of proficiency needed to respond correctly to each task. The procedure used in IALSS to model these continua of difficulty and ability is Item Response Theory (IRT). IRT is a mathematical model used for estimating the probability that a particular person will respond correctly to a given task from a specified pool of tasks (Murray, Kirsch and Jenkins, 1998).
The scale value assigned to each item results from how representative samples of adults in participating countries perform on each item and is based on the theory that someone at a given point on the scale is equally proficient in all tasks at that point on the scale. For the IALSS survey, as for the IALS, proficiency was determined to mean that someone at a particular point on the proficiency scale would have an 80 percent chance of answering items at that point correctly.