A small-scale feasibility study was conducted in the USA and The Netherlands,
in which 44 additional items were tested on samples of about 55 cases in each country.
Statistical analyses conducted at this stage were quite similar to those used during the
first feasibility study, although the smaller sample prevented the use of IRT analysis.
Out of all the items tested in the two feasibility studies conducted in Stage 1 and
Stage 2, a pool of 81 items was selected for further testing at the Pilot stage. These
items had adequate psychometric characteristics, covered diverse levels of difficulty,
encompassed key facets of the conceptual framework for Numeracy, and could be adapted
without difficulty to a language other than English and to different units of measurement.
These items did not show any remarkable error patterns indicating that stimuli, questions
or task contexts were misunderstood. In addition, two easy Numeracy items were selected
to become part of the "Core", the screener test that is to be successfully passed by each
respondent in order for him or her to receive one or more full-length test booklets with
items assessing performance in the ALL skill domains.
7.3 Stage 3 (2001-2003): Preparations for pilot testing in participating countries
Scoring materials. Once items for the Pilot stage were selected, the Numeracy team
prepared detailed scoring guidelines for each item, as well as a training manual for
scorers. In addition, the team prepared a detailed manual describing "critical elements"
of each item that should be kept constant during the translation and adaptation work
that came next.
Translation and adaptation for Pilot. The 81 items selected at the end of Stage 2
were translated and adapted by all participating countries, sometimes into multiple
languages within the same country (e.g., Canada created English and French versions,
and Switzerland created German, French, and Italian versions). The translation process
was supported by training workshops held in 2000 and 2001 and by the support materials
prepared by the Numeracy team, i.e., the manual describing critical elements of each
item that should be kept constant across language versions, and a "translation and
adaptation manual". Item adaptation aimed to maintain cognitive equivalency in terms
of their task demands. Thus, for example, when units of measure or monetary values in
some computational items were adapted to various country situations, guidelines
emphasized the need to keep item demands comparable.
Further, each country prepared not only country-specific and language-specific
sets of items, stimuli, and response pages, but also adapted the scoring instructions and
the training manual for scorers. During this stage, all translators could also post questions
by e-mail to a hotline staffed by members of the Numeracy team.
The next section provides an overview of the results from the Pilot phase and
further explanations regarding the properties of the final 40 items selected for the
Main assessment.
|