3. Measurement of problem solving
There are at least three different sources for the design of problem-solving tests: Tasks
used in psychological research, domain-specific problem-solving tasks, and tasks used
in previous large-scale assessments of cross-curricular or practical problem solving. These
three possibilities will be examined in the following sections.
For the ALL study, the following requirements should be examined: The extent
to which tasks tap broad analytical problem-solving abilities and are in this respect
theoretically sound. They should furthermore be embedded within a real-life context
that is realistic enough to trigger actual and not artificial problem-solving processes,
and a context that does not make specialized knowledge a prerequisite. Finally, they
should show adequate psychometric properties and be compatible with the constraints
imposed by a large-scale assessment.
3.1 Tasks used in psychological research on problem solving
During the 20th century, psychological research in the area of
problem solving concentrated on a few experimental paradigms. For example,
the famous radiation
problem in cancer therapy (Duncker, 1945), the water-jug problems (Luchins, 1942),
the "Tower of Hanoi" (Newell and Simon, 1972) and it's analogies, Wason's
rule induction task (Wason, 1966), traveling-salesman problems or Cryptarithmetics
were used again
and again in experimental settings (cf. Anderson, 1999). In addition to these
puzzle-like problems, psychologists used knowledge-rich tasks such as chess games,
geometry
problems, algebraic word problems, mechanical reasoning or computer programming.
In the European tradition of problem-solving research, computer simulations of
various economical or ecological scenarios were introduced as a means of investigating
human
behavior in ill-defined, dynamic, intransparent, and complex problem situations
(see Frensch and Funke, 1995). Thus, one possible strategy for the design of
problem-solving assessment instruments could be to implement one or more of these
paradigms.
However, the tasks used in experimental research a) are often well-known to a
larger public, b) are not appropriate for large-scale assessment, c) are not
tailored to the life
and experiences of the target population. Thus, the challenge would be to adapt
these tasks, transform them into appropriate test formats, and contextualize
them in a way
that is meaningful to subjects across participating countries. The heterogeneity
of the tasks brings up another problem: Mixing for example a Tower-of-Hanoi-like
problem
with an "insight"-problem would most probably yield a test with low
internal consistency
and unknown validity.
|