3.2 Domain-specific problem-solving tasks

Problem solving can occur in any domain, and there are abundant domain-specific problem-solving tests, especially in the areas of educational and vocational research. The most interesting tests are those that use innovative formats such as the "Clinical Reasoning Test" (Boshuizen et al., 1997), which is based upon case studies in patient management, the "Overall-Test" of complex, authentic decision-making in business education (Segers 1997), or the "What if Test" which measures intuitive knowledge acquired in exploratory simulations of science phenomena (Swaak and de Jong, 1996). For science, Baxter and Glaser (1997), provide a systematic approach to performance assessment tasks, allowing for an analysis of cognitive complexity and problem-solving demands. Within the domain of mathematics, there is a long tradition of research on problem-oriented thinking and learning (Hiebert et al., 1996; Schoenfeld, 1992), and related assessment strategies (Charles, Lester and O'Daffer, 1987; for an integrated discussion from an educational, cognitive-psychological and measurement perspective see Klieme, 1989). Collis, Romberg, and Jurdak (1986), for example, developed a "Mathematical Problem-Solving Test" which used so-called "super-items", each composed of a sequence of questions that address increasing levels of cognitive complexity. Since the seminal work by Bloom and colleagues (Bloom, Hastings, and Madaus, 1971), there have been various attempts at differentiating task complexity levels, a more recent example being the SOLO taxonomy (Collis et al., 1986). It is interesting to note that the former literature on taxonomy of learning objectives and related tasks did not contain any category like "problem solving", because Bloom and his colleagues conceptualized problem solving as an integration of all the levels they proposed (reproduction, understanding, application, and so on).

3.3 Tasks used in previous large-scale assessments

Recently, several attempts have been made to implement measures of cross-curricular problem solving in large-scale assessments. A general test of cross-curricular competence developed by Meijer and Elshout-Mohr (1999) is based on critical thinking inventories. This test is promising but measures quite heterogeneous concepts. Sternberg's practical cognition test and a problem-solving test developed by Baker (1998), O'Neil (1999) and colleagues at the Center for Research on Evaluation, Standards and Student Testing (CRESST) in the United States have also been piloted in large-scale assessments. In the pilot version of Sternberg's test, the respondents are provided with a list of workplace or everyday-life related problem situations, and asked to choose one of several possible solutions. The respondent's answer pattern is then compared to the average pattern in their culture. The degree of agreement between the respondent's choice and the national representative sample is regarded as an indicator of the respondent's "common sense". This means that there is no "right" or "wrong" solution to the problems. The CRESST problem-solving test is based on a framework that defines domain-dependent strategies, meta-cognition, content understanding, and motivation as components of problem solving. Note that only the first two of these are understood as aspects of problem-solving competence in the present framework, whereas the third and fourth are regarded as prerequisites that need to be measured independently. To assess strategies, the CRESST authors confronted respondents with information on a technical device (tire pump) or a similar biological system, which was described as malfunctioning, and asked them to think about trouble-shooting actions. To assess content understanding subjects were asked to explain how the device works by drawing a "knowledge map". Several field trials showed that in principle the instrument is feasible, although its difficulty (with less than 25% of the adults being able to solve the trouble-shooting problem) and reliability were not sufficiently convincing.

Trier and Peschar, as a part of their work for OECD-Network A (OECD, 1997), analyzed problem solving as one of the important cross-curricular competencies. They constructed an item to measure skills in written communication. The respondent is asked to plan a trip for a youth club. This essay-like planning task is based on "in-basket" documents. The task proved to be too difficult for the target population, and low levels of objectivity were reached in scoring the answers.