The survey’s national base sample, provincial top-up samples to the base, and supplementary samples related to age could have been selected from short-form households from the Census, but the long form data were required to identify the remainder of the special subpopulations. In the case of minority language samples, the quality of the long form responses is judged to be superior to that of the short form. The presence of questions on the knowledge of, and the use of languages, in addition to the mother tongue (language first learned and still understood) provide respondents with more opportunities to properly characterize their linguistic profile. Sample design A stratified multi-stage probability sample design was used to select the sample from the Census Frame. The sample was designed to yield separate samples for the two official languages, English and French. In addition, the sample size was increased to produce estimates for a number of population subgroups. Provincial ministries and other organizations sponsored supplementary samples to increase the base or to target specific subpopulations such as youth (ages 16 to 24 in Quebec and 16 to 29 in British Columbia), adults aged 25 to 64 in Quebec, linguistic minorities (English in Quebec and French elsewhere), recent and established immigrants, urban aboriginal peoples, and residents of the northern territories.

In each of the 10 provinces the Census Frame was further stratified into an urban stratum and a rural stratum. The urban stratum was restricted to urban centers of a particular size, as determined from the previous census. The remainder of the survey frame was delineated into primary sampling units (PSUs) by Statistics Canada’s Generalised Area Delineation System (GArDS). The PSUs were created to contain a sufficient population in terms of the number of dwellings within a limited area of reasonable compactness. In addition, a general indication of the education level of the population from the 1996 Census was incorporated to create PSUs that reflected the educational distribution of their province.

A second, implicit, stratification was used in the systematic selection of households for each sample. The highest level of education for each adult in the household, as recorded in the Census frame, was used to determine a representation of the dominant class from four broad levels: 1) less than high school, 2) high school graduate or some post-secondary education, 3) college graduate, and 4) university graduate. Formal educational attainment is not the only, but is the main, determinant of performance in evaluations of literacy (OECD and Statistics Canada, 2000). Ordering the households by education within geographic regions before sample selection increased the ability to represent a range of educational backgrounds.

The sample was allocated between strata under a Neyman allocation, incorporating a conservative design effect of 2 for the rural stratum and 1.5 for the urban stratum. After allocation, it became apparent that several PSUs in the rural strata were sufficiently important that they were effectively being sampled with certainty. These PSUs were converted to a new pseudo-urban stratum, to be treated similar to the urban stratum in terms of sample selection.

As a final step before sample selection, the negotiated sample sizes were inflated to account for an international target minimum response rate of 70 percent and for mobility in terms of the characteristics of interest for each subpopulation covered by a supplementary sample. A blended rate was calculated using reported 1-year and 5- year mobility variables from the Census as proxy variables, and applied to the time lag between the Census and the start of collection in March of 2003. These rates were adjusted downward in each stratum to reflect the expected replacement of movers by others with the same target characteristics for each supplementary sample.