Is there any alternative to learning by rules?
One model of possible circuitry is the ‘pattern associator’. If associative learning is true, and if processing really is parallel and distributed, if enormous quantities of stuff can be independently and variously managed in different sites at the same time, there is the possibility that we learn everything simply in terms of associated patterns and also that we very easily learn very large numbers of such associations, and patterns of associations. After a little bit of pattern learning the mind will begin to behave in a ‘rule-bound’ way, to behave as if it knew, and consulted, rules before acting. As you can see in the scenario here drawn, this is an illusion as there are no rules in there at all. Learning is possibly simply the arrangement of connections between cells into pattern associator nets. This ‘...allows a network of simple units to act as though it knew the rules.’ (Rumelhart & McClelland 1986 p.32, their emphasis)
So here we all perhaps are, a colossal collection of pattern associators in a vast and elaborately interconnected network of networks. Everything we have learned has been built into the circuitry as we learned it, without the need for a single rule to be specified. In fact it is much simpler and more robust for not having any rules to which the associators need refer. It is also more flexible, much better able to cope with a new idea than a rule-based system would be. It can readily assign a likely meaning even to imperfect inputs. This is a stunning, shocking notion which, once you are used to it, is profoundly elegant, even luxurious. We operate with so many pattern associators that we produce behaviour apparently meticulously controlled by rules, with absolutely no knowledge of any such thing. Nowhere in the system need any spelling rules be described in order for us to produce excellent spelling! Since pattern associators are able to do their associating even when input is imperfect, we can even read, for example, handwriting or misspellings without difficulty or the need to revise any previously learned ‘rules’.
And, of course, it turns out that we all ‘knew’ this anyway. We all, for example, produce language of complexity and grace and acknowledge that we do this with not the slightest knowledge of any of the ‘rules’ of grammar or syntax. We are all, more or less dimly, aware that grammar is complicated and ramifies into the distance almost as far as the eye can see. Thus it is also with spelling rules. We are, initially, shocked by the idea that we might not know any rules at all but perhaps, having some idea how Byzantine and exception-ridden English spelling rules are, we ought to have been more astonished at the idea that we could learn, and so easily apply, a system so complex by deploying rules? Is that idea any less outrageous than the idea that we learn by simply wiring up little pattern associator circuits (albeit in large numbers) completely innocent of rules? And pattern associators have one enormous advantage - they do their work in our subconscious; we do not have consciously to ‘think’ in order to operate them. Indeed, perhaps the more we try to think about them, to influence them, the less well they perform?
While learned pattern association goes on in the unfathomable and practically unlimited unconscious, thinking about rules has to take place in the hectic and very limited conscious. If there is the slightest chance of getting the subconscious to look after any process for us we should gratefully seize it. (This is, of course, precisely what practice achieves.) To learn a rule before the patterns to which it will apparently apply have been ‘overlearned’, simply as patterns, to subconscious standard, is to risk the patterns forever thereafter having to be referred up to conscious thought for decision and execution whenever they occur. A student who is taught to learn spelling rules before patterns has their capacity ‘just to write’ spiked, in fact.
This is the principle on which the best conversational language courses now function and we should take a leaf from their book. Modern courses deliver grammatical fluency astonishingly quickly and securely. They do this, paradoxically, by eschewing any teaching, indeed any mention, of grammar per se. Grammar is introduced stepwise and logically, but covertly. Typically, a session will be built around a single, but unmentioned, grammatical reality (a particular tense for example) and the ostensible targets of tuition - vocabulary, idiom, intonation etc. - presented, and practised, within the setting of that grammatical reality. Students learn, very rapidly and very dependably, how the particular grammatical construction they didn't realise they were learning ‘feels’ or ‘tastes’ in the new language. Job done, at this point. Such internalised learning ‘makes sense’; it will stick. It will also, of course, remain painlessly, yet absolutely reliably, in the subconscious.