Over the next few days, I will be posting three articles up on this blog, as a short series. This series examines an important problem in education, and that is: how do we find out what individuals are capable of, and what they are learning? I will look at three aspects of this problem. First of all, I will discuss assessing intelligence, then I’ll move onto assessing learning, and finally I will look at assessing aptitude. It’s quite ambitious to try to do this in blog posts, because these subjects fill entire library sections normally, but you really should see the posts just as a short introduction to each topic, a kind of taster.
I’m going to start with a brief investigation into IQ tests. First of all I will describe the main types of intelligence test that have been in common use over the last century, in chronological order of development. I will also discuss the uses and limitations of IQ tests of western origin.
It’s widely recognised that measuring intelligence can be regarded as controversial, as it can be culturally specific. In other words, how well you do on the test can relate to how similar you are in background to the people setting the test. As most psychologists in the early part of the 20th century were white and middle class, their tests were often based on ideas and situations that they would have found familiar, in a US or Northern European context. This disadvantaged non-native speakers, those from a challenging educational background, or indeed those from outside the US or Europe. Although there have been attempts to remedy this, by making tests as culturally neutral as possible, it is still difficult in some cases to get to the absolute core of what constitutes intelligence. Today we are going to look at some of the debates surrounding that.
Starting with the Western psychological context, we tend to measure IQ as a way of measuring intellectual potential (IQ means “intelligence quotient”) and that is a figure which represents your success in doing particular timed tests in relation to your age. This concept of the test being fixed in time, both in terms of the clock, and in terms of biological age, remains central to assessing intelligence.
The first widely used intelligence test was known as the Simon-Binet test (1905), and it was used by the French government to help identify children who would need help at school. As explained above, it assessed what ‘average’ performance would be, and it calculated a child’s ‘mental age’ according to that. By 1916 it had been extended to include adults.
The next significant development in terms of intelligence testing was the introduction of the Wechsler Adult Intelligence Scale (1939) and Wechsler Intelligence Scale for Children (WISC). This included verbal and non verbal test items, such as the manipulation of blocks, pictures, etc. It was felt that non verbal test items would help the test be more culturally neutral.
Clearly there are particular difficulties in assessing the intelligence of very young children, and to this end the Bayley Scales of Infant Development were developed in 1969 for children under two. These are used in adapted form by Health Visitors today, at children’s two-year check, for example, and a typical task might be building a tower from three cubes.
There has also been widespread use of another test, the British Ability Scales (BAS) (1979). These were designed to measure development and moral reasoning, and to be less US-centric.
A factor in the development of IQ tests is indeed that the tests have become increasingly complex over time, and the kits for psychologists are very expensive indeed – we are talking upwards from £600 now. A test does not just consist of a cheaply printed question sheet and a marking sheet. There is big money in developing tests, which are commercially produced by monopoly providers. While they are normally used properly, an important consequence of this expense is that in some cash-strapped educational or health related contexts, bits and pieces of the test get lost or worn, parts are transferred from one test to another, photocopies are made, and this all means that the test isn’t as strictly regulated as it might be, with consequences for the results. This is despite the British Psychological Society compelling testers to attend accredited courses to learn about best practice in test delivery, to ensure standardisation. Therefore real life is a factor in how well the test is given, and how accurate the results might be.
So we have considered a number of IQ test materials and practices. I’m going to move on now to examine how we judge the predictive ability of IQ tests. There are three key questions for psychologists in determining how useful a test is. First of all, is the test reliable – does it give consistent scores if repeated over time? Secondly, is the test valid – does it correlate with future academic achievement? And finally, are test scores stable – does the IQ of individuals change over time? In many cases, the answers are favourable, but as I argued above, there have been some problems with IQ tests in the past. One of these is cultural discrimination. As I explained, you are at a distinct advantage here if you happen to be a white American or European. Even within that context, the British Ability Scales tests were introduced to counter the US-Centrism of tests such as the Wechsler Intelligence Scale for Children. There has been particular criticism of the use of tests with indigenous native populations, for example Australian aboriginals, as low scores were used as grounds for persistent discrimination, yet subsequently it was discovered that the tests being used did not measure intelligence sufficiently well for these groups.
Another factor in test bias was that they risked discriminating according to the educational background of test taker. In other words, if you had attended school and could read well, and had learnt how to tackle abstract problems in a systematic manner, then you were likely to be at a distinct advantage to those who had not benefited from such experiences. The environment when people are taking the tests can also play a role. Tests are designed to be done under laboratory conditions, and if there is noise or interruptions, this can reduce the overall score. Similarly the mood and motivation of test takers plays a role. I am sure that every health visitor in the land can recall a two-year-old who has refused to co-operate with testing at his or her two-year-check. (One of my own children did this, and consequently spent the entire appointment carrying out a comparison of all the different weighing scales in the room to see if they came up with the same reading. So much more interesting than building a little tunnel out of three blocks, and pushing a pencil under it!) Another phenomenon in all kinds of psychological testing is the desire to please an authority figure and give expected answers, by second guessing what they might want. This may result in the person being tested giving an erroneous answer unnecessarily. Finally, we must bear in mind the effect of coaching.
W H Smith is absolutely full of books containing IQ tests for middle class parents to buy for their children, which teach explicitly the techniques needed for success in analysing problems. Again, this is bound to affect results and the overall reliability and validity of tests. The availability of test coaching materials may be one factor in why we seem to see IQ results rise over time. This is not because of some evolutionary change making us all cleverer – it is much too quick for that. It is because we are all getting better at doing the tests. That is why test manufacturers have to keep regrading them, so the average mark is based on increased numbers of correct answers.
Many tests work on the assumption that if you are intelligent in one respect, this is likely to apply in different cognitive domains. For example, British psychologist Charles Spearman (1863-1945) described a concept he referred to as ‘general intelligence’ or the g factor. After using a technique known as factor analysis to examine a number of mental aptitude tests, Spearman concluded that scores on these tests were remarkably similar. People who performed well on one cognitive test tended to perform well on other tests, while those who scored badly on one test tended to score badly on others. He concluded that intelligence is general cognitive ability that could be measured and numerically expressed.
However individuals can display varying degrees of intelligence depending on what they are trying to do. For example, I may be able to write this blog post, for example, but try to get me to navigate around IKEA to find a remote flatpack successfully, and you will see someone who needs serious help.
In evolutionary studies, the debate is whether g evolved as a multipurpose tool or whether the mind has domains, like a ‘Swiss army knife’, with cognitive ‘tools’ evolving to answer specific challenges. This does not map exactly onto other theories of multiple intelligence, as we will see in a minute, but it does tend to overlap. This is all within the ongoing dog-fight in cognitive psychology: is the mind domain-general in function or not? Is my inability to cope with IKEA’s store layout a function of the quality of my whole brain, or just a particular part of it that isn’t quite up to the job?
When I am not worrying about flatpack retrieval, another area of 40-something personal concern is how far I am turning into my own parents. Clues as to the likelihood of this can be found in intelligence research on the heritability of IQ or other test score. This is of course a minefield – medium to high heritabilities (what is called ‘h’ to academics in this field) are generally found from twin and adoption studies. G heritability appears to increase over the lifespan – 20% in infancy, 40% in childhood and 60% in adulthood (for more information on this, see Plomin et al, 2003, “Behavioural Genetics” in The Postgenomic Era, APA Washington DC). What is more interesting than heritability, is non-shared and shared environment influence on g – this is hard to interpret as it becomes increasingly clear that there is hardly any such thing as a ‘shared environment’, even for siblings. Of course measurement error is also going to fall into this pot. As a result of all these doubts and problems with the theory, the heritability of g is no longer the focus of research. Instead, scientists are more interested in how environmental factors interact with this, as it becomes clear that there is a lot of gene-environment interaction (as in psychology generally) with the response to environmental factors depending on genetic predisposition. Shared environment appears to have less influence after adolescence as h increases (so we really do turn into our mothers!)
There are a number of alternative views of intelligence in addition to Spearman, as I mentioned previously. Perhaps the most commonly recognised view is that of Howard Gardner (1983) who argued that there was such a thing as multiple intelligences. The theory was first laid out in Gardner’s 1983 book, Frames of Mind: The Theory of Multiple Intelligences, and has been further refined in subsequent years. They are as follows:
- Linguistic (reading, writing, speaking, listening)
- Logical-mathematical (numerical skills)
- Spatial (driving, playing chess)
- Musical (singing, playing instrument)
- Bodily kinaesthetic (dance, athletics)
- Interpersonal (understanding others)
- Intrapersonal (understanding self)
In a sense, this idea of multiple intelligences rather resembles Thurstone’s multifactor theory with seven primary mental abilities (in the manner of personality traits), which dates from 1938, so it wasn’t entirely new. However Gardner argued that intelligence defined in the manner I have described fails to take into account comprehensively all the different aspects of ability that are present in humans. For example, a child who is able to memorise multiplication tables easily (in the manner approved of by the Victorians) is not necessarily going to be more intelligent than one who struggles. The second child may be stronger in another kind of intelligence, and may indeed have potentially higher mathematical intelligence than the one who simply memorizes tables. This suggests that schools should take pains to identify different strengths and weaknesses amongst individual pupils, and tailor the curriculum on an individual basis accordingly, using a range of approaches to teaching.
Gardner’s criteria for determining a specific area of intelligence were firstly case studies of individuals exhibiting unusual talents in a given field, such as child prodigies or autistic savants); neurological evidence of specialised areas of the brain (often including studies of people who have suffered brain damage affecting a specific capacity); the evolutionary relevance of the various capacities; psychometric or studies; and the existence of a symbolic notation (e.g. written language, musical notation, choreography). His theories have been heavily criticised on the grounds that there is little empirical evidence for their existence, and also because they pertain more closely to personality types than any latent ‘intelligence’. Psychologists have also argued that in all the other intelligence tests, the different areas of intelligence have more or less correlated, which suggests that it is unlikely someone could be outstanding in one are and not in any others. Despite such criticism, the theory has been widely adopted by teachers, in the same way that schools have been very quick to catch onto the idea of the existence of aural, visual and kinaesthetic learning styles, even though much of the time many of such concepts are without empirical foundation and not properly understood. They are, however, cheap, quick and easy to implement, and reinforce teachers’ self-identity as socially equitable educators.
Moving on, the latest development in terms of trying to understand and classify intelligence is most probably Sternberg’s triarchic theory of intelligence (1985). This priorities the following aspects of the human condition:
- Mental mechanisms that underline intelligent behaviour
- Adaptation to external environment via use of these mechanisms
- Role of life experience in linking internal and external worlds
It moves away from the idea of psychometric testing of individuals, and towards a more cognitive approach. It was based on his observations of graduate students. Again, this system of measuring intelligence has been criticised because many of the systems inherent in the criteria previously listed are merely new versions of cognitive skills tracked using existing tests, which are thought to correlate well to personal and professional success in middle age and beyond, suggesting validity.
I’ll finish this blog post with an interesting quotation about the role of IQ in creating scientific success. I hope this disabuses you of any notion that your aspirations should be limited by any idea that you need a special level of IQ to achieve anything.
‘Even within science, IQ is only weakly related to achievement among people who are smart enough to become scientists. Research has shown, for example, that a scientist who has an IQ of 130 is just as likely to win a Nobel Prize as a scientist whose IQ is 180.’
Hudson, L (1966) Contrary Imaginations: A Psychological Study of the English Schoolboy (London, Methuen), p104, cited in Sulloway,F J (1996) Born to rebel: birth order, family dynamics and creative lives New York: Pantheon, p357