The Flynn effect is the continued year-on-year rise of IQ test scores, an effect seen in all parts of the world, although at greatly varying rates. It is named after New Zealand political scientist James Robert Flynn, its discoverer. The Flynn effect is a perplexing phenomenon for those who believe that IQ tests represent a true measure of human intelligence, as it would suggest that people today are in general considerably more intelligent than those of previous generations. Flynn himself does not believe this to be the case. It is conceivable that something about modern society (the greater need for abstract thinking, presence of computers, and more visually-oriented culture) is responsible. Better nutrition has been proposed as a factor. However, there is evidence from Scandinavian countries that IQ scores rose even more, 20 points per generation, following the austerity of occupation during World War II. Another possible explanation is that people are maturing faster, so that, for example, a ten year old today may have the mental age that a twelve year old had sixty years ago, although this may also be ultimately due to nutrition. In 2001, William T. Dickens and James Robert Flynn presented a mechanism by which environmental effects on IQ may be magnified by feedback effects. The paper “Heritability Estimates Versus Large Environmental Effects: The IQ Paradox Resolved” was published in Psychological Review. In 2004, Colom et al. (Colom, 2005) presented data supporting the nutrition hypothesis, which predicts that gains in IQ will predominantly occur at the low end of the distribution where nutritional deprivation is most severe. Two large samples of Spanish children were assessed with a 30-year gap. Comparison of the IQ distributions indicated that 1) the mean IQ had increased by 9.7 points (the Flynn effect), 2) the gains were concentrated in the lower half of the distribution and negligible in the top half, and 3) the gains gradually decreased from low to high IQ. Possibly related to the Flynn effect is change in cranial vault size and shape during the last 150 years in the US. These changes must occur by early childhood because of the early development of the vault. In the end, a number of varied phenomena may be contributing to the Flynn effect. The Flynn effect is the increase in raw scores on IQ tests over time. When new norms are calculated for an IQ test, it is observed that the number of correct item content scores achieved by test-takers in the norming sample typically rise compared to test-takers in previous norming samples. Similar improvements have been reported for other cognitions such as semantic and episodic memory. The effect has been observed in most parts of the world at different rates. Because IQ 100 is defined as the median score of the norming sample, current test-takers taking tests with older norms tend to have inflated scores. A standard IQ test administered to people from various generations has conclusively proved the fact that there has been a linear and uninterrupted increase in the average human intellectual capabilities. These test scores were normalized for every study being conducted. Normalization gives the average score for a particular group of people. The same test was administered to the next generation and the normalized result was compared with the previous test. The results have confirmed a higher intelligence quotient (IQ) level. According to Flynn, these effects are due to a combination of factors which undergo a drastic change with each successive generation. The cognitive psychology of a succeeding generation has a lot of stimulation for the abstract mind, and hence a better interpretative ability to assimilate these ideas. This demands a lot of thinking and reasoning from an average human brain. A simple example can be the scientific advancement which has undergone a sea of change. A person now in his 40s had limited access to technological inventions, the web, or mobile communication in his childhood. In stark contrast to this, consider his son born in the 1990s, who is quite adept and comfortable using these advancements. Even though he is using these technologies unknowingly, (his brain comprehends more facts than what his father’s did, at his age) the average effort put in by his brain to understand a particular system is higher than his father’s brain. This can be due to variety of reasons like better nutrition, large scale exposure to many concepts at a relatively tender age, interactive media and so on. The Flynn effect is more evident in a rapidly developing country like India. The prior generation had a relatively easier access to its premier educational institutions, as the number of applicants for the seats was relatively less. India’s economy, health facilities, exposure to new facets of development and various such parameters have raised at a much faster pace in the past decade. This has created a huge demand for skilled professionals and an increased awareness among its burgeoning middle class about the importance of getting into premier institutes. Thus, although the intake has been increased negligibly, there has been an astounding rise in the number of students clearing the tests. This, despite the fact that the entrance tests have increased their difficulty level, which reiterates the fact that the general level has increased for a given set of population.
The Flynn Effect is one of the most surprising, most intriguing — and potentially most important — findings in the recent psychology research literature. Flynn (1984) used patterns in 73 studies to suggest the existence of “massive gains” in IQ in the US. He calibrated the level of gain at around 0.33 IQ points per year in the US from 1932 to 1978, an overall increase of around 15 IQ points over this period. Flynn (1987) followed with an analysis of IQ scores from 14 economically developed countries around the world, and found similar patterns that also supported IQ gains. These gains appear to reflect abstract problem solving ability more than other intellectual abilities that involve learning material; patterns from the Ravens Progressive Matrices, a relatively culture-free IQ test that involves a great deal of problem solving, provide the strongest support for the Flynn Effect across countries. In fact, Flynn (1987) suggested that there may have been declines in abilities related to learned content, and that these have been suppressing rather than contributing to IQ gains. Thus, he suggests that gains in specific problem solving abilities may be, if anything, even greater than those he documented for general IQ. To recap a little history, in 1984 James R. Flynn, a New Zealand political scientist, published a long paper showing that there had been large increases in mean IQ scores in the USA between 1932 and 1978. In 1987 he published a further major paper showing that the same trend could be observed in more than a dozen other developed countries. Flynn gathered tests from Europe, North America and Asia, around thirty countries in all, and discovered that, for as far back as we had data in any case, average IQ test scores had risen about 3 points per decade and in some cases more. Only recently, in some Scandanavian countries, to the gains appear to be levelling off (see, for example, Sundet 2004; Teasdale and Owen 2005). These two papers have stimulated a great deal of research and discussion, including at least one book. In 1994 Herrnstein and Murrayaa‚¬a„?s The Bell Curve discussed the rising trend in IQ and said as follows: aa‚¬E?We call it aa‚¬A“the Flynn Effectaa‚¬? because of psychologist James Flynnaa‚¬a„?s pivotal role in focusing attention on it, but the phenomenon itself was noticed in the 1930s when testers began to notice that IQ scores often rose with every successive year after a test was first standardisedaa‚¬a„?. Herrnstein and Murrayaa‚¬a„?s term aa‚¬E?the Flynn Effectaa‚¬a„? has been generally adopted, for example by Arthur Jensen, who says aa‚¬E?This upward trend in the populationaa‚¬a„?s mean test scores has been aptly dubbed the aa‚¬A“Flynn Effectaa‚¬?.aa‚¬a„?the average rate of rise seems to be around three IQ points per decade. But before the effect is taken seriously by the community of social science researchers, its very existence should not be questionable. In other words, research addressing the legitimacy and meaning of the effect should precede research testing for and evaluating causes of the effect. Surely, Flynn’s analysis was complete and painstaking, and his writing is clear and appropriately self-critical. In fact, he was very careful to self-evaluate the strength of his arguments; for example, in Flynn (1987) he classified how much support data from each country offered to the “Massive Gains” hypothesis, and also classified the legitimacy of his interpretations. However, I will argue that Flynn’s arguments contain methodological weaknesses, of which he was unaware, of which the community of researchers has not been sufficiently critical. Because his self-evaluation was also blind to these weaknesses, it tends to overstate the confidence we should have in the status of the Flynn Effect. Certainly, scrutiny by independent investigators of the logic leading to the claim is necessary before we will be able to understand what the Flynn Effect is, and ultimately to identify what cause or causes lie behind it. Thus, the purpose of this critique is not to resolve the issue of the meaningfulness of the Flynn Effect or to specify the causes of the effect. Neither is the purpose to present extensive empirical analysis to provide further data or evidence concerning its legitimacy (although one suggestive empirical study will be presented). Rather, the purpose is to frame an approach to studying the Flynn Effect by defining a set of questions, criticisms and a research agenda. This critique opens discussion over what the nature of the Flynn Effect is, and of whether the Flynn Effect is real or a methodological artifact (or some combination). Other interpretations besides Flynn’s and the ones presented here certainly exist as well, and should also be subjected to logical and empirical scrutiny.
The Flynn effect is named for James R. Flynn, who did much to document it and promote awareness of its implications. The term itself was coined by the authors of The Bell Curve. The effect’s increase has been continuous and approximately linear from the earliest years of testing to the present. There are numerous proposed explanations of the Flynn effect and also some skepticism about its implications. The Flynn effect may have ended in at least a few developed nations, possibly allowing the national differences in IQ scores to diminish if the Flynn effect continues in nations with lower average national IQs. There are many different kinds of IQ tests using a wide variety of methods. Some tests are visual, some are verbal, some tests only use of abstract-reasoning problems, and some tests concentrate on arithmetic, spatial imagery, reading, vocabulary, memory or general knowledge. The psychologist Charles Spearman early this century made the first formal factor analysis of correlations between the tests. He found that a single common factor explained for the positive correlations among test. This is an argument still accepted in principle by many psychometricians. Spearman named it g for “general intelligence factor.” In any collections of IQ tests, by definition the test that best measures g is the one that has the highest correlations with all the others. Most of these g-loaded tests typically involve some form of abstract reasoning. Therefore Spearman and others have regarded g as the perhaps genetically determined real essence of intelligence. This is still a common but not proven view. Other factors analyses of the data are with different results are possible. Some psychometricians regard g as a statistical artifact. The accepted best measure of g is Raven’s Progressive Matrices which is a test of visual reasoning. Because children attend school longer now and have become much more familiar with the testing of school-related material, one might expect the greatest gains to occur on such school content-related tests as vocabulary, arithmetic or general information. Just the opposite is the case: abilities such as these have experienced relatively small gains and even occasional decreases over the years. The greatest Flynn effects occur instead for g-loaded tests such as Raven’s Progressive Matrices. For example, Dutch conscripts gained 21 points during only 30 years, or 7 points per decade, between 1952 and 1982. Some studies have found that the Flynn effect has not caused gains in g. This effect has been observed across cultures, although in varying degrees. We would have come across people, or even we might have noticed that now-a-days children are more intelligent or quick to absorb a new concept. This noticeable difference between generations has been scientifically put forth by Flynn and several other researchers. Since then, the so-called “Flynn effect” has been confirmed by numerous studies. The same pattern, an average increase of over three IQ points per decade, was found for virtually every type of intelligence test, delivered to virtually every type of group. The pattern applied to some 20 countries for which data were available, including the USA, Canada and different European nations, although the rate increase varied somewhat according to country and type of test. The increase was highest, 20 points per generation (30 years), in Belgium, Holland and Israel, and lowest, 10 points per generation, in Denmark and Sweden. Although the data are limited, it moreover seems that the increase is accelerating. In Holland, for example, scores went up most (over 8 points) for the last measured period, 1972 to 1982. For one type of test, Raven’s Progressive Matrices, Flynn found data that spanned a complete century. He concluded that someone who scored among the best 10% a hundred years ago, would nowadays be categorized among the 5% weakest. That means that someone who would be considered bright a century ago should now be considered a moron! Such a result has unexpected implications for the relation between intelligence and age. Older people tend to have lower scores on IQ tests than younger people. Until now, it was always assumed that this means that intelligence diminishes with age. However, this observation can be explained as well by noting that older people were raised in a period when the general level of intelligence was lower. Flynn showed that if people’s IQ is evaluated with tests calibrated for the period during which they grew up, an old person scores as well as a young one. The reason that older people do less well on IQ tests is not that they have become more stupid with age, but that the younger generation simply got a head start. One might expect that the Flynn effect would be clearer for tests that emphasize culture or education. The opposite is true, however: the increase is most striking for tests measuring the ability to recognize abstract, non-verbal patterns. Tests emphasizing traditional school knowledge show much less progress. This means that something more profound than mere accumulation of data is happening inside people’s heads. None of the scientists who have studied the effect can offer a simple explanation. Flynn himself admits that he is baffled by the results, and that he finds it hard to believe that his generation is significantly more intelligent than the one of his parents. He proposes the following argument. Compared to the previous generation, the number of people who score high enough to be classified as “genius” has increased more than 20 times. This means that we should now be witnessing, in Flynn’s own words, “a cultural renaissance too great to be overlooked”. Because he finds this conclusion implausible, he suggests that what has risen is not intelligence itself but some kind of “abstract problem solving ability”. But if we look at the ever accelerating production of scientific discoveries, technological innovations and cultural developments in general, the “cultural renaissance” does not seem such an absurd idea anymore. And whether you call the factor that raises “intelligence” or “abstract problem solving ability”, the conclusion that people have become intellectually more capable remains the same. IQ tests are re-normalized periodically to hold the average score for an age group at 100. This normalization gave a first indication to Flynn that the IQ was changing over time. The revised versions are standardized on new samples and scored with respect to those samples only. The only way to compare the difficulty of two versions is to have a group of people take both tests. This confirms IQ gains over time. The average rate of rise seems to be around three IQ points per decade. Today, children go to school for a longer time. They have also become more familiar with testing. It might therefore be expected that the biggest gains occur with school-related content, such as vocabulary, arithmetic or general information. Just the opposite is the case: abilities such as these have experienced relatively small gains and even occasional declines over the years. The largest changes attributed to the Flynn effect appear on general intelligence factor loaded (g-loaded) tests such as Raven’s Progressive Matrices, instead. For example, Dutch soldiers gained 21 points in only 30 years, or 7 points per decade, between 1952 and 1982. Some studies focused on the distribution of scores have found that the Flynn effect mainly occurs with lower scores. Teasdale and Owen (1987), for example, found the effect primarily reduced the number of low-end scores, resulting in an increased number of moderately high scores, with no increase in very high scores. However, Raven (2000) found that a lot of data must be re-interpreted with respect to the date of birth. Previously, this data had been interpreted to show that many abilities decrease when people get older. This data must now be interpreted to show that many abilities had in fact increased dramatically, as Flynn predicted. On many tests this occurs at all levels of ability. Two large samples of Spanish children were assessed with a 30-year gap. Comparison of the IQ distributions indicated that:
The mean IQ had increased by 9.7 points (the Flynn effect),
The gains were concentrated in the lower half of the distribution and negligible in the top half, and
The gains gradually decreased from low to high IQ.
Some scientists believe these changes are very big. One of them is Ulric Neisser. In 1995 he was the head of a task force of the American Psychological Association, charged with writing a statement on where intelligence research was. He estimates that if American children of 1932 could take an IQ test normed in 1997 their average IQ would have been only about 80. In other words, half of the children in 1932 would be classified as having borderline mental retardation or worse in 1997. Looking at Ravens, Neisser estimates that if you extrapolate beyond the data, which shows a 21 point gain between 1952 and 1982, an even larger gain of 35 IQ points can be argued. Arthur Jensen warns that extrapolating beyond the data leads to results such as an IQ of -1000 for Aristotle (even assuming he would have scored 200 in his day). Most of the time, the effect is associated with IQ rises. A similar effect has been found with increases in semantic and episodic memory.
FOUR PARADOXES OF THE FLYNN EFFECT
What Is Intelligence? Flynn identifies four paradoxes that arise from the steady increase in average IQ test scores given the predominant understanding of ‘intelligence.’
The factor analysis paradox: Prior research suggested that a single factor, ‘general intelligence’ or ‘g,’ underlies IQ. The Flynn Effect, however, does not affect all sections of the WISC and other intelligence tests to the same degree; that is, if we’re getting smarter, some parts of our intelligence are getting smarter faster, undermining our confidence in ‘g.’
The intelligence paradox: The Flynn Effect suggests that we are getting smarter relatively quickly, but it’s not obvious (and some would say flies in the face of certain evidence) that kids today are so much smarter than their parents or grandparents (except perhaps when it comes to home electronics). As Flynn writes:
“If huge IQ gains are intelligence gains, why are we not stuck by the extraordinary subtlety of our children’s conversation? Why do we not have to make allowances for the limitations of our parents? A difference of some 18 points in the average IQ over two generations ought to be highly visible.”
The mental retardation paradox: If the rate of change in IQ is extrapolated backwards, it suggests that people in 1900 had a mean IQ score somewhere between 50 and 70 judged by today’s standards. An IQ level of 75 is typically considered ‘mentally retarded.’ Flynn puts this one nicely, too: ‘Either today’s children are so bright that they should run circles around us, or their grandparents were so dull that it is surprising that they could keep a modern society ticking over.’
The identical twins paradox: Twins raised apart tend to have very similar IQ scores, typically considered strong evidence for a genetic basis for differences in IQ. The Flynn Effect instead suggests that intelligence, if it is being measured by IQ, is more malleable and subject to environmental effects.
SO DO WE GIVE UP ON IQ TESTS?
Flynn doesn’t think gains in IQ scores are either trivial or cause for discounting standardized testing; the tests are not fatally flawed, although our understanding of the results may be. Flynn argues that we really need to understand what these changes are measuring. People are not growing ‘smarter,’ they are growing better at very specific cognitive skills:
This solution to our paradox does not imply that massive IQ gains over time are trivial. Aside from the escalation in lateral thinking, they represent nothing less than liberation of the human mind. The scientific world-view, with its vocabulary, taxonomies, and detachment of logic and the hypothetical from concrete referents, has begun to permeate the minds of post-industrial people. This has paved the way for mass education on the university level and the emergence of an intellectual cadre without whom our present civilization would be inconceivable. (From this page in the Cambridge talk) Flynn surveys a number of studies that suggest that the ability to engage in formal or abstract reasoning strongly correlates, not with intelligence, but with the level of schooling; since, in the West, people stay in school longer if they are more intelligent, it makes sense that the correlation between intelligence and the ability to engage in abstract reasoning would be especially strong in industrialized countries. That is, if schooling affects intellectual abilities measured by these tests, at the same time that these tests provide greater access to formal educational opportunities, then the tests would measure intelligence really well, but not just because they transparently reflect some underlying intelligence. Rather, it would be because the tests themselves are tied up in a self-reinforcing cycle of talent (or its lack) being compounded by opportunities, confidence, and motivation (or their opposite).
CASES WHERE THE FLYNN EFFECT HAS BEEN DISCUSSED
In Maldonado v. Thaler, 2010 U.S. App. LEXIS 17735, the state’s expert witness Dr. Denkowski failed to take the “Flynn Effect” into account when calculating Petitioner’s Maldonado’s IQ score.
In Thomas v. Allen, 607 F.3d 749, the State took issue with the district court’s employment of the Flynn effect. The question was not whether the district court’s application of the Flynn effect to lower Thomas’s IQ scores was mandatory, but whether the district court’s application of it in this case was clearly erroneous. Court could not say that it was.
In Pierce v. Thaler, 604 F.3d 197, UNITED STATES COURT OF APPEALS FOR THE FIFTH CIRCUIT found that the flaws of the Flynn effect were not in issue.
In Walker v. Kelly, 593 F.3d 319, UNITED STATES COURT OF APPEALS FOR THE FOURTH CIRCUIT found that the District Court erred in not considering the Flynn Effect.
In Winston v. Kelly, 592 F.3d 535, the most that could be reasonably inferred from the holding was that the Virginia Supreme Court was unconvinced by Winston’s evidence concerning the Flynn effect.
In Thomas v. Quarterman, 335 Fed. Appx. 386, the state court concluded that “it is not a generally accepted professional practice to automatically adjust individual IQ scores to accommodate the group statistical concept known as the Flynn Effect.”
In Holladay v. Allen, 555 F.3d 1346, UNITED STATES COURT OF APPEALS FOR THE ELEVENTH CIRCUIT did not find the district court in error in not considering the Flynn effect.
In In re Mathis, 483 F.3d 395, UNITED STATES COURT OF APPEALS FOR THE FIFTH CIRCUIT Flynn effect was not accepted in this Circuit as scientifically valid.
In In re Salazar, 443 F.3d 430 Court opined “even assuming that the Flynn Effect is a valid scientific theory and is applicable to Salazar’s individual I.Q. score–and we express no opinion as to whether this is actually the case–Salazar’s score readjusted to account for score inflation is still above the cutoff for mental retardation.”
James R. Flynn: The mean IQ of Americans. massive gains. New York: Harper and Row 1984
James R. Flynn: Massive IQ gains in 14 nations: what IQ tests really measure. Psychological Bulletin 101: 171-191, 1987
Ulric Neisser: Rising Scores on Intelligence Tests, American Scientist, September – October 1997
APA Task Force Examines the Knowns and Unknowns of Intelligence
Flynn’s Effect, Scientific American Dickens, William T., and James R. Flynn. 2001. Heritability Estimates Versus Large Environmental Effects: The IQ Paradox Resolved. Psychological Review 108(2): 346-369. doi:10.1037//0033-295X. 108.2.346
Flynn, James R. 1984. The mean IQ of Americans: Massive gains 1932 to 1978. Psychological Bulletin 95: 29-51.
_____. 1987. Massive IQ gains in 14 nations: What IQ tests really measure. Psychological Bulletin 101: 171-191.
_____. 1998. IQ gains over time: Toward finding the causes. In U. Neisser, ed., The rising curve: Long-term gains in IQ and related measures. Pp. 25 – 66. Washington, DC: American Psychological Association.
_____. 2000. IQ gains, WISC subtests, and fluid g: g theory and the relevance of Spearman’s hypothesis to race (with Discussion). In G. R. Bock, J. A. Goode, & K. Webb, eds., The nature of intelligence. Pp. 222-223. Novartis Foundation Symposium 233. New York: Wiley.
_____. 2007. What Is Intelligence?: Beyond the Flynn Effect. Cambridge: Cambridge University Press.
Flynn, J. R. (1984): The mean IQ of Americans: Massive gains 1932 to 1978. Psychological Bulletin, 95, 29-51
Teasdale, T. W., & Owen, D. R. (1989): Continuing secular increases in intelligence and a stable prevalence of high intelligence levels. Intelligence, 13, 255-262
Kanaya, T., Scullin, M. H., & Ceci, S. J. (2003): The Flynn effect and U.S. policies: The impact of rising IQ scores on American society via mental retardation diagnoses. American Psychologist, 58, 778-790.
Rising Scores on Intelligence Tests Neisser, U. (1997). American Scientist, 85, 440-447.
Teasdale, Thomas W., and David R. Owen. (1987). ‘National secular trends in intelligence and education: a twenty year cross-sectional study’, Nature, 325, 119-21.
Raven, J. (2000). The Raven’s Progressive Matrices: Change and stability over culture and time. Cognitive Psychology, 41, 1-48.
Colom, R., Lluis-Font, J.M., and Andres-Pueyo, A. (2005). “The generational intelligence gains are caused by decreasing varianceIn the lower half of the distribution: Supporting evidence for the nutrition hypothesis”. Intelligence 33: 83-91.doi:10.1016/j.intell.2004.07.010.
The g factor by Arthur Jensen pg 328
Ronnlund M, Nilsson LG. (2009). Flynn effects on sub-factors of episodic and semantic memory: parallel gains over time and the same set of determining factors. Neuropsychologia. 47(11):2174-80. doi:10.1016/j.neuropsychologia.2009.05.001 PMID 19056409