The Ravens Educational Standard Progressive Matrices

Raven’s Progressive Matrices consist of three main forms, including Standard Progressive Matrices (SPM), Colored Progressive Matrices (CPM) and Advanced Progressive Matrices (APM). These different forms of test are intended to be used for individuals with different level of abilities. As the most basic form of Raven’s Progressive Matrices, SPM is used for individuals with average ability (O’Leary, Rusch & Guastello, 1991). It has been widely used in most countries of the world for more than 70 years and vast numbers of studies have demonstrated its suitability for cross-cultural and cross-sectional research (Raven 2009).

The development of SPM + was put forwards based on the observation that scores on all the Raven Progressive Matrices tests have, unexpectedly increased dramatically over the years in all cultures. SPM+ is developed intending to provide a parallel form of test with higher ceiling level and discriminative power (Raven, 2009). The Raven’s (SPM+) is a nonverbal test that uses nonverbal stimuli including visual patterns and shapes to assess general ability for children aged from 7 to 18 years. (Raven, 2008). Because of its nonverbal format, the test can be used with children, the elderly and patient populations for whom the processing of language may need to be minimized (O’Leary, Rusch & Guastello, 1991). The Raven’s (SPM+) is divided into five sets of 12 matrix problems (Sets A, B, C, D, and E). Each problem is presented as a pattern or sequence of diagrammatic puzzles with one piece missing. The task is to complete the pattern or sequence by choosing the correct missing piece from a list of options. The problems become progressively more difficult as the test taker proceeds through the problems in the test (Raven, 2008). This version of test is available in both computer and pencil-and-paper form. The pencil-and-paper form of the test has no time limit, and can be administered individually or in groups (Raven, 2008). During the assessment, each examinee will be given one spiral-bound stimuli booklet and a record form. The examinee is to write the selected answers on the record form. The total score is the total number of matrices completed correctly, and the test is thus scored out of 60.

Best services for writing your paper according to Trustpilot

Premium Partner
From $18.00 per page
4,8 / 5
Writers Experience
Recommended Service
From $13.90 per page
4,6 / 5
Writers Experience
From $20.00 per page
4,5 / 5
Writers Experience
* All Partners were chosen among 50+ writing services by our Customer Satisfaction Team

The current version of Raven’s SPM+ was developed in 1998. It was re-standardized in United Kingdom (UK) on 926 children from 7 to 18 years of age in 2008 (Raven, 2008). The published norms for the Raven’s SPM+ are presented in the form of Total raw score, Standard score and Percentile rank, and are available for children aged between 7 years to 18 years 11 months with yearly age intervals. The children’s actual age on the day of test will be used to compute standard score and percentile rank.


Construct Validity

Based on Spearman’s theory of two-factor intelligence, the Raven’s SPM + purports to assess a unitary trait namely ‘g’, or more specifically, the eductive component of intelligence. This type of intelligence has also been referred as analytic intelligence or the ability to deal with novelty, to adapt one’s thinking to a new cognitive perspective (Carpenter, Just and Shell, 1990). The administration manual of Raven SPM+ describes Raven’s Progressive Matrices as ‘one of the purest and best measure of g’ (Raven, 2008 p 59). Evidences from several factor-analytic studies using Raven’s Standards Progressive Matrices-Classic form (SPM-C) were sited to support this claim. Among those studies, g loading of Raven SPM-C was revealed ranging from .81 to .86 with adults and children samples with different cultural and demographic backgrounds (Raven, 2008).

However, some other studies suggest that Raven’s SPM-C measures other factors in addition to a general ability (g). For example, Banks and Sinha (1951) found that Raven’s SPM had a low g loading of 36%. Instead, a small loading of SPM on the visual special factor was found in this study.

Criterion-related Validity

The current version of SPM+ was found to have high concurrent validity with other versions of Raven’s SPM. It is reported that the pooled correlation between SPM-C and SPM+ is .797 and .830 between SPM+ and the Parallel version of SPM-C (SPM-P) (Raven, 2008).

Studies on the concurrent validity of Raven SPM with other general intelligence measures show considerable variability. Several studies reported moderate to high correlations between SPM and the various version of Wechsler intelligence scales for adults (Hall, 1957; Mcloed & Rubin, 1962; Burke, 1985; O’Leary, Rusch, & Guastello, 1991). A study of 288 participants from various age groups found that SPM scores correlate about .74 to .84 with the Full Scale IQ on the Wechsler Adult Intelligence Scale-Revised (O’Leary, Rusch, & Guastello, 1991). Inter-test correlations with children population are found to be similar in magnitude and pattern to those outlined for adults, especially for English-speaking children (Raven, 2008). Barratt (1956) found a correlation of .75 with the Full Scale IQ on the Wechsler Intelligence scale for Children. Roger and Holmes (1987) demonstrated correlation about .83 to .92 between SPM-C and WISC-R in a stratified sample of Canadian children aged from 7 to 11 years. Nevertheless, it is important to note that some studies investigating the inter-test correlations between SPM and Wechsler intelligence scales reported significant but variable magnitude of correlation in research with non-English speaking children, ranging from .30 to .68 ( Raven, 2008).

Predictive Validity

The predictive validity, i.e., the correlations between SPM and performance on achievement and scholastic aptitude tests are reported to be relatively consistent in studies with English and non-English speaking children, ranging up to about .70 (Raven, 2008). In this session, evidence on predictive validity on school success and scholastic achievement among children population will be reviewed as it is more relevant to the subject of the present study.

Application studies have demonstrated favorable finding on the effective use of Raven’s SPM in identifying children for gifted program (Baska, 1986; Saccuzzo & Johnson 1995; Tauron, Reparaz, & Peralta, 1999). In these studies, SPM was found to be predictive of their school success. In a study involved 175 children with diversed ethinic background, Raven SPM was found to be correlated moderately with Naglieri Nonverbal Ability Test (.52) and the Iowa Test of Bacis Skills (.48) (Lewis, Decamp_Fritson, Ramage, McFarland, & Archwamety, 2007). Comapring to other intelligence test, e.g. Wechsler Intelligence Scale for Children – Revised (WISC-R), Raven SPM was also found to be have equal predictive validity in a study with a total number of 16,985 subjects with diverse ethinic backgrounds (Saccuzzo & Johnson 1995).

In contrast with the above favorable findings on its predictive validity, several studies demonstrated that Raven SPM has inconsistent predictive power in different subjects. An Icelandic study on children aged 6 to 16 years demonstrated higher correlations between SPM and mathematics scores and lower correlations with language subjects. (Pind, Gunnarsdottir, & Johannesson, 2003). Overall, it has been suggested that there was a tendency for validity estimate to be higher when comparing with maths and science skills rather than language skills or overall academic achievement (Raven, 2008).


The split-half reliability of the SPM+ is .936 (N=924) based on UK standardisation sample. The test-retest reliability of SPM+ (from individual to group administration and vice versa) is reported as .833. However, it was reported in the Administration Manual that a previous administration of SPM+ individually will increase the test scores by 2 raw score points when the test is subsequently administered in groups (t= 3.146, p<.001); while administering the SPM+ in a group has no effect on subsequent scores when test is administered individually (t=.852, p=.399) (Raven, 2008).

No significant sex differences were found in the SPM+ scores of males and females in the standardised groups (t=.171, p=.864) (Raven, 2008).

Theoretical foundation of Raven SPM

The theoretical foundations that underpin the construction and development of the Raven’s Progressive Matrices series of tests were based in the work of Spearman, who developed what he termed the ‘two-factor theory of intelligence’ (Raven, 2008). In the early 1900’s, Spearman made an important observation that all tests of mental ability are positively correlated. For instance, he noticed that people who scored high on IQ or mental ability tests usually scored higher on other types of tests, and people that scored lower generally had lower scores on other tests (Jensen, 1998). After speculating the positive correlations between various mental tests, Spearman used the psychometric principles of factor analysis to develop a more comprehensive theory of intelligence, which focused on accounting for the differences observed in individuals’ intellectual abilities. His ‘two-factor’ theory of intelligence assumed that all possible measures of intelligence could be divided into two separate components. The general or ‘g’ component was defined by Spearman as a measures of the common intellective function; and the specific or ‘s’ component, as being specific to each measure (Jensen, 1998).In addition, Spearman argued that the ‘g’ factor itself was further comprised of two distinct components, namely eductive and reproductive ability. These were defined as follows: (a) eductive ability, the ability to make meaning out of confusion, the ability to generate high-level, usually nonverbal, schemata which make it easy to handle complexity; and (b) reproductive ability-the ability to absorb, recall, and reproduce information that has been made explicit and communicated from one person to another (Raven, 2000).

In 1938, Raven utilized Spearman’s theory as a rationale for test construction in the development of his Progressive Matrices test, which was designed in order to assess as accurately as possible the eductive ability component of ‘g’ in individuals, i.e., the capacity to think clearly and make sense of complex data (Raven, 2008).

The factorial structure of Raven SPM has always been subject of investigation and debate. Raven (2008) also described Progressive Matrices as one of the purest and best measures of g or general intellectual functioning, deducting from the test result by Raven (1939). More recently, Jensen (1998) has contended that the total variance of Raven scores in fact comprises virtually nothing besides g and random measurement error after conducting a serial of confirmatory studies.

However, it has not been universally accepted that the Progressive Matrices is a pure measure of reasoning ability and g. (Lynn, Allik & Irwing, 2004). Hertzog and Carter (1988) have concluded that the SPM contains two factors they designate verbal intelligence and spatial visualization. Similar finding was presented with simulation study. Using computer simulation, Carpenter, Just and Shell (1990) studied the cognitive process that accounted for problem solving in Raven APM, and concluded that the SPM measures ability to induce abstract relations and the ability to dynamically manage a large set of problem solving goals in working memory. Lynn, Allik and Irwing (2004) identified that SPM showed loading of three factors, namely gestalt continuation, verbal-analytic reasoning and visuospatial ability. A further analysis of the three factors however, showed a higher order factor identifiable as g.

In sum, Empirical research has provided evidence that Progressive Matrices is largely a measure of g as conceptualized by Spearman. On the other hand, it also contains a small or fairly small visualization or spatial factor as demonstrated in factorial analytical studies.

Strength of Raven’s SPM

Despite the debate on its factor loading, Raven SPM is recognized as perhaps the best measure of general intelligence (Spearman’s g factor), and it has shown the highest loading on the g factor when compared to other intelligence tests (Jensen, 1998). On the other hand, Raven SPM also gains its acceptance by many researchers because of its good psychometric characteristics. A huge body of research has confirmed that Raven SPM enjoys good psychometric property (See discussion on Validity and Reliability in later session).

As mentioned previously, Raven’s SPM are a series of matrix problems. This appealing question format and the use of figure stimuli have made the test attractive for clinical and research application. A meta-analysis of cross cultural performance of Raven’s SPM showed that the Raven is the second most used test after the Wechsler Intelligence Scales for Children (Brouwers, Van de Vijver, & Van Hemert, 2008). Since its publication in the 1930s, The SPM has been used as a cognitive screen tool for individuals with a range of abilities levels, such as neuropsychological disorder (Caffarra, Vezzadini, Zonado, Copelli, & Venneri, 2003), gifted population, (Touron, Reparaz & Peralta, 1999), physical disability (Armfield, 1985). In recent years, it has also been used to screen and identify individuals with intellectual disabilities (See section on Raven’s PM and ID).

Raven SPM is also well recognized for its cross cultural application (Brouwers, Van de Vijver, & Van Hemert, 2008). An extensive of evidence is now available in supporting its validity, reliability and suitability of SPM in many cultures (Baraheni, 1974). Till date, Raven SPM has been normed and used in both developed and developing countries (Raven, 2000). For instance, Raven SPM has been effectively used in China (Armfield, 1985; Stone, Wong & Lo, 2000), Spain (Touron, Reparaz & Peralta, 1999), Canada (Vanderpool & Catano, 2008), Iran (Baraheni, 1974), and countries in Africa such as Kenya (Daley, Whaley, Sigman, Espinosa, & Neumann, 2003). Rushton, Cvorovic and Bons (2007) also contended that Raven’s tests have remarkable cross-cultural generalizability of item properties across South Asians, Europeans, and sub-Saharan Africans.

Weakness of Raven’s SPM

Despite its popularity, Scholars and clinicians often hold polarized views on the suitability of the SPM to determine cognitive abilities in cross cultural population. the center of argument lie on the premise whether the SPM measures cross cultural differences in intelligence that are not confounded by other cultural or national differences, such as education and affluence (Brouwers, Van de Vijver ,& Van Hemert, 2008).Various studies reported its suitability as a tool to screen (Armfield, 1985), or assess the general ability of individuals (Caffarra, Vezzadini, Zonado, Copelli, & Venneri, 2003). Some other studies however, have questioned its suitability to represent general ability of individuals from certain cultural backgrounds (Wicherts, Dolan, Carlson, & van der Maas, 2010). The main issues surrounding this controversy are presented as followed;

The first controversy is often stemmed from the difference in the norm of the SPM across culture and geographical areas. Various norming studies suggest that the norms for different populations are usually similar at a given point in time; however, studies from different cultural and geographical areas often reveal broad different norms between countries (Raven, 2000). As summarized in Raven (2000), some Chinese studies have presented higher norms than norms obtained from Britain. Lower norms are usually observed in norms for rural and isolated communities. Some scholars also suggested that the average IQ of the populations of sub-Saharan Africa in terms of UK norms lies below 70 (Rushton & Jensen, 2005). The differences on norms across populations often lead to controversy on items properties of the test to individuals from diverse cultural backgrounds and psychometric meaning of Raven’s test scores as measures of general intelligence within the culture. Despite the conclusion by Raven (2000) that the SPM has equal predictive validity and item difficulty among individuals with different ethnic backgrounds. Some researchers, such as Wicherts, Dolan, Carlson, and Van der Maas (2010), on the other hand claim that Raven SPM may require a set of reasoning skills which are more appealing to affluent western culture. Their meta-analysis study on Raven SPM and CPM concerning 40 samples of Africans concluded that except a comparable reliability and predictive validity, the convergent validity of the Raven’s tests to other intelligence measurements among Africans is poor, especially in rural samples (ranging from .20 to .30). Hence, they further suggested that Raven’s tests in Africa are not as strongly g-loaded as they are in western samples. This notion is also further confirmed by their factor analytical study, which Raven tests show a low suitability as an assessment tool for general ability for African population with criterion validity coefficient of only .265.

The other controversy often lies on the notion whether the test is free from biases among individuals with similar cultural background. Although there is huge body of publication data supporting its suitability for clinical and educational application, researchers still argue that Raven’s SPM is not completely free from social biases as claimed (Desert, Preaux & Jund, 2009). In line with Wicherts, Dolan, Carlson, and Van der Maas (2010) conclusion, Desert, Preaux and Jund (2009) concluded that children with different socio-economic status (SES) status perceive differently on the standard test instruction listed in the SPM manual, after administrating SPM to 153 students in France. In term, children with low SES perceive additional evaluative pressure when they are told that that the aim of the task was to identify their strengths and their weaknesses in several domains.

In sum, a review of literature shows that Raven SPM is accepted as a valid screening tool of cognitive ability. Its suitability, reliability and validity as a research and screening tool in cross cultural population, especially in Asian population have been supported by the huge body of published research studies.

Application of Raven’s SPM in Asian culture

Brouwers, Van de Vijver and Van Hemert (2008)’s review on cross-cultural application of Raven SPM has demonstrated its popularity in both western and nonwestern cultures, such as Asian culture. Specifically, SPM has been used as a measure of general mental ability when investigating the difference on cognitive abilities (Rushton, Cvorovic & Bons, 2007). Furthermore, Raven’s SPM has also been a popular tool in schools psychology research (Armfield, 1985; Stone, Wong & Lo, 2000). For instance, Armfield (1985) use Raven’s SPM to screen the ability level for two groups of pupils, one with special needs, such as deafness or mutes, another without special needs in the primary school in Guang Zhou, China. He concluded that Raven SPM test appears to be sensitive and appropriate to the performance of pupils. Students’ performance on Raven was consistent with their school performances and teachers’ observations. Therefore, SPM is an effective tool to help the teachers to make education plan for their pupils based on abilities levels.

In spite the above-mentioned statements regarding the suitability and popularity of the Raven’s SPM in Asian countries, a review of the literature reveals a surprising paucity of research conducted on Raven’s SPM in Singapore. Brouwers, Van de Vijver and Van Hemert (2008)’s review classified Singapore among countries which has less than ten studies on Raven SPM in a time span of 60 years. A careful search of available literatures finds two recent local studies which employ Raven’s SPM as research tools. The first study was published in 1994 by Goh & Feldhusen (1994), in which Raven SPM was used as a measure of adolescents’ intelligence. The second study was conducted recently by a group of psychiatrisc and psychologists where Raven’s SPM was chosen as one of the neuropsychological assessment batteries to indicate mental capacities of schizophrenic patients (Chan, Chia, Yang, Woon, & Sitoh, et al, 2009). Although both studies have employed Raven’s SPM as a measure of mental ability, no local study has investigated the potential of Raven’s SPM as a screening tool for intellectual disability. Furthermore, there is also no local data representing the correlation between the WISC IV and Raven’s SPM in a population with lower than average intellectual abilities.

(Sent to Dr Neihart on 5/5/10)

Application of Raven’s SPM in intellectual disability (ID)

Besides its acceptance as a culture-reduce test suitable for individuals from diverse social and cultural background, Raven’s Matrices are also accepted as a test of choise for individuals with ID, or deprived linguistic background (Kline, 2000). Jensen (1980) asserted that the ‘main value of Raven’s tests in applied psychology is assessing the potential of those with deprived linguistic background’ (cited in Kline, 2000, p 463). Comparing to the widely accepted criterion test of intelligence, WISC, Raven’s matrices possess advantages of less reliance on language skills, relative ease in administration and shorter length of administration time, which often lead to better compliance rate, motivation, as well as performance (Kline, 2000). This is even true for those with more profound ID. A recent study by Dawson, Soulieres, Gernsbacher and Mottron (2007) showed that the WISC-III underestimates intelligence in children with Autism Spectrum Disorder (ASD). They found that scores of 38 children with ASD were on average 30 percentile points higher on the Raven’s Progressive Matrices than their scores on the WISC-III, whereas no such difference was found for typically developing children. Therefore, Raven’s Matrices can be potentially more suitable alternatives when screening a large group of students with low language proficiency.

In fact, Raven’s Matrices, especially Colored Progress Matrices (CPM) is being utilized increasingly with children with ID, including those with co-morbid conditions such as ASD (Koegel, Koegel & Smith, 1997), Down Symdrone (Brock &Jarrold, 2005), cerebral palsy (Pueyo, Junque, Vendrell, Narberhaus, &Segarra , 2008); in research settings to control for non-verbal mentation (Todman & Gibb, 1985; Standen & Ip, 2002; Brock & Jarrold, 2005); and in educational and medical settings to determine the level of functioning and treatment progress as part of a battery of tests (Amfield, 1985; Bello, Goharpey, Crewther &Crewther, 2008).

Research studies which evaluate appropriateness of Raven’s Matrices in assessing mental abilities of individuals with intellectual disabilities usually provide applauding statements on the appropriateness of Raven Matrices for identifying or assessing level of intellectual difficulties. Pueyo, Junque,Vendrell, Narberhaus, and Segarra (2008) administered Raven’s CPM to 30 individuals with different level of cognitive dysfunction attributed by cerebral palsy and concluded that Raven’s CPM is a fast, easy-to-administer test able to obtain a measure related with linguistic, visuoperceptual, and memory cognitive functioning.

The finding is also further supported by research from educational setting, where Amfield (1985) study shows that SPM appears to be a sensitive and appropriate tool to predict the school performance of pupils, even for those with ‘low abilities’. A recent study from Australia also produced consistent result when evaluating the validity of a puzzle form of Raven CPM as a measure of intelligence for individuals with severe ID (Bello, Goharpey, Crewther &Crewther, 2008). Their findings suggest that Raven’s CPM is a good alternative in assessment of individuals with severe ID. However, the puzzle form of CPM test is a more reliable measure of the non-verbal mentation of children with severe ID than the book form as the puzzle form demands sensory-motor attention and limits distraction in children with severe ID, thus, improve their test performances.

To conclude, Raven’s Matrices, mainly CPM have been increasingly utilized in research and assessment of ID across settings. Although some researchers only utilized them as control or measure of non-verbal mentation, it is also highly recommended by scholars to be used as an assessment for individuals with disadvantages in linguistic and mental abilities, such as in case of ID.

You Might Also Like

I'm Alejandro!

Would you like to get a custom essay? How about receiving a customized one?

Check it out