What effect does the word superiority have?

This study analysed the nature of word superiority effects (WSE) which was first developed by James Cattell in 1886. WSE is the ability to recognise letters more successfully if the individual letters are presented as part of word rather than as part of a non-word or an isolated letter. The study tested the WSE on 97 participants to replicate Cattell’s theory. In replicating the hypothesis for Cattell’s WSE, participants chose a letter from 2 alternatives that best resembled a letter from an original presented stimulus; which was later masked after presentation. Results indicated that an effect was apparent when comparing words to non-words but no apparent significance was found when comparing words and single letters. Overall results indicated that mean accuracy rates per condition were too similar to support the theory tested. Discussions have been made on possible flaws leading to the inconclusive results; such as the easy nature of tasks performed, as well as suggestions for future study procedures.

What effect does word superiority have?

Best services for writing your paper according to Trustpilot

Premium Partner
From $18.00 per page
4,8 / 5
Writers Experience
Recommended Service
From $13.90 per page
4,6 / 5
Writers Experience
From $20.00 per page
4,5 / 5
Writers Experience
* All Partners were chosen among 50+ writing services by our Customer Satisfaction Team

Word superiority effect (WSE) was first recognised as being an area of phenomenon by James Cattell in 1886 (Reicher, 1969). As theorised by Cattell (1886) when trying to recognise letters that have been presented as part of a stimulus, letter recognition is more successful if the individual letter is initially presented as part of a word rather than if presented as part of a non-word or as an isolated letter.

Reicher (1969) constructed an experiment to test Cattell’s theory, however Reicher hypothesised that the effect Cattell achieved may have been caused due to the individuals inferring of missing letters based on the letters surrounding it. In order to combat this redundancy effect, his experiment presented alternative letters that when chosen; either alternative letter chosen created a known word. In his study Reicher presented random stimuli to participants that either resembled 4 letter words, 4 letter non-words or a single letter. His results found that when masked after presentation, participants more accurately chose correct letters when initially presented as part of a word compared to when presented as a non-word or single letter; supporting Cattell’s theory.

According to Sperling (1963, 1967) when presented with these words and letters, participants store this information in the visual information storage system (VIS); however further states that this storage is fast decaying and that information is lost while responding to alternative choices. Therefore, resulting in participants incorrectly choosing the correct letters from the alternatives presented. Estes and Taylor (1966) further elaborated on Sperling’s observations by conducting their own study which found that accuracy levels in the storage of these information was more affected when target stimuli was presented in longer strings of letters.

Rumelhart and McClelland (1982) recognised the recognition pattern used in the identification process of these letters through what they termed the parallel distributed processing model (PDP). This model states that the information we visualise is processed in a series of hierarchical levels by the brain. When participants first visualise the letter “A” the individual characteristics of the letters presented (e.g. / -) excites the first level called the feature level. Once excited, the feature level sends forward the processing to the next level termed the letter level. In this case the illustrated examples of / – are recognised as being part of the letter “A”. When the “A” has been established it excites the higher level processing termed the word level. This “A” is then recognised as being part of words that are stored at the word level such as “BANK” or “TANK” etc.

These hierarchical levels interact with each other using what is known as excitatory or inhibitory connections. If the characteristics at each level are recognised as matching upper level requirements, then the next levels processing are excited. If they are not recognised as being in the next level the features elicit an inhibitory signal. The mix of these inhibitory and excitatory connections helps to recognise the displayed letters. The theory behind PDP model has also been agreed upon by other researchers (Sperling, 1967). However Estes and Taylor (1966) add that even in a PDP model, there are still limits to the quantity of information that can be manipulated.

An extension to the PDP model, as stated by McClelland (1976) is that in relation to WSE; the PDP model does not necessarily require the words presented to be complete. They stated that once feature levels recognised what the main nodes were being presented, the higher letter and word levels were able to establish a pattern that best resembled letters and words stored.

In their study they tested the WSE by manipulating the letter-cases that were presented. Hypothesising that even when mixed letter-cases were presented, known words would be better recognised than pseudowords. Results found that accuracy in mixed letter-case known words had higher accuracy than mixed letter-case pseudowords. Also revealed was that when testing same letter-case and mixed letter-case known words, same letter-case known words had much higher accuracy than mixed letter-case known words. This supported the WSE by showing that although the visual configuration was disrupted; it did not affect the feature and letter levels of the PDP model

Other areas of interest that were proposed to be important factors were the issues of word frequency and word inferiority. According to Peterzell, Sinclair, Healy and Bourne (1990) during everyday occurrence individuals use words that are common to our everyday language and therefore are more accurately identified with. They call such words high frequency words. Another phenomenon they interpreted was the consequences of a word inferiority effect. They explain letters which form part of an incorrectly spelled word are more accurately recognised and remembered than letters that are part of correctly spelled words.

In their study they tested the WSE on its relation to word frequency effect. Accepting Cattell’s and Reicher’s theory that letters were more accurately identified as part of known words compared to non-words and single letters, they theorised that another level of WSE occurred with common known words and uncommon known words.

They tested participants accuracy in recognising the letters (h, e, i, o) that occurred in the middle of stimulus words (the, tee, tie and toe). It was hypothesised that a WSE would be more accurate in identifying the letter “h” in the word “the” because it was one of the most common words used in the everyday language. Results indicated support for their hypothesis; leading to the conclusion that common words such as “the” were processed as single units prior to the processing of individuals letters.

Based on Cattell’s theory that letters are recognised more accurately when presented as part of a word compared to single letter or unknown word, this experiment was constructed to test this outcome. Two hypotheses are being examined within this study. Firstly it’s hypothesised that participants will more accurately identify letters that are presented in the context of a word than the accuracy level when presented as a non-word. Secondly it’s hypothesised that participants will be more accurate in identifying letters that are presented in the context of a word than as a single letter.



Participants comprised of 97 voluntary 2nd year university students that attended designated experimental classes. Age and gender of participants varied across the sample tested, as no selection criteria were imposed.

Material and Design

Computer programs utilising WSE were used to test participants’ responses. The experiment incorporated a repeated measures design that consisted of random independent variable stimuli being presented in 48 trials. In controlling the effects of fatigue and practice, the study incorporated a simultaneous design structure; whereby all 3 target stimuli conditions were presented simultaneously within the one testing session. The independent variable manipulated within the study was a target stimulus that consisted of 3 conditions; either being a 4 letter word, 4 letter non-word or a single alphabet letter. The dependent variable tested was the accuracy level of correctly choosing the correct letter that was initially presented as part of the stimulus; from 2 given alternatives.


Using computer packages, participants were presented with a total of 48 stimuli with each stimulus condition being randomly presented 16 times each. Each stimulus presented was either a 4 letter word, 4 letter non-word or a single letter. Upon presentation of a random stimulus for 100ms, the stimulus was then masked with hash symbols. On top of the masked stimulus, another 2 single letters were presented. One of the letters presented was part of the masked stimulus while the other letter was absent. Participants had to correctly choose the letter that best corresponded as part of the initial presented stimulus. Participants did this by either selecting the designated button corresponding to the upper or bottom letter. Upon completion, the number of correct choices made in each stimulus condition of the IV was generated; with a maximum total of 16 correct possible choices in each IV condition. Ethical requirements of the study regarding participants consent and study debriefing were also met.


Table 1.

Mean Number of Correctly Identified Letters per Condition

Stimulus Mean Number of Standard

Condition Correct Answers Deviation

4 Letter Word 15.08 1.31

4 Letter Non-Word 13.81 1.86

Single Letter 15.18 1.16

Note: N = 97

As shown in table 1, the data consists of the mean number of correctly guessed letters as a function of the different stimulus conditions presented. In general it was found that with the presentation of the different stimulus conditions, the average number of correctly guessed letters per condition was found to be similar. Participants’ variability in general was also found to be small within each condition. A within groups ANOVA confirmed that there was significant differences in accuracy when comparing word to non-words, t (96) = 7.82, p < .05 along with a significant difference when comparing single letters with non-words, t (96) = 7.29, p < .05. In contrast when comparing words to single letters, it was found that no significant difference in accuracy was present, t (96) = -.621, p > .05.


Although it was supported that letter recognition was more accurate when identified in the context of a word than a non-word, letter accuracy was inconclusive in regards to being more accurately chosen in the context of a word to that of a single letter. A possible explanation could be explained by Sperling (1963) who states that as part of our VIS any image that is seen by the eye is still available in the mind after the stimulus is removed for a fraction of a second longer. In the study although 4 letter known words may have been high frequency words and should have been recognised more easily than other stimulus displayed, the fact still remained that after the presentation was masked; each stimulus image was burnt into the VIS for just that few moments longer that was needed for recall. A solution for future studies would be to put in a distracter item such as a number presentation between masking and presentation of alternatives

Another possible reason why the results did not support the hypothesis was that the task performed was too easy in nature for the experiment. With the mean scores depicting close to perfect results, it can be seen that the tasks performed was easily overcome by the participants; resulting in a ceiling effect developing.

Future studies could benefit from trying to make the task conducted more difficult. Possible methods to achieve this could be by presenting stimulus words and non-words that are longer than just 4 letter words. This would aid the study by forcing participants to evaluate if the alternative choices presented were present in a longer stringed stimulus rather than from just 4 letters. Alternatively presenting more possible answers to choose from, rather than the current 2 alternative possibilities could make the task more appropriate. This would force participants to compare more letters simultaneously to the presented stimulus.

However the easiest way to increase the accuracy of the study without adjusting any other characteristics of the trials is by simply increasing the number of trials that are present within each condition. However this may present a confounding issue due to fatigue effects.

In order to determine the effectiveness of the above recommendation, a pilot study could be utilised. Pilot studies could help determine the appropriate length of words, the number of alternatives answers and the appropriate number of trials per condition that should be used to achieve a challenging yet fair experiment for the general population. In conclusion WSE is an area that warrants further investigating and although this study did not support the hypothesis, numerous research conducted previously have shown great support into explaining the intricate processes that are present regarding visual processing and the WSE.

You Might Also Like

I'm Alejandro!

Would you like to get a custom essay? How about receiving a customized one?

Check it out