Effects of Teacher Training on New Instructional Behaviour

Abstract

This paper is an academic critique of an article written by de Jager, Reezigt, and Creemers (2002) titled: The effects of teacher training on new instructional behaviour in reading comprehension. The authors undertook a research study to examine the results of teacher inservicing on practical teacher behaviours. My examination systematically focuses on specific aspects of the article in terms of process and validity of research methods and results. I have attempted to develop a cohesive and unified explanation which not only expounds the particulars of the research but which also formulates a clear interpretation of that research throughout. I suggest that the size of the sample and the method of selecting subjects for the experimental groups makes the research externally invalid and thus greatly reducing generalizability to the ultimate, and perhaps even to the immediate, population.

Best services for writing your paper according to Trustpilot

* All Partners were chosen among 50+ writing services by our Customer Satisfaction Team

Quantitative Research Article Critique

In their article, The effects of teacher training on new instructional behaviour in reading comprehension, de Jager, Reezigt, and Creemers (2002) outline a quasi-experimental research design involving three sample groups (two experimental and one control) which were drawn from an immediate population of eighty-three primary school teachers in the northern part of the Netherlands. In their introduction to the article, the authors state that “teachers need suitable instructional models that provide them with guidelines for new types of instruction and they must have access to inservice training, which helps them to successfully implement these models in their regular lessons” (de Jager et al., 2002, p. 832). This statement basically outlines the premise behind the research – it is not a research question but a statement of belief on the part of the authors; a statement of belief upon which they draw in framing the purpose and focus of their research. The authors articulate their recognition of the fact that educating must be focused around the concepts of constructive, active and student-based learning where the teacher facilitates and guides his/her pupils to their own understandings. They also recognize the fact that while educational theory has progressed to meet the higher demands of the current paradigm, educational practice is often less than up to date. The research itself is based on exploring the possibility of reconciling educational theory and educational practice in pragmatically.

Research Problem

In their research, de Jager et al. (2002) focus on answering a specific research question which is outlined clearly in the study on page 832. They offer their problem as a statement rather as an interrogative so, for clarity, I shall paraphrase: Can teachers in primary schools be trained in using the models of either cognitive apprenticeship or direct instruction? Of particular importance to the authors is the integration of metacognitive skills (in reading) into these teaching models. They note that, through research such as that conducted by Muijs & Reynolds (2001) and others, it has been proven that training in Direct Instruction (DI) is effective for enhancing the development basic skills. Additionally, they indicate that lab experiments have proven the effectiveness of Cognitive Apprenticeship (CA) on the development of metacognitive skills in controlled situations (de Jager et al., 2002, p. 832). Due to these facts, it is realistic to believe that similar experiments (and perhaps similar results) can be conducted and analysed in real classroom situations. Indeed, the act of replicating methods (even – and perhaps necessarily — with some changes) is at the heart of small scale quantitative research: “Quantitative research assumes the possibility of replication” (Cohen, Manion & Morrison, 2003). Thus, the possibility of training and implementation is researchable, as it builds upon pre-existing research.

The authors propose a practical necessity for this research problem to be explored. As suggested above, a theory base for this research is pre-existing and, as such, the authors clearly intend to provide significant results in the form of practical application of those theoretical concepts. Just as it is indicated that the research is pragmatic in its nature and implications, it is clear that the research is also significant. Essentially, the research is based on changing the instructional methodology of educators to meet current outcomes – outcomes which seem to be on par with those of the current system in Newfoundland. That is, they seem to be student centered and focused on the development of metacognitive thinking skills. The authors refer to De Corte (2000) regarding this matter: “Learning is a constructive, cumulative, self-regulated, goal-oriented, situated, collaborative, and individually different process of knowledge building and meaning construction” (de Jager et al., 2002, p. 831). Educators must be dynamic and able to adapt their methods and teaching styles to accommodate the shifts in the curricula and pedagogy inherent in the current paradigm. Therefore, research into the possibility of adapting to new instructional models is very significant – indeed, what could be more significant?

The authors explicitly state that their research study is “a quasi-experiment with a pre-test, post-test, control group design” (de Jager et al., 2002, p. 835). Presumably, the ultimate population of the study will be all educators in the entire world. The immediate population consists of 83 primary school teachers in the northern part of the Netherlands. In addition to being identifiable by locale and occupation, the population is also particular in that each individual within the group was previously familiar with and had been making use of I Know What I Read – a specific curriculum for reading comprehension (de Jager et al., 2002, p. 835). This is a significant factor which could introduce a possible problem in terms of external validity. That is, assuming that the authors wish to make their findings generalizable beyond the immediate population (whether through subsequent studies of their own or others), the previous use of this specific curriculum could be viewed as an independent variable. In any case, the immediate population for the study is clearly indicated by the authors.

The identification of independent and dependent variables is of extreme importance in researchaˆ¦ it is the foundation of any experiment or similar activity. De Jager et al. (2002) researched the possibility of changing educational methods through training. As such, the independent variables in this study are the training of two experimental groups (one group in DI and one in CA) and the dependent variable is, therefore, the inclusion of instructional methods which are clearly designed to develop metacognitive reading skills. While this is the case, these variables are not indicated clearly within the study. Nowhere does the language of independent and dependent variables occur explicitly. Additionally, beyond these variables, other significant extraneous variables exist which could be viewed as independent variables, such as the pre-existing familiarity with a specific curriculum model (mentioned above), the age of the participants, the educational training and experience of the immediate population and research sample, the use of alternate curriculum materials, etc. I will deal with these topics in another section.

Review of the Literature

The authors draw on pre-existing research to formulate the purpose of their own study. Additionally, they seem to have drawn on a comprehensive list of sources throughout the study. For example, under section 3 of their report, they include an extensive background for the development of their in-servicing models. During the implementation of the independent variables (training in DI and CA) they provide evidence which indicates that they closely followed the findings outlined in the literature on inservice training. While this is the case, they also note a particular difficulty inherent in this mass of research literature, namely that no information exists to indicate whether teachers should be inserviced via the methods that they are expected to employ (eg: should the CA group have been inserviced using the CA model?)

Beyond their extensive use of source referencing in terms of the preparations for the study, the authors also seem to rely heavily on pre-existing literature in the development of the observational instrument and in the justification of their methods in terms of sampling. In terms of the latter example, the support drawn from their reference to Anders, Hoffman, & Duffy (2000) is fallacious in that it does not truly validate their method of sampling (I deal with this explicitly in the Selection of Subjects section below). In the former, the development of this tool in two sections (Low Inference Observation and High Inference Observation) relied heavily on the work of Booij, Houtveen, & Overmars (1995); Sleipen & Reitsma (1993); and Burgess et al. (1993). Specifically, in their use of High Inferece Observation, which could possibly have introduced problems of internal validity, the authors reference Gower’s congruence measure (1971) in order to dispel any notion of problems with interrater reliability (the possibility of unintentional subjective bias on the part of trained observers).

Generally, the review of literature seems comprehensive. The authors reference previous studies on the adaptation of instructional models, specifically on teacher training in the implementation of the DI model (de Jager et al., 2002, p. 832). Somewhat troubling is the fact that no clear reference is offered for the lab-testing of the CA model. Resnick (1989) is cited in terms of introducing the concept of “constructivist models such as cognitive apprenticeship” (de Jager et al., 2002, p. 832) but the actual experiments are not cited.

No emphasis seems to be placed on primary sources. Interviews, artifacts, etc. were not involved in the study or its development beyond the structure of the inservice training where participants were free to discuss and interact openly. As this is not a source but an active and inclusive part of the study, this cannot be viewed as a primary source.

In terms of being up-to-date, the review of the literature seems valid. That is, the article includes a well organized bibliographical reference list of 35 studies and 13 of those studies were published during or after 1995. A vast majority of those studies took place after 1990. Additionally, as can be seen from the discussion above, the literature seems to be directly related to the development of this study and is involved in the development of the research hypotheses (discussed explicitly in the next section).

Research Hypothesis

Unlike the independent and dependent variables, the authors of this study clearly, consisely, and explicitly indicate their hypotheses regarding the outcomes of the research.

Particularly, these hypotheses are as follows:

After appropriate training in which they learn how to implement one of these instructional models, teachers will increasingly show the main characteristics of cognitive apprenticeship or direct instruction.

Teachers in both experimental groups will improve the general quality of their instructional behaviour.

The teachers in both experimental groups will learn to focus more on comprehension skills and metacognitive skills (thus they will spend more lesson time on these skills) (de Jager et al., 2002, p. 834).

These hypotheses clearly follow from the literature cited within the section of the article dealing with the theoretical background of the Direct Instruction and Cognitive Apprenticeship models of instruction. Implicit within these hypotheses is the suggestion that a true causal relationship will exist between the independent and dependent variables. That is, the training in CA and DI will result in teachers being capable of implementing these models. While this seems to be a simplistic suggestion at first glance, if proven true, the implications of this study on the inservicing of new curricula and instructional methods will be significant. In essence, the study will prove the validity of the notion that teachers not only need inservicing but that they can benefit from it. What is perhaps most valuable about these hypotheses is that they are clearly testable via the observational tool developed for the study. Characteristics of lessons can be identified, and lesson time/focus can be measured. The second of these three hypotheses is the only one which really presents a difficulty in terms of measurement as it relies on an analysis of a very subjective topic – general instructional quality. While this is the case, the observational tool includes 16 items of high inference observation (though it is arguable that this is not enough). Thus, the authors have made specific provision for testing all three hypotheses.

Selection of Subjects

The immediate population of this research is clearly and explicitly identified by the authors in section 4.1 of the article (I have indicated the immediate population of this study in the Introduction and Research Problem sections of this paper). Of the 83 individuals in the immediate population, three sample groups were generated from 20 volunteer participants/subjects. Of these three groups, two were experimental and one was used as a control group. The first of the two experimental groups consisted of 8 teachers who were inserviced in CA; the second experimental group consisted of 5 teachers who were inserviced in DI; the control group consisted of the remaining 7 teachers. The subjects are shown to be equivalent in terms of years teaching experience (average 22.1, SD=6) and likewise in their experience with the specific reading curriculum (average 2.8, SD=1.9).

Much discussion regarding these facts has occurred among my colleagues in Education 6100. While the sample groups seem to be equivalent, there are some serious questions of validity which must be raised here. First of all, one must note that 95% of the subjects will statistically have ten or more years teaching experience. This is significant because the nature of the research study is such that it attempts to measure teachers’ ability to change their instructional methods. In their section on teacher training, the authors cite Galton & Moon (1994) where it is stated that “More experienced teachers may find it even more difficult to change than novice teachers” (de Jager et al., 2002, p. 834). If this is indeed the case, it seems as though the sample groups (being quite experienced) will find it particularly difficult to alter their methods and, as such, this could distort the research findings. An attempt ought to have been made to include teachers of various experience levels in order to make the findings more generalizable and therefore more externally valid.

In addition, the method of sampling employed by the authors seems less than valid. That is, the authors made no attempt to control the parsing of subjects into the three sample groups. What is worse, they allowed the subjects themselves to choose between the groups. The authors cite Anders, Hoffman, & Duffy (2000) and their comments on voluntary participation as a positive influence on inservice teacher training (de Jager et al., 2002, p. 835). There is a profound difference in accepting volunteers and allowing those volunteers to choose their experimental sample groups. As a teacher myself, it is clear that those teachers who ‘volunteered’ for the CA inservice training were predisposed (or at the very least, more open) to that method of instruction; likewise with those for the DI group. Beyond this, 20 volunteers is hardly a reliable sample from 83 individuals – according to Box 4.1 on page 94 of the course text, were this a random sample, a population size of 80 subjects would require a sample size of 66 subjects (Cohen et al., 2003). Thus, the sample size is quite disproportional to the immediate population from whence it came. Besides this, the subjects who agree to the research project will necessarily have more in common with each other than they do with the remaining members of the immediate population (if nothing else, the fact that they are volunteers in the study). It is in the sampling of the experimental and control groups that the inherent flaw of this study rests. The authors freely admit that this method (if one could call it that) was employed for pragmatic reasons (de Jager et al., 2002, p. 835). It was clearly a calculated compromise, but one which, in my opinion, invalidates their work.

Instrumentation

All experimental and quasi-experimental research must include some method(s) or tool(s) for evaluating the effects of the independent variable(s) on the dependent variable(s) – otherwise, there would be no point to the study. In terms of the method for evaluation in this particular study, the authors developed a survey style observation tool which was employed four times for each experimental group and only twice for the control group. As the instrument was designed specifically for the measurement of these groups, it falls into the category of non-parametric testing. This statement is verified by the fact that survey/questionnaire research methods yield non-parametric data (Cohen et al., 2003, p. 77) and also by the fact that the equivalence of the groups before treatment was “tested with the Kruskal-Wallis One-way analysis of variance for independent groups” – an ordinal ranking test (de Jager et al., 2002, p. 837). Ordinal scales yield non-parametric data (Cohen et al., 2003, p 77). Another ordinal scaling test ( The Mann-Whitney U test) was employed by the authors in the evaluation of the data generated by both low and the high inference portions of the observation tool. Thus there can be no doubt that the observation tool is non-parametric. This is significant because, like so many other elements of this study, this creates problems of external validity as a non-parametric tests “do not make any assumptions about how normal, even and regular the distributions of scores will be” (Cohen et al., 2003, p. 318).

The internal validity of the test seems to be quite positive. Indeed, the researchers take great pains to ensure this and they explicitly outline their efforts in the development of their observation tool, especially regarding the use of low and high inference observations. Additionally, they indicate that while five individuals were trained to employ the observational tool, the interrater reliability was quite high (0.81). [1]

The observation tool was developed to include two sections. The low inference section seems to have consisted of a checklist which was scored at two minute intervals. This list simply indicated which activities were taking place at each interval; while useful for gathering quantitative data, it did not allow for qualitative observation. Therefore, the second section of the observational tool consisted of a high inference evaluation in the form of a Likert Scale (a rating tool normally having gradable/comparable range of responses to a prompt for observation – usually on a scale of 1-5). The use of the Likert (rating) Scale could introduce a problem in terms of internal validity in that observers may not wish to indicate extremes (eg: circling a 1 or a 5) but rather they may stick to the mid-range of the scale. This makes sense, as the circling/indication of an extreme value on a ranking scale is akin to a dichotomous scale (binary in nature).

The primary advantage to this sort of evaluation in terms of internal validity seems to be in the fact that the subject does not complete the survey but instead, s/he is observed by an (ideally) objective third party. Of additional significance here is the fact that the observers are just that – they are not the researchers. The chance of observation bias (the possibility of inadvertently – or otherwise – ‘doctoring’ the results in an attempt to verify research hypotheses) is greatly reduced. While this is the case however, the method or extent of observer training is not indicatedaˆ¦ it is simply stated that five persons were trained (de Jager et al., 2002, p. 837).

Design

The design of this research study is explicitly indicated as “being quasi-experimental with a pre-test, post-test control group design” (de Jager et al., 2002, p. 835). The independent variables in this study are clearly manipulated in that the inservicing of both control groups was methodologically undertaken. Additionally, each treatment was conducted independently of the other and while both dealt with metacognitive skills, each was developed to achieve those skills using very different methods. The control group received absolutely no training or teaching aids, and was not influenced by the researches beyond the fact that during two lessons in a single school year, each member was observed during the delivery of a lesson on reading comprehension. While this is the case, there are a wealth of extraneous variables which are not (and cannot) be taken into account here. Firstly, the experience of the subjects in the sample groups cannot be denied as being significant (see above). Additionally, the fact that the two experimental groups were given completely alternate materials has serious repercussions on the validity of the research findings. The study was supposedly based on the possibility of adapting instructional methods. It would have been more valid if the sample groups all made use of the same curriculum materials because, as it stands, we cannot be sure that any differing results in the observations of the groups are a result of the training or the use of different (perhaps superior) materials. Who is to say that the control group, equipped with the alternate materials, would not have had similar or equivalent results?

Another extraneous problem, beyond sampling and differing curricula rests in the fact that while the study involves teacher training, it necessarily includes student involvement (and perhaps achievement) as well. The students are not indicated as being equivalent in this study. With a group of 20 teachers, we could be talking about a sample population of students consisting of up to 500 (or more) students. While it may be argued that this bears no relavence to the matter, I hold that it does. What can be accomplished with one class may be a nightmarish impossibility with another; what comes easily for the one group, may be quite difficult for the second. Consider the notion of group work, which is included in the observation tool developed for this study. Some classes are more apt to accept this model of instruction than others. In short, the students ought to have been considered.

Results

The results of this research article are shown to prove the hypotheses of the authors to be correct. A relative change in instructional strategy occurred within both experimental groups and is attributed by the authors to the independent variables: “Differences in the observation a the end of the school year can be attributed to the experimental treatments” (de Jager et al., 2002, p. 837). The statistical results of the quasi-experiment on both experimental sample groups is described and represented in separate sections and separate tables. This emphasizes the fact that there was no intent on the part of the researchers to compare the DI and CA models of instruction (although they do clearly state that both were equally difficult/easy to implement). While this is the case, it would have perhaps been useful to see the results combined in a single chartaˆ¦ with two independent variables and two experimental groups, one cannot help but to wonder about relationships among the effects on the dependent variables. However, what is perhaps most troubling about the representation of the data in these charts is that the charts do not correspond. I suppose that this is due to the fact that they represent the implementation of two distinct instructional methods, but the amount of concurrence in terms of high inference seems to be quite low in my opinion.

Whereas the authors state that significant differences exist between the pre-test and post-test of the experimental groups, these differences are rather limited in scope. In terms of the CA group, “only four of the 13 indicators show significant differences and with the DI group only four differences on the high inference indicators are significant” (de Jager et al., 2002, p. 838). While this is true, it is also reported that on most indicators the teachers of both experimental groups show more favourable behaviours than the control group. As stated in the previous section however, this could be attributed to extraneous variables and not necessarily to the independent variables (ie: teacher training).

On the whole, the results of the research do not seem to be very valid or indicative. Yes, there are some differences, but there is no certainty as to whether this can be attributed to the inservice training in either the DI or CA instructional models. Beyond this, as outlined above, an insufficient number of subjects was sampled from the immediate population for the results to be statically valid. While appropriate statistical tests, such as the Mann-Whitney U test, were employed, the fact that the numbers are insufficient delimits the significance of these tests. By and large, the significance of the differences which are purportedly the result of the independent variable seems to be overrated.

Discussion and Conclusion

For the most part, the conclusion of this article takes the ideas presented in the Results section and makes value statements based upon them. Thus, the interpretation of the results is reserved for this section (although some interpretation is undertaken in the previous section). Generally, the results are not discussed in relation to previous research studies. It is stated that the teachers who were trained in DI and CA showed more characteristics of these models than the teachers who were not inserviced in the implementation of these models – this is obviously the case. Showing characteristics of a model of instruction, however, does not constitute success.

In terms of success, it is perhaps useful to reiterate the research question for this article: Can teachers in primary schools be trained in using the models of either cognitive apprenticeship or direct instruction? The authors indicate that the project was a success: “aˆ¦teachers, when they are appropriately trained and coached, can change their behaviour in accordance with relatively new ideas about learning and teaching based on constructivist theories” (de Jager et al., 2002, p. 839). This clearly indicates that the results of the research conducted by the authors is consistent with the findings of both the previous research in DI model of instruction and the lab test experiments with the CA model of instruction. While this is the case, the authors also indicate that the scores of the experimental groups “did not differ significantly from the control teachers” (de Jager et al., 2002, p. 839). This indicates that the researchers, while holding on to the notion of success, acknowledge the fact that their results are non-conclusive in and of themselves. They recommend amendments to the research methods which they employed throughout their study and call on others to replicate the experiment.

Personal Analytic Statement

As a whole, I found this article to be lacking merit in terms of reverberationary quality – external validity. That is, while it is interesting to see how the training in alternate instructional methods can possibly affect the teaching methods of regular classroom teachers, it is relatively superfluous in that it does not offer any real generalizable results. The sample groups were poorly selected and did not represent the immediate population. As such, the findings of the research are particular to the individuals studied. Of course, as regular teachers which were proven to be equivalent (though, in my opinion, not convincingly so) the samples can be considered to be possibly representative of other individuals, hence the authors’ call for replication of the research. Beyond this, the results themselves do not seem to be conclusive as there is no real general significant difference between the sample and the control groups. Of particular interest to me was the fact that on the one hand, the authors clearly indicate that educators who are very experienced in the profession will have a more difficult time in accepting and incorporating changes to their instructional methods and on the other hand, they select a group which statistically has 95% of its subjects enter into the experiment with 10 or more years of experience. If this was intentional (ie: work with the most difficult to show the power of inservicing), the authors fail to indicate their intentions.

Effects of Teacher Training on New Instructional Behaviour

Best services for writing your paper according to Trustpilot

Related posts:

You Might Also Like