Academic Exchange Quarterly         Spring  2004: Volume 8, Issue 3



Using Story Re-tell in Bilingual assessment

Christina Schelletter, University of Hertfordshire, UK

Tim Parke,  University of Hertfordshire, UK


Christina Schelletter, Ph.D. and Tim Parke, Ph.D. are both lecturing in English Language and Communication at the University of Hertfordshire.



Narrative has been used in the assessment of children’s language skills for some time but rarely with bilingual children (though see Gutiérrez-Clellen 2002). This paper examines narratives of a sample of German/English bilingual children in terms of standard measures and differences in the children’s retellings of a story. Whereas on the standard measures the bilinguals seem similar to monolinguals, the retellings show differences between the English- and German-dominant informants. These

differences highlight the significance of examining discrete skills when profiling the language competences of bilingual children.





Narrative has been used to assess both the global skill of reconstructing a story as well as a range of different sub-skills of children’s language. Regarding sub-skills, narratives are a good indicator of linguistic complexity, often requiring the use of subordinate clauses to specify the cause or purpose of a particular action alongside the description of the action itself. In addition, narratives can also give an indication of children’s discourse skills, in particular the introduction of referents, topic maintenance, location of an action in time, use of connectives, etc (Hickman 2003).

The global skill of understanding and reconstructing a story has been found to be linked closely to the development of literacy skills, both in terms of children’s understanding of texts (Gutiérrez-Clellen  2002), as well as children’s writing skills (Shrubshall 1997).


Due to the different task demands within a specific context, narratives have been used as an assessment tool both in the classroom and in a clinical setting. Within the classroom, they enable teachers to assess children at different levels of language and to make a judgment about their ability to construct a story, as well as their pronunciation and vocabulary (Parke 2001). Within a clinical setting, narratives can reveal difficulties at different language levels and also highlight problems with story comprehension and discourse skills. However, such an assessment requires norms based on normally developing peers at the same mental or language age.


The subject group targeted in the present paper are children who come to the fore as an issue not in general developmental terms, as language-disordered children do, but in educational terms. Children who have acquired more than one language from birth are often seen to be at a higher risk for difficulties in academic performance at school, particularly where the language taught at school is the child’s second language (L2).  It is regrettable to put these two populations together, as it may seem to perpetuate the ancient prejudice against bilingualism as a kind of disorder, but the authors do so purely from a methodological perspective.


Studies investigating narratives in bilingual children have found them to be less advanced than matched monolingual children on a variety of measures (Shrubshall 1997) and to employ different strategies from monolingual children when lexical difficulties arise (Parke 2001). Comparing narratives in both languages of Spanish-English bilinguals, Gutiérrez-Clellen (2002) found differences in the recall and comprehension of a story, such that the children showed better performance in the language used in the classroom (L2) as opposed to their L1.


The narrative tasks employed in the studies outlined above include narrative re-tells, where the child is given a story model that has to be re-produced, and spontaneous narratives. Gutiérrez-Clellen uses both types of narratives with her informants. She reports that all children found the narrative re-tell of a story that they heard in Spanish more difficult, whereas the level of language in the spontaneous narrative (the Frog Story) tended to be much higher, especially in terms of the coherence of narrative form. This might be due to the high degree of narrative support through the given pictures. On the other hand, the narrative in the provided transcript (Child # 315, Gutiérrez-Clellen 2002: 187-188) only includes main clauses linked with ‘And’ or ‘And then’, and no deeper-level links between propositions, such as cause and consequence. Nonetheless, on the basis of the variation of the richness of the narrative structure in both tasks, it becomes clear that the performance of bilinguals depends very much on the task they are given. A teacher who relies on the evidence of one language only would, therefore, have a very inadequate view of the overall language capacity of a particular child. Bilingualism is, as Gutiérrez-Clellen claims, a continuum of skills: the full profile of language skills emerges only when several measures are applied.


In the present study, a narrative re-tell task was employed that has been standardised and used for the assessment of language for some time: the Bus Story (Catherine Renfrew, 1969, originally published by Collins & Co. Ltd). This assessment is routinely used by speech therapists as a fairly natural tool for the assessment of language, yet there are not many studies that report findings for normally developing or non-normally developing children. It is an assessment of narrative recall, in which the children are told the story by a researcher (or therapist) alongside a set of 12 pictures, and are asked to retell it afterwards, using the pictures as cues. As published, the test provides details of calculating measures such as an information score (IS) and a sentence length score which is based on the mean number of words of the five longest utterances (A5SL). It also provides normative data on these measures, ranging from 3 years to 8 years. To our knowledge, the Bus Story has not hitherto been used with bilingual children.


Howlin and Kendall (1991) include the Bus Story along with other common tests used by therapists to assess the language skills of 28 language-disordered children with a mean age of 8;4. In particular, they found a significant correlation between children’s results on the Word Finding Vocabulary Test for English (Renfrew 1995) and both Bus Story measures (r = 0.63 for Word finding and Bus Story Information and r = 0.53 for Word finding and Bus Story MLU) , as well as a significant correlation between the two Bus Story measures ( r = 0.59). These findings are corroborated in a study by Adams and Gathercole (1996) who use the Bus Story measures in conjunction with others to assess the relationship between phonological working memory and spoken language in normally developing children aged 5. They found a significant correlation between both measures in the Bus Story ( r = 0.799) as well as between the bus story measures and a combined receptive and productive vocabulary score ( r = 0.38 for the vocabulary score and the bus story information score and r = 0.42 for the vocabulary score and the bus story MLU)


Botting (2002) compares narrative skills in 7-8 year-old children with a severe pragmatic impairment (PLI) and children with a specific language impairment (SLI). The Bus Story and the Frog Story, are contrasted in terms of measures of length (number of words in the story), errors (tense errors) as well as the use of evaluative devices (Bamberg and Damrad-Frye’s 1991). For the Bus Story, Botting found a discrepancy for both groups of children between information scores within the normal range and sentence length as well as the number of subordinate clauses below the normal range.


Our aim was to assess both languages of a group of English/German bilingual children by using the Bus Story. For this purpose, the original story was split into an English part (based on the first 6 pictures) and a German part (based on the last 6 pictures). For the German part, the English original was translated into German by a native speaker. In particular, we wanted to see what differences in performance, if any, exist between each child’s retelling in German and in English, what the nature of the possible differences in the two retellings is, and whether these possible differences correlate with other measures such as assessments of vocabulary and of MLU.







A total of 16 subjects took part in the study. Their mean age was 8;9, with an age range of 7;3 to 10;2. They were all attending the primary section of a German-medium school in London and had at least one German parent. They were all judged by their class teacher to have a good command of both languages, although there was variation in the length of time they had been living in the UK. On the basis of their productive vocabulary score, children were assigned to be German dominant or English dominant. Each group contained 8 subjects, 4 girls and 4 boys with a mean age of 8;8 (range 7;6 – 9;10) for the English dominant group and a mean age of 8;6    ( range 7;3 – 10;2) for the German dominant group.




In order to assess children’s lexical skills, each informant was first given the Word Finding Vocabulary Test for English (Renfrew 1995) and the ‘Aktiver Wortschatztest’ for German (Kiese and Kozielski, 1996). Both tests measure productive vocabulary. Then subjects were told the Bus Story. The first part of the story (6 pictures) was read to the children in English and they were asked to retell it. The story was then continued in German (6 pictures). In terms of the information score, the first 6 pictures of the story make up 40 % of the overall information score, whereas the last 6 pictures make up 60 %. The informants were audio-recorded while retelling the story and their responses transcribed and analysed using the CHILDES format (MacWhinney 1998). It is in this format that examples from our data are presented here.




A first analysis was conducted giving the MLU (in words) for each language, the information score (IS) for each language as well as the combined MLU of the five longest utterances (ASL5) and the combined information score (COIS). Table 1 gives an overview of the Bus Story measures for both languages, as well as an overview of the results of the productive vocabulary measure.


            (Table 1 about here)


Table 1 shows that while the two groups of children differ with regard to their productive vocabulary score, there is no difference between languages or groups in terms of MLU or information score. It is surprising though that the German dominant children display a slightly higher MLU in the less dominant language. Overall, the children’s average sentence length score (A5SL) and combined information score (COIS) is well within the range found by Renfrew (1969) for monolingual English children of this age group. Similar to  Howlin and Kendall (1991) and Adams and Gathercole (1996), there is a correlation between the combined Bus Story measures      ( r = 0.638, p < 0.01), but in this study there is no relation between the Bus Story measures and the vocabulary scores. Thus, on measures presented so far, there are no real differences in the language performance of the two sets of informants. We also looked for gender differences among the subjects. We found that on all measures, girls scored slightly higher than boys, but the differences were not significant. On this basis, gender differences are not further discussed here.


A second analysis focused particularly on the key words of the Bus Story, namely  nouns and verbs. In respect of other words classes, in the English portion of the text, only two adjectives occur: ‘funny’ (faces) and ‘naughty’ (bus), while none occur in the German portion. Table 2 gives the proportion of nouns and verbs in both languages that were taken up by the children from the original story, as well as additional nouns and verbs that the children included in their retelling.


                        (Table 2 about here)


Table 2 shows a significant difference in the uptake of nouns over verbs from the given text for all children and for both language contexts. This difference is significant (t = 5.7, p <0.01 for German, t = 12.6, p < 0.01 for English). This result is probably not surprising since the agents in the story (bus, train, driver, tunnel, policeman, cow) could not be described using a different lexical item. However, the descriptions of the actions allow synonyms to be used to convey the same meaning. For example, the driver can ‘mend’, ‘fix’ or ‘repair’ the bus and the bus can ‘run’ or ‘drive away’ or even ‘escape’.


Synonyms as alternatives for given verbs  were used more extensively by the German dominant children in the German context, thereby resulting in a higher mean number of new verbs. The difference between the two groups in the use of new verbs in German is close to being significant. Examples are given below.


(1)   BilingM3


     *CHI: Dann rollte er den Berg runter.  

      %eng: then he rolled down the mountain.


(2)   BilingF9


      *CHI: Als der Bus sah, dass unten Wasser is, probierte er, zu bremsen.

       %eng: when the bus saw that there was water below, he tried to brake.


In example (1), the story text includes the more general verb fahren (go). The child uses the verb rollen (roll) which is appropriate in the story, given that the bus has wheels. Similarly, in example (2), the original story uses versuchen (try) which is synonymous with probieren and also anhalten (stop) which is semantically close to bremsen (brake) and appropriate in the context of the story.


On the other hand, some word variations by the English dominant children in the German context, though within the same semantic field as the original word, have a meaning which is different from the lexical item used in the original text. An example of this is seen in the variations of the word See (lake) as ‘stream’, ‘sea’ and ‘Thames’.


       (3)  BilingM4


*CHI: Als der Bus sah, dass da unten  ein Bach war, dann wollte er bremsen.

               %eng: when the bus saw that there was a stream at the bottom, he wanted to  



(4)   BilingM5


*CHI: Als der Bus sah, dass er ein Berg runterrrollte und in ein Meer fallte,  

                       da is der Fahrer den Bus wieder gefahrn.

 %eng: when the bus saw that he was rolling down and fell in a sea, the driver

            drove the bus again.


(5)   BilingF10


    *CHI: Also ist er in die Themse gegangen.

      %eng: therefore he went into the Thames.


A final analysis included measures used in other studies, such as the mean total number of word types in both stories, the mean number of subordinate clauses and the number of errors in both languages. The results are given in Table 3.


                        (Table 3 about here)


The total number of word types in the original story was 55 for the English part and 83 for the German part. This difference in length is reflected in the children’s narratives. There was no difference in the number of word types supplied between the two groups of children for either context.


As far as the number of subordinate clauses are concerned, both language portions contained four complex clauses. The children either copied the complex clauses, modified them or omitted them altogether. The  English dominant children produced a  slightly higher mean number of complex clauses in the German story-retell. At the same time, girls outperformed boys in the supply of complex clauses, particularly in the German context. Among the complex clause types, causation was the type most often included, whereas relatives were the type most frequently omitted.


An area where the language dominance of the children does make a difference though is in terms of language errors, particularly in a language like German that is richer morphologically than English. Errors include word order, case, gender, as well as the form of the participle. While even German dominant bilingual children also included produced errors, they were far more frequent in the stories of English dominant children.




The present study has compared two groups of bilingual children (English dominant and German dominant) with regard to their story-retell in both languages using a standardised procedure, the Bus Story.


No differences between the groups were found in terms of general measures, such as information score, MLU and the number of word types used for each story. Both groups were also equally able to reproduce complex clauses in their own narrative, either as a copy of the model provided, or a modification of the input.


Differences between the two groups of children were found in the German context, where German dominant children outperformed English dominant children in terms of their ability to use synonyms of verbs, as well as in terms of errors.  These differences are fairly subtle, but nevertheless highlight a need for additional practise in a classroom situation that is based on a curriculum for monolingual German primary school children.


A further result of this study is the lack of a correlation between the Bus Story measures and vocabulary measures in the children tested. This correlation was found for monolingual children, but it seems to be absent in the bilingual case. This means that vocabulary skills in bilingual children do not predict syntactic ability. Even where a bilingual child is more restricted in word choice, this does not affect their syntactic abilities. In the present study, most children were able to retell the stories adequately, incorporating a good level of complexity in both languages, even if their vocabulary score for one language was well below that of the other.


Overall, the results found in this study both support and differ from those of Gutiérrez-Clellen (2002). We did not find here the same differences as she reports: one intriguing difference is the much greater degree to which her informants departed from the ‘input text’. On the other hand, our findings concur with hers, and with her overall conclusion, in that they support the position that bilingualism is a continuum of skills. Single language measures of young bilingual children are inherently unreliable in making a rounded assessment of their skills. And it is even more dangerous to infer one measure from another -  e.g. to take a vocabulary score as any kind of indicator of syntactic competence.


We should also, however, acknowledge the particular circumstances of the children studied here. All children live in the UK and have had contact with the language of the country, while their other language is supported through at least one parent and also the school environment. These represent significant contrasts with the Spanish-English speaking children studied by Gutiérrez-Clellen.




