Where does an extended piece of writing start – with sentences or the whole text? Thinking about this has important implications for how we approach teaching English. Paradoxically, I think it’s possible to claim it’s the latter. Texts are, of course, built up sentence by sentence, but the starting point of composition is an overall sense of the whole, the concept of what is going to be written in its entirety that sits in the writer’s head, no matter how loosely formed. It’s shaped by intentions, aims, a sense of purpose and audience – what do I want to write, why and who for. Don’t take my word for it. Take the simulated word of the machines that are shaking up the world of language and beyond – large language models (LLMs), often referred to as AI chatbots, most prominently in the form of ChatGPT-4, though there are numerous other models available. If you’re wondering what on earth I’m talking about, then please read on!
I’m not denigrating the importance of sentence-level work in English. But in the rush to apply pedagogical approaches that frame all learning as moving from the component to the composite (such as Mastery and Rosenshine’s Principles), an assumption has crept into some English teaching that writing should overwhelmingly focus on the small parts – generally sentences, but sometimes individual words, or paragraphs – before moving on to something more substantial.
This approach might play out in several ways in the classroom, such as:
- Students writing their own versions of model sentences, as instructed by the teacher.
- Students presenting the same piece of information or idea in 3-4 different types of sentence.
- Students identifying a sentence in a piece of their own writing and redrafting it.
- Students identifying a sentence in their work that they are particularly pleased with and explaining why.
- Students ‘exploding’ a sentence from their own reading (i.e. close analysis of a single sentence) in order to understand how to construct their own sentences.
In and of themselves, there’s nothing wrong with these activities. But if they are intended as the foundation for more substantial writing, or even for improving extended writing, then we begin to hit problems.
Think about how we construct almost any piece of extended writing from scratch and it’s likely that isolated sentences are not front and foremost in our minds. We’re likely to start with a general idea of what we want to say. It will sit in our heads as some kind of whole, a cloud waiting to shed a shower of words, to paraphrase Vygotsky. How we give form to that shower of words will depend quite significantly on our personal writing processes. Some of us will plan, chunking our thinking up into sections, possibly paragraph areas; some of us will just start writing, with a rough idea of where we’re headed. Whichever of these two broad options we take, sentences will emerge in what we write. But pre-writing almost always takes the form of thoughts, ideas and general areas for exploration, not sentences. Once writing starts, the sentences will come. At this point, we’ll try to draw on an appropriate number of different sentence constructions, depending on what we’re writing and who we’re writing for. We may draft and re-draft as we go, re-wording or structuring thoughts differently along the way, even adding in phrases or whole sentences to clarify or alter the ideas. We’ll also make sure to include paragraph breaks where appropriate. Of key importance is that we hold that picture of the whole in our heads during the writing process, even if the whole shifts its shape as we write. This concept of the whole includes a sense of what we’ve already written, what we’re writing in the moment, and what we want to write next.
Lots of students, of course, are not secure in forming and demarcating sentences or paragraphs. Wouldn’t they, then, benefit from starting at the component level of the sentence and building this up to the composite whole? I’d argue not, at least not for lots of the time. All students need the opportunity to develop their thinking in extended texts if they are to improve their writing. Often, an extended piece, even with insecure use of sentence grammar and punctuation, is of real value to teacher and student alike. It gives them both something to work with in identifying how to move forward, not just at a technical level, but in terms of the ideas explored. The opposite approach – tightly scaffolding writing at all stages from the sentence upwards - risks eliminating the need for deep thinking, with students parroting ideas they have been fed in a limited number of sentences. Both student and teacher are then left with little idea about what they can actually do and what they need to do to improve – at the level of both idea generation and writing competence. Learning to write can, of course, sometimes rely on tight instruction about how and what to write; but it can also start with the student, with input about how to develop further coming after the event. Writing as a genuinely formative tool.
One powerful approach that allows writing to be this genuinely formative tool is free writing, which is the opposite of crafting perfectly formed sentences, one at a time. In free writing you just write your thoughts as they come and then only think more about honing and crafting when you have something down on paper. Often, you can be surprised by where your thoughts have taken you and even the extent of early crafting that emerges from semi-conscious intentions. Professor Debra Myhill, an expert in talk for writing, demonstrated this approach with teachers at a recent EMC conference on ‘Linguistically-Informed Oracy’, where she showed the development from free writing, to thinking carefully about linguistic choices. Stylistic ‘choices’ are at the heart of this – not being told by someone else what you have to write, but consciously weighing choices for youself. Choices not rules.
What's all this got to do with LLMs?
It might now seem a huge leap to start talking about AI chatbots, but believe it or not we can learn a lot about the human writing process and the relationship between component and composite by thinking about how these machines work. This isn’t to say that chatbots process language in the same way as humans, but it is to suggest that there are similarities, most strikingly, perhaps, in the composition of whole texts and the need to start from the composite whole, or the ‘big picture’, in order to construct complex, meaningful texts.
Early attempts to teach computers to generate language were ‘rule-based’. Computer scientists attempted to replicate the ‘rules’ of language into computer code. Put crudely (and my limited knowledge of computer science means this really is crude), they attempted to code language grammar in order to train computers to replicate the parts of speech and the word order that make human language comprehensible. They tried to teach computers to construct texts sentence by sentence. From a linguist’s perspective, they were taking the rationalist approach most famously exemplified by Chomsky’s theory of generative grammar. Chomsky holds that humans are born with an innate sense of grammar, which means that we instinctively know how to construct meaningful utterances. The computer scientists were trying to replicate this innate, universal grammar in code.
The results were limited to say the least. Even if a machine could be trained to write a grammatically correct sentence, it struggled to write anything meaningful. The problems multiplied when it was required to respond to prompts or to combine sentences into meaningful text. Unless operating in a highly prescribed area, performing a very specific language function, any text produced was largely meaningless and of no comparison to a human’s. While language is rule-based, it is also incredibly flexible and context-dependent, far too much so to for a ‘rules-first’ approach to work for anything beyond a straightforward sentence or two.
The most significant problem for a rules-based approach was how to hold on to a picture of a whole text – whether reading or writing. This did not just apply to adjacent sentences, but to aspects of a text that might have occurred several paragraphs, or even pages, earlier, but which still needed to be absorbed in memory in order for the writing process (and the subsequent reading process) to make sense. There are no standardised input-output rules as to how humans do this, just general conventions that can occur in multiple ways. Think about the foreshadowing that takes place at the start of most novels, for example. Clues are dropped in that are relevant to the whole book, but which might not be explained fully for another hundred pages or more.
Solving the problem (a problem of ‘attention’ for the computer scientist; of ‘semantic memory’ for the linguist) was the great leap forward for the development of LLMs. The current generation of LLMs do not learn from language in component parts; nor do they write building up from the component to the whole. They swallow the whole, coded to pay attention to how all the different parts fit together. Trained on billions of texts, they pick up language patterns rather than being coded to replicate them. So, for example, they come to learn grammar and syntax, how sentences are structured, including subject-verb agreement and word order, through seeing how it is done over and over again, not by being taught rules; they come to understand context, how words fit alongside other words, to the extent that they can easily tell if a word like ‘bark’ refers to a dog or a tree, or even to a human, without code having to tell them directly that this is the case; and they absorb a multitude of facts relative to the prompts they are given.
This isn’t all that happens. Once LLMs have been trained on gargantuan amounts of text, developers can tweak their outputs. Essentially, they can add rules after the event, just as a teacher can advise a student on how to make their work better after they’ve seen a substantial attempt. For example, at the moment ChatGPT and other LLMs sometimes make up quotations, wrongly attributing them to a real person, or making up the source entirely. Developers are working on that and are almost certain to solve it soon. It’s like the student who doesn’t use paragraphs and needs advice about how to do so – they will get there in the end, but writing interesting whole texts with insecure paragraphs in the meantime is no bad thing. (For the students at least; in the case of LLMs there are significant ethnical issues at stake in relation to the kinds of errors they make.)
I’ll return to what I said several paragraphs previously (a good example of ‘attention’/ semantic memory) and stress that humans and machines do not process language in exactly the same way. Nor do I want to offer a utopian vision of what this new technology offers. Wondrous as it is, it’s also terrifying in its potential to be used for the wrong purposes and to undermine some of the fundamental aspects of our current ways of doing things, not least human linguistic creativity. But in thinking about how LLMs work, particularly their reliance on attention and semantic memory, it’s impossible not to think about the importance of those concepts in human language use, including in writing. Writing discrete sentences, even perfectly formed ones, (as if there can ever be said to be such a thing), does nothing to develop a student's semantic memory; nor does longer writing that is overly scaffolded and relies too much on a model. This can only happen if students are given lots of space to write substantial pieces of their own, that demonstrate their own thinking and draw on their own existing language resources. Fundamentally, as Myhill shows, in looking at how to teach grammar to students to improve their writing, its all about choices – students making their own choices and thinking about the impact on their writing.
Producing extended work is not easy for lots of students. It’s understandable and desirable that teachers stage learning in small chunks in order to guarantee that such students can produce something. But it’s our experience at EMC, in working with lots of schools, and in our own teaching, that nearly all students can produce extended pieces if the right conditions are in place. These pieces might be messier and less fully formed than carefully scaffolded work, but they are also more developmentally productive, more authentic, providing a truer picture of what students can do and so what they need to do more of to develop further. They teach students how to be writers, in the fullest sense of that word. They can also take up less classroom time, and certainly enable the different students in a class to focus on aspects of their own writing, rather than concentrate on elements that they might already be familiar with. Here are some of the things that generally need to be in place to make extended writing accessible for the majority of students:
- Students have a strong knowledge of what they are to write about and some intentionality (‘I want to write about this’). This might include a strong knowledge of a text or subject area; it might mean making room for students to write about their own interests and experiences, thoughts and feelings, or responses to texts.
- Students are given space, advice and thinking time (including talking to other students about their ideas) before writing, or after a first phase of ‘free writing’.
- There is a shared understanding that the writing is an opportunity primarily to express their thoughts, feelings and ideas, certainly in the first draft. Obviously, there is an expectation that they use grammar and punctuation to the best of their ability, but this is not the primary purpose of the writing. The primary purpose is to express and craft ideas.
- Work is done after the writing of a first draft to identify how it can be developed further. This might involve elements of self, peer and teacher assessment.
- Lots of the writing is low-stakes in nature, a chance to try out ideas and develop writing skills, rather than something to be judged and given a mark.
It was brought home to us that some students have limited opportunities for extended writing, particularly self-directed extended writing, when we carried out our project on assessment. Our team found that the dominant assessment model involved guiding students towards a final assessed piece by requiring them to tackle the learning in small chunks lesson by lesson. There’s certainly a logic to chunking in terms of enabling some elements of knowledge to be taught. But when applied to writing, which is too complex to divide simply into component parts, it meant that students were rarely writing more than a paragraph in one go, often just a sentence or two. When they did do their extended assessment piece, modelling and scaffolding had sometimes been too prescriptive, so that the responses could not provide a genuine indication of what students could do. There was also a flattening in the range of outcomes. High-fliers tended to underperform, held back by the constraints; low-attainers tended to ‘overperform’, a welcome boost to esteem, perhaps, but not indicative of where they were working, what they might achieve under less heavily scaffolded conditions and so ultimately of limited use to them and their teachers.
This model (what is sometimes referred to as doing ‘the formative’ and then ‘the summative’) is akin to rules-based approaches to teaching computers to generate language. The outcomes are formulaic, only work in a very particular context, and sometimes do not make enough sense. It’s the consequence of the same back-to-front thinking that thwarted early attempts to teach computers language.
Lots of schools still teach to alternative models, their students' books packed with different kinds of extended writing, some of it teacher-directed, some from the students’ own thinking. But the messages pushed on schools by various agencies in the past decade or more have made this increasingly difficult, reducing all learning to a set of routines and formulae. This is particularly damaging in English because language does not start at word and sentence level, building up to the whole. It starts with thinking. The articulation of that thinking is then a process of integrating constituent parts into the whole, with the whole always in mind.
It’s troubling enough that computers can generate human-like text; let’s at least make sure that our students have opportunities to engage in some meaningful, uninhibited text generation of their own.
___________________________________________________________________________________________
Thanks to my colleague Barbara Bleiman for providing the comments about Debra Myhill’s workshop about free-writing.
The main source for my thinking about AI and language was These Strange New Minds, by Christopher Summerfield. Highly recommended!