Skip to main content

Building assessment into the curriculum plan



This is the fourth blog I've posted in recent weeks about curriculum planning, and this one, like the previous two, draws on the book Language Curriculum Design (Nation and Macalister, 2010). The subject this time is assessment and how this can be built into a successful curriculum. Gianfranco Conti and I have covered some similar ground in The Language Teacher Toolkit (2016) and Breaking the Sound Barrier (2019), but these important issues are worth spelling out again as many languages departments evaluate their curricula.

So I shall summarise some points referred to in Chapter 7 of Nation and Macalister (2010), adding a few observations of my own.

What is good assessment?

Assessment needs to be reliable, valid and practical. Let's look at these three aspects:

Reliability

A reliable test gives results which are not greatly affected by conditions which the test was not intended to measure. If the same person sat the test twice you would expect them to get more or less the same result. A test is more reliable if it is always given in the same conditions, is marked consistently, has the largest number of questions possible and in which questions and instructions are as clear as possible. An unreliable test cannot be valid.

(I would add a couple of other issues here. One very obvious one is that pupils should not have the opportunity to copy each other. If classroom layout means pupils are sat close to one another the temptation to cheat is too great for a good number of students, so steps need to be taken to ensure it is virtually impossible to see a neighbour's work. My solution was to have pupils place a bag in the middle of their shared table so that it was extremely difficult to see their partner's work.

A second point is that the type of question has a significant influence on reliability. So-called objective questions such as true/false or multi-choice are inherently more reliable than essay-style questions or oral tests where a level-based mark scheme has to be applied. If you set free writing, for example, you need to decide precisely what you are assessing and to apply any mark scheme as consistently as possible. Research shows that this is not easy, as teachers bring their own bias to the marking of pupils' work. As far as speaking and writing are concerned, you need to decide what you are assessing. Is it accuracy? Relevance to the question title? Pronunciation accuracy? Range and complexity of language?)

Validity

A valid test measures what it is supposed to measure. A valid achievement test measures what has been learned on the course, not language to which students have not been exposed. A valid listening test measures skill at listening. To ensure this teachers need to consider face validity. If it's a test of reading, does it look like one? If it's a vocabulary test, does the test actually test the spelling and meaning of words and phrases. But as well as face validity, we have content validity. When you analyse a test does it actually test what it's supposed to/ For example, if you wish to test reading or listening alone it would be unwise to include questions which require written target language answers, since these involve knowledge of writing as well as reading.

(I would add that this presents a challenge. If you decide to separate out the four skills and test each individually, this may end up having an effect on the way you teach, owing to what's called the backwash effect (aka washback). let's suppose that you want to test listening by asking questions in English (so as to avoid using target language writing), the danger is that, in the run-up to the test and in other prior lessons you use English questions when doing listening lessons. This, in turn, reduces the amount of target language (comprehensible input) you use in lessons. In this case the backwash effect is negative since we know that maximising comprehensible input is likely to lead to more acquisition.

On the other hand, if you design assessments to reflect the way you would like to teach then the backwash effect of tests can be positive. Suppose you believe that teaching is best when skills are integrated in lessons, with listening reinforcing reading, reinforcing speaking, reinforcing writing, and so on, then a good test would reflect this practice by including mixed skill assessment, e.g. a TL text with questions in the TL.

This has important implications when we look at GCSE and A-level in England, Wales and Northern Ireland. At GCSE Ofqual/DfE has long ago decided that, in general, we need to test the four skills separately. Yet they also reduce the validity of the tests by including TL answers in listening and reading tests. At A-level mixed skill testing has long been taken for granted.

The backwash effect is very powerful and encourages, for example, schools to use GCSE-style assessment even as early as Y7, which I find very unwise. These issues need to be kept in mind when designing unit tests and end of year exams.

In sum, I would argue for designing tests to resemble well-conceived lesson activities. You then end up with positive backwash and fairness, as students are asked to do well-designed tasks with which they are familiar. the test thus becomes a natural extension of the teaching. Remember too that research clearly shows students do better on tests whose form resembles what they have previously done in class.)

Practicality

When you consider practicality you need to look at factors such as cost, time taken to sit the test, time needed to mark it, the number of people needed to mark it and the ease in interpreting the results.

Practicality is aided when you can reuse tests year on year, or when a course book provides well-designed unit tests. Tests are more practical when they can be quickly marked, e.g. true/false or multi-choice, as opposed to composition or question-answer. But keep in mind that reliability and validity should take precedence over practicality. A multi-choice test is quick to mark, may be reliable to a limited extent, but certainly lacks validity in important areas.

(I would add that one way in which practicality comes in is to do with speaking tests. If you spend too much time on these during the year you risk sacrificing other areas of teaching. One formal oral assessment a year is probably enough, as long as you are assessing oral performance informally the rest of the time. Some might argue that a formal oral assessment is not worth doing at all in the early years since it is stressful and you already have enough information from lessons to assess spoken ability. But my experience is that you need at least one formal spoken assessment a year for the benefit of quieter, less confident learners who are reluctant to perform in normal classroom tasks.)


In conclusion, as Nation and Macalister point out: "Assessment also contributes significantly to the teacher's and learner's sense of achievement in a course and thus is important for motivation. It is often neglected in curriculum design and courses are less effective as a result" (p.120). So  I conclude with these few general pointers;
  • There's nothing wrong with testing - it provides opportunities for review and retrieval.
  • Design a regular programme of low-stakes assessments covering all the skills in an integrated way, as far as possible.
  • If you decide to test skills discretely (separately), don't let this have negative backwash on lesson design.
  • Only test what pupils have learned, keeping input comprehensible.
  • Don't spend too long on tests.
  • Make sure tests can be marked quickly.
  • Record results carefully to enable you to have an idea of a student's progress.
  • Make sure you use test results and analysis to help guide your future teaching.
  • Change tests if they are found not to work, e.g. they are too easy, too hard, or unclear.
  • All classroom activities, including tests, involve retrieval practice - if a test is a just like a regular classroom tasks it will be less stressful and produce positive backwash.
  • If the course book tests need adapting, do so, e.g. by adding more repetitions of audio tracks or reading them aloud.
  • When setting and marking end of year exams, spread the workload fairly across the department.
  • If you decide to do regular vocabulary testing, remember what research says about vocabulary acquisition - learning isolated words in written form has severe limitations. "Knowing words" is much more than knowing what they look like and what they mean.
  • Raising the status of tests can have a motivating effect (many pupils learn hard for them), but keep things in proportion!
Image: pixabay.com

Comments

Popular posts from this blog

What is the natural order hypothesis?

The natural order hypothesis states that all learners acquire the grammatical structures of a language in roughly the same order. This applies to both first and second language acquisition. This order is not dependent on the ease with which a particular language feature can be taught; in English, some features, such as third-person "-s" ("he runs") are easy to teach in a classroom setting, but are not typically fully acquired until the later stages of language acquisition. The hypothesis was based on morpheme studies by Heidi Dulay and Marina Burt, which found that certain morphemes were predictably learned before others during the course of second language acquisition. The hypothesis was picked up by Stephen Krashen who incorporated it in his very well known input model of second language learning. Furthermore, according to the natural order hypothesis, the order of acquisition remains the same regardless of the teacher's explicit instruction; in other words,

What is skill acquisition theory?

For this post, I am drawing on a section from the excellent book by Rod Ellis and Natsuko Shintani called Exploring Language Pedagogy through Second Language Acquisition Research (Routledge, 2014). Skill acquisition is one of several competing theories of how we learn new languages. It’s a theory based on the idea that skilled behaviour in any area can become routinised and even automatic under certain conditions through repeated pairing of stimuli and responses. When put like that, it looks a bit like the behaviourist view of stimulus-response learning which went out of fashion from the late 1950s. Skill acquisition draws on John Anderson’s ACT theory, which he called a cognitivist stimulus-response theory. ACT stands for Adaptive Control of Thought.  ACT theory distinguishes declarative knowledge (knowledge of facts and concepts, such as the fact that adjectives agree) from procedural knowledge (knowing how to do things in certain situations, such as understand and speak a language).

12 principles of second language teaching

This is a short, adapted extract from our book The Language Teacher Toolkit . "We could not possibly recommend a single overall method for second language teaching, but the growing body of research we now have points to certain provisional broad principles which might guide teachers. Canadian professors Patsy Lightbown and Nina Spada (2013), after reviewing a number of studies over the years to see whether it is better to just use meaning-based approaches or to include elements of explicit grammar teaching and practice, conclude: Classroom data from a number of studies offer support for the view that form-focused instruction and corrective feedback provided within the context of communicative and content-based programmes are more effective in promoting second language learning than programmes that are limited to a virtually exclusive emphasis on comprehension. As teachers Gianfranco and I would go along with that general view and would like to suggest our own set of g