Assessing listening skills
In many ways, the consideration of testing and assessing listening ability parallels that of assessing reading. Both are receptive skills and both can be broken down in similar ways. For that reason, you should read this guide after or in conjunction with the guide to assessing reading ability.
The essential difference between the skills is that the listener cannot move backwards and forwards through the text at will but must listen for the data in the order in and speed at which the speaker chooses to deliver them.
In common with the assessment of reading skills, that of listening skills is, perforce, indirect. When someone speaks or writes, there is a discernible and assessable product. Merely watching people listen often tells us little or nothing about the level of comprehension they are achieving or the skills they are deploying. This accounts for the fact that both listening and speaking skills are often assessed simultaneously. In real life, listening is rarely practised in isolation and the listener's response to what is heard is a reliable way to assess how much has been comprehended.
Rarely, however, does not mean never and there are a number of times when listening is an isolated process. For example, listening to the radio or TV, a lecture or a station announcement are all tasks which allow no interruption or feedback from the listener to gain clarification or ask questions. One can, of course, allow the listener access to a recording which he or she can replay as frequently as is needed to understand a text but, as this cannot be said to represent a common real-life task, we'll exclude it from what follows.
We can test some underlying skills discretely. For example:
- we can test learners' abilities to understand lexical items through, e.g., matching or multiple choice exercises
- we can assess the ability to recognise individual phonemes by, for example, getting learners to match minimal pairs of words to written forms
and so on.
However, before we do any of that, we need to define what listening skills we want to test and why. For more on the subskills of listening, see the guide to understanding listening skills. The following is premised on the fact that you are familiar with the content of that.
The aims of the teaching programme |
All assessment starts (or should start) from a consideration of the aims of instruction. With listening skills, as with reading skills, however, it is notoriously difficult to identify specific skills which are linked to specific purposes. An argument can almost always be made that the following are key macro listening skills whatever the setting, whatever the purpose and whatever the topic and text type:
- Listening to locate specific data is required:
- by general listeners to locate items of interest in, e.g., announcements and news programmes
- for academic purposes to locate the part of a lecture or address or programme which focuses on what needs to be learned
- in the workplace to make the identification and absorption of heard data efficient and focused
- Listening to obtain the gist is needed:
- by general listeners who simple want to get the gist of a text and don't need detailed understanding
- by students to judge whether a comment or section of a lecture is relevant to their studies and current concerns
- by busy people in their occupations so they can judge whether something they are hearing is relevant or ignorable in part or whole
- Following directions and instructions:
- by general listeners needing to know what to do or where to go in response to an enquiry which may be as simple as Where's the toilet? or much more complicated
- by students to find what to do, what to read and when to submit work
- by people in the workplace to allow them to follow an instruction and organise their working time
Underlying these three macro skills are a number of micro listening skills without which few texts can be properly understood. These will include, for example:
- Recognising the sounds of English, especially those which are allophones in English but full phonemes in the learners' first language(s) and vice versa
- Identifying lexemes and word boundaries
- Using context and co-text to infer meaning (including visual information)
- Understanding intonation a recognising attitude
- Recognising the communicative functions of utterances: questions, instructions, responses, initiations etc.
Three basic tenets |
- We have to use assessment tasks which focus on the kinds of texts the learners will hear in 'the real world'.
- We need to design tasks which accurately show the learners' ability.
- We need to have a reliable way to score the learners' performance.
These three factors are to do with ensuring reliability and
validity. For more on those two concepts, see
the guide to testing, assessment and evaluation.
The rest of this guide assumes basic familiarity with the content of
that guide.
Fulfilling all three criteria adequately requires a little care.
Identifying listening text types |
The first step is to find out what sorts of texts the learners
will need to access and what strategies are appropriate for the
purposes of listening. This is by no means an easy undertaking,
especially if the course is one in General English (also known as ENAP
[English for No Apparent Purpose]) when it is almost impossible to
predict what sorts of texts, for what purposes the learners may one
day need to access (see below for a generic check-list of skills).
On courses for very specific purposes, it is easier to identify the
sorts of texts the learners will encounter and the purposes for which
they will listen to them but there is no related set of subskills
that we can identify with confidence that will allow them easy
access to texts in particular topic areas. We can, however, look at
the types of texts and identify key listening strategies to focus on.
For example:
Situation | Skills needed |
ANNOUNCEMENTS | Good monitoring skills to decide on relevance (Is this my flight?) and the ability to extract vital data (gate numbers, platforms etc.) |
LECTURES | Listening for signposting (sequences, itemisation, prioritisation, importance etc.) |
RADIO AND TV | Gist listening to entertainment to
follow a plot Monitoring for relevance in a news broadcast Using visual clues to understand TV programmes |
INSTRUCTIONS AND DIRECTIONS | Intensive listening for detailed understanding |
MEETINGS AND SEMINARS | Intensive listening to understand
detail and locate relevance On-going monitoring to identify questions and invitations to comment |
DIALOGUES | Gist listening to follow a
conversation Intensive listening if the listener is a (potential) participant |
When we know what kinds of settings in which our learners will need to operate, we can get on with designing tests which assess how well they are able to deploy the skills they will need.
A general listening-skills check-list |
It may be the case that you find yourself teaching General
English rather than English for Specific Purposes. If that is so, you
need a general-purpose check-list of abilities at various levels
against which to assess your learners' abilities to read.
Here's one:
The abilities and text types are, of course, cumulative. At, e.g., B2 level, a learner should be able to handle everything from A1 to B1 already.
Designing tasks |
Now we know what sorts of thing we want to assess, the text types
we are targeting, the purposes of listening, the subskills deployed
and so on, we can get on and
design some assessment procedures.
There are some generic guidelines for all tasks.
If you have followed the guide to testing, assessment and
evaluation (see above), you will know that this is something of a
balancing act because there are three main issues to contend with:
- Reliability:
A reliable test is one which will produce the same result if it is administered again (and again). In other words, it is not affected by the learner' mood, level of tiredness, attitude etc.
This is challenging area in the case of assessing listening because the skill requires high levels of concentration especially if more than gist is to be gleaned.
We need to be aware that very long listening tasks will result in fatigue and that may overwhelm learners who are otherwise good listeners. Unless there is a good reason for using a long text (e.g., when preparing people for study in English), a range of short tasks focused as far as possible on micro skills is a better way forward in most circumstances.
Assessment outcomes are often in written form and the listening text itself often recorded and repeatable so marking can be quite reliable. - Validity:
Two questions here:- Do the texts represent the sorts of texts the learners are
likely to encounter?
For example, if we set out to test someone's ability to understand a lecture, we need to ensure that the topic area is valid for them.
On the other hand, if we know that our learners will rarely, if ever, encounter the need to listen to extended monologues from native speakers but will need to understand what they are told in service and informational encounters, then we have to match the texts we use for assessment of their abilities. - Do we have enough tasks to target all the skills we want
to assess?
For example, if we want to test the ability to use context and co-text to infer meaning, do we have a task or tasks focused explicitly and discretely on that skill? If we want to test the ability to monitor a series of announcements for crucial data, do we have a test that requires that skill?
- Do the texts represent the sorts of texts the learners are
likely to encounter?
- Practicality:
Against the two main factors, we have to balance practicality.
It may be advisable to set as many different tasks as possible to ensure reliability and to try to measure as many of the subskills as possible in the same assessment procedure to ensure validity but in the real world, time is often limited and concentration spans are not infinite.
Practicality applies to both learners and assessors:- for learners, the issue is often one of test fatigue.
Too many tests over too short a time may result in learners losing commitment to the process.
On shorter courses, in particular, testing too much can be perceived as a waste of learning time. - for the assessors, too many time-consuming tests which need careful assessment and concentration may put an impractical load on time and resources. Assessors may become tired and unreliable.
- The third issue concerns technology. If we know, for example, that our learners will rarely have to understand audio-only, disembodied text, then providing context and clues through the use of video recordings should be considered. Even settings which are heavily text laden (such as lectures) are accompanied by gesture, expression and visual data that cannot be excluded from a valid test of the skills.
- for learners, the issue is often one of test fatigue.
Examples may help |
Say we have a short (150-hour) course for motivated B2-level
learners who will need to operate comfortably in an English-speaking
culture where they will live and work. They will need,
therefore, to be able understand a wide and unpredictable range of texts
for a similarly wide range of purposes so we
need to focus our assessment on generic, recognisable listening
skills.
We have around three hours of the course earmarked for this
assessment.
What sorts of items could we include, bearing mind reliability,
validity and practicality?
Evaluate the following ideas, based on these principles and then
click on the
to reveal some comments.
Get the learners to watch a 20-minute news broadcast and give them a worksheet designed to get them to identify, from a set of six or so, two essential facts about three of the items. |
Negatives:
On the positive side:
|
Get the students to listen to a range of 4 short texts, each targeting a different listening skill:
|
Negatively:
On the positive side:
|
Give the learners either:
Ask the students to summarise, in writing, the similarities and differences between the two texts. |
Negatively:
On the positive side:
|
Designing anything in life involves striking a balance between competing priorities.
The kind of evaluative thinking that you have used here can be applied to any kind of assessment procedure, regardless of topic, level and task types.
Other listening-skill assessment task types |
It may be that your circumstances allow for very simple
listening
tasks such as those requiring the learners to respond spontaneously
to a set of prepared initiations. This kind of test can be
done in the classroom or a language laboratory or even on the
telephone or video link.
Those are all legitimate
tasks providing the task type and content suits the purposes of
assessment.
There are other ways.
No list can be complete, but here are some other ideas for other
ways to set listening tasks for assessment purposes. The
content of any task will of course, depend on all the factors
discussed so far.
- Monitoring tasks
The station-announcement task above is an example of this kind of procedure. Longer texts can be used as well, asking the learners, for example, to identify the linguistic signal a speaker uses to signpost a summary of key points. Such tasks can be graded even if the text is ostensibly beyond the learners' level. Just locating a gate number or a name in an otherwise complex and indistinct recording is a good test of the ability to monitor and ignore the unnecessary. - Compare and contrast tasks
- Matching tasks
Getting people to match a short audio description to a picture (or series of similar pictures where only one represents the content of the text) is a good test of detailed understanding. - Multiple-choice tests
These tests can be carefully targeted on particular items in the text to test the ability to listen for detail, infer likely meaning of lexemes and understand tense relationships and so on. They can also be targeted at the ability to listen for gist and identify key words and phrases.
The great disadvantage in terms of listening skills assessment is that the learners need to hold all the alternatives in their heads while they are also being required to focus on the text itself. Alternatives need to be kept short if cognitive overload is to be avoided. - Directions and instructions
In these tests, learners may be required to listen and follow instructions. Such tasks, because of their artificiality, have limited uses but they do test intensive listening skills. Popular topics are origami and following directions to locate something. They can be motivating and intriguing tests. - Labelling tasks
In these tasks, the learners are given a diagram of something fairly complicated and asked to match the descriptions of various labels (A, B, C ...) to the parts of the diagram that the listening text refers to. This is an important academic skill for some learners but of limited utility in other settings. - Note-taking tasks
In these tasks, the usual procedure is to require learners to take notes as they listen and then, when the address or lecture is over, they are presented with questions to answer on what they have heard and use the notes to respond. Providing there is a level playing field, i.e., that some of the learners are not able to answer the questions without any reference to notes because they are familiar with the topic, this can be a valid and reliable test of the skill. - Dictation
Dictation wanders in and out of fashion but is still seen as a reliable if not too authentic way of testing listening ability. The text has to be carefully chosen to be relevant. The problem with this sort of test is that it isn't always clear what's being tested because a good deal depends on the learners' ability to deploy grammatical knowledge and logic to infer what the text should be.
It is a fairly flexible procedure because we can force learners to start with a blank piece of paper or have them fill gaps in a text. The latter can be quite finely targeted.
Measuring outcomesIf you can't measure it,
you can't improve it |
Unlike writing and speaking test, in which holistic, impression
marking can be done, listening tests are normally marked analytically.
This involves breaking down the tasks and being specific about
the criteria you are using to judge success. Any amount of
weighting can be applied to whichever of the micro skills you judge
to be most important.
Normally, the results of a listening test are permanent in some
way (short answers, multiple choice responses and so on). Even
the success of a Directions / Instructions task (above) can be objectively marked.
This means that marking can be objective (and the test is reliable,
to that degree) but, unless the test items target recognisable and
definable micro skills, validity is always problematic.
The summary
Related guides | |
assessing reading | for the guide |
assessing speaking | for the guide |
assessing writing | for the guide |
assessment in general | for the general guide |
the in-service skills index | for associated guides |