Consonants

consonants

If anything in the first part of this guide is unfamiliar to you, you should probably take a little time to refresh your memory concerning the essential concepts in phonology. You can open that guide in a new tab by clicking here.

Two questions:

Can you define 'consonant'?
What are the consonant sounds of English?

Click here when you have an answer.

When you produce a sound by completely or partially blocking the flow of air through the vocal tract, you produce a consonant. For example, if you block and then release air through pressing your lips together, you will produce the sound /p/. If you block the back of your mouth by raising your tongue, you will produce /k/.
Here's the list with examples. There are 21 consonant letters in English but 24 basic consonant sounds (excluding any allophones):

/p/	peach	/b/	bang	/t/	top
/d/	do	/k/	cough	/ɡ/	good
/tʃ/	chair	/dʒ/	jumper	/f/	food
/v/	value	/θ/	path	/ð/	the
/s/	sack	/z/	zoo	/ʃ/	sugar
/ʒ/	leisure	/h/	happy	/m/	man
/n/	nice	/ŋ/	ring	/l/	love
/r/	roll	/j/	yacht	/w/	war

Only 7 of the 24 sounds need a special symbol to represent them. It is quite a simple matter to learn how to read and write the phonemic script for consonant sounds. The only ones which differ from the letters of the Latin alphabet are:

/ʃ/ which is represented by the letters sh in, e.g., show (/ʃəʊ/)
/tʃ/ also appears in the transcription of the letters ch in, e.g., chuck (/tʃʌk/), i.e. /t/ combined with /ʃ/
/ʒ/ which is represented by the letter s in, e.g., measure (/ˈme.ʒə/)
/dʒ/ also appears in the transcription of the letters dg in, e.g., badger (/ˈbæ.dʒə/), i.e. /d/ combined with /ʒ/
/θ/ which is represented by the letters th in, e.g., thank (/θæŋk/)
/ð/ which is represented by the letters th in, e.g., this (/ðɪs/)
/ŋ/ which is represented by the letters ng in, e.g., sang (/sæŋ/)

You will need to learn these seven easily to understand what follows.

There are some other things to note:

Both the sounds represented by 'th' are considered full phonemes in English. However, it is rare to find a loss of meaning if /θ/ is replaced with /ð/ and vice versa (although it can happen: consider the noun teeth and the verb to teeth). These sounds do, however, appear in minimal pairs with other sounds in the list, notably /s/, /z/ and /f/ for many learners. Because the sounds are not common across languages, learners will often simply substitute them for more familiar sounds, pronouncing, for example:
think as fink, sink or zinc (i.e., substituting /fɪŋk/, /sɪŋk/ or /zɪŋk/ for /θɪŋk/).
Two letters (not phonemes) act as both consonants and vowels:
W is a consonant in will but a vowel in how. (The transcription is /wɪl̩/ vs. /haʊ/.) The sound some speakers produce for the 'wh' in which, for example, sounding the /h/, is sometimes transcribed as /wh/, /w/ or /ʍ/ (/wɪtʃ/, /whɪtʃ/ or /ʍɪtʃ/).
Y is a consonant at the beginning of yesterday, transcribed as /j/ but a vowel at the end in /ˈjest.əd.i./ or /ˈjest.ədeɪ/. It is also a vowel at the end of potty, transcribed as /i/ in /ˈpɒ.ti/.
There are some important allophones of some of the sounds (for more, see below).
- /p/ is aspirated to /pʰ/ and /t/ produced as /tʰ/ in some circumstances: compare the sounds in top, pat, spin and pin. If you hold a thin piece of paper in front of your mouth when saying the words, it will move only (or more) on the aspirated sounds. The same considerations apply to /k/ and /kʰ/ (say cake, pack, luck, cackle, bicker).
- /l/ also appears as [ɫ] (the so-called 'dark l'). Compare the pronunciations in lull (dark for the second). Feel what the tip of the tongue is doing.
  The dark 'l' is velarised with the tongue much further back in the mouth whereas the light version is pronounced with the tongue tip touching the alveolar ridge behind the top teeth.
  The /l/ is velarised when it is in the final position in a word or when it comes immediately before a voiced consonant.
What are phonemes in English are, if they exist at all, often allophones in other languages and vice versa. Turkish, for example, has dark and light /l/ sounds as full phonemes, Mandarin does the same with /t/ and /tʰ/ and most varieties of Arabic consider /p/ and /b/ to be the same phoneme.
Hearing and correctly identifying consonants is very important. You can remove vowels (e.g., in a txt msg) but taking out the consonants produces nonsense. You can try it for yourself:
Can you understand /tr t f jslf/ with the vowels removed?
Can you understand /aɪ ɪ ə ɔː e/ with the consonants removed?
One consonant often heard in regional varieties of English (such as Scots, Welsh and South African) is represented by /x/ and is the sound made in Scots of the 'ch' at the end of loch and elsewhere. It is a voiceless velar fricative. In Welsh, the sound is represented by 'll' and is slightly differently pronounced, with the place of articulation further back in the mouth, but the phonemic (rather than phonetic) symbol is the same.

Now we can go on to /'klæ.sɪ.faɪ.ɪŋ 'kɒn.sə.nənts/.

Classifying consonants

There are three areas to consider when classifying consonant sounds:

Voice
Place of articulation
Manner of articulation

Voicing

Voicing describes how phonemes may be different depending on whether the vocal cords vibrate or not at the time of pronunciation. It is sometimes referred to as sonorisation.
For example, the /k/ sound is made without voicing but the /ɡ/ sound is made with the mouth parts in the same place but with voice added. If you put your hand on your throat and say the words sue and zoo, you will see what is meant and feel a slight vibration on the second word (/s/ is unvoiced but /z/ is voiced).
The same phenomenon is noticeable when saying log vs. lock (although the voicing of the /ɡ/ in the first is less obvious).

Sixteen of the consonant phonemes form voiced / unvoiced pairings:

Unvoiced	Voiced	Minimal pairs
/p/	/b/	pat vs. bat
/tʃ/	/dʒ/	chin vs. gin
/f/	/v/	fan vs. van
/s/	/z/	sip vs. zip
/k/	/ɡ/	cut vs. gut
/t/	/d/	tab vs. dab
/θ/	/ð/	loath vs. loathe
/ʃ/	/ʒ/	leash on vs. lesion

Voicing is not a digital, on-off phenomenon; it exists on a cline from fully voiced to fully unvoiced. In some circumstances, the consonants normally considered voiced are only partially voiced and, more rarely and in very rapid speech, not voiced at all.
In initial and final positions, as in words like had, sob, dig, do, be and go, the consonants /d/, /b/ and /ɡ/ are only partially voiced but in the mid-position, as in words like ladder, rubber and bigger, voicing is more pronounced.
This variation in the level of voicing has led some to use two different terms for the phenomenon:

fortis (meaning strong) which alludes to the fact that unvoiced consonants are, allegedly, pronounced with more energy. The consonants /p/, /t/ and /k/ are described as fortis consonants.
lenis (meaning weak) alludes to the opposite phenomenon of the consonants /b/, /d/ and /ɡ/ which are variable in the amount of voicing they take and often produced with little force.

What this implies is that phonemes are a way of digitalising the information. Although a sound may, in fact, be very variably pronounced (shouted, whispered, mumbled etc.) and may be affected by its environment vis-à-vis other sounds, it will still be instantly recognisable by a native speaker of a language. Phonemes are, in other words, sets of allophones, not simple sounds.

There are two other things to know about any consonant:

Where is it pronounced? This is called place of articulation
How is it pronounced? This is called manner of articulation

Place of articulation

To figure this out, we need to do a bit of physiology to get the terms right. As you read the following guide, move your tongue around to identify the parts we are talking about. Technically, the various parts you identify are called articulators.

Start at the front of your mouth, where it meets the outside world and you have found your lips. Sounds which require the use of your lips are called labial. Sounds which require both lips are called bilabial. An example is the /m/ sound in member.
Behind your lips are your teeth and sounds produced here are, unsurprisingly, called dental. An example is the th (/ð/) sound in that.
Behind your top front teeth, there is a bony ridge called the alveolar ridge and sounds produced here are called alveolar. An example is the /t/ sound in teeth.
Behind that, the roof of the mouth has two sections:
1. the hard palate (where palatal sounds are made) to the front. An example is the sh (/ʃ/) sound in ship.
2. the soft palate or vellum to the rear (where we make velar sounds). An example is the /k/ sound in cake.
Your tongue can reach no further but pause to note that the tongue has three areas: the tip, the front and the back.
At the back of your mouth is a teardrop-shaped fleshy part called the uvula. It is, unsurprisingly, where uvular sounds are made but there are no uvula consonants in standard varieties of English.
Right at the back of the mouth is the glottis where we make glottal sounds. The only true glottal in English is the /h/ in, for example, horse. In rapid speech and some varieties of English, there is also the glottal stop (/ʔ/), however, that appears when a consonant is dropped as in, e.g., the Scots and Southern British pronunciation of better as be'er (/ˈbe.ʔə/ rather than /ˈbe.tə/.
The nasal cavity which is connected to the mouth and involved in nasal sounds. An example is the sound on ng (/ŋ/)in swing.

Here's a picture:

vocal tract

A copy of that diagram is available. Download it here.

Now pronounce some consonants and see if you can identify which parts of the mouth are involved in making the sounds. Can you put the following sounds in the table?
/s/ as in seem
/t/ as in tent
/f/ as in fine
/ɡ/ as in gone
/θ/ as in think
/l/ as in link
/ŋ/ as in sing
/w/ as in went
/h/ happy
/p/ as in pin
/ʃ/ as in shine
In the third column, put in your best guess at the adjective for the type of sound.
You can download a printable version of this and the next activity here.

Click on the table to get the right answer.

place of articulation table

If you would like to see the place of articulation in a diagram, here it is:
consonants place

The diagram and all that we have discussed so far is only the first dimension of the sounds of consonants.
Now we can describe where the sounds are pronounced but we still need a way to distinguish between them.
For example, /s/ and /t/, /d/ and /z/ are all alveolar sounds but they are very different so we need to understand how they are pronounced: the manner of articulation.

Manner of articulation

There is, unfortunately, no universally recognised system to describe how sounds are produced. However, English sounds are all produced pulmonically (i.e., by expelling air) and by restricting the airflow in some way.

Stops or plosives

These sounds are produced by completely blocking the air flow and then releasing the blockage. For example, to produce a /p/ sound, we close both lips, let a little breath build up and then release it by opening the lips. These sounds can't be made continuously.
There are four phases to their production:

the articulators are closed (e.g., the lips are pressed together for /p/)
the air behind the articulators is compressed
the articulators are moved apart to allow the air to be released
the air, once released, often makes an audible sound or aspiration. That is the difference between the sound of the 't' and 'p' in top [/tʰɒp/] and pat [/pʰæt/].

English has seven plosive consonants: /p/, /b/, /t/, /d/, /k/, /ɡ/ and /ʔ/. The last of these is called the glottal plosive and is often an alternative to /p/, /t/ and /k/.

/p/ and /b/ are bilabial, formed by both lips, and the second is voiced. For example:
paper bill /ˈpeɪ.pə.bɪl/
/t/ and /d/ are alveolar, formed by the tongue pressing against the alveolar ridge (not the teeth), and the second is voiced. For example:
train delay /treɪn.dɪ.ˈleɪ/
/k/ and /ɡ/ are velar, formed by the back of the tongue pressing against the juncture of the hard and soft palate, and the second is voiced. For example:
cream gateau /kriːm.ˈɡæ.təʊ/
The glottal plosive (/ʔ/) is also known as a glottal stop (because the airflow is entirely blocked) and is voiceless. It has to be voiceless because it is formed by compressing the vocal tract entirely and holding the vocal folds rigid. It occurs in many words, often replacing a plosive as in the London and Scots pronunciation of butter which may be transcribed as /ˈbʌʔ.ə/ with the /t/ plosive replaced by the glottal.
It is not always a signal of non-standard speech patterns as, in rapid speech, the stop is commonly used.

Nasals

To make these sounds, we close off the airflow (as we do for plosives) but allow the air to enter the oral cavity and flow out through the nasal cavity.
There are three nasal consonants in English:

/m/ – as in
map /mæp/
ham /hæm/
lamb /læm/
milk /mɪlk/
This consonant causes few problems for most learners.
In these, the oral passage is blocked by pressing the tip of the tongue against the alveolar ridge (just behind the teeth) so the place of articulation is described as alveolar.
/n/ – as in
nut /nʌt/
bun /bʌn/
nil /nɪl/
can /kæn/

This consonant causes few problems for most learners.
/ŋ/ as in:
doing /ˈduːɪŋ/
bringing /ˈbrɪŋɪŋ/
sing /sɪŋ/
ping /pɪŋ/
In these, the oral passage is blocked by raising the tongue to contact the velum at the back of the throat, forcing the air through the nose so the place of articulation is described as velar.
This consonant can cause difficulties because it is quite unusual.
It never occurs initially in English.
It occurs frequently in mid-position but is only pronounced as /ŋ/ when the morphology of the word allows it. It is pronounced /ŋ/ in bringer [/brɪŋ.ə/] because the word is formed from bring + er but it is not pronounced that way in finger [/ˈfɪn.ɡə/] because the word is morphologically different, and not formed from fing + er.
In other words, when it occurs at the end of a morpheme 'ng' is pronounced as /ŋ/ but in other circumstances, 'ng' is pronounced /nɡ/.

Fricatives

To make these sounds, the air flow is not completely cut but is restricted with air flowing continuously and turbulently between two mouth parts. What you hear is the result of friction, hence the name. The term sibilant is used to refer to the sounds such as /s/ and /z/ which are produced by allowing the air to flow across the tip of the tongue between it and the alveolar ridge.
To demonstrate to yourself, make a /t/ sound by completely blocking and then releasing the air and then make the /s/ sound by allowing air to seep out between articulators.
The nine fricatives in English are:

labiodental fricatives
/f/ and /v/ formed by the lips and top teeth. The second is voiced. For example:
fine /faɪn/
vine /vaɪn/
dental fricatives
/θ/ and /ð/ formed by the tongue touching the teeth. The second is voiced. For example:
breath /breθ/
breathe /briːð/
alveolar fricatives
/s/ and /z/ formed as sibilants with the air compressed between between the tongue and the alveolar ridge. The second is voiced. For example
bus /bʌs/
buzz /bʌz/
palatal or post-alveolar fricatives
/ʃ/ and /ʒ/ formed by the tongue compressing the air slightly further back on the palate or just behind the alveolar ridge. The second is voiced. For example:
mesh /meʃ/
measure /ˈme.ʒə/
glottal fricative
/h/ formed by air compressed in the glottis at the back of the throat. For example:
household /ˈhaʊs.həʊld/
hope /həʊp/
It is not voiced in English but a voiced equivalent exists in some languages, including Basque, Chinese, Czech, Finish, Korean, Polish, Portuguese, Romanian, Slovak and Slovene. The sound is usually transcribed as [ɦ] and occurs, incidentally, in some South African speakers' production. Speakers of those languages may be tempted to insert it into English and, although this rarely causes comprehension issues, it contributes to a foreign accent.
In English, the airflow is only very minimally disrupted when forming this sound which leads some to assert that it isn't really a consonant at all.
velar and uvular fricatives:
The fricatives /x/ and /ɣ/ are not usually included in a list of standard English consonants but appear on words of Scots and Welsh origin and in many German words. It also occurs in South African varieties where the words are borrowed from Afrikaans or Xhosa.
The /x/ is voiceless and occurs in the Scots word:
loch /lɒx/
English speakers will often substitute /k/.
Other languages (such as Dutch) have a voiced version for the /x/ which is transcribed as /ɣ/ and is often spellt as 'g', for example, in:
's-Hertogenbosch /ˌsɛrtoʊɣə(m)ˈbɔs/
There are no uvula consonants in standard varieties of English.

Affricatives

These are formed as a combination of a plosive and a fricative. First there is closure of the airflow but release is allowed in a restricted way, extending the sounds. There are two affricative sounds in English /dʒ/ and /tʃ/ and both are described as palatal or post-alveolar, being formed with the tongue obstructing the air flow further back than the alveolar ridge (where /t/ and /d/ are formed). The first of these is unvoiced and the second voiced:

/tʃ/ in, for example
chop /tʃɒp/
/dʒ/ in, for example
bridge /brɪdʒ/

Approximants

These sounds are all voiced and are produced by small obstructions of the airflow. They are formed by bringing certain mouth parts quite close together without letting them touch, hence the name.
There are four of these in English and the first two are often referred to as glides or semi-vowels while the second two are referred to as liquid sounds:

velar /w/ with the back of the tongue slightly raised towards the velum as in
would wait /wʊd.weɪt/
palatal /j/ with the tongue raised towards (but not very close to) the palate as in
yellow yacht /ˈje.ləʊ.jɒt/
/l/ which is sometimes placed in a class of its own as the only lateral in English. In this, the sound is formed by using the tongue to stop air moving directly forward and out and forcing it to run along the side of the tongue. For example
lullaby /ˈlʌ.lə.baɪ/
/r/ which is the only rhotic sound in English formed with a palatal airflow rather than a lateral flow of air. For example
real rarity /rɪəl.ˈreə.rɪ.ti/

Two more distinctions

One, rather simple way to divide consonant sounds is to refer to two overarching categories:

Obstruents
are sounds made by obstructing the airflow completely or partially and include
- stops and plosives (such as /b/ and /p/)
- fricatives (such as /f/, /v/, /ʃ/ and /ʒ/)
- affricatives such as /tʃ/
Sonorants
are sounds made with continuous, non-turbulent airflow (and include all vowels by some definitions) and include
- nasals such as /m/ and /n/
- lateral (/l/) (a liquid sound)
- rhotic (/r/ (another liquid sound)
- glides (/w/ and /j/)

There are some other ways to make sounds and languages are quite inventive. These include trills (the Spanish rolled /r/) in which the tongue vibrates and flaps (for example, the 'dd' sound in madder in US English) when the airflow is momentarily interrupted. Some African languages make extensive use of click sounds which occur in English in expressions such as tsk tsk and also when people try to imitate the sound of horses' hooves (clip clop). Transcription varies because there are at least five ways to make the sounds.

Retroflex sounds

Retroflex sounds are formed in many languages with the tongue concave and/or curled back on itself to block the air flow, like this:

(Image adapted from Wikipedia)

For example:
    Russian and Polish have a retroflex /z/, transcribed as [ʐ].
    Hindi and other Indian languages have a retroflex /t/ transcribed as [ʈ].
    Swedish has both a retroflex /ŋ/ transcribed as [ɳ] and a retroflex /d/ transcribed as [ɖ].
    Chinese languages have a retroflex /s/ transcribed as [ʂ].
If speakers of these languages import the retroflex sounds into English it contributes greatly to a foreign accent.
It is usually helpful to make learners aware of the differences.

Markedness and phonemic substitution

Markedness in this sense refers to how widely consonants are represented in the world's languages. That, it is sometimes averred, is a measure of how hard they are to acquire. The common sounds will give few problems but consonants which are not represented in the learners' first language(s) will, understandably, cause significant problems.

There is evidence to suggest that the unvoiced consonant sounds, especially, /t/, /s/, /p/, and /k/ are common to nearly all languages and are, therefore, considered unmarked. They should cause few learners any trouble at all except in terms of their allophonic varieties (with and without aspiration, retroflex or not).
The consonant /n/ is also an unmarked form which appears in many languages.
On the other hand, the equivalent voiced sounds, /d/, /z/, /b/ and /ɡ/ are marked in that they do not universally occur with anything like the same frequency so they require more attention as does the nasalised /ŋ/ which is also less common and causes some learners a good deal of difficulty.

Where a sound may occur also plays a role. Final voiced consonants are rare in many languages, including German and Dutch, for example and this may tempt learners of those backgrounds to pronounce dog as dock, cab as cap, cadge as catch and so on.

There is a guide on this site to teaching troublesome sounds (new tab) which considers many of the more marked, i.e., less common, vowel sounds.

Allophones, reductions and regional variations

individuals vary

No two speakers pronounce all consonants in exactly the same way. Individual speakers will also pronounce some consonants slightly differently depending on how they feel, how carefully they wish to speak and how quickly. So, for example, we might pronounce take in:
I want to take it home
as /teɪk/
with no aspiration on the /t/ sound and on another occasion, we might pronounce the word in
Take it!
as /t^heɪk/
with the aspiration on the /t/ prominent.
/t/, /p/ and /k/ are all variously aspirated depending on the phonological environment in which they occur and the speaker's attitude.
Voicing, too, is variable with some individuals using more (because there is a cline from unvoiced to voiced, not an either-or distinction). So, for example, some speakers may pronounce pub as /pʌb/ with a clearly voiced final consonant but others may reduce the amount of voicing until word approximates to /pʌp/. Some may even remove the final consonant and substitute a glottal stop as in /pʌʔ/. No-one will mistake the word, however it is pronounced, so we are dealing with allophonic variation.

the positions of consonants vary

Where a consonant occurs in a word may also affect how it is pronounced. For example:

/b/, /d/, /dʒ/ and /ɡ/ which are all voiced in most transcriptions may become wholly or partially de-voiced when they fall at the end of a word or phrase so, for example
    It's my job
may be transcribed as
    /ɪts.maɪ.dʒɒp/ or /ɪts.maɪ.dʒɒb/
    I'll be the judge
may be transcribed as
    /aɪl.bi.ðə.dʒʌtʃ/ or /aɪl.bi.ðə.dʒʌdʒ/
The /ɡ/ sound is clearly voiced in, e.g.
    bigger
but much less so in
    big
so the first may be transcribed as
    /ˈbɪ.ɡə/
and the second is nearer to (but not identical with)
    /bɪk/
and so on.
The /l/ sound also exhibits variations in what is called velarization (the amount it is pronounced by partial closure of the velum at the back of the mouth). So for, e.g.:
    Let me go
the transcription would be
    /let.miː.ɡəʊ/
but the transcription of
    Let me fall
will be:
    /let.miː.fɔːɫ/
with a velarized final consonant (the so-called dark [ɫ]).
In standard BrE, the sound is light (/l/) before a vowel and dark elsewhere but that disguises changes in connected speech because the sound will be light in pull it (/pʊl.ɪt/) but dark in pull that (/pʊɫ.ðæt/). The transcription may safely be left as /l/ in all cases because it is simpler and we have a rule for the pronunciation of the allophones.
Most native speakers of English are unaware of the two pronunciations of /l/ because they make no phonemic difference. A Turkish speaker, in whose language the sounds are phonemes, will be much more aware of the distinction having been trained since childhood to recognise it.
The /t/ sound often becomes glottalized when it occurs finally. In other words, it is replaced by the stop /ʔ/. So the transcription of
    I got it
is either
    /ˈaɪ.ˈɡɒt.ɪt/
or
    /ˈaɪ.ˈɡɒt.ɪʔ/
or, even
    /ˈaɪ.ˈɡɒʔ.ɪʔ/
The amount of aspiration is also dependent on the position of the consonant vis-à-vis other sounds. We saw above that this aspiration affect /t/, /k/ and /p/ in particular. When these sounds are the first in a word or the first in a stressed syllable, they are aspirated so the sounds followed by the elevated /^h/ in the following will be aspirated:
    peter /ˈpʰiː.tə/
    tap /tʰæp/
    kill /kʰɪl/
but will remain unaspirated in these:
    couple /ˈkʌp.l̩/
    hate /heɪt/
    sicken /ˈsɪkən/
Because the sounds are not full phonemes in English, most speakers are unaware of the differences in pronunciation and may be surprised to discover it but to speakers of languages (such as Mandarin) where aspiration is a phonemic characteristic, the change in pronunciation will be very obvious because they have been brought up to recognise it.
(In fact the phoneme /t/ has six possible pronunciations in English:
At the end of a hat it is called an unreleased /t/ and transcribed phonetically as [t̚].
At the beginning of task it is aspirated [tʰ].
It may be glottalised in, e.g., butter and got [ʔ].
It may be flapped as in the AmE later [ɾ].
It may be nasalised and flapped as in the AmE counter [ɾ̃] (because it is following a nasalised consonant /n/).
It may just be a plain [t] sound as in stitch.)

reductions and elisions of consonants and clusters

When consonants occur in clusters such as at the end of a word like clothes (/kləʊðz/) there is a tendency in English to elide one of the consonants so the pronunciation is often as /kləʊz/ with the elision of the /ð/. (If learners always say it that way, they will never be misunderstood and it's a good deal easier for them.)
Some clusters such as the one at the end of sixths, are simply difficult to pronounce. The result is usually something like /sɪkθs/ or even /sɪkfs/. Learners whose languages do not allow the same clusters as English are often tempted to use cluster reduction inappropriately, for example, pronouncing crisps as /krɪps/ rather than /krɪsps/.
It is usually /t/, /d/, /p/ and /k/ which are elided in this respect, so, for example:
    text message becomes /teks.ˈme.sɪdʒ/
    midst becomes /mɪst/
    glimpse becomes /ɡlɪms/
    and asked can be pronounced /ˈɑːst/.
The same phenomenon is observable with the unvoiced /θ/ sound so asthma is pronounced as /ˈæ.smə/.
Occasionally, elision can become fixed in the language so, for example, the confection now known as ice cream was originally iced cream but the /t/ sound of the letter 'd' was routinely elided and the phrase took on its current spelling.

accents vary

Where people come from may also have a significant effect. In some parts of Britain, for example, a final letter 'r' will be pronounced quite obviously so, e.g.
    My father is
will be pronounced as
    /maɪ.ˈfɑːð.ə.rɪz/
by lots of people because the /r/ precedes the vowel, but many people will pronounce it as
    /maɪ.ˈfɑːð.ə.ɪz/
without the /r/ sound However, even those who do pronounce the /r/ would not pronounce
    He is my father
as
    /hi.z.maɪ.ˈfɑːð.ər/
preferring
    /hi.z.maɪ.ˈfɑːð.ə/
because there is no following vowel.
In Standard AmE, the /r/ is usually produced so the transcription is
    /hi.z.ˈmaɪ.ˈfɑːð.r̩/
with a syllabic /r/ as the final consonant and no preceding schwa.
Alternatively, the transcription appends a tiny /r/ to the vowel so we have, e.g., nurse transcribed not as /nɜːs/ but as /nɝːs/.
Another significant difference between Standard American and British is the pronunciation of the letter 't' when it occurs in the middle of words so, for example, we find:

Word	British	American
butter	/ˈbʌt.̩ə/	/ˈbʌd.r̩/
Peter	/ˈpiː.tə/	/ˈpiː.dər/

There are a few other significant (and some not very significant) variations in how consonants are pronounced between BrE and AmE. For a list of the differences, see either the guide to teaching yourself to transcribe or download the PDF document for this area. (Both those links open in new tabs.)
A regional difference in parts of Britain is that the central /t/ sound may be replaced by a glottal stop (/ˈbʌʔ.ə/ and /ˈpiː.ʔə/, respectively).

/hw/ vs. /w/

Now almost extinct except in some varieties of English spoken in Scotland, parts of Ireland and the southern United States, is a variant of /w/ usually transcribed as /hw/ (or you may see it as [ʍ]). It appears at the beginning of words spelled wh- but has for almost all speakers of English now merged with /w/. The result is that apart from a small minority of speakers, there is no distinction in pronunciation between weather and whether, wine and whine etc. The merger is generally called the whine-wine merger.)

A summary and test

Now we have all three ways to classify the consonants and can describe them properly. These three ways are:

Voicing
Place of articulation
Manner of articulation

Can you complete this chart? If you have your downloaded and printed activity sheet to hand, do it there. If you would like to download that now, click here. When you have filled in all the consonant sounds, click on the chart to reveal the answer.

consonants

The voiced consonants are in bold.
Notice, too, that /t/ and /d/ are alveolar stops in English, not dental sounds as they are in a range of other languages. Making them dental sounds contributes to a foreign accent in English.

If you would like to hear these sounds, the ideal place to go has been kindly provided by the British Council.

Of course there's a test (two, to be honest on what has been covered up to now).

Consonant clusters and phonotactic rules

English allows a range of consonants to occur together. In this guide, we will call them clusters although you may hear talk of consonant sequences, consonants compounds and consonant blends.
Clusters can occur initially (as in spray [/spreɪ/]), medially (as in hopscotch [/ˈhɒp.skɒtʃ/]) or finally (as in cups [/kʌps/]) but there are restrictions concerning which clusters can occur where. The rules are referred to as phonotactic, signalling that they concern the contact points of consonants.
The clusters which are allowed in the initial position of a syllable (not necessarily a word) in English can be listed:

Cluster	Example	Cluster	Example	Cluster	Example	Cluster	Example	Cluster	Example
/s/ + /p/	speak	/sp/ + /r/	spray	/b/ + /l/	blow	/f/ + /r/	frog	/k/ + /j/	cute
/s/ + /t/	stop	/st/ + /r/	street	/ɡ/ + /l/	glow	/θ/ + /r/	throw	/b/ + /j/	beauty
/s/ + /k/	scope	/sk/ + /l/	sclerosis	/f/ + /l/	flow	/ʃ/ + /r/	shrink	/d/ + /j/	duty
/s/ + /f/	sphere	/sk/ + /r/	screech	/s/ + /l/	slow	/t/ + /w/	twin	/f/ + /j/	future
/s/ + /m/	smile	/sk/ + /w/	squeal	/p/ + /r/	pray	/k/ + /w/	quick	/h/ + /j/	huge
/s/ + /n/	snip	/sk/ + /j/	skew	/t/ + /r/	tray	/d/ + /w/	dwell	/v/ + /j/	view
/s/ + /l/	slip	/st/ + /j/	stew	/k/ + /r/	cry	/θ/ + /w/	thwack	/m/ + /j/	mew
/s/ + /w/	swim	/sp/ + /j/	spurious	/b/ + /r/	brow	/s/ + /w/	swell	/n/ + /j/	new
/s/ + /j/	suit	/p/ + /l/	play	/d/ + /r/	drag	/p/ + /j/	pew	/l/ + /j/	lewd
/sp/ + /l/	splay	/k/ + /l/	clay	/ɡ/ + /r/	grow	/t/ + /j/	tube

Three consonants is the maximum that is allowable in English in the initial position.
Some of the above (e.g., /sk/ + /l/, /θ/ + /w/, /sp/ + /j/ and /p/ + /j/) are very rare and some, such as /l/ + /j/ only occur in the dialects of some English speakers.
Others, such as /s/ + /f/ occur only in words derived from other languages (Greek in this case).

Equally, we can identify clusters which are permitted in the final position and see that there are phonotactic rules for final consonants in English.
Here's another list:

In forming plurals and verb inflexions such as past tenses and other structures, English has the final consonant followed by /s/ (as in lots [/lɒts/, /z/ (as in lads [/lædz/]), /t/ (as in sacked [/sækt/]) or /θ/ (as in seventh [/ˈsevn̩θ/]). In these cases, the /s/, /z/, /t/ and /θ/ are the only four allowable post-final consonants.
There are five pre-final consonants appearing in clusters.
/m/, /n/, /ŋ/ /l/ and /s/ are the only ones which can precede the final consonant. For example:
lumps, banks, ringed, belt, last (/lʌmps/, /bæŋks/, /rɪŋd/, /belt/, /lɑːst/)
A few words in English end in clusters of four consonants and these cause many learners real trouble. Examples are glimpsed (/ɡlɪmpst/) and texts (/teksts/).
In BrE, the 'r' in words like marks (/mɑːks/), carts (/kɑːts/) and lords (/lɔːdz/) is not sounded so these are, in fact, two-, not three-consonant clusters.
Only one word and a few derivatives of it, in English ends in /mt/: dreamt (/dremt/).
No syllables can end with more than four consonants (and more than three is vanishingly rare). We can allow sevenths (/ˈsevnθs/) with four final consonants in a cluster but that is the limit.

When we consider the medial position, life is slightly more complicated because some will only allow a cluster to appear in a single syllable so, for example, mixture will be said to contain only /ks/ and /tj/ but others will allow it to contain /kstj/ as a cluster. The first analysis is more consistent with the phonotactic rules of English.

It is clear from the above that certain combinations of consonants are not allowed in English at all. Here's a short list:

/sb/, /sd/, /sɡ/, /sθ/, /ss/, /sʃ/, /sh/, /sv/, /sð/ /sz/, /sʒ/ and /sŋ/ cannot occur initially as a cluster in an English word.

This is the situation before /l/ in the initial position:

Allowed	Forbidden
/p/ /k/ /b/ /ɡ/ /f/ /s/	/d/ /tʃ/ /v/ /ʒ/ /r/ /dʒ/	/θ/ /z/ /h/ /ŋ/ /j/ /t/	/ð/ /ʃ/ /m/ /l/ /w/

This is the situation before /r/ in the initial position:

Allowed	Forbidden
/p/ /t/ /k/ /b/ /d/ /ɡ/ /f/ /θ/ /ʃ/	/tʃ/ /v/ /s/ /ʒ/ /n/ /r/ /dʒ/ /z/ /h/	/ŋ/ /j/ /ð/ /ʃ/ /l/ /w/

This is the situation before /w/ in the initial position:

Allowed	Forbidden
/t/ /k/ /d/ /θ/ /s/	/p/ /tʃ/ /v/ /ʒ/ /n/	/r/ /b/ /dʒ/ /z/ /h/	/ŋ/ /j/ /ɡ/ /f/ /ð/	/ʃ/ /m/ /l/ /w/

This is the situation before /j/ in the initial position:

Allowed		Forbidden
/p/ /t/ /k/ /b/ /d/ /f/	/s/ /h/ /v/ /m/ /n/ /l/	/tʃ/ /ʒ/ /r/ /dʒ/ /θ/ /z/	/ŋ/ /j/ /ɡ/ /ð/ /ʃ/ /w/

There is no obvious reason for this and it is not to do with certain clusters being unpronounceable. English speakers, for example, have little or no difficulty pronouncing Gwen but /ɡ/ + /w/ is not allowed in English words. Equally, there is no obvious reason why English forbids an initial cluster of /ðr/ instead of /θr/ but it does.

This matters because English is at the forgiving end of the spectrum in allowing a wide range of clusters to occur (although not all of the possibilities, as we have seen). Other languages do things differently and here's a short list of the commonest problems caused by clusters:

Standard Arabic forbids initial consonant clusters altogether and never allows more than two consecutive consonants anywhere.
Japanese allows a very limited range of clusters and forbids any unvoiced consonant following a nasal so /nd/, is allowed but /nt/ is forbidden and /mb/ is allowable but not /mp/.
Spanish allows no cluster beginning with /s/ in initial position so speakers may insert an intrusive /ə/ or /e/ sound before the cluster in English producing, e.g., eschool for school (/eskuːl/ or /əskuːl/ not /skuːl/).
French allows /vr/ as an initial cluster and French speakers may carry this over into English words beginning with /v/.
In Chinese the clusters /kl/, /st/ and /rs/ are forbidden and speakers may insert a /ə/ between the consonants.
Additionally, and the language has this in common with, e.g., Thai, there are no final consonants barring /ŋ/ in most dialects. The result is often that speakers of these languages will simply fail to produce final consonants at all.
Final consonant clusters, which may, in English, be made up of up to four consonants are even more problematic.
In Italian the consonant clusters of pl/ or /kl/ are not allowed.
In German more initial clusters are allowed (/ʃl/ is very common) and /pf/ occurs both initially and in other positions but is not allowed at all in English.
Greek allows no fewer than 32 two-consonant clusters at the beginnings of words which are forbidden in English.
In Russian and other Slavic languages, many initial clusters are permitted which in English are forbidden. These include /pt/, /bd/, /tk/, /kt/ a /gd/, for example.

Phonotactic rules are not easily discernible to learners of the language so the temptation is often to use native clusters, so French speakers and Russian speakers may insert forbidden clusters.
Speakers of languages which have no or a very limited range of clusters may break up clusters which are unfamiliar and produce, e.g., screw as sekeru (/skruː/ pronounced as /sekəruː/) and that is evident in the production of speakers of Japanese, Chinese languages and Arabic.

To help a little, we need to recall (or become suddenly aware of the fact) that native speakers routinely simplify final consonant clusters, especially in rapid speech so it is unnecessary to trouble learners with the full pronunciation of words like products or camped because the /t/ and the /p/ are not usually sounded by native speakers (so we have /ˈprɒ.dʌks/ not /ˈprɒ.dʌkts/ and /kæmt/ not /ˈkæmpt/).
The middle consonant in clusters such as /kts/, /mps/, /mpt/, /nts/, /ndz/ and /skt/ is usually left out or sounded very weakly. Examples are:
    impacts which can be pronounced as /ɪm.ˈpækts/ or /ɪm.ˈpæks/
    dumps which can be pronounced as /dʌmps/ or /dʌms/
    dumped which can be pronounced as /dʌmpt/ or /dʌmt/
    pints which can be pronounced as /paɪnts/ or /paɪns/
    funds which can be pronounced as /fʌndz/ or /fʌnz/
    tasked which can be pronounced as /tɑːskt/ or /tɑːskt/
That is helpful for teaching purposes, especially for learners whose first languages do not allow or allow a more limited range of final consonant clusters.
The troublesome /ð/ in clothes is also often ignored by native speakers and learners can take the same route (say /kləʊz/, not /kləʊðz/. Nobody will misunderstand and few would notice.).

If you yearn for more help in this area, try the guide to syllables and phonotactics accessible from the pronunciation index linked below.

Spelling consonant sounds

What follows is a guide to how the consonant sounds of English are realised in its orthography. If you have followed the general guide to spelling in English, you will be aware that English is often described, sometimes despairingly, as a wholly inconsistently spelled language with no discernible connections between sound and spelling. You will also be aware that that is only very partially true.
In the case of consonant sounds, there are clear consistencies and these are teachable.

The following takes each consonant in turn and suggests the commonest way that the sounds are realised in the morphology as well as noting some rarities, often loan words from other languages, which have to be learned individually.
A silent final 'e' has been ignored in this list and the ordering is as for the list of consonants in the table above.

Sound	Common spellings	Rarities and varieties	Sound	Common spellings	Rarities and varieties
/p/	p or pp: pepper pill people	gh: hiccough (unique)	/z/	z, zz, or s: zoology puzzle please grabs	cz: czarina (also tsarina) x: xylem
/d/	d, dd or ed: did peddle framed	dh: dharma AmE: tt: matter	*/h/	h or wh: he whom	j: fajita ch: chutzpah x: Quixote
/tʃ/	ch, tch or t: chum match nature righteous tch is never initial	c: cello cz: Czech tsch: putsch	/ŋ/	ng, n or ngue: sang think tongue	nd: handkerchief
/v/	v, vv or f: volume navvy of	ph: Stephen w: weltanschauung	/j/	y or i: young bunion	j: hallelujah r: February
/s/	s, ss or c: sad less since	cc: flaccid ps: psalter	/t/	t, tt, bt, ght or ed: tense debt butt fight pressed	cht: yacht pt: pterosaur th: thyme
/ʒ/	g, j or s: genre bijou leisure	si: division ti: equation z: seizure	/ɡ/	g, gue or gh: gone dialogue (BrE) ghost	gg: egg ckg: blackguard
/n/	n, nn or kn: noise inner knock kn is only initial nn is only medial	dn: Wednesday gn: gnome mn: mnemonic nd: handsome	/f/	f, ff, gh or ph: find ruffle rough phantom	pph: sapphire u: lieutenant (BrE)
/r/	r, rr or wr: rise furrow wrong	l: colonel rh: rhythm	/ð/	th: that	-
/b/	b or bb: bar oblong obey abbot gobble	bh: bhang pb: cupboard	/ʃ/	sh, s, ss, c, ce, ch or ti: shave sugar mission special ocean machine mention	chsi: fuchsia sc: crescendo sch: schlepp
*/k/	c, k, kk or cc: cab kick trekked accountant ck is never initial	ch: chord q: liquor	/m/	m, mm or mb: money hammer comb	mn: autumn
/dʒ/	g, j, dg or dj: magic judge graduate adjourn	ch: sandwich gg: veggie	*/l/	l or ll: limb fellow	sl: aisle
/θ/	th: think	tth: Matthew	/w/	w, wh or u: wall when persuasion	o: choir

* The /h/, /k/ or /l/ sounds are often the ways in which the /x/ sound in loch, chutzpah, llyn and other loan words are rendered in Standard English. Many speakers of Standard English do, however, make the effort to produce /x/ in these cases.

This is the index of other guides in the in-service pronunciation section.
the overview of pronunciation	connected speech	consonants
intonation	minimal pairs (PDF)	minimal pairs transcription test
sentence stress	syllables and phonotactics	teach yourself transcription
teaching pronunciation IP	teaching troublesome sounds	verb and noun inflexions IP
vowels	word stress	identifying word-stress IP
Guides marked IP are in the initial plus section.