Representing the sounds of Australian Indigenous languages

Sasha Wilmoth, 21 May 2023

Linguists don’t divide languages into ‘oral’ and ‘written’ languages; nothing special changes about a language itself once a writing system is developed or used. As children, we learn the sounds and grammar of language through hearing it spoken to us and around us (or signed, for people who grow up with access to a sign language). We have a fairly mature grasp of the structure of our first language or languages by the time we start learning to read and write. Unlike speaking and hearing, reading and writing has to be learned through explicit instruction. Language exists independently of writing; writing is simply a technology and a practice that some societies use and others do not. It is a technology that is relatively new to this continent, where, at the time of colonisation, hundreds of incredibly diverse languages were spoken but not written (although there were and still are other types of complex visual semiotic systems).

The grammar and sounds of Australian Indigenous languages posed challenges for those European colonisers who tried to write them down without any professional linguistic training. This has left a legacy of hugely variable spelling in manuscripts that can be difficult to interpret for communities who are on a journey of language revitalisation and reclamation. I’ve demonstrated some of this variation in the images following, which display some of the historically attested spellings of names for languages and peoples from the AIATSIS Austlang database.¹ This variation is mostly due to English speakers being unable to distinguish the sounds they hear, and not knowing a consistent way to write them down. However, Australian Indigenous languages are no less systematic than any other language, and well-designed writing systems can represent their sounds consistently.

An example of the considerable variation in spelling.

First, I’ll discuss vowels and how they’re usually written, then consonants. I’ll use slashes around phonetic symbols from the International Phonetic Alphabet that represent the sounds themselves, angled brackets around the letters used to represent those sounds, and example words in italics. For example, the sound /ŋ/ can be represented as either ⟨ŋ⟩ or ⟨ng⟩. Most Australian Indigenous languages use the Latin alphabet without any special diacritics or glyphs. Those diacritics and special characters that are used are listed below. Beyond providing that information, my aim here is to give a basic overview of the sounds of Australian Indigenous languages, and how they have been represented using the Latin alphabet. While Aboriginal languages are incredibly numerous and diverse, there are some commonalities in their sound systems across the continent. I’ll make some generalisations which hold for the majority of Aboriginal languages, but not all. I’ll also mention some relevant features of Torres Strait Islander languages. This information is true to the best of my knowledge, but it’s not possible to know what writing conventions are being used by every language community.

Vowels

If you ask the average person how many vowels there are in English, they will probably say five: a, e, i, o, and u. In fact, Australian English has 14 vowels, depending on the analysis, and a further six vowel sounds that involve a transition from one vowel to another (that is, diphthongs like the /æɪ/ in face). Because of the alphabet we’ve inherited, people have had to find ways to represent these 20 different sounds using only 5 letters; it’s no wonder that English spelling is a mess. Luckily, most languages of the world, including Aboriginal languages, don’t have so many vowels. Representing the vowel sounds of Aboriginal languages is relatively straightforward, although asking people to unlearn their English assocations with the letters and pronounce the words correctly is a much greater challenge. Many of the world’s languages only have three vowels, and there is a global tendency that if a language has three vowels, they will be /a/, /i/, and /u/. These are pronounced something like the ⟨u⟩ in putt (phonetically /a/), the ⟨i⟩ in peat, and the ⟨u⟩ in put. It is very common for Aboriginal languages to have three vowels, in which case these are /a/, /i/, and /u/. These three vowels are usually written with ⟨a, i, u⟩, although some languages, like Miriwoong and other languages in the Kimberley region, use ⟨oo⟩ for /u/. Some languages add a fourth vowel sound, such as a schwa /ə/, like the unstressed first syllable in about. English uses schwa all the time but we don’t have a way of writing it consistently in different contexts. In Eastern and Central Arrernte, for example, this vowel is written with an ⟨e⟩. Many Aboriginal languages have a five vowel system, which is also common across the world’s languages. There is a global tendency that if a language has five vowels, they will be /a, e, i, o, u/. Spanish and Japanese are examples of this (with subtle differences in precisely how these vowels are produced). Again, representing these is straightforward, typically ⟨a, e, i, o, u⟩ are used in Australian languages (except for some languages using ⟨oo⟩ for /u/).

Some languages have additional vowels. One example is Dalabon, which uses the letter ⟨û⟩ to represent /ɨ/, which is in between /i/ and /u/. Meriam Mir, a Torres Strait Islander language which is unrelated to languages of the mainland, uses the grave accent to make some further vowel distinctions: ⟨ì, ò, ù⟩. Rembarrnga and Kala Lagaw Ya (a language of the Western Torres Strait which is historically related to mainland Aboriginal languages), both use ⟨œ⟩.

Many languages have pairs of short and long vowels (like the difference in Australian English between cut and cart), in which case the vowel is usually written as doubled, i.e. ⟨aa, ii, uu⟩. In Yolŋu languages, the solution for representing long vowels is slightly different. Short /a/, /i/ and /u/ are represented with ⟨a, i, u⟩, but the respective long vowels are represented with ⟨ä, e, o⟩. Historical sources for many languages use a macron: ⟨ā, ī, ū⟩, etc; some Aboriginal people and communities may use these, but they are not typically used in the orthographies developed by professional linguists. Some sources use a colon to mark a long vowel, e.g. ⟨a:, i:, u:⟩, as this resembles the phonetic symbol used to mark length (ː, U+02D0). This convention is not used in most contemporary spelling systems. Some languages, like Ndjébbana, use the acute accent to indicate stressed vowels, like in Spanish. I also note that the Victorian Aboriginal Corporation for Languages (VACL) uses ⟨ŭ⟩ in the language name Gunnai / Kŭrnai. (The typeface used on their website as of March 2023 does not support this character.)

Consonants

Aboriginal languages typically have a large number of consonant sounds, with some distinctive features compared to the world’s languages. Many consonant sounds are distinguished that are not distinguished in English. Particularly, Aboriginal languages are famous among linguists for having many ‘places of articulation’. This refers to the different places where the speech organs make contact with each other within the vocal tract, such as the lips touching, or the tongue touching the teeth, or curled back further in the mouth, and so on. I’ll discuss this in more detail in the sections below, which are divided according to their ‘manner of articulation’, as linguists call it, that is, the precise mechanism by which sound is produced in the vocal tract.

Another common feature of Aboriginal languages (although not all) is the lack of a voicing distinction. That is, in most Aboriginal languages it doesn’t make a difference to the meaning of the word either way if the vocal cords are switched on and vibrating while a consonant is being pronounced. This is very different to English, where there’s a difference in meaning between pat and bat. While some Aboriginal languages do have a similar distinction, and so use both ⟨p⟩ and ⟨b⟩ (and ⟨t⟩ and ⟨d⟩, etc.), most Aboriginal language spelling systems use either ⟨p⟩ or ⟨b⟩, ⟨t⟩ or ⟨d⟩, etc.

Only a tiny minority of Aboriginal languages have fricative sounds. These are sounds produced by creating a very narrow opening in the vocal tract through which air flows. This creates a sort of hissing sound, like /f, v s, z/ and the sounds represented by ⟨sh⟩ and ⟨th⟩ in English. Languages of the Torres Strait do however have fricative sounds /s/ and /z/, represented by ⟨s⟩ and ⟨z⟩ respectively.

Stops

These are sounds like /p, t, k/ where there is a total closure in the vocal tract, stopping airflow completely. English has three different places of articulation in its stops: bilabial /p/, alveolar /t/, and velar /k/ (as well as their voiced equivalents /b, d, g/). Bilabial refers to the closure between the two lips. Alveolar refers to the alveolar ridge, the hard ridge behind the teeth where the tongue touches to make a /t/ sound. Velar refers to the velum, the soft palate, which the back part of the tongue makes contact with during the /k/ sound. Aboriginal languages have these sounds too, and due to the lack of voicing distinction in many languages, usually opt for either ⟨p, t, k⟩ or ⟨b, d, g⟩ in their writing systems. However, while English has only these three places of articulation in its stops, Aboriginal languages can have many more, either four, five, six, or even seven (in the case of Yanyuwa). The rest of this section describes some of the more common sounds and how they’re represented.

The location of human vocal organs and possible places of articulation used for speech.

Some languages have a distinct laminodental stop /t̪/ or /d̪/, where the tip of the tongue, and the flat part just behind the tip, is between or directly touching the teeth. These are often spelled using a digraph with the letter ⟨h⟩; think of how /θ/ and /ð/, both represented by ⟨th⟩ in English, are pronounced with the tongue between the teeth. The digraphs for dental stops are usually ⟨th⟩ or ⟨dh⟩ in Aboriginal languages; these do not represent the fricative /θ, ð/ sounds that English used ⟨th⟩ for.

Some languages have a palatal/alveo-palatal stop /c/ or /ɟ/. To English speakers, this can sound like a ⟨ch⟩ or ⟨j⟩ sound, like in chump or jump. However, while ⟨ch⟩ (phonetically /ʧ/) is produced with the tip of the tongue touching the roof of the mouth behind the alveolar ridge, the alveo-palatal stop is produced with the flat, front part of the tongue touching the roof of the mouth. This sound is usually represented with a ⟨j⟩ or a digraph such as ⟨ty, dy, tj, dj⟩.

Retroflex stops /ʈ, ɖ/ are produced with the tongue curled back, or bunched up, with the tongue tip touching the roof of mouth, further back than /t/ or /d/ (these are often called postalveolar). These sounds might be familiar from South Asian languages such as Hindi and Urdu. They are usually represented with a digraph ⟨rt⟩ or ⟨rd⟩ in Aboriginal languages. Some languages, including Yolŋu Matha and Pitjantjatjara, use the underlined letters ⟨ṯ, ḏ⟩ for retroflex consonants. The underlines on these character differ from the underline which is added through formatting. If possible, the line should be narrower than the width of the letter, should not touch the underline on an adjacent letter, and should appear above the line used in underline formatting. There is a Yolŋu keyboard layout available that contains the underlined letters, as well as ⟨ŋ⟩.

Some languages have a glottal stop, like a British pronunciation of wha’ever, or in the Hawaiian pronunciation of Hawai’i. This is usually represented with ⟨h⟩, like in the language name Gundjeihmi, or an apostrophe, like in the name of the Yolŋu community Galiwin’ku. It sometimes might be represented with a question mark ⟨?⟩ in some languages (this is used in academic documentation of Kamu, for example), as this resembles the phonetic symbol /ʔ/.

Nasals and Laterals

Now that we’ve covered the anatomical details above, the representation of nasal and lateral consonants easily follows. Nasal sounds are produced by producing a closure in the mouth and redirecting air through the nose. Lateral sounds are produced when there is a partial closure in the mouth, but the air can still pass around the sides of the tongue. English has three nasal consonants (/m/, /n/, /ŋ/), and one lateral (/l/). As with the stops above, there are more places of articulation in most Aboriginal languages.

Laminodental sounds are usually represented with a digraph with ⟨h⟩. By analogy with ⟨th⟩ and ⟨dh⟩, languages that have these sounds usually use ⟨nh⟩ and ⟨lh⟩, like the town of Nhulunbuy in the Northern Territory.

Likewise, palatal nasals and laterals use either ⟨y⟩ or ⟨j⟩ in digraphs: ⟨ny, nj, ly⟩. Some sources may use ⟨ñ⟩ for the palatal nasal, although this is not widely used. I don’t know of any Aboriginal languages that use ⟨lj⟩.

Retroflex nasals and laterals are also usually represented with digraphs ⟨rn, rl⟩, except those languages which use underlined letters: ⟨ṉ, ḻ⟩.

Finally, velar nasals are usually represented with the digraph ⟨ng⟩, although a minority of languages use ⟨ŋ⟩. The uppercase form of this letter should be based on the uppercase N with a descending hook, rather than an enlarged form of the lowercase ⟨ŋ⟩.

Rhotics

Rhotics are sounds which… well, they don’t really have a clear feature in common in terms of how they’re produced across languages, but linguists basically use the term rhotic to mean sounds written with ⟨r⟩. Very scientific.

Most Aboriginal languages distinguish between two rhotic sounds. One is an approximant, roughly like an English /ɹ/. The other might be pronounced like a trill /r/ sometimes, at other times like a tap /ɾ/. The tap /ɾ/ is like the t-sound in butter when you’re speaking at a normal pace in Australian English. Typically, the approximant is represented with a single ⟨r⟩, while the tap/trill is represented with a double ⟨rr⟩. In some languages which use underlines, the single ⟨r⟩ is used for the tap/trill, and the underlined ⟨ṟ⟩ represents the approximant.

Summary & Other Considerations

The table below summarises the most common ways that consonant sounds are represented in Aboriginal languages. The columns designate the place of articulation (where the tongue, lips, teeth, and other parts of the mouth meet), the rows designate the manner of articulation (how the sound is produced). Not all Aboriginal languages have all these sounds, and some languages have more sounds, however, these are the most common consonants across the continent (and the Torres Strait). There’s also a row for ⟨w⟩ and ⟨y⟩, which are the same as in English.

Some languages use punctuation symbols within words to make it clear that letters represent separate sounds. Specifically, the letters ⟨ng⟩ might be ambiguous between the nasal sound /ŋ/, or a sequence /ng/ (think of the difference between singer and finger in English). The sequence /ng/ is therefore spelled as ⟨n’g⟩ or ⟨n.g⟩ in some languages to make it clear that this is two sounds, not one. Examples of this are the language name Ngan’gityemerri, or the word ran.gu which means ‘moon’ in Burarra.

Apart from these common digraphs, some languages use different digraphs, or have more complex sounds that require multiple letters to represent faithfully. Here are just a handful of examples: ⟨pb, rtd, kg, tjj, rdd, djdj, nyng, nyk, yk, rnd, ngkw, kngw, thnw, nthw⟩.

Digraphs (and longer combinations of letters) may appear at the start of words in ways that look unfamiliar to English speakers. For example, in Anmatyerre artist Emily Kame Kngwarreye’s name, ⟨kngw⟩ represents a single sound, the rounded pre-stopped nasal /ᵏŋʷ/. This is characteristic of the complex sounds of the Arandic language family.

Special Characters & Diacritics

The following diacritics may be used with any vowels:

´ Acute accent
` Grave accent
¨ Umlaut
¯ Macron

The circumflex, breve, tilde, and underline diacritics may be used for these combinations in particular:

Û U+00DB Latin capital letter U with circumflex
û U+00FB Latin small letter u with circumflex
Ŭ U+016C Latin capital letter U with breve
ŭ U+016D Latin small letter u with breve
Ḏ U+1E0E Latin capital letter D with line below
ḏ U+1E0F Latin small letter d with line below
Ḻ U+1E3A Latin capital letter L with line below
ḻ U+1E3B Latin small letter l with line below
Ṉ U+1E48 Latin capital letter N with line below
ṉ U+1E49 Latin small letter n with line below
Ṟ U+1E5E Latin capital letter R with line below
ṟ U+1E5F Latin small letter r with line below
Ñ U+00D1 Latin capital letter N with tilde
ñ U+00F1 Latin small letter n with tilde

Special (non-diacritic) letters:

Œ U+0152 Latin capital ligature OE
œ U+0153 Latin small ligature oe
Ŋ U+014A Latin capital letter Eng
ŋ U+014B Latin small letter eng

Author’s note: I am non-Indigenous, academic linguist at the University of Melbourne who has had the privilege to research Australian Indigenous languages over a number of years, particularly Pitjantjatjara. I live and work on Wurundjeri Woiwurrung country and have been welcomed into Pitjantjatjara communities for the purposes of academic research as well supporting bilingual education. I do not claim any authority or ownership over the languages I discuss; they are the intellectual property of the language custodians.

References

1 — https://collection.aiatsis.gov.au/austlang/search

FURTHER RESOURCES

https://www.firstlanguages.org.au/

First Languages Australia is the peak body for Aboriginal and Torres Strait Islander languages. Their website links to many more resources to learn more about Australian Indigenous languages, as well as a map with local language centres and other organisations.

https://collection.aiatsis.gov.au/austlang/search

The Australian Institute for Aboriginal and Torres Strait Islander Studies maintains a database of information about Indigenous languages, including available resources for the languages, history of documentation, variant names, and more.

https://gambay.com.au/

This is a language map being developed by First Languages Australia in collaboration with local communities. It is an updated version of the well-known but problematic Tindale map.

https://50words.online/

This is an initiative by the Research Unit for Indigenous Language at the University of Melbourne. Communities around the country have contributed up to 50 words from their languages. You can read and hear these words being spoken in these languages.

http://learnline.cdu.edu.au/yolngustudies/resourcesKeyboard.html

This is a downloadable keyboard layout for the Yolŋu alphabet, which includes the underlined characters, ⟨ŋ⟩, and ⟨ä⟩.

Words of Wonder: Endangered Languages and What They Tell Us is a book for a general audience by non-Indigenous linguist Nicholas Evans, who has spent decades working with Indigenous communities in Northern Australia to document their languages.

https://nyingarn.net/

Nyingarn (‘echidna’ in Nyoongar) aims to digitise manuscripts containing Indigenous language, and to transcribe them and make them accessible and searchable for communities and language custodians. You can sign up as a volunteer transcriber here: https://nyingarn.net/transcription-methods/transcribe-with-us/

https://aboriginalbibles.org.au/

You can access the Bible in many languages here; this might be useful as placeholder text to see how different languages will look in a particular typeface or layout.