Transcribing speech is never neutral. It shapes power and bias
Earlier this year I gave a talk about my research at Oxford’s All Souls College, and worked with a chef to design an accompanying menu.
Thinking about my work in southwest Western Australia, I typed “Boorloo”, the Nyungar name for the City of Perth.
Autocorrect had other ideas. It replaced it with “Barolo” – which, I thought, made for a fitting wine choice on the night.
It was an amusing moment, but also a revealing one. The system’s dictionary, trained largely on mainstream English data, didn’t know what Boorloo was, so it reached for a more familiar alternative. This seemingly minor miscorrection offers a glimpse into how language technologies are shaped – including which words they recognise, and which they overlook.
Why does this happen?
Part of the answer is that technologies such as automatic speech recognition convert spoken language into text. Transcription is often presented as a straightforward technical exercise: you listen, you write down what was said.
But every transcription protocol carries........
