“Today’s episode examines the shortcomings of LLM translation, particularly concerning marginalized languages. Dr. Jaime Hoerricks, the author of the source article, uses a personal anecdote of translating a Scots Gaelic phrase to illustrate how ChatGPT’s predictive modeling can distort meaning by prioritising statistical likelihood over intended meaning. Dr. Hoerricks argues that LLM “translation” is not neutral, but instead reinforces colonial biases and power structures present in training data. They highlight the risks of erasure of nuance, the reinforcement of linguistic biases, and the loss of linguistic autonomy. Dr. Hoerricks advocates for caution, transparency, cross-checking with fluent speakers, and ethical dataset development when using LLMs for translation. Ultimately, they urge users to critically engage with LLM translation to protect the integrity of marginalised languages and ensure they are represented accurately.”
Here’s the link to the source article: https://open.substack.com/pub/autside/p/tha-obair-a-moladh-a-bhan-cheard
Let me know what you think.
Share this post