If You Can’t Make Sense of Us, You Silence Us: Vocal Stereotypy, Behavioural Science, and the Erasure of Autistic Voice
Reclaiming Autistic Voice from the Silencing Logic of Behavioural Science.
This article rebuts a 2025 ABA study that frames autistic vocalisations as “non-functional,” exposing the behavioural model’s silencing logic and reclaiming echolalia, scripting, and stimming as valid, embodied autistic communication.
Introduction
Molina, P. B., & Elias, N. C.. (2025). Effects of Non-Contingent Music in the Emission of Vocal Stereotypies in Children with Autism Spectrum Disorder. Psicologia: Teoria E Pesquisa, 41, e41402. https://doi.org/10.1590/0102.3772e41402.en
There are studies one reads with detachment, others with academic curiosity. Then there are those that land somewhere deeper—beneath the skin, in the chest, in the place where language lives before it finds words. Reading the 2025 Brasilian study Effects of Non-Contingent Music in the Emission of Vocal Stereotypies in Children with Autism Spectrum Disorder was not an intellectual exercise. It was an act of endurance. The language itself is clinical, sedated, composed: terms like “problem behaviour,” “non-functional vocalisations,” “nonsensical whispers,” “screams,” and “decontextualised speech” are deployed with the untroubled certainty of behavioural science. But beneath each phrase is the unmistakable hum of something more violent. These are not just descriptions—they are judgments. And not of the behaviour alone, but of the child. Of the voice. Of the way that voice resists translation.
The method is simple: when a child makes sounds deemed inappropriate—too loud, too strange, too unplaceable—music is played to interrupt, to displace, to reroute the instinct. If the child vocalises again, the music is withheld. Silence is rewarded. Vocal expression becomes a condition to overcome. This is framed as intervention. Evidence-based. Effective. But what I see—what I feel in the gut—is the quiet brutality of containment. Not containment as safety, but as control. The kind that reorders the sensory world not for the benefit of the child, but for the comfort of the adults measuring them.
What unsettles me most is not just that this paper was written, but that it was written so effortlessly. That it was reviewed, published, circulated—as if the premise did not need questioning. As if autistic children’s voices are by default suspect, and their reduction a moral good. There is no room here for interpretation, no pause to ask what these vocalisations mean to the children themselves. No recognition that “nonsensical whispers” may be a form of co-regulation, that “repetitive sounds” may carry memory, rhythm, or emotional charge. The study never asks why the children vocalise—it asks only how to make them stop.
And so I return to that sense—not of anger, but of deep, bone-level grief. Because what this paper makes clear is that behavioural autism research is not, at its core, about listening. It is about managing the dissonance of autistic expression in a world unwilling to slow down and attune. This isn’t an isolated flaw in one paper’s methodology. It is the logical conclusion of a field built on Lovaas’ legacy—a legacy that never sought to understand us, only to render us palatable. We were never expected to be heard. Only handled. Adjusted. Made invisible, if necessary. And this, I think, is the most enduring violence: not that our speech is misinterpreted, but that our right to speak in the first place is questioned. That our voices are studied only to be suppressed. That our songs are heard only when they fall silent.
The Diagnostic Weapon: What Is “Vocal Stereotypy?”
If you haven’t heard the term vocal stereotypy before, there’s a reason for that. Autistic people don’t use it to describe ourselves. We don’t speak this way—not in our communities, not in our support groups, not in the sacred spaces where we piece together language that fits the shape of our lives. Vocal stereotypy is not our word. It is a behaviourist invention, a term born in the sterile confines of observation labs and intervention protocols. It exists not to reveal something about us, but to classify what makes others uncomfortable. To mark our voices—our scripting, our echolalia, our tonal play, our stimming—as fundamentally disruptive because they do not map cleanly onto neurotypical expectations of speech. Because they move sideways, in rhythms that resist direction.
In the literature, the definition is consistent across decades: repetitive, non-functional vocalisations (Ahearn et al., 2007; Taylor et al., 2005; Rapp et al., 2009). That phrase—non-functional—is doing enormous, unexamined work. Whose function? By what metric? What counts as “functional” when communication is relational, sensory, and sometimes entirely interior? What is missed when language is judged only by its external utility and not by its internal resonance—its rhythm, its familiarity, its power to soothe or to tether or to remember?
To name something vocal stereotypy is not a neutral act. It is to fix it—like type in a mould—before it has even been heard. The word itself carries the history of its function. Stereotypy originates in printing: a stereotype was a solid metal plate used to make identical impressions, over and over again. From stereos, meaning “solid,” and typos, “impression” or “mark,” the term was first used not in psychiatry, but in the mechanical reproduction of text. There is something telling in that origin—something that prefigures how the term would later be turned on us.
When psychology adopted stereotypy to describe repetitive behaviour, it imported that same metaphor: of fixity, of unchangeability, of duplication without variation. In the behavioural sciences, vocal stereotypy came to mean patterned vocalisations that repeat without social function. But embedded in this definition is a powerful, unspoken judgment: that language only matters when it is flexible, goal-directed, and legible to others. That repetition is evidence of deficit. That patterns without explanation are dangerous. And so, our voices—especially when they arrive in loops, in echoes, in phrases pulled from memory or sound—are treated not as communication, but as malfunction.
This framing does more than describe; it disciplines. It marks certain ways of speaking as excess—noise to be tracked, timed, interrupted, reduced. The term becomes a tool for management, a way of justifying intervention not because the behaviour is harmful, but because it resists external meaning. But what if the repetition isn’t meaningless? What if it’s regulatory, narrative, sensory, sacred? Behaviourism does not pause to ask. It privileges what can be observed, measured, and modified. It privileges surface over interior, and outcomes over context. To call something vocal stereotypy is to flatten it—to render it solid, fixed, without depth—just as the original stereotype plate pressed words into paper without ever considering the reader’s need.
This is how diagnostic weapons are forged. Not through cruelty necessarily, but through a clinical detachment so total that it forgets the subject altogether. It sees form, but not feeling. It hears sound, but not voice. And this is precisely why so many of us—especially those who process language gesturally, relationally, musically—have had to build our own frameworks from the ground up. Because to accept this naming, this mould, is to be set in it. To use their terms is to surrender our fluency. To be pressed into shape by someone else’s image of what language should be. We are not noise. We are not malfunction. We are not broken records. We are speaking—again and again, because sometimes that is how truth takes form.
Suppression as Method: ABA’s Intervention Playbook
The behavioural interventions surrounding vocal stereotypy do not attempt to understand autistic communication—they seek to eradicate it. Each so-called treatment operates on the same underlying assumption: that autistic vocal expression is undesirable unless it mimics the norms of the neuro-majority. There is no curiosity, no attunement, no pause to ask what the sound might mean to the person making it. Only the relentless pursuit of silence (“children should be seen and not heard”), or at least the performance of acceptable noise. The clinical language might vary—“increase alternative behaviour,” “decrease frequency,” “promote compliance”—but the underlying logic remains constant: if a voice cannot be made to serve external goals, it should be interrupted, redirected, or overwritten.
The most common method, Response Interruption and Redirection (RIRD), is exactly what it sounds like: an adult repeatedly stops an autistic person mid-vocalisation and demands an immediate, preferred alternative—often a word, a label, a request. It has been praised as effective in reducing vocal stereotypy across multiple studies (Ahearn et al., 2007; Liu-Gitz & Banda, 2009; Wunderlich & Vollmer, 2015; Wells et al., 2016), but effective at what? At silencing. At training us to replace comfort with compliance. RIRD doesn’t teach language; it conditions substitution. The original vocal act—whether a quote, a hum, or a self-soothing phrase—is treated not as language but as noise pollution.
Then there is Non-Contingent Music (NCM), the method featured in the study that first prompted this reflection. Here, music is used not to join the child in their sensory world, but to override it. The goal is not connection, but displacement: to flood the auditory environment so that the child no longer vocalises. Studies have shown this method to “reduce” vocal stereotypy (Gibbs et al., 2018; Lanovaz & Sladeczek, 2011), but again, reduction is not the same as understanding. It is noise-cancellation, not listening.
Stimulus Control techniques go one step further: teaching the child when vocalising is “allowed.” Green card: you may vocalise. Red card: be silent. It is the regimenting of voice, the literal policing of sound. What kind of world requires a child to seek visual permission to speak to themselves, to comfort themselves, to make sense of their environment? These methods (Esposito et al., 2021; Dunlop, 2012) are celebrated for their efficacy, but their ethical cost is never reckoned with. There is no inquiry into what is being lost—what emotional, developmental, or sensory functions are sacrificed in the name of control.
And now, we have the rise of AI Tracking—programmes designed to detect and monitor vocal stereotypy in real time (Dufour et al., 2020; Min & Fetzner, 2018). Surveillance dressed as care. The machine does not question whether a vocalisation is meaningful; it simply records deviation. The logic is chilling in its consistency: the goal is not to translate autistic expression, but to algorithmically contain it. To build a system in which silence—or its approximation—can be scaled, measured, automated.
What unites all of these interventions is not their method, but their motive: the suppression of autistic voice under the guise of therapeutic progress. None of these strategies ask why we vocalise the way we do. None attempt to meet us where we are. They begin with a premise—that autistic vocalisation is problematic—and build entire systems of intervention around that flaw. But when your voice is always framed as a problem to be solved, there is no space left for authenticity, no space for resonance. Just the slow erosion of self, reshaped in the image of someone else's comfort. These are not treatments. They are behavioural silencings, systematised and sanctioned.
The Eugenics We Don’t Name: Lovaas and ABA’s Colonial Legacy
If you’ve ever wondered why so many autistic people distrust ABA, or why the term “vocal stereotypy” feels more like a muzzle than a mirror, it’s not simply because of bad practitioners or outdated methods. It’s because the entire field emerged from a legacy not of liberation, but of control.
The roots of behaviourism are not benign. Ole Ivar Lovaas—often cited as the father of ABA—did not merely invent a therapy. He engineered a system of normalisation. His methods were shaped not only by the science of the post-war period, but by the political ideologies he was raised in. As the 2025 study by Gjerde reveals, Lovaas was a teenage leader in Nasjonal Samling, the fascist youth party aligned with Nazi Germany during the occupation of Norway. This isn’t an incidental footnote. It’s formative.
And yet, the field of ABA has never truly reckoned with this origin. His legacy is still celebrated in conference keynotes and training manuals, as if the fascist scaffolding of his worldview can be safely cordoned off from the methods he developed. But the ideology wasn’t an accessory. It was a blueprint.
Lovaas’s most infamous quote—that his goal was to make autistic children “indistinguishable from their peers”—is often treated as a well-intentioned goal poorly phrased. But what if it was precisely phrased? What if indistinguishability was the point—not inclusion, but erasure? What if the metric of success wasn’t flourishing, but conformity? ABA’s foundational logic is not curiosity about difference; it is the elimination of it.
That logic remains intact today. We see it in every data sheet that tallies “undesirable” behaviour without asking what it means. In every intervention that replaces echolalia with forced requests. In every funding structure that rewards compliance but not connection. ABA, as it is practiced and studied, does not seek to understand autistic voice. It seeks to overwrite it.
When viewed through the Power Threat Meaning Framework (PTMF), the motives behind these methods come into focus. PTMF asks not “What’s wrong with you?” but “What’s happened to you?”, “How did it affect you?”, “What meanings did you make of it?”, and “What did you do to survive?” In contrast, ABA asks none of these. It treats survival strategies as dysfunction. It reduces meaning to frequency. It decouples behaviour from context, emotion, and history—and in doing so, it reproduces the colonial logics of control.
Because this is colonialism—clinical, domestic, polite. It is the imposition of one communicative norm over another. A settler epistemology enacted in therapeutic form. To declare a child’s natural language “non-functional” is to refuse to hear them on their own terms. To frame tonal play as “vocal stereotypy” is to flatten nuance into deviance. This is how settler-colonial thinking works: not always through overt violence, but through the quiet removal of interpretive power.
ABA research continues this pattern. The metrics are clear: behaviours are measured not to be understood, but to be removed. The goal is not to join the child in their sensory-affective experience, but to prune it into something tolerable to observers. The success of an intervention is gauged not by the child’s sense of agency, but by their reduction in visible difference. It’s not hard to see the eugenic logic still embedded here—just softened by euphemism, hidden in spreadsheets.
This is what diagnostic weapons look like. Not knives, but clipboards. Not brute force, but fidelity checklists. And yet, the function is the same: containment. The behaviourist does not need to hate us to erase us. All they need is a methodology that privileges observation over relationship, outcome over meaning, indistinguishability over selfhood.
And for those of us who process language gesturally, musically, relationally—those of us who speak in patterns, pulses, or quotes—this erasure is not theoretical. It is daily. It is intimate. The thing they call “non-functional vocalisation” is the same thing we call regulation, resonance, memory, meaning-making. And when that is suppressed, it is not just noise that is lost. It is identity. It is kinship. It is a mode of self-repair passed down across autistic lifelines.
The cruelty here is not accidental—it is algorithmic. And the silence it leaves behind is not peace. It is disappearance.
The Double Empathy Problem Becomes Epistemic Erasure
At some point, the refusal to recognise autistic communication for what it is ceases to be an oversight—and becomes epistemic violence.
The double empathy problem, as theorised by Damian Milton (2012), tells us something deceptively simple: communication breakdowns between autistic and non-autistic people are mutual, not one-sided. The issue is not a lack of empathy within autistic people, but a bidirectional disconnect—compounded by power. And that power, when institutionalised, becomes doctrine. It decides which forms of communication count, and which are labelled as disordered, meaningless, or “non-functional.”
That doctrine is alive and well in the field of ABA. And its consequences for gestalt language processors (GLPs) and natural language acquisition (NLA) pathways are profound.
It’s not that the science isn’t there. It is. Research has shown, across decades, that echolalia and scripting are meaningful. That they serve communicative, regulatory, and affiliative functions. Paccia and Curcio (1982) demonstrated that echolalia can reflect comprehension. Pruccoli et al. (2021) affirmed its pragmatic uses. Prizant (1983) and Manning & Katz (1989) outlined echolalia’s developmental role in gestalt-based language emergence. Blanc et al. (2023) have continued that lineage, centring NLA as a valid and trackable path, not a deviation. These findings are not marginal—they are foundational for those of us who have lived them.
And yet, behaviourist literature rarely cites them. Instead, we get studies that measure “problem vocalisations” in decibels and frequency. That chart the reduction of scripting as a therapeutic success. That treat our voices—our repeated lines, our tonal play, our layered meaning-making—as something to suppress rather than understand.
When critics like Hutchins et al. (2024) or Venker & Lorang (2024) dismiss GLP as “ill-defined” or “anecdotal,” they’re not just questioning a model. They’re reinforcing a hierarchy of knowledge—one that privileges the frameworks of the neuro-majority and pathologises autistic insight. That renders our internal logic invalid simply because it does not map onto theirs. This is what epistemic erasure looks like: not just disbelief, but disqualification. The insistence that unless something is observable through a behaviourist lens, it is not real.
But what they call anecdote, we call pattern. What they call idiosyncratic, we call ancestral. What they call scripting, we call storytelling—fragmented, yes, but no less true. We build meaning through resonance. Through repetition. Through echoes that land differently depending on the context, the relationship, the need. This is not a deficit. It is a grammar.
The question is not whether our communication has function. It’s whose function the system cares about.
Because for many of us, especially those of us who are multiply marginalised, language has never been a simple tool. It has been a contested space. A site of survival. And when we are told that our mode of speaking, of relating, of constructing self, is “non-functional,” the harm is not just academic. It’s existential. The system does not merely fail to see us. It denies that there is anything to see.
ABA does not grapple with this. It doesn’t pause to ask what is being communicated, or what need underlies the repetition. It doesn’t interrogate its own assumptions about language as linear, atomised, one-directional. Instead, it responds to our voices with protocols. Redirection. Data points. Silence.
Milton’s theory reveals the gap. But lived experience names the cost.
And that cost is borne by children who are told their joy is disruptive. By teens who are punished for quoting their favourite show in moments of overwhelm. By adults who have spent years trying to unlearn the shame etched into them by therapists who only ever taught them to be smaller, quieter, more interpretable.
We are not uninterpretable. We are uninterrogated. Because to truly hear us, one would have to stop correcting long enough to listen.
And so, we echo. We repeat. We stitch fragments into something whole—not because we lack creativity, but because this is how we remember ourselves. How we restore coherence in a world that refuses to grant it.
What behaviourists miss, GLPs live.
And we’re not waiting for permission to be real.
What Vocal Stimming Really Is: A PTMF Reframing
If the behavioural model dismisses autistic speech as “non-functional,” the PTMF offers a vital counterpoint. It asks not what is wrong with a person, but what has happened to them—and how their responses to those experiences are meaningful. Through this lens, vocal stimming is not a symptom of disorder, but a survival strategy: a way of navigating environments that have punished or ignored our natural rhythms of expression.
Many of us who are GLPs have been immersed in settings that treat our communicative instincts as errors to be corrected. We’ve had our scripting redirected, our tonal play pathologised, our voice flattened into compliance. But these behaviours—these so-called “vocal stereotypies”—are not meaningless noise. They are embodied responses to real threats: emotional overwhelm, sensory distress, social misattunement, and the unspoken expectation that we perform neurotypicality just to be allowed to remain in the room.
In this context, our vocal expressions serve a clear purpose. They regulate our nervous systems, helping us process and release tension. They store and replay moments that mattered, looping back through memory to find safety or resonance in the present. They communicate affect when words fail, and they affirm identity in a world that often denies it. These are not disruptions. They are declarations.
To stim vocally is to reach for coherence when the world feels jagged. It is to insist on rhythm when surrounded by dissonance. And it is, perhaps most profoundly, a way of staying present in a body and a moment that might otherwise feel too much. When we echo a line from a film, hum a familiar note, or repeat a word with deliberate cadence, we are not malfunctioning—we are anchoring. We are weaving meaning from sound, marking our place in time, and making ourselves legible on our own terms.
The questions PTMF offers—what has happened, what is the threat, how does this behaviour serve the person, and what does it reveal about their needs—pull us away from clinical detachment and toward embodied understanding. They allow us to see that what we do with our voices is not evidence of damage, but of adaptation. We stim because it helps. Because it holds us. Because it lets something rise to the surface that can’t be forced into conventional speech.
And the reasons we do it—whether to soothe ourselves, signal to others, or simply stay connected—don’t require validation from those who were never taught to listen. They’re valid because we say they are. Because our bodies remember what they need, even when the frameworks around us pretend not to.
Stimming with sound isn’t a failure of function. It’s the nervous system reaching for rhythm. It’s memory reaching for language. It’s a kind of sensory incantation—gestalt, emotional, resonant. And when ABA charts these vocalisations as behaviours to be extinguished, it doesn’t just misunderstand us. It disrespects us. It turns poetry into pathology and presence into problem.
PTMF gives us a way back—not toward being fixed, but toward being witnessed. It invites us to stop asking how to make people stop stimming and start asking what they are saying. Because we are saying something. Always. And if you listen—not with a clipboard, but with care—you might just hear what our bodies have been trying to tell you all along.
Toward Epistemic Justice and Autistic Sovereignty
What we need now is not a gentler method of silencing, but a complete refusal to participate in the logic of behavioural suppression. The category of “vocal stereotypy” must be retired—not revised, not softened, but abandoned. It was never built to recognise meaning. It was built to erase it.
In its place, we can name what’s actually happening. Vocal stimming. Gestalt utterance. Echolalic communication. These are not disruptions in need of redirection—they are modes of expression, of presence, of self-regulation and social signal. They are languages rooted in rhythm and resonance, spoken in tones that don’t always conform to the majority’s norms, but which carry depth, intent, and history. And they deserve not just protection, but platforms.
This is where the work turns: toward narrative translation, not behavioural compliance. Toward models of support that listen for what our voices mean, rather than charting how often they occur. Toward therapeutic approaches that build with us, not against us—like those explored by Valentino and colleagues (2012), or the early interventions sketched out by Demaine (2012), where the goal is not normalisation, but connection.
We need research that begins from autistic knowledge, not merely includes it as a token afterthought. That means deepening and validating NLA frameworks, not through the gatekeeping gaze of behaviourists, but through autistic-led, trauma-informed, participatory study. It means centring lived experience and embodied insight. And it means accepting that the ABA industry, in all its institutionalised defensiveness, will likely never read that research—because it cannot afford to confront the double empathy problem at the heart of its practice. It needs us to remain illegible in order to justify its interventions.
But we do not need it.
We do not need to be fixed, translated, or rendered palatable. We do not need to make ourselves small enough to fit inside their charts. What we need is to be met where we are—with our loops and our scripts and our songs. With our need for rhythm, our hunger for resonance, our refusal to abandon ourselves for the comfort of others.
The future we deserve isn’t one of compliance, but of coherence. Of voice that isn’t pathologised, but honoured. Of stimming that is not interrupted, but understood.
We were never too loud. The frameworks were too brittle. And it’s time we stopped asking them to hear us—and started building the spaces where our voices already belong.
Closing: We Were Always Speaking
You can measure our voices by the second. You can call them inappropriate, disruptive, nonsensical. But we were always speaking. The tragedy isn’t that you didn’t hear us. It’s that you thought you had the right to decide what our voices were for.
Like anyone, I’ve grown as I’ve grown older. I’ve learned how to translate myself more fluently, to navigate systems that were never designed with me in mind. My words have found new forms—books, essays, poems, songs. But the growth of my expressive capacity has not meant a loss of the earlier modes. I am still autistic. Still a Level 2, by the framework’s reckoning. I still have significant support needs. I still stim. I still echo. I still press phrases to the roof of my mouth until they click into meaning. I still reach for language like a lifeline, sometimes out of joy, sometimes grief, sometimes an overwhelm too big for any sentence to hold.
And I honour that. Because those gestures—those loops, those echolalic invocations—aren’t relics of a past self to be outgrown. They are the very texture of who I am. They are memory, emotion, survival, song. They are how I hold time still, how I stitch myself to a place, a moment, a presence. They are not disordered. They are not meaningless. They are ancestral.
If someone had measured me only by my early presentation—by the scripts, the silences, the stims—they might’ve written my future off entirely. Assumed that this voice, this body of work, this life I now live would never come to be. And yet here I am. Writing deep dives, not behaviour charts. Weaving poems instead of data sheets. Singing still—joyous, defiant songs like my Stone-Borne ancestors, who raised their voices not to be productive, but to belong.
They had no need for “functional speech” as defined by systems of extraction. Nor do I. My language, like theirs, is a birthright—not a project.
So hear this clearly: we were never broken. We were always speaking. And we will not let you measure us into silence.
Oof...the second time in a week a post has reminded me of Kurt Vonnegut's short story Harrison Bergeron 😞