The Role of Audio Design in Voice‑Enabled Self-Service Kiosks
Enabling a kiosk to truly “hear” a customer is not as simple as adding a microphone. These devices operate in challenging environments: background noise in busy restaurants, voices from nearby users, etcetera. Even the kiosk’s own loudspeaker output can interfere with its microphones.
A Growing Trend
Self-service kiosks are now a familiar presence in restaurants, airports, supermarkets, etcetera. Until recently, interaction was mainly through touchscreens. Now, the future is moving towards a more natural approach: talking to the kiosk, just as we would with a human employee.
Voice interaction brings clear benefits: speed, convenience, accessibility for people with visual or mobility limitations, and greater hygiene by reducing physical contact. Beyond helping organizations comply with accessibility regulations, speech interfaces can make kiosks more inclusive by enabling natural interaction for a broader range of users, including those who may struggle with touchscreens. With advances in artificial intelligence, kiosks can now understand complex requests, process natural speech, and respond with clear, natural-sounding voices.
Ensuring the Kiosk Hears Clearly

For voice interaction to work reliably, the audio system must be carefully designed. Key challenges include:
– Filtering out background noise.
– Distinguishing the user’s voice from nearby conversations.
– Canceling echo to avoid the kiosk hearing itself.
– Accurately capturing a wide range of accents and speech styles.
Some past attempts to introduce voice interaction in kiosks have failed to meet expectations, often because of shortcomings in audio design. These cases highlight how poor microphone placement or inadequate processing can result in frustrating experiences and low adoption rates.
The Importance of Audio Design Expertise
It is important to seek out expertise in audio design. Specialists can help ensure that the hardware layout and system architecture provide the best possible input to processing algorithms, for example by supplying signals from multiple microphones and exposing relevant acoustic information. This expertise makes the difference between a functional solution and one that consistently delivers optimal results.
Solutions in the Market
There are different approaches to address these challenges:
1. Custom audio design with specialized software: the preferred approach for new kiosks or drive-through systems, where designers can select quality microphones and speakers and apply advanced algorithms for audio processing, as for example Cerence’s Speech Signal Enhancement (SSE), proven in millions of vehicles worldwide to deliver reliable voice interaction despite engine, road noise and other sounds in the car. Applied to kiosks and drive-throughs, it enables precise sound filtering to focus on the customer’s voice, delivering superior results.
2. Retrofitting existing systems: not all fleets can be replaced at once. Retrofit solutions use external pre-built microphone arrays and speakers. An important design decision in these cases is whether to rely on the pre-built mic array’s own audio processing algorithms or to bypass them and feed the raw microphone signals into third party signal processing solutions such as Cerence’s SSE.
Conclusion
Voice is set to transform the self-service kiosk experience. But success depends not only on AI—it requires kiosks to be designed with high-quality audio in mind.
Companies that choose proven solutions will be able to provide faster, more natural, and more inclusive interactions. In doing so, kiosks will evolve from simple machines into reliable assistants.
To Know More
For further information about audio design for self-service kiosks, please contact Code Factory at .
Founded in 1998 and headquartered in the Barcelona area in Spain, Code Factory is committed to provide solutions that improve the way that people interact with technology.
