WhatsApp Voice Note Support
7 min
overview your customers are busy many would rather speak than type now your whatsapp chatbot understands voice messages just as well as text customers can chat hands free, in their own language, and your bot gets them the help they need faster what is the feature? voice note support allows users to interact with the chatbot using whatsapp voice messages instead of typing the voice note is converted into text using a speech to text (stt) provider and processed by the chatbot like any other text message the feature supports any form of language and multilingual conversations (if enabled on the bot) and can leverage ai to identify user intents from natural spoken queries benefits improves accessibility and user convenience enables hands free interactions similar to our web bot supports multilingual voice queries handles natural and conversational user requests improves intent detection for longer spoken queries using ai (ai intent assist) pre requisites an active chatbot configured and published on whatsapp update the stt provider configuration enable multilingual languages (as per requirements) the ai intent assist feature should be enabled channel supported web whatsapp how to enable enable the voice note feature for the bot from the bot admin console in case you don't have access, raise a request to your account manager / project manager they will enable the voice note toggle from the admin console (limited to internal users) select the preferred stt provider configure the required stt settings important go to bot settings > solution settings > enable ai intent assist save and publish the bot configurations stt provider select the speech to text provider used for audio transcription we support sarvam sarvam (best for latency and regional languages, but average for international languages) whisper (best for international languages, average for regional languages, and latency) gemini (better for all languages, and cost efficient, but high latency) failure message configure the message displayed when a voice note cannot be processed due to transcription failure, unsupported audio, or other processing issues maximum audio duration voice notes longer than 1 minute are not supported and will trigger the configured failure message how does it work? user sends a voice note on whatsapp the whatsapp audio is received by our chatbot system and sent to the configured stt provider the stt provider detects the language and transcribes the audio into text the chatbot processes the transcript as a regular text message nlp attempts to identify the intent if nlp cannot identify the intent, ai intent assist attempts to map the query to the closest matching intent using ai the bot continues the conversation using the identified intent and configured flow example user voice note "mujhe meri policy renew karni hai" transcribed text "mujhe meri policy renew karni hai" detected intent (detected via ai intent assist) renew policy bot response the bot initiates the policy renewal journey known limitations background noise and poor audio quality may reduce transcription accuracy ai intent assist needs to be enabled by mandatory to support the proper working mixed language inputs (e g , hinglish) are supported on a best effort basis, and the accuracy may take a hit voice notes longer than 1 minute are not supported structured inputs such as pan, aadhaar, policy numbers, otps, and email addresses may not transcribe accurately and should be collected via text like llm transcribes “9930…” as “nine nine three zero…” long conversational voice notes may reduce intent detection accuracy multiple requests within a single voice note may result in failure language detection and transcription quality depend on the configured stt provider ai intent assist helps identify new or conversational intents, but may not assist during structured data collection steps within an active flow
Have a question?
Our super-smart AI, knowledgeable support team and an awesome community will get you an answer in a flash.
To ask a question or participate in discussions, you'll need to authenticate first.
