Outbound Voice Capabilities
...
Answering Machine Detection (A...
Flow and Algorithm
4 min
this document will delve into the core algorithm that powers amd in the asterisk communication platform while different amd configurations exist, the underlying algorithm remains constant across all setups flow answering machine detection (amd) is a standalone application that communicates with asterisk, a communication platform, using a local socket here's how the interaction between amd and asterisk works amd setup and configuration loading configurable parameters when amd starts, it reads its configurable parameters from a specific file located at "/dacx/var/ameyo/dacxdata/etc/amds/1 6/amd conf " unfortunately, this file path is hardcoded within the amd code, making it unchangeable after amd is compiled from asterisk's perspective socket communication when asterisk encounters an amd extension, it creates a socket connection to amd using the predefined path sending frames once the socket connection is established, asterisk sends an asterisk frame from the channel to amd amd's response amd processes the received frame and responds back to asterisk with one of the following outcomes am (answering machine) amd identifies the call recipient as an answering machine human amd detects that a live human has answered the call continue if the response is not clear, amd requests additional frames from asterisk to continue analyzing the call extension handling based on the amd response, asterisk takes appropriate action if it receives am or human, it moves on to the next extension if it gets continue, it sends more frames to amd until a conclusive response (am or human) is received or a timeout occurs from amd's perspective socket listening when amd starts, it creates a local socket at the path "/dacx/var/ameyo/dacxdata/var/amds/1 6/amd ctl" (note that, similar to the amd configuration, this socket path is also hardcoded and cannot be changed) thread creation whenever amd receives a request from asterisk, it creates a new thread for each request this means that amd can handle multiple requests concurrently, as it creates as many threads as the number of amd detection requests it receives answering machine detection amd then starts its answering machine detection process for each connected client (asterisk) polling and conclusion during the answering machine detection process, amd polls the incoming frames from the socket until it reaches a conclusion regarding whether the call recipient is an answering machine or a human timeout handling if amd cannot make a clear determination within a certain time (timeout), it assumes a human has answered the call and responds accordingly flow diagram ' answering machine detection (amd) algorithm fetching and computing data amd fetches each unit of audio data (packet) from a socket and calculates its energy level if amd cannot confidently determine the result based on the current fetched data, it reads more data from the socket energy level thresholds amd divides the energy levels into three regions noise, signal, and intermediate if the energy of a data unit is below a certain threshold (noise threshold), amd classifies it as noise if the energy is above another threshold (signal threshold), amd classifies it as a signal (i e , not noise) if the energy is between the noise and signal thresholds, the data is temporarily categorized as "intermediate " circular buffer for packet states for every packet, amd stores its computed state (noise, signal, or intermediate) in a circular buffer this buffer helps amd detect a continuous block of high energy packets, referred to as an "utterance" (akin to a single word) detecting utterances after processing each packet, amd checks the circular buffer to determine if an utterance has started an utterance is considered a single block of continuous high energy packets (a word) determining utterance boundaries to determine the end of an utterance (word), amd looks for a low energy packet after detecting an utterance start the time between utterance start and utterance end is stored in amd's data structure smoothing noise and signals after processing each packet, amd smooths the noise and signals to achieve more accurate results smoothing helps in considering the minimum utterance duration for better word recognition decision making with information about utterance start and end times, amd makes decisions based on specific criteria examples of decisions include if amd detects too many words, it returns "answering machine " if the number of words per given time interval is high, it returns "answering machine " if the length of a word is larger than the maximum allowed threshold, it returns "answering machine " if the time gap between the end of the first word and the start of the second word is greater than the "hello silence threshold," it returns "human " if the first word starts very late, it indicates "answering machine "
Have a question?
Our knowledgeable support team and an awesome community will get you an answer in a flash.
To ask a question or participate in discussions, you'll need to authenticate first.