| Sv translation | ||||
|---|---|---|---|---|
| ||||
IntroductionYou might want to create a routing application which allows your callers to use speech recognition to perform various tasks - such as inform the system of what they want, or who they want to be connected to. There are three ways of doing this with the jtel system. Simple Word- or Phrase SpottingIn this mode, the caller can utter a sentence, and the system will try to extract words / phrases which correspond to a particular action in the call flow. For example, suppose the caller utters "I have a problem with my system and would like to speak to someone from technical support", then you could define the words "problem, support" as the keywords which will take the caller to this action. Call flows like this can be built using the object Input Menu DTMF ASR.
Entity ExtractionTo complete an automated process, you might need several pieces of information from the caller. For example, using the example of a newspaper delivery complaint, you might need:
We are sure, you have seen plenty of Voice Bot demos where the sales person will demonstrate the scenario above by saying something like: "My customer number is 12345678 and I didn't get my newspaper on Monday morning and I want it to be delivered again". Nice ... and yes, you can build those kind of applications with jtel. But it's a far cry from the reality of how human beings actually interact with automated systems. Try it out - just read the sentence above. How likely is it that you would actually say that? More likely, callers will say something like "I didn't get my newspaper", and you will have to continue on from that point. Also, bear in mind, that your process will not work, unless you get all three pieces of information required by your backend REST API or interface - in the right format to actually give to the API or interface concerned. Using LLMs will not give you a 100% result here - so be aware that if you go for the LLM approach you will get unpredictable results sometimes. Sometimes it is better to use a step by step call flow once you know what the caller wants, leading the caller through the inputs you require until you have them all and can execute the process on the backend. Call flows like this are built using the Input ASR Object, and specific extractors for the type of information you want at a particular point in the call flow. LLM Based Call-FlowsThe third way of producing a voice bot is using speech recognition in combination with an LLM and crafted prompts, which instruct the LLM how to query the caller for the information required to complete a process. Taking our example above, you might build an LLM prompt something like this:
Call flows like this are built using the Input ASR Object and the "Any Text" extractor. The results are passed to the LLM (including any previous interactions from the conversation) for processing. Once you detect a JSON Structure containing all of the information you need, you can complete the process. Or you can break-out of the LLM based approach, and return to Entity Extraction if you like, at any point in the process. |
| Sv translation | ||||
|---|---|---|---|---|
| ||||
EinführungWenn Sie eine Routing-Anwendung erstellen möchten, mit der Ihre Anrufer mithilfe der Spracherkennung verschiedene Aufgaben ausführen können – beispielsweise dem System mitteilen, was sie möchten oder mit wem sie verbunden werden möchten. Mit dem jtel-System gibt es drei Möglichkeiten, dies zu tun. Einfaches Erkennen von Wörtern oder Phrasen (Wordspotting)In diesem Modus kann der Anrufer einen Satz sagen, und das System versucht, Wörter/Ausdrücke zu extrahieren, die einer bestimmten Aktion im Anrufablauf entsprechen. Angenommen, der Anrufer sagt: „Ich habe ein Problem mit meinem System und möchte mit jemandem vom technischen Support sprechen“, dann könnten Sie die Wörter „Problem, Support“ als Schlüsselwörter definieren, die den Anrufer zu dieser Aktion führen. Anrufabläufe wie dieser können mit dem Objekt „Input Menu DTMF ASR” erstellt werden.
Entity ExtractionTo complete an automated process, you might need several pieces of information from the caller. For example, using the example of a newspaper delivery complaint, you might need:
We are sure, you have seen plenty of Voice Bot demos where the sales person will demonstrate the scenario above by saying something like: "My customer number is 12345678 and I didn't get my newspaper on Monday morning and I want it to be delivered again". Nice ... and yes, you can build those kind of applications with jtel. But it's a far cry from the reality of how human beings actually interact with automated systems. Try it out - just read the sentence above. How likely is it that you would actually say that? More likely, callers will say something like "I didn't get my newspaper", and you will have to continue on from that point. Also, bear in mind, that your process will not work, unless you get all three pieces of information required by your backend REST API or interface - in the right format to actually give to the API or interface concerned. Using LLMs will not give you a 100% result here - so be aware that if you go for the LLM approach you will get unpredictable results sometimes. Sometimes it is better to use a step by step call flow once you know what the caller wants, leading the caller through the inputs you require until you have them all and can execute the process on the backend. Call flows like this are built using the Input ASR Object, and specific extractors for the type of information you want at a particular point in the call flow. LLM Based Call-FlowsThe third way of producing a voice bot is using speech recognition in combination with an LLM and crafted prompts, which instruct the LLM how to query the caller for the information required to complete a process. Taking our example above, you might build an LLM prompt something like this:
Call flows like this are built using the Input ASR Object and the "Any Text" extractor. The results are passed to the LLM (including any previous interactions from the conversation) for processing. Once you detect a JSON Structure containing all of the information you need, you can complete the process. Or you can break-out of the LLM based approach, and return to Entity Extraction if you like, at any point in the process. |