Gateway to the IBM Speech To Text (STT) cloud service > com.ibm.streamsx.sttgateway 2.2.3 > com.ibm.streamsx.sttgateway.watson > IBMVoiceGatewaySource
The IBMVoiceGatewaySource is a server-based operator. Inside of it, it hosts a C++ WebSocket server. It is designed to ingest speech data from the IBM Voice Gateway product version 1.0.3.0 or above. This speech data is ingested in binary format from the IBM Voice Gateway into this operator via the Websocket interface. Such speech data arrives here in multiple fragments directly from an ongoing live voice call. This operator is capable of receiving speech data from multiple calls that can all happen at the very same time between different pairs of speakers. For every voice call it handles in real-time, the IBM Voice Gateway product will open two Websocket connections into this operator and start sending the live speech data on both of those connections. One of those connections will carry the speech data of the agent and the other connection will carry the speech data of the customer. This operator will keep sending the audio chunks received on those two Websocket connections via its output stream for consumption by the downstream operators. At the end of the any given call, IBM Voice Gateway will close the two WebSocket connections it opened into this operator.
This operator can be configured with a Websocket server port number which is optional. If the user of this operator doesn't specify a Websocket server port number, then a default port number of 443 will be used.
See the samples folder inside this toolkit for two real-world working examples (VoiceGatewayToStreamsToWatsonS2T and VoiceGatewayToStreamsToWatsonSTT) that show how to use this operator. In those examples, you will see how to ingest the real-time voice call speech data, feed it to an IBM Watson Speech To Text engine to transcribe into text and then distribute the transcribed data in different ways for doing further analytics to derive insights. In addtion, those examples also show how to do live voice call recording and call replay from the pre-recorded calls. Many other vendors provide proprietary black-box solutions for call recording at a hefty price tag with either a non-existent or a minimal call replay facility. But, this toolkit gives those two features for free in a completely open and a flexible manner for users to beneift from them. Such a benefit allows customers to control where the recorded data gets stored in a standard Mu-Law format as well as accessing and using that data for their other purposes. All of them combined, it is a compelling way in which the IBM Voice Gateway, IBM Streams and IBM Watson Speech To Text offerings put the customer in the driver's seat to gather real-time intelligence from their voice infrastructure.
For a detailed documentation about the requirements, operator design, usage patterns and in-depth technical details, please refer to the official STT Gateway toolkit documentation available at this URL:
Optional: certificateFileName, initDelay, ipv6Available, nonTlsEndpointNeeded, nonTlsPort, tlsPort, vgwLiveMetricsUpdateNeeded, vgwSessionLoggingNeeded, vgwStaleSessionPurgeInterval, websocketLoggingNeeded
The default function for output attributes. This function assigns the output attribute to the value of the input attribute with the same name.
Returns an rstring value indicating the IBM Voice Gateway session id that corresponds to the current output tuple.
Returns a boolean value to indicate if this is a customer's speech data or not.
Returns an int32 value indicating the total number of output tuples emitted so far for the given channel in a IBM Voice Gateway session id.
Returns an int32 value indicating the total number of speech data bytes received so far for the given channel in a IBM Voice Gateway session id.
Returns an int32 value indicating the voice channel number in which the speech data bytes were received for a IBM Voice Gateway session id.
Returns an rstring value with details about the agent's phone number.
Returns an rstring value with details about the caller's phone number.
Returns an rstring value with the call start date time i.e. system clock time.
This port produces the output tuples that carry the binary speech data received from the IBM Voice Gateway. The schema for this port must have its first attribute named as speech with a blob data type to hold the speech data. Remaining attributes can be of any type based on the needs of the application. Such speech data being sent in these output tuples can represent multiple fragments of a full conversation happening in a live voice call. This operator is capable of sending out speech data from multiple calls that can all happen at the very same time between different pairs of speakers. IBM Voice Gateway will always send the speech data in two voice channels i.e. one channel will carry the speech data of a customer and the other channel will carry the speech data of an agent. Please refer to the custom output functions provided by this operator to query such voice call meta data information and assign that meta data values to other optional attributes in this output port.
There are multiple available output functions, and output attributes can also be assigned values with any SPL expression that evaluates to the proper type.
This port produces periodic output tuples to give an indication about the end of a specific speaker (i.e. channel) in a voice call that was in progress moments ago for the given IBM Voice Gateway session id. The schema for this port must have these three attributes with their correct data types as shown here. rstring vgwSessionId, boolean isCustomerSpeechData, int32 vgwVoiceChannelNumber This source operator will set the appropriate values for these attributes to indicate which particular speaker (i.e. voice channel number) of a given voice call (i.e. session id) just ended the conversation. This tuple also has an attribute (i.e. isCustomerSpeechData) to tell whether that recently ended voice channel carried the speech data of a customer or an agent. Downstream operators can make use of this "End Of Voice Call" signal as they see fit.
Optional: certificateFileName, initDelay, ipv6Available, nonTlsEndpointNeeded, nonTlsPort, tlsPort, vgwLiveMetricsUpdateNeeded, vgwSessionLoggingNeeded, vgwStaleSessionPurgeInterval, websocketLoggingNeeded
This parameter specifies the full path of the WebSocket server PEM certificate file name. Default is to read ws-server.pem from the etc sub-directory of the application.
This parameter specifies a one time delay in seconds for which this source operator should wait before start generating its first tuple. Default delay is 0.0.
This parameter indicates whether the ipv6 protocol stack is available in the Linux machine where the IBMVoiceGatewaySource operator is running. (Default is true)
This parameter specifies whether a WebSocket (plain) non-TLS endpoint is needed. (Default is false)
This parameter specifies the WebSocket (plain) non-TLS port number. Default port number is 80.
This parameter specifies the WebSocket TLS port number. Default port number is 443.
This parameter specifies whether live update for this operator's custom metrics is needed. (Default is true)
This parameter specifies whether logging is needed when the IBM Voice Gateway session is in progress with this operator. (Default is false)
This parameter specifies periodic time interval in seconds during which any stale Voice Gateway sessions should be purged to free up memory usage. (Default is 36060 seconds)
This parameter specifies whether logging is needed from the WebSocket library. (Default is false)
Non-TLS port number configured for this operator.
Did the user configure to exchange data via a non-TLS port?
Number of output tuples sent by this operator instance.
NOTE: This metric is only updated if parameter vgwLiveMetricsUpdateNeeded is true.
Total number of speech data bytes received by this operator instance.
NOTE: This metric is only updated if parameter vgwLiveMetricsUpdateNeeded is true.
TLS port number configured for this operator.
Number of voice calls processed by this operator instance.
NOTE: This metric is only updated if parameter vgwLiveMetricsUpdateNeeded is true.