- Application code
- Web server
- VoiceXML browser
- VoiceXML platform
- Telephony hardware, VOIP stack
SpeakRight lives in application code layer, typically in a servlet. The SpeakRight runtime dynamically generates VoiceXML pages, one per HTTP request. Between requests, the runtime is stateless, in the same sense of a "stateless bean". State is saved in the servlet session, and restored on each HTTP request.
The SpeakRight framework is a set of Java classes specifically designed for writing speech rec applications. Although VoiceXML uses a similar web architecture as HTML, the needs of a speech app are very different (see Why Speech is Hard TBD).
SpeakRight has a Model-View-Controller architecture (MVC) similar to GUI frameworks. In GUIs, a control represents the view and controller. Controls can be combined using nesting to produce larger GUI elements. In SpeakRight, a flow object represents the view and controller. Flow objects can be combined using nesting to produce larger GUI elements. Flow objects can be customized by setting their properties (getter/setter methods), and extended through inheritance and extension points. For instance, the confirmation strategy used by a flow object is represented by another flow object. Various types of confirmation can be plugged-in.
Flow objects contain sub-flow objects. The application is simply the top-level flow object.
Flow objects implement the IFlow interface. The basics of this interface are
IFlow getFirst();getFirst returns the first flow object to be run. A flow object with sub-flows would return its first sub-flow object. A leaf object (one with no sub-flows) returns itself. (See also Optional Sub-Flow Objects)
IFlow getNext(IFlow current, SRResults results);
void execute(ExecutionContext context);
getNext returns the next flow object to be run. It is passed the results of the previous flow object to help it decide. The results contain user input and other events sent by the VoiceXML platform.
In the execute method, the flow object renders itself into a VoiceXML page. (see also StringTemplate template engine).
Execution uses a flow stack. An application starts by pushing the application flow object (the outer-most flow object) onto the stack. Pushing a flow object is known as activation. If the application object's getFirst returns a sub-flow then the sub-flow is pushed onto the stack. This process continues until a leaf object is encountered. At this point all the flow objects on the stack are considered "active". Now the runtime executes the top-most stack object, calling its execute method. The rendered content (a VoiceXML page) is sent to the VoiceXML platform.
When the results of the VoiceXML page are returned, the runtime gives them to the top-most flow object in the stack, by calling its getNext method. This method can do one of three things:
- return null to indicate it has finished. A finished flow object is popped off the stack and the next flow-object is executed.
- return itself to indicate it wants to execute again.
- return a sub-flow, which is activated (pushed onto the stack).\