IBM Speech Markup Language: SpeechML




Download 2,96 Mb.
Pdf ko'rish
bet47/131
Sana14.05.2024
Hajmi2,96 Mb.
#232039
1   ...   43   44   45   46   47   48   49   50   ...   131
Bog'liq
Ceponkus, Hoodbhoy - Applied XML - Toolkit for Programmers

IBM Speech Markup Language: SpeechML
IBM’s alphaWorks (a Web site devoted to introducing IBM’s research work to developers) 
introduced SpeechML in February 1999. SpeechML provides a framework in which Web-
based applications can integrate interactive speech capabilities.
SpeechML provides tags for defining spoken output and input as well as tags for 
triggering actions to be taken on a given spoken input (such as “open window” or “copy to 
disk”). SpeechML elements are identified and linked by URLs in an attempt to keep 
everything familiar to HTML developers. It builds on Java Speech Markup Language 
(JSML) for spoken output (that is, text to speech) and uses a combination of markup and 
Java Speech Grammar Format (JSGF) to define the spoken input (that is, speech to 
text.)
IBM would like developers to be able to use tags to add interactive speech capability to 
Web sites without being experts in speech technology. Just as you’d mark up a 
paragraph with a tag to make its contents bold, using a vocabulary of tags defined in 
SpeechML, you’d mark up sections of text with particular tags to make them audible.
The alphaWorks Web site includes a downloadable conversational browser that is written 
in Java and builds on Java’s Speech API and IBM’s XML parser.
This is exciting stuff. The realm of possibilities is virtually endless here. Imagine Web 
sites that give you their content without your having to look at the screen. You could 
perform multitasking like never before—listen to information from one page and type a 
report about it as you write.
In and of itself, text-to-speech capability built into a browser is pretty cool; however, other 
companies have tried the same. SpeechML adds the extra component of accepting voice 
input from users, which makes things really interesting. Imagine being able to talk to your 
browser, to tell it how to navigate for information, and to have it speak back to you as you 
surf the net. Things can get pretty wild.
The SpeechML implementation, as it currently exists, works only through an aural 
browser (called conversational browser), which provides the real text-to-speech 
functionality. For it to be a successful technology, SpeechML needs to be approved by 
the W3C. IBM intends to formally propose SpeechML to the W3C.


- 43 -
The spin-offs of voice markup languages are really cool. Once approved as a standard 
(we don’t see any reason for it not to be—if not SpeechML, some similar standard will be 
approved), the implementations of its interpreters will extend far beyond the formal PC 
browser per se. Imagine doing things such as browsing the Web through your telephone, 
or having driving instructions read to you as you drive in your car, or having your 
refrigerator and other home appliances talk to you—granted, now we’re bordering on 
being really being geeky, but it certainly is a possibility. The possibilities are extremely 
exciting.
To be fair, we should mention that at least two other similar voice applications of XML 
that we know about are in the pipeline: Motorola’s VoXML and AT&T’s VXML. All of these 
have similar intentions and all are XML-based. Our one fear is that a standards war may 
emerge. But even if there is one, after the dust settles, one standard will emerge and 
developers will be dreaming of hundreds of thousands of applications to use it with.
Figure 2.16 shows what voice markup languages look like from a systems-level 
perspective.

Download 2,96 Mb.
1   ...   43   44   45   46   47   48   49   50   ...   131




Download 2,96 Mb.
Pdf ko'rish