IBM Speech Markup Language: SpeechML

Download 2,96 Mb. Pdf ko'rish
bet	47/131
Sana	14.05.2024
Hajmi	2,96 Mb.
	#232039

1 ... 43 44 45 46 47 48 49 50 ... 131

Bog'liq
Ceponkus, Hoodbhoy - Applied XML - Toolkit for Programmers

IBM Speech Markup Language: SpeechML
IBM’s alphaWorks (a Web site devoted to introducing IBM’s research work to developers)
introduced SpeechML in February 1999. SpeechML provides a framework in which Web-
based applications can integrate interactive speech capabilities.
SpeechML provides tags for defining spoken output and input as well as tags for
triggering actions to be taken on a given spoken input (such as “open window” or “copy to
disk”). SpeechML elements are identified and linked by URLs in an attempt to keep
everything familiar to HTML developers. It builds on Java Speech Markup Language
(JSML) for spoken output (that is, text to speech) and uses a combination of markup and
Java Speech Grammar Format (JSGF) to define the spoken input (that is, speech to
text.)
IBM would like developers to be able to use tags to add interactive speech capability to
Web sites without being experts in speech technology. Just as you’d mark up a
paragraph with a tag to make its contents bold, using a vocabulary of tags defined in
SpeechML, you’d mark up sections of text with particular tags to make them audible.
The alphaWorks Web site includes a downloadable conversational browser that is written
in Java and builds on Java’s Speech API and IBM’s XML parser.
This is exciting stuff. The realm of possibilities is virtually endless here. Imagine Web
sites that give you their content without your having to look at the screen. You could
perform multitasking like never before—listen to information from one page and type a
report about it as you write.
In and of itself, text-to-speech capability built into a browser is pretty cool; however, other
companies have tried the same. SpeechML adds the extra component of accepting voice
input from users, which makes things really interesting. Imagine being able to talk to your
browser, to tell it how to navigate for information, and to have it speak back to you as you
surf the net. Things can get pretty wild.
The SpeechML implementation, as it currently exists, works only through an aural
browser (called conversational browser), which provides the real text-to-speech
functionality. For it to be a successful technology, SpeechML needs to be approved by
the W3C. IBM intends to formally propose SpeechML to the W3C.

- 43 -
The spin-offs of voice markup languages are really cool. Once approved as a standard
(we don’t see any reason for it not to be—if not SpeechML, some similar standard will be
approved), the implementations of its interpreters will extend far beyond the formal PC
browser per se. Imagine doing things such as browsing the Web through your telephone,
or having driving instructions read to you as you drive in your car, or having your
refrigerator and other home appliances talk to you—granted, now we’re bordering on
being really being geeky, but it certainly is a possibility. The possibilities are extremely
exciting.
To be fair, we should mention that at least two other similar voice applications of XML
that we know about are in the pipeline: Motorola’s VoXML and AT&T’s VXML. All of these
have similar intentions and all are XML-based. Our one fear is that a standards war may
emerge. But even if there is one, after the dust settles, one standard will emerge and
developers will be dreaming of hundreds of thousands of applications to use it with.
Figure 2.16 shows what voice markup languages look like from a systems-level
perspective.

Download 2,96 Mb.

1 ... 43 44 45 46 47 48 49 50 ... 131

Download 2,96 Mb.

Pdf ko'rish