How To Use Speech Recognition in Windows XP


This article describes how to use speech recognition in Windows XP. If you installed speech recognition with Microsoft Office XP, or if you purchased a new computer that has Office XP installed, you can use speech recognition in all Office programs as well as other programs for which it is enabled, such as Microsoft Internet Explorer.

Speech recognition enables the operating system to convert spoken words to written text. An internal driver, called an speech recognition engine, recognizes words and converts them to text. The speech recognition engine may be installed with the operating system or at a later time with other software. During the installation process, speech-enabled packages such as word processors and web browsers, may install their own engines or they may use existing engines. Additional engines are also available through third-party manufacturers. These engines often use a certain jargon or vocabulary; for example, they may use a vocabulary that specializes in medical or legal terminology. They can also use different voices allowing for regional accents such as British English, or use a different language altogether such as German, French, or Russian.

You need a microphone or some other sound input device to receive the sound. In general, the microphone should be a high quality device with noise filters built in. The speech recognition rate is directly related to the quality of the input. The recognition rate is significantly lower or may be unacceptable if you use a poor microphone. The Microsoft Speech Recognition Training Wizard (Voice Training Wizard) guides you through the process, recommends the best position to place the microphone, and allows you to test it for optimal results.

After you have installed the system and it is working, you must train the engine for your environment and speaking style. To do so, click the Speech Recognition tab, click Train Profile, and then follow the instructions in the Voice Training Wizard to train the system to recognize background noises such as a fan, air conditioning, or other office sounds. The engine adapts to your speaking style including accents, pronunciations and even idiomatic phrases.

For more detailed information about using Microsoft speech recognition, click the Help button on the Language bar.

For the most up-to-date information about speech recognition developments at Microsoft, refer to the following Microsoft Web site:

How to Use the Microsoft Speech Recognition Engines

The Microsoft speech recognition engine enables you to insert text into a document using specific programs. You can dictate text in any Office XP program, in Internet Explorer, and in Microsoft Outlook Express (versions 5.0 or later). Other software programs may eventually support the Microsoft speech recognition engine. You cannot dictate text in Microsoft Notepad at this time.

NOTE: Speech recognition engines are language-specific. The first three Microsoft speech engines that are available are Simplified Chinese, U.S. English, and Japanese. Engines for other languages will become available.

In addition to being language-specific, some speech engines may be region-specific. For example, the Microsoft English ASR Version 5 engine is intended for speakers of U.S. English. British, Australian, other non-U.S. English speakers may have difficulty using this engine because of variations in accent.

For additional information about speech recognition engines, click the article number below to view the article in the Microsoft Knowledge Base:

306537 How to Install and Configure Speech Recognition in Windows XP

How to Train the Speech Recognition Engine

When you train the speech recognition engine, the speech recognizer uses the Voice Training Wizard to adapt to the sound of your voice, word pronunciation, accent, speaking manner, and even new or idiomatic words. If you train for as little as ten minutes, you can improve speech recognition capabilities. The system also adapts to your speech on an ongoing basis and recognition increases over time.

To train the speech recognition engine, follow these steps:
  1. Click Start, click Control Panel, and then double-click Speech.
  2. Click the Speech Recognition tab, and then click the speech recognition engine that you want to use in the Language box.
  3. Click the profile that you want to use in the Recognition Profile group. Training is specific to an engine and profile so that training one engine or profile set has no effect on any other engine or profile set.
  4. Click Train Profile, and then follow the directions in the Voice Training Wizard. Not all engines support training. If your engine does not, Train Profile is unavailable.
NOTE: It is recommended that you spend at least 15 minutes training the computer. The more training you do, the higher your recognition accuracy will be.

How to Use the Speech Recognition Engine

NOTE: The steps in this procedure may vary depending on the program in which you are using speech recognition.

  1. Position the microphone so that it is about an inch or a thumb's width to the side of your mouth. Make sure that it is not directly in front of your mouth, and you are not breathing directly into it.

    NOTE: If you inadvertently move the microphone as you speak, remember to bring it back to the correct position.
  2. Start the program in which you want to use speech recognition, and then click the document to place the insertion point into your document. If you open a Help topic while you are working or if a message is displayed on the screen, click the document again to continue using speech recognition.
  3. On the Language bar, click Microphone (if the microphone is not already turned on).

    NOTE: The Language bar displays labels beside each button on the bar by default. To hide or show the text labels, right-click the Language bar, and then click Text Labels.
  4. Switch between Dictation and Voice Command modes as you work.

    NOTE: You can save time if you complete dictation first, review your file, and then format text or make corrections. When you do so, you have to switch between Dictation mode and Voice Command mode less often. To change modes:
    • Using Dictation mode: To turn the words you speak into text, click Dictation on the Language bar.

      As you speak, a blue bar is displayed; this means that the computer is processing your voice. As your words are recognized, text is displayed on the screen. You can continue to speak while the computer processes your voice; you do not have to wait until the blue bar disappears to speak again.

      NOTE: While the blue bar is displayed, avoid using your mouse or keyboard to type or take other actions. If you do so, speech recognition is interrupted, and your words are not processed.
    • Using Voice Command mode: To select menu, toolbar, dialog box (U.S. English only), and task pane (U.S. English only) items, on the Language bar, click Voice Command. For example, to change font format, you can say "font" or "font face" to open the Font box on the Formatting toolbar, and then say a font name. Or if you want to format selected text, say "bold" or "underline."
  5. Click Microphone on the Language bar to turn the microphone off when you are finished speaking to the computer.

The following list describes some of the shortcuts that you can use:
  • You can also switch between Dictation and Voice Command modes by saying "dictation" or "voice command."
  • In Microsoft Word, you can delete the last thing you said in Dictation mode by saying "scratch that."
  • You can turn the microphone on and off by clicking Speech on the Tools menu (in Microsoft Excel, point to Speech on the Tools menu, and then click Speech Recognition).
  • You can also turn the microphone off by saying "microphone".

Speech Recognition Tips

Speech recognition is not designed for completely hands-free operation; you achieve the best results if you use a combination of your voice and the mouse or keyboard. In addition, use a consistent quality of speech to achieve the best results. When you speak to others, people usually understand from the context and environment even if you whisper, shout, or talk quickly or slowly. However, speech recognition understands words better when you speak in a predictable manner.

  • Speak in a consistent, level tone. If you speak too loudly or too softly, the computer may not recognize what you said.
  • Use a consistent rate without speeding up and slowing down.
  • Speak without pausing between words; a phrase is easier for the computer to interpret than just one word. For example, the computer has a hard time understanding phrases such as "This (pause) is (pause) another (pause) example (pause) sentence."
  • Start by working in a quiet environment so that the computer hears you instead of the sounds around you, and use a good quality microphone. Keep the microphone in the same position; try not to move it around after it is adjusted.
  • Train your computer to recognize your voice by reading aloud the prepared training text in the Voice Training Wizard. Additional training increases speech recognition accuracy.
  • As you dictate, do not be concerned if you do not immediately see your words on the screen. Continue speaking and pause at the end of your thought. The computer displays the recognized text after it finishes processing your voice.
  • Pronounce words clearly, but do not separate each syllable in a word. For example, if you sound out each syllable in "e-nun-ci-ate," the computer may not recognize what you said.


For additional information about using speech in Windows XP, click the article numbers below to view the articles in the Microsoft Knowledge Base:

306537 How to Install and Configure Speech Recognition in Windows
306899 How to Use Speech Recognition Profiles in Windows XP
306902 How to Use Text-to-Speech in Windows XP
306993 How To Use the Language Bar in Windows XP
278927 WD2002: Part 1: Speech and Handwriting Recognition Frequently Asked Questions


Article ID: 306901 - Last Review: Sep 23, 2011 - Revision: 1