Speech recogntion in the classroom.
What do teachers need to know?

Effective use of speech recognition technology shifts the focus from the physical act of writing to that of expression of thoughts and knowledge. While students require explicit writing instruction and practice, they may also benefit from supports to help circumvent the barriers of writing while they are improving their skills. To maximize student success, technologies must be combined and aligned with effective instruction such as the teaching of explicit strategies for planning for different types of writing.

Over the past few years, speech recognition has improved in ease of use and accuracy, offers a viable alternative to a keyboard, and provides students with an alternative method for demonstrating their understanding through written text.

Benefits to support student learning

Speech recognition can:

  • shift the focus from the physical act of writing to that of expression and organization of thoughts and knowledge
  • enable students to generate written output that better represents their true oral language skills
  • increase written output and legibility
  • allow students to alternate between typing and speaking as needed
  • support students in working independently within grade-level expectations enabling them to experience and participate in the writing process (planning, composing, revising and editing written work)
  • improve writing products across all subject areas
  • improve endurance and reduce writing fatigue by eliminating the physical act of composing to paper or keyboard, and by decreasing the memory demands of spelling
  • decrease anxiety associated with mechanics, organization and editing, and increase learner engagement
  • allow for increased independence in writing
  • provide pronunciation practice in a safe, low-stress environment for students learning English as a second language.

Back to top

Types of learning tasks supported

Speech recognition supports any task where students are asked to communicate their thinking in written format such as:

  • planning/pre-writing activities
  • composing writing assignments (e.g., essays, research papers, position papers)
  • revising/editing by having the text read back and using that feedback to revise written output
  • collaborative writing using Google docs, email or social media
  • creating study notes.

Back to top

Learning contexts

Speech recognition can be used:

  • in any digital writing environment that allows students to input text using a keyboard
  • with most word processing programs or any program where a cursor can be inserted (e.g., email, digital visual organizers, Twitter)
  • individually to support independent writing
  • in learning centres or workstations
  • for writing at home
  • in conjunction with word prediction software.

Back to top

Students who would particularly benefit

Speech recognition would benefit students:

  • who may not understand the relationship between letters, words, and sounds
  • who are unable to demonstrate competence as writers due to inability to translate thoughts and ideas to text using tools such as a pencil, keyboard, or word prediction
  • with moderate to severe spelling difficulties
  • with poor or limited fine motor skills, including students with physical disabilities
  • with repetitive strain injuries limiting use of conventional writing tools
  • with experience using voice recorders or other modes of oral transcription to complete writing assignments
  • learning English as another language
  • who previously required scribes
  • who have a preference for this method of writing.

In addition, some students with vision loss may benefit from speech recognition.

Back to top

Factors for consideration

Factors to consider when using speech recognition:

  • To maximize student success, speech recognition must be combined and aligned with effective instruction to support the writing process.
  • Speech recognition often requires training by the student. Depending on the software, training might involve reading text from the screen or listening to ‘sound chunks’ and repeating the phrases or sentences back to the computer.
  • Some students may find the dictation process frustrating since it can place specific cognitive demands on the student’s ability to self-monitor, self-correct, and to memorize dictation commands necessary for effective use.
  • While speech recognition allows students to freely express their thoughts and ideas without the constraints of spelling and handwriting, they still need to plan, organize, and structure their thoughts into coherent phrases and sentences. Students need to be able to move from the informality of typical conversational speech (e.g., use of “um” and other pauses, incomplete phrases and sentences) to dictating grammatically intact sentences.
  • Specific language disorders that may make using speech recognition more challenging include:
    • severe articulation difficulties
    • disfluent speaking (stuttering, mispronunciations)
    • expressive language difficulties such as word finding/word retrieval problems.

Back to top

Benefits of making speech recognition available to all students

Having speech recognition available to all students:

  • allows students to experiment with speech recognition and determine for themselves if it is useful
  • provides a tool for students who may not have an identified disability but may benefit from additional support
  • removes the stigma for individual students who might otherwise deny themselves the support, especially at the junior and senior high school level.

Back to top

Personalizing for individual student needs

Personalizing options vary from software to software but typically there are options for:

  • training supports for students with reading difficulties (e.g., offers text as audio in phrases for the student to speak back or an adult might read the training script to the student while the student dictates to complete the voice training requirements)
  • additional voice training for students to improve their dictation accuracy
  • availability in more than one language
  • ‘re-reading’ or listening to what has been written for editing purposes as well as to identify dictation errors through text-to-speech
  • USB Bluetooth wireless headsets to allow for increased mobility.

Back to top

Choosing the right tool

Choosing an appropriate technology solution involves gathering information about your students, identifying needs and potential technologies, and investigating the effectiveness of different technologies with different students.

Free versions of speech recognition available on the web may offer some support for students but may be limited in the amount of data a student can transcribe (e.g., up to 30 seconds) or restrictive in terms of the environment within which it can be used (e.g., limited only to the web) and therefore not compatible for use with other software or in multiple writing environments. Without the ability to create a user profile, free versions of speech recognition will not be able to ‘learn’ about the student’s voice and word-choice habits. This feature increases accuracy of written output.

Simple speech recognition capability is built into both the Windows and Mac operating systems, providing an easily accessible way to try out a version of speech recognition. These built-in tools may not be robust enough to offer all of the features some students may require including the ability to listen back to written text, the level of accuracy, or training for specialized vocabulary. For these students, speech recognition software that can be personalized would be a better solution.

If speech recognition is identified as necessary for student success, it is important that the speech recognition program chosen allows for the creation and saving of individual student profile information and any other features necessary for an individual student’s success.

Back to top

Conditions for success

Conditions for success include:

  • Use of a noise cancelling USB microphone headset is critical to be able to achieve the highest levels of word accuracy. Recommendations for microphone headsets are often available on the software manufacturer’s website.
  • Voice training must occur in the environment where the student will be using speech recognition. For example, if the student will be using speech recognition in the classroom, then he or she must complete voice profile training in that same classroom to allow the program to recognize the noise levels typical in that learning environment. Note that ‘white noise’ or a noisy classroom environment can affect dictation accuracy (e.g., photocopier, printer or noisy air intake).
  • Dictating may be taxing to some students and they may benefit from having water at their workstation.
  • Ensure that the computer hardware is capable of running the speech recognition software. System requirements will be available on the manufacturer’s website.
  • Wi-Fi is required to access speech recognition on mobile devices and on certain operating systems.

Back to top

Instructional planning considerations

Instructional planning considerations include:

  • combining and aligning the use of technology with ongoing writing instruction including strategies for planning for different types of writing
  • ensuring that students have access to speech recognition in all learning environments where they are required to complete written work, including home
  • providing writing tasks in a digital format that will allow students to respond on that same document using speech recognition technology
  • making available personalized cue cards of possible speech commands for students to use as a quick reference.

Back to top

Introducing to students

With most programs, the student will first need to create a user profile. In most cases this will include completing initial training exercises as outlined. This process can be completed individually or in a small group setting and generally takes 5 to 15 minutes. Learner profiles allow the software to ‘learn’ about individual users and increase the rate of dictation accuracy.

Some software requires that the student read the text to train the software. If the student reads the training text in a choppy or disfluent voice, it will affect the quality of his or her voice profile because the words won’t be pronounced properly. Other software provides speech prompting which presents the text in short sound bites for the student to listen to and repeat back. This is helpful for students who are unable to decode and read fluently.

Another alternative for training the software would be to have an adult ‘whisper’ read to students who do not have a high level of oral reading fluency. This is an especially helpful training strategy for students who stumble on words or hesitate in the middle of words, all of which will affect the overall user profile.

Some speech recognition software can be used ‘out of the box’ and does not require training. In this case, the software will ‘learn’ about the user as he or she dictates. If the student is articulate and takes time to make corrections, this will be similar to training the software. This would not apply to multi-user accounts or free online apps.

Training modules are often included with the software to introduce voice commands, dictation technique and editing if used with the software.

Training sequence

    1. Introduce and teach commands
      Introduce students to the speech recognition software they will be using. Teach them the basic commands that they will need to know to tell the software what to do. Provide visual prompts that they can use for reference.
    2. Commands vs. dictation
      Students need to understand the difference between a command and dictating their ideas. Demonstrate that when they give a command they must: pause – say the command in one breath – pause and then wait for the software to do what they command it to do. This is how the software understands the difference between a command and when you want it to write what you say (e.g., command “wake up” versus “The boy was told to wake up.”).
    3. “Say it like you mean it.”
      Encourage students to talk to the software in a strong, firm voice without hesitation. To dictate effectively, students will have to learn to use complete sentences and avoid typical conversational phrases such as “um” or “you know.”
    4. Practise the command technique
      Give students time to practise using the microphone commands: “wake up” and “go to sleep” or whatever commands may be important for the software being used.
    5. Developing dictation technique
      It is important to teach correct dictation technique when the student starts to use the software. For the first training sessions, the teacher may model correct dictation technique and have the student mimic this. In the first few sessions you are trying to get the student to dictate in a natural, comfortable voice and to start creating a dictation routine such as

      • Collect your thoughts.
      • Speak in a natural manner.
      • Listen back to your dictation using text-to-speech.
      • Make any corrections or changes that you wish to make.

Work with students to create a visual they can use to cue themselves.

Guided practice

Introduce speech recognition with an engaging task (e.g., shorter writing activities in an area of interest) and allow the student to become comfortable and confident with using speech recognition before extending to more challenging tasks.

For example, use interesting images that the student could choose from as a prompt to start the writing process. Insert the chosen image into a document where the student can view it while describing or telling a story about the image.

Observe the student using speech recognition. Has the student learned the basic commands needed to navigate the program? Can the student organize ideas in his or her head and then orally compose those thoughts at the phrase/sentence level? Is the student able to identify errors in dictation and make corrections when needed? Is the student able to use the audio support to identify dictation errors and edit for word choice?

Identify a specific task the student would be expected to complete using speech recognition (e.g., completing a reader response activity) and indicators of success that correlate to increased learning.

Sample indicators of success might include:

  • improved quality of written work
  • increased demonstration of understanding of learning through written expression
  • increased speed and volume of written output
  • increased completion of writing tasks
  • independent completion of writing tasks
  • increased student engagement and participation in writing activities
  • improved vocabulary usage that better represents the student’s oral language skills
  • increased independence and feelings of self-efficacy
  • more self-initiated writing for own purposes.

Determine what explicit instruction students may require and any supports that may be needed to help them make progress prior to assigning a writing task. Speech recognition may be one of those supports.

Back to top

Monitoring and assessing effectiveness

Collect baseline writing samples of students’ current written work (without speech recognition). Then compare baseline data to writing tasks completed using speech recognition and assess evidence of improvement.

Collect data on a regular basis to help determine the effectiveness of using speech recognition.

Note how often students use speech recognition over the course of a school day, how long they use it at one time, and what impact it has on their writing success. For example, are the students completing writing tasks? How long did it take? Are they able to independently use speech recognition or are they still learning how to use it effectively?

If expected success is not achieved, it might indicate a need for a change in how speech recognition is being implemented. The student may require more time to become comfortable with the software, or perhaps a different software or approach may need to be considered.

If results indicate that speech recognition software is not benefiting an individual student who is struggling with written output, consult with an assistive technology consultant to explore other options.

Back to top

Differentiated instruction

The strategic use of technology can engage learners at varying levels of readiness and in multiple ways, supporting the differentiation of instruction to help meet the diverse learning needs of students.

Speech recognition can:

  • support and scaffold written output
  • allow students to work on the same curriculum expectations in various ways with common criteria for success
  • provide choice and access to a wider variety of expression options for all students when devices with speech recognition are available
  • help create a learning environment responsive to the learning preferences, interests, and readiness of the individual learner.

Back to top

Universal Design for Learning (UDL)

A Universal Design for Learning (UDL) approach recognizes that barriers to learning can exist within the environment and aims to maximize learning for all students by creating flexible learning environments that reduce or eliminate potential barriers. For some students, written output using paper and pencil is a barrier and interferes with their ability to demonstrate understanding.

UDL also emphasizes the need to clearly identify the purpose of the learning activity in order to allow students multiple options for how they can successfully accomplish the goal or purpose of the activity.

For example, if students are required to compose a short story, the purpose of the learning task will determine the scaffolds and supports that might be appropriate. Is the purpose to:

  • develop spelling strategies
  • learn strategies to plan and organize ideas
  • understand purpose/voice when writing
  • learn strategies to develop sentence structure and vocabulary
  • learn strategies to revise and edit written work?

If the purpose of the writing assignment is to create an effective writing plan and compose accurate, fluent text with varied sentence structure and vocabulary, speech recognition could support the writing process for students who struggle with handwritten text.

However, if the goal of the learning activity is to provide students with practice in spelling or to produce legible handwritten text, then allowing those same supports may not be appropriate. In this instance, the use of speech recognition would interfere with the students’ ability to practise spelling and to generate handwritten text.

Back to top

Response to Intervention

Typically, Response to Intervention is the practice of providing evidence-based instruction and intervention matched to student need. The RtI model offers tiered interventions for students who are not benefitting from universal instructional strategies and therefore require additional or more intensive interventions.

Speech recognition could be considered a solution for students with disabilities related to writing, for students with severe fine motor and spelling difficulties, or for students learning English as another language.

Back to top