Speech recognition - Children's voices.

  • Thread starter Thread starter David
  • Start date Start date
D

David

I decided to play with speech recognition to help out one of my customers who
often needs to use his computer while holding something with both hands.
Before tackling the "real" problem, I decided to write a sample application,
which was a math flash card program for my son, who is 9 years old.

On Windows XP and .Net Framework 3.5, I hooked up a
"System.speech.recognition.speechrecognitionengine" to a program that
displays two numbers to add, waits for a spoken input, and compares the
spoken answer to the correct answer. It worked great...for me. The guys in
the office tried it out, because they thought it was cool that I could write
a speec recogntion app in just 2 hours. It worked for them, too. It even
worked for Ayman, who has a pretty thick Arabic accent.

Then I tried it on my 9 year old son. Terrible. It couldn't recognize
anything.

He has a child's voice, undoubtedly, but it's clear and he doesn't have a
speech defect. There's no difficulty understanding his speech, unless you
happen to be a computer.

Is there a way to train the speech recognition engine, or to select a
different voice? Would it work better in Vista?
 
I don't think pitch is the answer. As an experiment, I tried the program
myself, but with an artificially high, falsetto type, voice. Kind of like a
Mickey Mouse voice.

For my Mickey Mouse voice, everything worked well. The error right was
slightly higer, maybe, but not much.
 
As a converse test, have you asked your child to try talking in a slower and
deeper voice?
 
Yes. It was even worse. In his normal voice, it got perhaps half correct.
When he tried to slow down or talk in an abnormal voice, it got worse.
 
David said:
I decided to play with speech recognition to help out one of my customers
who
often needs to use his computer while holding something with both hands.
Before tackling the "real" problem, I decided to write a sample
application,
which was a math flash card program for my son, who is 9 years old.

On Windows XP and .Net Framework 3.5, I hooked up a
"System.speech.recognition.speechrecognitionengine" to a program that
displays two numbers to add, waits for a spoken input, and compares the
spoken answer to the correct answer. It worked great...for me. The guys
in
the office tried it out, because they thought it was cool that I could
write
a speec recogntion app in just 2 hours. It worked for them, too. It even
worked for Ayman, who has a pretty thick Arabic accent.

Then I tried it on my 9 year old son. Terrible. It couldn't recognize
anything.

He has a child's voice, undoubtedly, but it's clear and he doesn't have a
speech defect. There's no difficulty understanding his speech, unless you
happen to be a computer.

Is there a way to train the speech recognition engine, or to select a
different voice? Would it work better in Vista?

Heh, I am also writing a simple Flashcard (math quiz) program using .NET 3.5
and the speech recognition similar to you. Want to work together? :-)

My son is 10 years old and initially had the same poor results. But then he
trained the speech recognition using the Control Panel applet, and by
reading the displayed sentences (it took about 15 minutes). While that
taxed his patience, he did get through it, and when we tried the Flashcard
program again, it understood him. :-)

Well, he still has problems. It works when he speaks deliberately, but he
is excitable and tends to yell. That's when the recognition does not work.
But if he makes an effort to be understandable, it does work.

I would encourage your son to tune the speech recognition engine (I assume
there is a training program on XP).

-- David
 
Sure.

The application that I have has a text box (tbAnswer in the code below) for
the user’s answer. When using speech recognition, I set the text recognized
text into that text box. I don't specifically use a "trained" voice,
although I will run the training applet from the "speech" entry in the
control panel, and see if that improves the recognition of my son's voice.

Here’s most of the important parts of the code for the speech recognition.

private void Form1_Load(object sender, EventArgs e)
{
SetRecognizerDelegate = new SetTextDelegate(SetRecognizerText);
voicethread = new System.Threading.Thread(new
System.Threading.ThreadStart(SRThread));
voicethread.Start();

}


//Put the speech recognition in it’s own thread. Here’s the thread
//function that I used.

//Aparently, the recognitionengine needs to run in a Single Threaded
Apartment
//model, and won't work if invoked from a standard Windows Forms application
//unless it is in it's own thread.
private void SRThread()
{
recognizer = new
System.Speech.Recognition.SpeechRecognitionEngine();
recognizer.SetInputToDefaultAudioDevice();

System.Speech.Recognition.GrammarBuilder gb = new
System.Speech.Recognition.GrammarBuilder();
System.Speech.Recognition.Choices ch = new
System.Speech.Recognition.Choices();
for (int i = 0; i < 100; i++)
ch.Add(i.ToString());//I was recognizing numbers 1 to 100
for the flash card program.
//You would use quoted strings for what you would want to recognize.
//Eg. “Addâ€, “Openâ€, “Closeâ€, “Deleteâ€

gb.Append(ch);
grammar = new System.Speech.Recognition.Grammar(gb);
recognizer.UnloadAllGrammars();
recognizer.LoadGrammar(grammar);
recognizer.SpeechRecognized += OnSpeechRecognized;

recognizer.RecognizeAsync(System.Speech.Recognition.RecognizeMode.Multiple);
//This turns on the recognition engine to keep going, and recognize inputs
continuously.
// It throws a "SpeechRecognized" event when input is recognized.

}


private void OnSpeechRecognized(object sender,
System.Speech.Recognition.SpeechRecognizedEventArgs e)
{


tbanswer.Invoke(SetRecognizerDelegate, e.Result.Text);
//e.result.text is the text representation of the recognized string.
//Because the speech recognition is not in the main thread, I can’t set
//the contents of the text box, tbanswer, directly.


}

public SetTextDelegate SetRecognizerDelegate;
public delegate void SetTextDelegate(string newtext);

public void SetRecognizerText(string newtext)
{
tbanswer.Text = newtext;

}
 
Thanks. I'll give it a shot. I figured there had to be some way to train
for a particular voice.

Ideally, I would like to be able to train multiple users and load a profile.
It looks like I will have to train my son on my PC, which will take fifteen
minutes, and when I'm done, it probably won't recognize me very well. So, if
I want to use my voice on my PC, I'll have to retrain it, and then when he
wants to use it...well, you get the idea. It would be nice if there were a
way to tell the PC which user was using it, and load the voice profile for
that user.

I'm not exactly trying to go commercial with this application, so it's
pretty darned simple. I just had the "questions" in the form of two labels,
and a text box for input. Feel free to ask any questions you might have, if
there's anything giving you trouble.
 
David,
I believe trained voice is provided to software developer in SAPI5.3, and
also provided to user on XP by click on Start->Control Panel->Speech
Recognition Option->Train Your Computer To Better Understand You. I am not
sure what your need is. I am trying to write a program in C++ to recognize a
trained voice, which utilize the SAPI 5.3 methods such as
SPDUI_AddRemoveWord() SPDUI_UserTraining(), etc. If you want to work with me,
you can contact me directly at (e-mail address removed).
Thanks
Michael Nguyen
 
David said:
Ideally, I would like to be able to train multiple users and load a
profile.
It looks like I will have to train my son on my PC, which will take
fifteen
minutes, and when I'm done, it probably won't recognize me very well. So,
if
I want to use my voice on my PC, I'll have to retrain it, and then when he
wants to use it...well, you get the idea. It would be nice if there were
a
way to tell the PC which user was using it, and load the voice profile for
that user.

The idea is the each person has a different Windows logon account, and the
speech profile that is used is that of the currently logged on user. You
should create your son his own user account and make sure he is logged in
when he trains it. Then you can train it under your own account.


I'm not exactly trying to go commercial with this application, so it's
pretty darned simple. I just had the "questions" in the form of two
labels,
and a text box for input. Feel free to ask any questions you might have,
if
there's anything giving you trouble.

Yeah, I got that far. Now the more involved part is making the gaming
elements like keeping score, keeping time spent, adjusting the difficulty of
the problems based on progress, etc.

-- David
 
Thanks for the logon tips. That will work for the "flashcard" program on my
home PC, but it's not practical for my "real" application, where multiple
users will have access to a common PC. Maybe .NET 4.0 on Windows 2009 will
have a "load speech profile" function for the speechrecognitionengine class.

I doubt that I will develop the flashcard program beyond the very basics. I
was doing it partly to help my son with his speed math drills, and partly as
a sample app for speech recognition. Scorekeeping and the like is just a bit
advanced for what I'm doing with it.
 
Back
Top