managing sound in a dotNet app

Guest · May 13, 2006

I am working on a foreign language learning program. (C# DotNet V2) I have
some written text, and a recording of a native speaker speaking that text. I
would like to be able to do the following:

1 - synchronize the voice with the text, so that the background color of
each word is hilighted as the voice reads that word.

2 - be able to jump around within the recorded text. For instance if the
user double-clicks a certain sentence - just that sentence is read. Currently
the speech is in one file per paragraph.

3 - be able to slow down or speed up the pace of the spoken text (while not
making it sound ridiculous), also the synchronization in step one should not
be broken by this process.

I am a rank amateur at audo, so any ideas on how to do this, what
sound-editing tools would be good to use, etc. will be good for your karma!
(and really appreciated by me.)

Cowboy \(Gregory A. Beamer\) · May 13, 2006

To be right on sync, you will need some form of library that can actual
capture the audio. An cheaper option would be to get the timing of the words
and set up a metadata file for the audio (can be stored in a database or on
the file system (perhaps even in a resource file, athough that is not their
primary use)). When the clip hits a certain timing, highlight that word.

--
Gregory A. Beamer

*************************************************
Think Outside the Box!
*************************************************

Guest · May 13, 2006

Thanks. The audio is pre-recorded, so an application to capture it isn't
needed. I do need an applicaiton that let's me edit the timing (probably by
hitting a key at the beginning of every new sentence, then every word).
Ideally this application would also store the 'metaData' (is there some sort
of standard for this?) Also it needs to be easy to edit mistakes.

Any ideas on where I can get a sound editor like this?

Thanks again

Guest · May 15, 2006

Metadata is just the needed data in a format you want that tells you what you
want to know.

IE: for this application, maybe an XML file that you serialize upon startup
that lists the words to highlight at each point in time, so at 0.5 sec
highlight the second word, 1sec highlight the third, etc... Then a program
that you can write to just click the words that are on the screen as you
listen to the audio. So start the audio and record the datetime down to the
ms when you started it, then at each click record the difference in ms and
save that to an xml file. Now when you start the program, open the xml file
and it should show the words on the screen, and also have the hierarchy of
the word timing loaded into memory. Then just begin the audio and start your
highlighting.

A rough idea of what you're looking to do, but kind of gives you a starting
algorithm anyway.

Guest · May 15, 2006

Thank you very much. That might work, but it has a couple of problems:

1 - depending on the system load, CPU speed, load, etc. the amount of time
the recording takes to start playing varies from system to sytem and
different times on the same system. This makes syncing at startup difficult

2 - I'd like to be able to pause and continue - which will exagerate the
above problem.

3 - I'd like to be able to slow-down or speed up the recording.

For these reasons I was thinking of triggersthat would be embedded in the
actual sound file itself.

managing sound in a dotNet app

Guest

Cowboy \(Gregory A. Beamer\)

Guest

Guest

Guest