CQ-Contest
[Top] [All Lists]

RE: [CQ-Contest] Synthesizer Opinions....

To: <cq-contest@contesting.com>
Subject: RE: [CQ-Contest] Synthesizer Opinions....
From: "Gerry Hull" <windev@inetmarket.com>
Date: Mon, 24 Jan 2005 10:11:15 -0500
List-post: <mailto:cq-contest@contesting.com>
Thanks for all the great replies!!  It seems that there is a lot of interest
in the synthesized voice keyer concept.

Here are some technical details:

- I used Microsoft Speech, or, the Microsoft Speech API 5.0 to be more
exact.   Speech 5.0 is installed on Windows XP by default, and can be
downloaded for other Windows OS at http://www.microsoft.com/speech.   It
uses your standard sound card in your PC to generate the synthesized voice.

- The voice I used is called "NeoSpeech Paul16"; this sythesizer voice (or
voice font, as I call it) is made from parts-of-speech of an actual
digitized voice.   You can purchase the voice for $35 on the web;
http://www.regnow.com/softsell/nph-softsell.cgi?item=3961-14&affiliate=31392
Beware -- the voice files are large -- typically 500-800mb.  (When the voice
first speaks, it takes a second or two to load, then, further text is spoken
immediately.)

Other great-sounding voices are available:
AT&T Natural Voices:
http://www.regnow.com/softsell/nph-softsell.cgi?item=3961-6&affiliate=31392
Cepstral Voices:
http://www.cepstral.com/cgi-bin/store/start?rkey=5LNs3fDvMJcbEgTQat7XkOxAM1e
KRxkMjjfLyEgzOO9ge4Gq0vQLbZbDkFIcAluK

I recommend 16-KHz voices only -- others sound way to robotic and hard to
understand.  There are MANY voices available, including British English,
other European languages, male/female, and even English with a VU accent!
By slightly adjusting the voice speed, you can get many subsets of the
original voice.

- My "glue" to make this an actual contest keyer was a Visual Basic program.
I use Writelog -- with Writelog is very easy to get the callsign entered, as
well as the current frequency of the radios.  My program "took over" the
F1-F12 keys, as well as INSERT/DELETE, and the + sign on the numeric keypad.
I simply mapped these keystrokes to the appropriate text for the synthesizer
to speak.   I decoded callsigns to speak the phonetics rather than the
individual letters.

- The key to making the synthesizer sound more natural, and to put emphasis
on the correct syllables, is the ability to control speed, pitch and
pronunciation real-time in the speech text.   The SAPI 5.0 API allows this
through the use of XML control tags.  For example, here was my exchange
line:

"<rate speed ="-2">kilo six victor victor alpha</rate>,<rate
speed="-5">robo</rate>,radio oscar bravo oscar, new hampshire"

Here, I am using a relative speed tag to slow down while saying a callsign,
the name, etc.  My CQ speed was at a default, overall higher speed.  You
have to be careful about going too slow, as it then sounds like the speaker
has a traumatic brain injury!



I actually built the keyer software about an hour before NAQP, so it's a
"work in progress".  It was very easy to interface to Writelog, and I've
done lots of Speech API stuff before.   Anyway, I am going to clean up the
UI, and add some code to allow it to interface to other software using a bit
of scripting.  Then, I'll post an installer on my web site.  That way,
anyone who wants to play with this will be able to.

73,

Gerry, W1VE





_______________________________________________
CQ-Contest mailing list
CQ-Contest@contesting.com
http://lists.contesting.com/mailman/listinfo/cq-contest


_______________________________________________
CQ-Contest mailing list
CQ-Contest@contesting.com
http://lists.contesting.com/mailman/listinfo/cq-contest

<Prev in Thread] Current Thread [Next in Thread>