Homonyms

Post Reply
daveh
Posts: 7
Joined: Wed Mar 20, 2013 4:31 am

Homonyms

Post by daveh »

Hi
I am testing the RC Sapi5 dictionary for homonyms and seem to have a problem with 'live'.

Using the Ivona Amy voice the following was not pronounced correctly.

Do you live in a house? Does it have live plants growing?
I have a live carrot which should live for eternity.

The first third and fourth 'live' is silent. The second 'live' pronounces as 'liv'

Any suggestions?
Dave
Percy Henry
Site Admin
Posts: 69
Joined: Tue Jan 03, 2012 12:50 pm

Re: Homonyms

Post by Percy Henry »

Greetings Dave,

Ivona Amy is an UK English voice and is not SAPI 5 compatible; therefore it skips homonyms that RC SAPI 5 tries to respell correctly. This does not occur with Ivona Salli because it is SAPI 5 compatible. Some UK voices are SAPI 5 compatible like Cepstral Lawrence. If you have Cepstral Lawrence you should use it with RC SAPI 5 to see how well it works with a UK English voice that is SAPI 5 compatible.

P.S. some serious standardization is needed between the different voice providers.
daveh
Posts: 7
Joined: Wed Mar 20, 2013 4:31 am

Re: Homonyms

Post by daveh »

Percy Henry wrote:Greetings Dave,

Ivona Amy is an UK English voice and is not SAPI 5 compatible; therefore it skips homonyms that RC SAPI 5 tries to respell correctly. This does not occur with Ivona Salli because it is SAPI 5 compatible. Some UK voices are SAPI 5 compatible like Cepstral Lawrence. If you have Cepstral Lawrence you should use it with RC SAPI 5 to see how well it works with a UK English voice that is SAPI 5 compatible.

P.S. some serious standardization is needed between the different voice providers.
Hi Percy

I checked the Ivona site where in the info for the Amy + Brian package it clearly states that the voices are SAPI 5 standard compatible.

I'm rather confused!

Have you tested Sally and found it OK?

Dave
Percy Henry
Site Admin
Posts: 69
Joined: Tue Jan 03, 2012 12:50 pm

Re: Homonyms

Post by Percy Henry »

Hi Dave,

Ivona Salli and all its U.S. voices are 100% SAPI 5 compatible. The voice providers can claim SAPI 5 compatibility if they offer a limited range of SAPI 5 compatibility. For the voice to be 100% SAPI 5 compatible the voice provider must offer some kind of mapping between the U.S. English phoneme set and the U.K. phoneme set (the two phoneme sets are radically different).

It is this limited range of SAPI 5 compatibility offered by Acapela voices, why you cannot use the TextAloud's phoneme editor with Acapela voices.
Percy Henry
Site Admin
Posts: 69
Joined: Tue Jan 03, 2012 12:50 pm

AT&T US and UK English NV are SAPI 5 compatible

Post by Percy Henry »

Greetings Dave,

I neglected to inform you that both AT&T US and UK English NV voices are SAPI 5 compatible. Therefore, if you don't have Cepstral Lawrence, you can use any of the AT&T English voices to test RC SAPI 5 and:

Do you live in a house? Does it have live plants growing?
I have a live carrot which should live for eternity.

However, if you are using RC SAPI 5 and AT&T English NV voices please be aware of the following:

1. There is a slight latency problem with AT&T NV when using the phoneme tag.

2. There is a bug in AT&T UK English NV 1.4 voices with the "t" phoneme when mapping it from UK English to US English using the phoneme tag. For example, if a UK voice encounters the respelled homonym "present" (which has a "t" phoneme) it will pronounce a garbled nonsensical phrase.

3. RC SAPI 5 has no bug fixes for AT&T NV.

P.S. will have to update RC SAPI 5 to pronounce live plants correctly.
daveh
Posts: 7
Joined: Wed Mar 20, 2013 4:31 am

Re: Homonyms

Post by daveh »

Thanks for the info Percy.

I really like the Ivona British female voices so am disappointed by the phoneme problem. However I did want a US voice as well so have bought Salli.

I also have Neospeech Kate and Paul as well as some voices from AT&T, Acapella and Scansoft.

I will be extremely impressed if you manage a foolproof handling of all homonym word patterns!

I often pre-process text for homonyms by hand. There are a few that occur so frequently that I find it worth while to use the TA editor to change the spellings in the text for correct pronunciation. My short list is:
row,read,lead,wound,wind,bow,live,tear,crooked,stead,learned

In practice the application time for the RC dictionaries is too long for my convenient real time reading so I only use them when preparing MP3 audio books. They do seem to mop up most of the stuff missed by my 'short' homonym list and shorter dictionaries, so I am quite pleased .... thanks!

One point with the sound files in the SND Folder. The exclamations, especially Ahh, can be a bit startling and of course don't match the reading voice. An option to turn this off might be handy.

I think you are doing a brilliant job with this.
Keep up the good work!

Dave

PS will you allow free downloads of updated dictionaries?
Percy Henry
Site Admin
Posts: 69
Joined: Tue Jan 03, 2012 12:50 pm

Re: Homonyms

Post by Percy Henry »

Greetings Dave,

I really feel your pain: I think the Ivona British female voice is one of the most pleasing voice I have heard. I believe you are right about the interjection Ahh. I am going to turn it off in future releases. You can turn it off by using the Search Filter to find occurences of (?#Ah) using either the caption or Text Matching search fields.

You can get a complete list of sound file regexes by using the Search Filter and the Pronounce As field to search for {{Audio=C:\\SND\\

You can then turn off the sound regexes you don't like.



You can also update for live plants by adding the following regex:


(?#Qlive)(?m)(?:^|\s|['"‘“(]|\p{Pi}|\p{Ps}|\p{Pd})\Klive(?= +plants?[’”\p{Po}\p{Pe}\p{Pf}]{0,2}(?:\s|$))


<Pron sym="l ay v 1"/>

Use the Respell line to add <Pron sym="l ay v 1"/>. I recommended using the Search Filter to add the regex in the first (?#Qlive) section.




You can find a Complete List Of Homographs (homonyms) of here.
daveh
Posts: 7
Joined: Wed Mar 20, 2013 4:31 am

Re: Homonyms

Post by daveh »

I find that a particularly irksome problem occurs when authors become lazy in the use of hyphens:

When a long hyphen — is intended, requiring a pause in pronunciation, many are simply using the short hyphen -, and this can mess up the sense when reading with text to speech.

At the moment I use the editor in TextAloud to manually replace - with — when appropriate, which can be quite time consuming!

It would be great if some clever way of recognising and dealing with this problem could be devised — though the number of possible variations would appear to be very large.
Percy Henry
Site Admin
Posts: 69
Joined: Tue Jan 03, 2012 12:50 pm

Re: Homonyms

Post by Percy Henry »

Hi Dave,

While it is very easy to come up with a regex that would pause the text at appropriate places, the problem as you pointed out is the number of cases.

A corollary to this problem occurs with the em dash (—) in Acapela voices. Acapela annoyingly pronounces every — as "em dash". So far in RC Acapela Dictionary I have over 50 regex entries to pause and/or eliminate the pronunciation of "em dash".
Percy Henry
Site Admin
Posts: 69
Joined: Tue Jan 03, 2012 12:50 pm

Re: Homonyms

Post by Percy Henry »

Hi Again Dave,

Below I have sketched out a general solution for pausing hyphens (when they are used in pairs). A sample sentence and two regexes follow. Both regexes should be placed in the order specified and the second should immediately follow the first. Also the regexes would have to be near the top of your dictionary entries to avoid interference from other dictionary entries:


Part of the reasoning is alarm at the speed and efficiency with which ISIS - a militant group President Obama described as “barbaric” - has made gains in northern Iraq and has been able to wash back and forth across the Syrian border.

(?#Pause - 1_of2)(?m)[a-z]['"’”]?\K -(?= [^.;:?\p{Zp}\r\n]{3,} - )

,


(?#Pause - 2_of2)(?m)[a-z]['"’”]?, [^.;:?\p{Zp}\r\n]{3,}\K -(?= )

,



For longer pauses, you could replace the comma (,) in the respell line with a period (.) or the pause tag.
Post Reply