Features of the phonetics of the Yakut language for speech synthesis

Girls yell at a cat in Yakut

Girls yell at a cat in Yakut

We recently completed a project to synthesize the Yakut language. Although our agreements do not allow us to make the models publicly available, we can share some ideas on how to synthesize the Yakut language.

Under cat in the cut you will find out:

  • What does synthesis sound like in Yakut language?

  • How does the Yakut alphabet differ from the Russian alphabet and what “additional” sounds are there;

  • How to work with stress in the Yakut language, given the complete absence of any corpora or dictionaries;

  • And, as a bonus, how the Yakut speech synthesis speaks Russian with a Yakut accent;

Phonetics of the Yakut language

What does the Yakut language sound like?

You may have never heard it (or didn’t realize that it sounds Yakut), but it sounds like this:

Examples of synthesis in the Yakut language

These are random phrases synthesized in Yakut. If it is your native language, please write in the comments.

Alphabet, diphthongs and long sounds

Let's start with something simple, namely the alphabet. We are all very lucky, actually. Situations like with words like the / though / thought / Thomas / eighth / lighthouse neither in Russian nor in Yakut.

In Russian, it is written as it is heard, taking into account the stress (yes, I understand, this is a huge assumption), and in Yakut it is the same, but the stress has little effect on the pronunciation of the word. And we know most of the sounds. We are also lucky that the alphabet is unified with Cyrillic and the writing is phonetic (the modern alphabet has existed since 1939 and has not been officially edited since then).

Yakut alphabet:

Cyrillic

IPA

Notes

A a

[a]

B b

[b]

In in

[v]

Only in borrowed words

G g

[g]

Ҕ ҕ

[ɣ], [ʁ]

“G with a hook” (fricative “g”)

D d

[d]

D d

[ɟ]

“D with a soft sign” (super soft “d”)

Her

[e], [je]

Only in borrowed words

Her

[jo]

Only in borrowed words

W w

[ʒ]

Only in borrowed words

Z z

[z]

Only in borrowed words

And and

[i]

Y y

[j], [j̃]

K k

[k], [q]

L l

[l]

Mm

[m]

N n

[n]

Ҥ ​​ҥ

[ŋ]

Ligature “ng” (rear lingual “n”)

Nn nn

[ɲ]

“N with a soft sign” (super soft “n”)

Oh oh

[o]

Ө ө

[ø]

“O with a crossbar”

P p

[p]

R r

[r]

With with

[s]

Һ һ

[h]

T t

[t]

U u

[u]

Ү ү

[y]

Straight “u”

F f

[f]

Only in borrowed words

X x

[x]

C c

[ʦ]

Only in borrowed words

H h

[ʧ]

Sh sh

[ʃ]

Only in borrowed words

Щ Щ

[ɕː]

Only in borrowed words

Ъ ъ

.

Only in borrowed words

Y y

[ɯ]

Back-lingual, unrounded “u”

Ь Ь

[ʲ]

Only in borrowed words

Uh uh

[e]

You you

[ju]

Only in borrowed words

I am I

[ja]

Only in borrowed words

As we can see, a significant part of the alphabet is generally used only in borrowed words (from the Russian language, first of all), but there are also as many as four different groups of “sounds” that have no analogues in the Russian language (or they are very rare or are used in borrowed words):

  • дь And нь although they are not formally letters, they represent sounds ɟ And ŋwhich have no analogues in Russian;

  • Letters ҕ, ҥ, ө, һ, ү denote sounds ɣ, ŋ, ø, h, y and have no analogues in the Russian language;

  • Diphthongs ыа, уо, иэ And үө;

  • Long sounds аа, оо, ыы, уу, ии, ээ, үү, өө;

Next, we will go through each of the sounds in detail, give examples of how it sounds, and describe in clear words how it sounds and what it is like.

Yakut also sounds a bit “singsong” to the ear. This is because the Yakut language has five long vowels (aa, yy, uu, ii, үү), which are found only in the root. Diphthongs, which are the result of combining sounds, are found in any syllable.

The use of vowel sounds is subject to the rule of vowel harmony, in which the vowels in a word follow each other in a strictly defined order. For example, if the previous syllable contains the sound ыthen in the next one there can only be ы or аor ыа: ylyym, ylaar, ylya.

It is also important to touch upon palatalization (“softening” of consonants before vowel sounds). In Russian, it is found everywhere. In Yakut, palatalization before vowels и, э, ө, ү, иэ, үө – is not present in all consonants. л, м, т, c – present. Regarding н – debatable, because there is sound нь.

Additional letters and sounds

Let's describe the sounds that are not in the Russian language:

  • Letter combination дь transmits sound ɟwhich in Russian letters can be written approximately as дьй;

  • Letter combination нь transmits sound ɲwhich is similar нь or however нь – alveolar sound (the front part of the tongue touches the alveolar process), and in the palatal nasal consonant ɲ the middle part of the tongue touches the hard palate. The simplest example is the English word onion or any Spanish word with a letter ñFor example señor;

  • Sound ɣ transmitted by the letter ҕis more familiar to the reader as the sound between г And х from Russian southern dialects, or the sound of the letter г in words господи, ага, Бог;

  • Nasal sound ŋ transmitted by letter ҥthe reader is most likely familiar from the English language in words like thingending in -ing;

  • Letter ө transmits sound øwhich we know from “German words with dots”, for example schönand sounds like something in between о And у;

  • Letter һ transmits sound hwhich is known to the reader as the English “easy” х on the exhale, for example in the word high;

  • Letter ү transmits sound ywhich you've probably all heard a million times in the word über in different contexts;

You can listen to examples of words with the specified sounds below:

Examples of new sounds in Yakut:

Diphthongs and long sounds

Phew, you can breathe out. Basically, diphthongs are just combinations of sounds, and long sounds are like two sounds in a row. This is related to stress, but more on that later.

Let's listen to these sounds.

Diphthongs and long sounds:

Accents

It is no secret that, for example, in English, it is impossible to do without translating words into phonemes in speech synthesis, because there they write “Monday”, but read Thursday.

Memes about English reading of words:
English meme 1

English meme 1

English meme 2

English meme 2

English meme 3

English meme 3

English meme 4

English meme 4

In Russian, things are better, but you need to know the stress. Apart from rare examples like “solntse” or “дожди”, if you put the stress, the reading of the word is quite unambiguous.

But what about the Yakut language? It turns out that there are no stresses as such. We asked questions to various native speakers and linguists, and did not get a consensus. Wikipedia has something strange written on this topic.

Naturally, due to the fact that the language has few native speakers and because of its “simple” phonetics, we did not find any publicly available orthoepic dictionaries or dictionaries with stress marks.

But even if there is no accent, you need to have a system that would be compatible with the Russian language, if, for example, you need to insert a Russian word without “adapting” it into Yakut. And Russian words need to be able to put accents.

For this reason, our colleague, who speaks Yakut, sat down to listen to recordings in Yakut and came to the conclusion that the stress is not strong and falls either on all syllables at once, or where there is a diphthong or double vowel.

By parsing the corpus of words obtained from the analysis of media sites in Yakut, we were able to place “stresses” using rules so that they would work backwards compatible with the Russian language.

Modeling

Okay, great. Palatalization is not very pronounced, almost everything is written as it is heard, stress can be omitted, but we'd better put it.

In essence, after recording speech corpora in Russian and Yakut, all that remains is to answer a number of questions/hypotheses in order to finally decide on the design of the speech synthesis system:

  • Use graphemes or phonemes? Here the choice is definitely in favor of graphemes. Both languages ​​have “phonetic” writing;

  • In our experiments with phonemes, the models where palatalization is separated into a separate symbol worked better. And in our experiments with graphemes, it worked better… when we simply presented the graphemes “as is”. It turns out that here too there are no particular options;

  • The question of how to give stress is also not really worth it – a separate symbol for both languages. We tried NOT to give stress for the Yakut language, but the model started to “get confused”. If you don't give stress in the Russian language… well, you can't continue;

  • In essence, it only remains to decide how to present diphthongs and long vowels. We tried both highlighting individual symbols on them and the naive method, but we did not see a huge difference. So we are acting along the path of reducing entities;

It turns out that if you solve the issue with stress correctly, you can get an easy-to-use language synthesis system that will work in two languages ​​at once and will also allow you to “speak Russian with a Yakut accent” and vice versa.

Russian speech with a Yakut accent

As a bonus, here are some audio samples of Russian synthesized speech with a “Yakut accent”.

Speech in Russian with a Yakut accent:

Results

There wasn't much of a thriller here, we simply carefully explored the subject area and created speech synthesis in Yakut, which works simultaneously in Russian. Despite the apparent exoticism or complexity, in fact we were very “lucky” with the phonetic alphabet of the language and the fact that one of us speaks Yakut.

It should also be noted that, contrary to popular trends in the spirit of “throwing all the data into a grid with hundreds of billions of parameters”, our model works even on 1-4 processor threads locally and shows very vigorous speed indicators.

Model speed:

The table shows seconds of generated audio per second.

Model speed

Model speed

There are 8 more languages ​​ahead of us, but, unfortunately, it won’t be so “easy” anymore…

P.S.

I struggled with audio hosting for a long time, some files it does not accept at all. Therefore, if someone is very interested in all the files, here link to google drive.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *