Speech synthesis in the context of "Homograph"

Play Trivia Questions online!

or

Skip to study material about Speech synthesis in the context of "Homograph"




⭐ Core Definition: Speech synthesis

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. The reverse process is speech recognition.

Synthesized speech can be created by concatenating pieces of recorded speech that are stored in a database. Systems differ in the size of the stored speech units; a system that stores phones or diphones provides the largest output range, but may lack clarity. For specific usage domains, the storage of entire words or sentences allows for high-quality output. Alternatively, a synthesizer can incorporate a model of the vocal tract and other human voice characteristics to create a completely "synthetic" voice output.

↓ Menu

👉 Speech synthesis in the context of Homograph

A homograph (from the Greek: ὁμός, homós 'same' and γράφω, gráphō 'write') is a word that shares the same written form as another word but has a different meaning. However, some dictionaries insist that the words must also be pronounced differently, while the Oxford English Dictionary says that the words should also be of "different origin". In this vein, The Oxford Guide to Practical Lexicography lists various types of homographs, including those in which the words are discriminated by being in a different word class, such as hit, the verb to strike, and hit, the noun a strike.

If, when spoken, the meanings may be distinguished by different pronunciations, the words are also heteronyms. Words with the same writing and pronunciation (i.e. are both homographs and homophones) are considered homonyms. However, in a broader sense the term "homonym" may be applied to words with the same writing or pronunciation. Homograph disambiguation is critically important in speech synthesis, natural language processing and other fields. Identically written different senses of what is judged to be fundamentally the same word are called polysemes; for example, wood (substance) and wood (area covered with trees).

↓ Explore More Topics
In this Dossier

Speech synthesis in the context of Speech processing

Speech processing is the study of speech signals and the processing methods of signals. The signals are usually processed in a digital representation, so speech processing can be regarded as a special case of digital signal processing, applied to speech signals. Aspects of speech processing includes the acquisition, manipulation, storage, transfer and output of speech signals. Different speech processing tasks include speech recognition, speech synthesis, speaker diarization, speech enhancement, speaker recognition, etc.

↑ Return to Menu

Speech synthesis in the context of Pitch contour


In linguistics, speech synthesis, and music, the pitch contour of a sound is a function or curve that tracks the perceived pitch of the sound over time. Pitch contour may include multiple sounds utilizing many pitches, and can relate the frequency function at one point in time to the frequency function at a later point.

It is fundamental to the linguistic concept of tone, where the pitch or change in pitch of a speech unit over time affects the semantic meaning of a sound. It also indicates intonation in pitch accent languages.

↑ Return to Menu

Speech synthesis in the context of Werner Meyer-Eppler

Werner Meyer-Eppler (30 April 1913 – 8 July 1960), was a Belgian-born German physicist, experimental acoustician, phoneticist and information theorist.

Meyer-Eppler was born in Antwerp. He studied mathematics, physics, and chemistry, first at the University of Cologne and then in Bonn, from 1936 until 1939, when he received a doctorate in Physics. From 1942 to 1945 he was a scientific assistant at the Physics Institute of the University of Bonn. From the time of his habilitation on 16 September 1942, he was also Lecturer in Experimental Physics. After the end of the war, Meyer-Eppler turned attention increasingly to phonetics and speech synthesis. In 1947 he was recruited by Paul Menzerath to the faculty of the Phonetic Institute of the University of Bonn, where he became Scientific Assistant on 1 April 1949. During this time, Meyer-Eppler published essays on synthetic language production and presented American inventions like the Coder, the Vocoder, the Visible Speech Machine. He contributed to the development of the electrolarynx, which is still used today for the speech-impaired.

↑ Return to Menu

Speech synthesis in the context of Concatenative synthesis

Concatenative synthesis is a technique for synthesising sounds by concatenating short samples of recorded sound (called units). The duration of the units is not strictly defined and may vary according to the implementation, roughly in the range of 10 milliseconds up to 1 second. It is used in speech synthesis and music sound synthesis to generate user-specified sequences of sound from a database (often called a corpus) built from recordings of other sequences.

In contrast to granular synthesis, concatenative synthesis is driven by an analysis of the source sound, in order to identify the units thatbest match the specified criterion.

↑ Return to Menu

Speech synthesis in the context of Linear predictive coding

Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model.

LPC is the most widely used method in speech coding and speech synthesis. It is a powerful speech analysis technique, and a useful method for encoding good quality speech at a low bit rate.

↑ Return to Menu

Speech synthesis in the context of Multimodal interaction

Multimodal interaction provides the user with multiple modes of interacting with a system. A multimodal interface provides several distinct tools for input and output of data.

Multimodal human-computer interaction involves natural communication with virtual and physical environments. It facilitates free and natural communication between users and automated systems, allowing flexible input (speech, handwriting, gestures) and output (speech synthesis, graphics). Multimodal fusion combines inputs from different modalities, addressing ambiguities.

↑ Return to Menu