“Sind doch die Lautgebilde der Vorhang, hinter welchem das Geheimniss der Begriffe steckt, das vom Sprachforscher Aufdeckung erwartet.”—Pott.
The skeleton of language is formed by those phonetic utterances into which significancy must be breathed before they can become living speech. They are the outward vestment of the thought that lies within, the material in which the mind of man finds its expression. Thought, it is true, may be conveyed through gesture and picture-writing as well as through phonetic utterance, but in phonetic utterance alone does it find a vehicle sufficient and worthy of itself. Like the marble in the hands of the sculptor, however, sound not only embodies meaning; it also limits and defines the expression of that meaning, and confines it within barriers which it may not pass. The language of man is conditioned by his physical structure and organization.
What anatomy is to physiology, that phonology is to the science of language. Comparative philology is based upon phonetic laws; the relation of words, of forms, of dialects, and of languages is determined by the laws which govern their outward shape. Languages are grouped together because they have a common stock of roots and a common grammar; and the identity of roots and of grammar is on the outward side an identity of phonetic sound. The laws of scientific philology are for the most part the laws which regulate the change of sounds, and these are dependent on the physiological structure of the organs of speech. The priority of sounds, of words, and even of dialects, is frequently to be discovered by an appeal to the formation of the throat and lips. We may lay down the general rule that the harder sound passes into the easier, rather than the easier into the harder; but it lies with phonology and physiology to determine which is really the harder sound. It is phonology which has created the modern science of language, and phonology may therefore be forgiven if it has claimed more than rightfully belongs to it or forgotten that it is but one side and one branch of the master science itself.
The empirical laws of the interchange and equivalence of sounds in a special group of tongues are ascertained by comparative philology; the explanation of these laws, the assignment of their causes, the determination of the order followed by phonetic development or decay, belong to the province of phonology. Phonology touches on the one hand upon physics in so far as it is concerned with the analysis of the sounds of speech, and on the other upon physiology in so far as it studies the nature and operations of the vocal organs themselves. It is, in fact, as much a branch of physiology as it is of the science of language, dealing as it does with a special department of physiology; but it passes beyond the province of physiology when it investigates the nature of the sounds produced by the activity of those organs with which alone physiology is concerned. But whether it touches upon physiology or upon physics, phonology is equally one of the physical sciences, pursuing the same method and busied with the same material. So long as philological research is purely phonological, so long have we to do with a physical science; it is only when we turn to the other problems of glottology, only when we pass from the outward vesture of speech to the meaning which it clothes, that the science of language becomes a historical one. The inner meaning of speech is the reflection of the human mind, and the development of the human mind must be studied historically. Those, therefore, who refuse to regard glottology as other than a physical science, take as it were but a half-view of it; they are forced to confine themselves to its outward texture, to be content with a mere description of the different families of speech and their characteristics, like the botanist or the zoologist, and to leave untouched the many questions and problems which a broader view of the science would present to them. It is true that even upon the broader view, the method of the science is as much that of the physical sciences as the method of geology; it is also true that the doctrine of evolution has introduced what may be termed the historical treatment even into botany and zoology; but nevertheless linguistic science as a whole must be included among the historical ones, unless we are to narrow its province unduly and identify it with the subordinate science of phonology. The physical science will give us the skeleton of speech, the dry bones of the anatomist’s dissecting-room; for life and thought we must turn to history.
We must not forget, however, that we can understand the past only by the help of the present. An antiquarian study of philology will enable us to trace the history of words and forms, to group languages into families, and to discover the empirical laws of phonetic change; to interpret and verify these laws, to correct our classifications and conclusions, to learn what sounds really are, we must examine the living idioms of the modern world. The method of science is to work back from the known to the unknown, and if we are to study glottology to any purpose and to extend and confirm its generalizations, it must be by first observing and experimenting on actual speech. We must begin by disabusing our minds of the belief that words consist of letters and not of sounds; on the contrary, letters are at best but guides to the sounds they represent, and only the experienced student of actual sounds is in a position to determine their real value. Phonology stands at the threshold of linguistic science, and those alone who have honestly wooed and won her can enter into the shrine within. The physical science leads upward to the historical science; the key to the past is to be found in the present.
Now the first question we have to ask is, What is a sound? The most general answer we can give to this question is that a sound is the impression made upon the organs of hearing by the rapid swinging of an elastic body in an elastic medium, which is usually the air. The vibrations set on foot by this rapid swinging reach the ear under the form of waves, and these may succeed each other at either irregular or regular intervals. In the first case we have what is called a noise—a source of constant delight to the savage and the infant, but exceedingly painful to the sensitive ear. In the second case musical tones are produced, among which must be counted the utterances of articulate speech. Tones, or rather full tones (as opposed to partial ones), are distinguished from each other by their (1) strength or loudness, their (2) height or pitch, and their (3) quality or timbre. The strength depends upon the amplitude of the vibrations produced in the elastic medium, the pitch on the number of the vibrations in any given space of time, or, what amounts to the same thing, on the length of time occupied by each vibration, and the timbre (also called “tone”) on the form assumed by the vibrations or waves of sound, that is to say, on the relations of the vibrations one to the other.
There are but few musical instruments that produce a simple tone; in fact, among those usually employed the tuning-fork is almost the only one from which we can hear it. All other musical tones result from a combination of simple, or as they have sometimes been termed, “partial” tones, whose double vibrations or “swing-swangs,” as De Morgan named them, stand to one another in the relation of 1, 2, 3, 4, &c. The Pythagoreans of the fourth century B.C. were already acquainted with the fact that the respective lengths of the fundamental note with its octave, fifth and fourth, must be as one to two, as two to three, and as three to four.[136] This fundamental note, or deepest partial tone, is the starting-point from which we ascend upwards; it forms the standard by which the pitch or ascending scale of sounds is measured, while the remaining partial tones go by the name of the harmonics or upper tones. The partial tones coalesce so closely into a full tone as almost to escape the notice even of the trained ear, but their co-existence may be easily detected by the help of resonatory instruments. The full tones themselves, however, which we shall henceforth call tones or notes,[137] may not be able to make the impression upon the nerves of hearing needful for conveying a sense of sound to the brain within. The tone produced by any number of vibrations less than sixteen a second is wholly inaudible except by the help of the microphone, and even this number of vibrations brings out so deep a pitch as to be scarcely perceptible.[138] “For practical purposes,” says Professor Max Müller,[139] “the lowest tone we hear is produced by thirty double vibrations in one second, the highest by 4,000. Between these two lie the usual seven octaves of our musical instruments. It is said to be possible, however, to produce perceptible musical tones through eleven octaves, beginning with sixteen and ending with 38,000 double vibrations in one second, though here the lower notes are mere hums, the upper notes mere clinks.” The sense of sound is not stronger and more trustworthy than the other senses of sight, of touch, of taste, of smell. On all sides we are strictly limited by the conditions which surround us, and even science, though she may assist the senses by instruments which enlarge and extend their powers, reaches at last a boundary which she cannot pass. The world is a vast sounding-board, even if we know it not; the infinitesimally small and the infinitesimally great alike lie beyond our apprehension. Above and below there is infinity, and “the music of the spheres,” of which the old Greek thinkers dreamed, is not, after all, so very far removed from the truth that science has revealed to us. The notes or partial tones that we hear are the purely mechanical product of a definitely determined number of double vibrations, and the variations in pitch we notice between them are due to the length of time occupied by these vibrations. If, for instance, one note takes half the time another does, if the number of oscillations in the second is twice that required by the fundamental note, the interval between the two notes is what is called an octave. If, again, the proportion between the two notes is as three to two, three waves of the one occupying the same time as two waves of the other, the interval between them is a fifth; while a major sixth represents the interval between two notes, which stand to each other as five to three. Consequently, if we divide into two equal parts a tense cord, which, when made to vibrate throughout its whole length, yields its fundamental note, and vibrate either part, we shall hear the octave above that fundamental note. In other words, the number of the vibrations of any two cords having the same degree of tension is (other things being equal) inversely as their length. In the case of two elastic rods or rigid tongues, the number of vibrations is inversely as the square of the length; hence an elastic rod six inches long will vibrate four times more rapidly than a rod of the same material and equal thickness twelve inches long. The number of vibrations is also dependent on the thickness and tension of the cords or rods, being inversely as the thickness of the cords and directly as the thickness of the rods, and in both cases proportional to the square root of their tension. It must be remembered that membranous tongues like our own chordæ vocales, act in accordance with the same general law as tense cords and not as elastic rods.
Every body capable of producing sound has a tone peculiar to itself; a stringed instrument, for instance, and a trombone differ in the tones they give forth, and we may even divide the air into definitely circumscribed portions, or “chambers of resonance,” each of which will have its own peculiar tone. The form assumed by the double vibrations, the ultimate causes of sound, determines these differences in the quality of the tones we hear. Sometimes the vibrations will run in zigzag course through the elastic medium; sometimes their shape will be rounded; sometimes, again, it will be angular. The simplest wave of sound, that produced by a tuning-fork, flows in a succession of spiral lines, and the partial tones or harmonics of other instruments may also be assumed to be so many simple waves of sound of the same form. In fact, even if a harmonic may be resolved into a combination of other harmonics or partial tones, and these again into yet simpler and fainter harmonics, we must come at last to simple notes, corresponding with the note emitted by the tuning-fork and composed of vibrations that have the same spiral shape. It is the varying amalgamation of these simple spirals that occasions the varying forms of the full tones; each full tone (the simple tone alone excepted) being made up of harmonics and consequently of their spirals in different proportions, and in this difference of mixture lies the difference of quality in the tones we hear.
Ohm, Fourier, and others first proved that the simple pendulous oscillation is the only vibration unaccompanied by harmonics, and that all full tones can be decomposed into the simple vibrations of which they consist. Helmholtz has now ascertained the exact form of many of these compound tones, as well as the conditions under which the by-notes or harmonics are present or absent. In the violin, for example, as compared with the guitar or the pianoforte, he finds that the primary note is strong, the partial tones from two to six weak, and those from seven to ten clearer and more distinct.[140] He was first led to detect the variations of form they assume by applying a microscope to the vibrations of different musical instruments, and the fact was further confirmed by the discovery made by himself and Donders that the sounds articulated by the human voice are composed of vibrations which each assume their own special shape. The phonautographs since constructed by Scott and König actually delineate the forms of these waves of sound either on a plate of sand, or in the flickerings of a gas-flame, or in the movements of a writing pencil, and the microscopic examination of the impressions produced by articulate sounds in the tinfoil of the phonograph shows a series of indentations of various but determinate shapes.
The number of forms which can be assumed by the waves of sound is naturally limited in kind, while various bodies may emit sounds containing the same harmonic or partial tone. The quality or timbre which depends on the relation and strength of these partial tones, and of the composite form assumed by the sum of their vibrations, constitutes what we have called a peculiar tone. This, as we have seen, is a simple one in the case of the tuning-fork, but in other cases it forms part of a full or complex group. We may find an illustration in the characteristic lines of light which we learn from the spectrum analysis are projected by substances; where we are dealing with a simple elementary substance, the line thrown upon the spectrum is correspondingly simple; where, on the other hand, the substance is compound, its spectrum also is compound, reflecting the several chemical elements of which it is made up. The simple spectrum answers to the simple harmonic or partial tone with its varying pitch and invariable form, just as the compound spectrum answers to the full note or peculiar tone with its characteristic quality and diversified grouping of partial tones. Now, if a body which has a certain peculiar tone is struck by a sound which contains a partial tone in any way similar to this peculiar tone, the body in question vibrates in sympathy, and we hear what is known as a by-note or harmonic. This by-note reacts upon the partial tone which has caused it, strengthening the partial tone and so modifying the quality of the complex sound. If, for instance, we play a note such as C on a violin, the strings of a piano representing C as well as the harmonics allied to it will vibrate in sympathy. Of course the more elastic the body which is struck, the louder and clearer will be the by-note, and of all elastic bodies none are better than those chambers of resonance into which we can divide the air. Such chambers of resonance are afforded by wind instruments of all kinds, whose shape determines the peculiar tone they are to emit. If the instrument is so constructed as to change its shape at will, now round, now straight, now broad, now narrow, the number of different chambers of resonance, and consequently the number of different peculiar tones, may be almost indefinitely increased.
It is this variability of form which makes the human throat such a marvellous instrument for the production of manifold sounds. Like most chambers of resonance, it has the hollow reed-like shape which connects it most readily with the primary source of sound. In analyzing the material of language we must never forget that we have to do with the most perfect wind instrument that exists, a wind instrument, too, of infinite pliability and power of change, and thus in constant and ready sympathy with the harmonics that are struck by the other organs of speech.
We must now pass from the science of acoustics to the science of physiology. We have seen what are the conditions under which musical notes are produced, we have also seen that among these musical notes the utterances of articulate speech have to be classed; we have next to examine into the nature and conformation of the physical organs to which these utterances owe their origin. In the first place, the organs of speech may roughly be divided into three groups:—the breathing apparatus, or lungs, the trachea or windpipe with larynx and bronchial tubes, and the chamber of resonance or mouth and nose. The lungs provide the material which is worked up into inarticulate noises and articulate sounds by the trachea and chamber of resonance. As long as the breath flows out of the throat and mouth quietly and without interruption language of any sort is out of the question. The organs of speech are at rest, and all that can be done is to propel the breath with greater or less violence. We may breathe hard through the mouth, we may even make noises like that of snorting through the nose, but as yet there is nothing which can constitute a starting-point for articulate speech.[141] Mere breath, as distinguished from voice, only supplies the material out of which words and sentences may afterwards be created. Voice is breath, acted upon and excited into waves of sound by the organs of the throat and mouth; a larger quantity of air than is needed for simple breathing is rapidly taken into the lungs, and immediately expelled in intermittent gusts, but with varying degrees of force. Almost all the sounds we utter are accompanied by exspiration; only such sounds as an occasionally mispronounced ja in Germany or our own surprised Oh! are produced while the breath is being drawn in. Experiment will at once show how difficult it is to pronounce a sound at the same time that this is being done.
The breath, then, is the passive instrument through which language is formed by the trachea and chamber of resonance. This trachea is a long cartilaginous and elastic pipe ending in the bronchial tubes, through which the air is admitted to the lungs. Its upper part is termed the larynx, consisting of five cartilages and situated in the throat. The lowest of these cartilages is the cricoid, which resembles a ring with the broad flat surface turned downwards. Over this comes the cartilago thyroidea or Adam’s apple, with two wings which partly enclose the cartilago cricoidea, and form a link between it and the os hyoideum,[142] or bone of the tongue, which has somewhat of the shape of a horseshoe. The space surrounded by these two cartilages may be compared with a hollow reed, out of the back part of which a piece has been cut. From the base of the latter and the upper rim of the cartilago cricoidea spring two small pyramidal cartilages, the arytenoids, which resemble the horns of an ox and almost touch one another. Their roots are connected with one another and with the cricoid and thyroid cartilages by the so-called processus vocales, which in spite of their name have little to do with the formation of speech. The horns of the arytenoids serve to unite two elastic bands to the opposite surface of the thyroid cartilage. These bands are formed of muscle enveloped with mucous membrane, and are the famous chordæ vocales upon which as upon the strings of a piano the manifold modulations of human language are played. So long as they remain, the other vocal organs, not excluding the tongue, may be removed without depriving the patient of the faculty of articulate speech.[143] Their length differs in men and women, in children and adults; the average length in men being about one-third greater than in women, and occasioning the different pitch of male and female voices.[144] The two chordæ vocales run obliquely across the cavity enclosed between the thyroid cartilage and a small projection on the front part of the arytenoid cartilage, an aperture which is called the glottis, or glottis vera. They can be relaxed or contracted at will by the muscles of the cartilages to which they are attached, and a portion of them can even be deadened by pressure from a small protuberance on the under side of the epiglottis. The glottis itself is divided into two parts, one the space between the vocal chords and the lateral thyro-arytenoid and crico-arytenoid cartilages, the other the triangular space between the vocal chords themselves, the latter allowing a passage for breath, the former a passage for voice. Both spaces can of course be narrowed or enlarged by the contraction or relaxation of the vocal chords, and the junction of the latter will close one or both altogether. It is in this secret chamber that the phonetic substance of speech is moulded into shape; the vibrations of the chordæ vocales in the breath of the glottis are the ultimate cause of syllables and words.
Above this chamber of the voice the trachea or windpipe again widens, and a second chamber is formed by two cavities on either side, called the ventricles of the larynx (the ventriculi Morgagni). Each cavity leads, at the back, into a pouch of the mucous membrane called the laryngeal sac and covered with sixty or seventy mucous glands, the secretion from which acts like oil on a piece of machinery by keeping the vocal chords and the surrounding parts in a moist condition. Stretched across the cavities are two thick ligaments, the false vocal chords, like the true chordæ vocales below them. They differ from the vocal chords in having no muscle of their own, but like the latter can contract or enlarge at pleasure the false glottis (glottis spuria), the space, that is, which is enclosed between them. The false glottis, which, like the false vocal chords, takes no part in the creation of language, is shut by an elastic cartilage, called the epiglottis, the lower point of which is attached to the thyroid cartilage immediately above the chordæ vocales, while the upper end broadens out like a leaf and falls over the fissure of the false glottis. This corresponds with the entrance of the larynx. The upper surface of the epiglottis is concave, and in swallowing it is allowed to drop upon the larynx. At other times it may be depressed over the false and true vocal chords.
Such is the machinery whereby breath from the lungs is transformed into voice in its passage through the windpipe; and voice is next taken up by what we have termed the chamber of resonance and modified in various ways. If we may call the glottis the manufactory of voice, we may call the mouth and nose the manufactory of the articulate sounds into which voice is divided. At the back of the epiglottis lies the pharynx, leading into the œsophagus, and the pharynx is bounded on the side of the mouth by the posterior pillar or arcus pharyngo-palatinus, opposite to which is the anterior pillar or arcus glosso-palatinus. Between them are the tonsils, and above these again the uvula, a sort of pendent valve which hangs downwards from the top of the anterior pillar towards the posterior pillar behind. The uvula is attached to a piece of yielding muscle known as the soft palate or velum palati, which with the uvula separates the throat from the entrance to the nostrils. The soft palate can move either backwards or forwards; in pronouncing the guttural (ng) for instance, it is pressed forward against the tongue, shutting off the throat; in pronouncing the vowels, on the other hand, it is pressed backward, and so cuts off the flow of breath to the nose. Above the soft palate comes the arch of the hard palate or roof of the mouth, and below this the tongue with its two roots and pointed tip. The teeth that enclose the mouth, along with their alveolars that form the front wall of the hard palate, have much to do with the formation of specific sounds, while it is hardly necessary to refer to the phonological importance of both nose and lips. As is well known, a leading characteristic of cultivated English is the little use it makes of the latter.
It is now time to consider the precise parts played by these different organs of speech, in producing the various elements of spoken language. We must begin by putting out of sight all inarticulate sounds or noises, such as the clicks of the Bushman or the Hottentot, which have entered into the composition and framework of actual speech. Such inarticulate sounds are but the stepping-stones to real language, the first steps of the ladder, as it were, which were eventually to lead to articulate words. They are the natural cries of man like the natural cries of the animals from which they in no way differ; and just as on the one side the barking of the dog and the mewing of the cat are said to be attempts to imitate the human voice, so on the other hand the inarticulate cries of the infant or “non-speaker” are on the same level as the roar of the lion or the shriek of the cockatoo. We are told that the cynocephalic ape of the Upper Senegal, whose form is depicted on the monuments of ancient Egypt, utters clicks which sometimes contain a distinct d,[145] and the Bushmen themselves show a true instinct when they make the beasts in their fables talk not only with the clicks of the Bushman dialects, but even in the case of some animals with clicks that do not otherwise occur.[146] If we watch the first endeavours of children to speak, we may discover inarticulate noises gradually becoming articulate sounds with definite meanings, and we may even trace a recollection of the first efforts of man to create a language for himself in the guttural aspirates heard for instance in some of the Semitic dialects. Indeed, the name given to the hard breathing (h) by the Greeks, πνεῦμα δασύ or “rough aspirate,” reminds us of the guttural noises, not yet phonetic sounds, made by the child; in forming this sound we jerk out the breath at the same time that we narrow the glottis, adding if we like various degrees of hoarseness by further stopping its free flow. The glottal catch, which is heard in Danish after vowels, and according to Mr. Bell is substituted in the Glasgow pronunciation for “voiceless stops,” is really a mere cough. Even the spiritus lenis or soft breathing, heard before a vowel, partakes in some measure of the nature of a noise. It is true that the rough breathing cannot be sung while the soft breathing may be; but this is because in the case of the latter the breath is checked near the vocal chords and can therefore be intoned. Professor Max Müller is doubtless right in holding that all that the Greeks meant by πνεῦμα ψιλόν as opposed to πνεῦμα δασύ was “a negative definition of another breath which is free from roughness,”[147] just as the ĕ-´psilon is negatively contrasted with the êta. Neither breathing was regarded as constituting as yet a true sound or “voice.”
The true sounds of language, however, were distinguished but roughly and imperfectly one from the other. Plato, in his Kratylus, divides them into φονηέντα or “vowels,” and ἄφωνα or “mutes,” these last being further subdivided into semi-vowels which are neither vowels nor mutes (φωνηέντα μὲν οὔ, οὐ μέντοι γε ἄφθογγα) and ἄφθογγα or real mutes. The term ἄφωνα, mutes, afterwards came to be restricted in its sense as a simple equivalent of Plato’s ἄφθογγα, its place being taken by the term σύμφωνα or “consonants,” letters, that is to say, which must be sounded along with a vowel. These consonants were next classed as ἡμίφωνα or semi-vowels (l, m, n, r, and s), ὑγρά or “liquids” which covered all the semi-vowels with the exception of s, and ἄφωνα or “mutes.” The mutes fall into three classes, the ψιλά or “bare” (k, t, p), the δασέα or “aspirates” (kh, th, ph) and the μέσα which stood, as it were, “between” them. The Latin translation of the latter term has given us the mediæ of modern grammars.
Far more thorough-going and scientific were the phonological labours and classification of the Hindu prâtiśâkhyas. Instead of starting from written speech like the Greek grammarians, they had to do with an orally-delivered literature, and hence while the Greeks never got beyond the belief that the tongue, teeth, and lips were the sole instruments of pronunciation, the Hindus had carefully analyzed the organs of speech some centuries before the Christian era, and composed phonological treatises which may favourably compare with those of our own day. They knew, for example, that in sounding the tenues, or hard letters, the glottis is kept open, while in sounding the mediæ, or soft ones, it is closed; they knew also that e and o were diphthongs analyzable into a + i and a + u; and they explained k and g, p and b, as formed by complete contact of the vocal organs. They had noted the repha or “Newcastle burr,” and had divided the nasals into their several classes. The names they gave to the various sounds, and the groups into which they were classified, were descriptive of their mode of formation, like the names similarly applied by modern phonologists. Thus the guttural sibilant formed near the root of the tongue (χ) was called Jihvâmûlîya, “the tongue-root letter,” and the labial sibilant (φ) Upadhmânîya, “to be breathed upon.” The consonants were classed both according to the place where they were formed, and according to their prayatna, or “quality,” the mutes and nasals, for instance, being formed by “complete contact” of the vocal organs, the semi-vowels by “slight contact” (îshat sprishṭa), the sibilants by “slight opening” (îshad vivṛita), and the vowels by complete opening. A controversy even sprung up among the grammarians as to the extent of this opening of the organs. “Some ascribe to the semi-vowels duḥspṛishṭa, imperfect contact, or îshadaspṛishṭa, slight non-contact, or îshadvivṛita, slight opening; to the sibilants nemaspṛishṭa, half-contact; i.e., greater opening than is required for the semi-vowels, or vivṛita, complete opening; while they require for the vowels either vivṛita, complete opening, or aspṛishṭa, non-contact.”[148]
Leaving the speculations of the past, let us now pass on to the results which have been obtained by modern research. Thanks to the labours of men like Alexander Ellis, Melville Bell, Helmholtz, Czermak, Brücke, Sweet, and others, the mechanism of speech has been fairly settled; and though many points are still open to discussion, the main facts have been thoroughly ascertained and adequately explained. We have learnt the real nature and causes of those phonetic elements of speech which the old grammarians first tried to separate and classify; we have cleared away the confusion from which even the Vedic scholars of India could not wholly escape, and have discovered that in phonology as elsewhere, the convenient systems of practical life do not bear a close scientific investigation. Even the ordinary distinction of vowels and consonants is exposed to more than one objection. It rests not upon the essential character of the sounds themselves, but upon mere differences of function, and its advocates have to invent a series of semi-vowels or semi-consonants, a name which of itself indicates how incomplete and unsatisfactory the distinction must be. The distinction, indeed, has a basis of fact, but the fact is one which has been misapprehended or overlooked.
Apart from the respiratory organs which supply the fuel, the chief agents in the manufacture of speech are the throat and mouth. The breath, as it makes its way upward, passes the vocal chords, causing these to vibrate; and while the forms taken by the vibrations determine the quality or timbre of the sound to be uttered, the very essence of a vowel, for instance, consisting in the quality of the voice, the number of the vibrations determines its pitch.
In the pitch we have to distinguish between two things, the chest or true notes and the head or falsetto notes, respectively due to the position and action of the vocal chords. In the chest notes the vocal chords are stiffened and laid side by side, so that when the flow of breath comes from the lungs, they are forced aside for a moment, to spring back the next and cause a series of intermittent puffs of breath. In the falsetto notes, on the other hand, the muscles of the vocal chords are not contracted, nor is the glottis wholly closed; hence only the inner membrane of the chords is set in motion by the breath, and instead of actually meeting one another, the chords merely narrow or enlarge the aperture of the glottis.[149]
The forms assumed by the vibrations depend, of course, on the anatomical structure of the vocal chords, their greater or less elasticity, and the like. Besides quality and pitch, however, we must also take account of the intensity of the sound, this intensity or emphasis arising from the force with which the stream of breath is expelled from the lungs, and the corresponding strain of the muscles of the trachea and vocal chords.
In whispering, the amount of intensity is considerably diminished, though the pitch is quite as distinct as in loud voice. The glottis is not completely closed, but the upward flow of breath is not strong enough to do more than produce a sort of friction, or imperfect vibration in the vocal chords. The latter incline towards each other on the side furthest from the arytenoids, and so give the glottis a triangular shape; the larynx, however, may also assume other forms. Hence it is that we may distinguish three kinds of whispered voice. We may either have a soft whisper, where the whole glottis is narrowed, and the force with which the breath is emitted is very slight; or a medium whisper, where the force is greater, and only that part of the glottis left open which lies between the arytenoids; or a loud whisper, where the force is considerable, the false vocal chords are in close contact, and the epiglottis bent stiffly downwards, allowing but a very small opening for the escape of the breath. A loud whisper is rare; a medium whisper the most common. Sighing, it may be added, is produced above the larynx, which takes no part in its production; when the vocal chords are brought into action, the sigh becomes a groan.
It needs but a short experience to discover the numberless varieties of voice that may exist, and it is not uncommon for a blind man by this means not only to distinguish the age and sex of those he meets, but even to recognize his friends. In fact the human voice, from the deepest male to the highest female voice, has a range of nearly four octaves, the lowest note being E, produced by 80 vibrations per second, and the highest C, produced by 1,024 vibrations per second. But Vierordt has shown that in extreme cases its range is nearly 5½ octaves, from F (produced by 42 vibrations) to A (produced by 1,708 vibrations). In the same individual it is rare for the range of the voice to be more than two octaves, and in ordinary speech it is generally only half an octave. These different notes are due to changes in the length and tension of the vocal chords and their approximation or separation, the lower notes, for instance, requiring them to be longer, looser, and more widely separated than in the case of the higher notes, and consequently to admit a larger but less rapid current of air. It has been calculated that 240 different states of tension of the vocal chords must be accurately producible at will, in order to cause all the notes and intermediate tones heard in a perfect voice of ordinary range. Madame Mara could effect no fewer than 2,000 changes. The four chief varieties of the voice—the bass, the tenor, the contralto, and the soprano—are dependent on differences of pitch, that is ultimately on differences in the length of the vocal chords. The bass and the tenor with the intermediate baritone characterize the man, the contralto and soprano with the intermediate mezzo-soprano characterize the woman. The lowest note of the contralto is about an octave higher than the lowest note of the bass, the highest soprano about an octave higher than the highest tenor. Sometimes, however, we find a bass voice singing the higher notes of a tenor, and yet at the same time remaining bass. The reason of this is that the various kinds of voice differ not only in pitch, but also in timbre. This is caused by differences in the vocal organs. The larynx of women is smaller than that of men; the angle formed by it in front is less acute, and the cartilages are softer. The voice of boys is either contralto or soprano, like that of women, though generally different in tone. There is, however, no difference in the larynx of either boys or girls up to the age of puberty, when in the case of boys it rapidly increases in size, and the vocal chords become longer, thicker, and coarser.
The elevation or depression of the larynx exercises a certain modifying influence upon the voice. When the voice is raised from a low to a high pitch, the whole larynx, together with the trachea, is lifted towards the base of the skull. The exact way, however, in which the trachea and the parts above the glottis affect the voice is by no means clear. The thyro-arytenoid muscles, which extend from the arytenoids to the recessed angle of the thyroid cartilage, have much to do with the production of these higher tones. They narrow the diameter of the larynx just below the vocal chords, and the diminution of the calibre of the wind-tube nearest the chords thus occasioned heightens the pitch. On the other hand, the pitch is made to fall by semitones when the tube is lengthened. In short, the greater the strength of the current of air the higher is the pitch. The depression of the larynx produces the so-called veiled voice (vox clandestina), the larynx itself being then covered by the entire pharynx, the root of the tongue approximated to the palate, and the voice being thus made to resound in the upper part of the pharynx under the skull.
The precise nature of ventriloquism is not quite certain. J. Müller states that it may be produced by speaking through an extremely narrow glottis, during a very slow exspiration, performed only by the lateral walls of the chest, a deep inspiration having been first taken, so as to cause the protrusion of the abdominal viscera by the descent of the diaphragm. Magendie, however, considers it to be produced in the larynx by variously modifying the voice so as to imitate the changes otherwise effected in it by distance.
The character of the voice is necessarily modified by changes in the structure of the vocal organs, whether due to old age, to weather and climate, to exhaustion, or to disease. In old age the ossification of the cartilages, the diminution of muscular and nervous power, and the degeneration of the larynx, make the voice weak, tremulous, and “piping.” In damp chilly weather the voice is often lowered by as much as two or three notes: indeed, nothing affects it more rapidly than a damp and depressing atmosphere. Exhaustion, again, accounts for the dissonance sometimes perceived in the voice of singers, while inflammation of the lining membrane of the larynx, and other diseases, will impair or wholly destroy the power of utterance. Loss of voice during a bad cold is a familiar instance of the latter fact.
Lisping, stammering, and other kinds of imperfect speech, are mainly due to nervous disease, stammering being usually caused by temporary spasm of the glottis. Too high a palate is another cause of irregular utterance. Dumbness, when not occasioned by deafness, as is generally the case, must be ascribed either to malformation of the vocal organs, or, more commonly, to disease of the nervous centres. Whistling, it must be remembered, results from the vibration caused by the friction of the breath against the edges of the open lips, and is wholly formed in the mouth.
The mouth, or chamber of resonance, is especially important for the creation of articulate speech. On the one side there are a great many sounds which owe to it their origin, on the other side even the sounds which are formed in the throat are necessarily modified in passing through the mouth. While t, p, or k have no existence until the voiced breath has reached the region of the mouth, the vowels which are formed in the throat cannot be heard in their pure and original state, but must pass through a chamber of resonance and so become more or less transformed. The throat, again, may remain passive, but the mouth must always be active. Of course the mouth forms a chamber of resonance not only for the sounds produced by the throat, but also for those produced by itself; the larger part of the mouth, for instance, forms a chamber of resonance for the palatal ch. We must remember, moreover, that a sound can be more variously changed and modified, the larger and more variable is the part of the mouth which serves as a chamber of resonance, that is to say, the further back the place is in which it is manufactured. The vowels consequently come first in capability of modification, then the gutturals and dentals, and finally the labials. It has often been observed that children when learning to speak are apt to change a guttural into a dental, and say do instead of go, the guttural being formed further back than the dental, and so undergoing a greater amount of modification in its passage through the mouth.
A vowel is voice freely emitted through the throat and mouth without interruption, and modified only by the different positions assumed by the tongue. The essence of a vowel is the quality or timbre of the voiced breath, and this quality, as we have already seen, is due to the varying forms taken by the vibrating vocal chords when played upon by the breath. Necessarily, however, the quality of the voice as it leaves the throat must be always the same, since the throat is a musical instrument which possesses its own peculiar tone. What, then, is the cause of the differences we notice in the quality of the vowels? Simply the mobility of what we have called the chamber of resonance, the manifold shapes the organs of the mouth are able to assume being so many musical instruments, each with its peculiar tone. The partial tones or harmonics which go to make up the quality of the voiced breath are strengthened by the corresponding peculiar tones of the several shapes assumed by the mouth, while at the same time those harmonics which do not agree with the peculiar tones are dulled or deadened. Hence a vowel is the quality of voiced breath produced by a combination of the forms of the vibrations of the vocal chords with those of the vibrating air in the various shapes taken by the chamber of resonance. The pitch of the vowel depends of course on the number of vibrations during the time of utterance, and may be detected even when the vowel is whispered. Indeed, as Donders and Helmholtz have shown, every vowel has its characteristic pitch, whether it is voiced or whispered. The different vowels can be heard in cases of aphonia, where the vocal chords are more or less paralyzed, while the vox clandestina is able to rise or fall. This is explained by the fact that even in whispering a certain friction is exercised on the vocal chords. If, for instance, we whisper the sound of ü, and then let the whisper gradually pass into a whistle, we shall always get the same tone, and Professor Max Müller thinks that the indications of musical pitch in the whispered vowels must be treated as “imperfect tones; that is to say, as noises approaching to tones, or as irregular vibrations, nearly, yet not quite, changed into regular or isochronous vibrations.”[150]
The number of possible vowel-sounds is almost infinite. The vocal chamber of resonance is almost infinitely variable in the forms it may assume, and it is in these forms, as we have seen, that we must find the origin of the vowels and their nuances of sound. In Prince L.-L. Bonaparte’s alphabet, as given in Mr. A. J. Ellis’s “Early English Pronunciation,” seventy-five vowel-sounds (exclusive of ḷ and ṛ) are distinguished from one another, ten of which occur in no actual language, and of the remaining sixty-five, fifty occur each in less than nine European dialects. For practical purposes, however, it is necessary to analyze the formation of those vowels only which are heard most usually in spoken language, always remembering that the nuances of which these are capable are nearly unlimited, and that the same speaker is constantly varying what he intends and believes to be the same vowel-sound. Speaking generally, we may say that in pronouncing the vowels we invariably raise the tongue towards the palate, but not so as to touch it—as in the case of the consonants—the lips being passive in some instances, and rounded in others. It is needless to note that in phonology, as in all other departments of the science of language, the Italian pronunciation of the vowels must be adopted. Our erroneous pronunciation of the vowel-symbols is not one of the least important reasons for urging a reform of English spelling.
The three fundamental vowels, round which all the others group themselves, are a, i, and u; and though it is not necessary to hold that these were the first vowel-sounds articulated by man, it is necessary to regard them, for analytical purposes, as the primary elements to which the rest may be ultimately referred. According to Winteler, these three vowels must be arranged in a straight line, of which i forms one end and u the other, a standing in the middle.
In forming a the tongue is in a more constrained position than in the case of any other vowel; it lies flat and retracted, while the lips are wide open. Helmholtz makes its inherent tone B″ flat. Owing to the constrained position of the tongue, this vowel is more liable to be modified than any other; the “neutral” a is scarcely ever heard, produced as it is by the gradual narrowing of the movement of the tongue from the back of the mouth, where the obscure a of father is heard, to the front of the mouth, where we get the broad ä of pair. This neutral a which may be heard in the Italian ămātă is not the “natural” sound it is sometimes called; different parts of the mouth must be modified to create it, occasioning the nasal sound we perceive in moaning if the mouth remains passive, or the shrill ä of the new-born child, if the nasal orifice is closed by the elevation of the soft palate.[151] The belief that language was once in a stage in which the neutral a was the only vowel known is contradicted by the facts of phonology.
A stronger effort of articulation is required for i and u. The lips must be slightly opened, the larynx raised, and the tongue pushed upward, so that its front approaches the hard palate, if we want to produce i, the natural pitch of which is said to be D⁗. The movement of the tongue from the back to the front of the mouth, with a gradual narrowing of the air passage, forms both the i of mill, and the i of meal.[152] As we shall see, the position of the tongue in forming i approaches that required for forming the palatals, and thus explains the relationship that exists between them. For u the tongue is raised towards the soft palate, the larynx lowered, and the lips rounded; hence the connection between this vowel and the labials. Its connection with the gutturals, as illustrated by the change of werra into guerre, or vespa into guêpe, is explained by the position of the tongue, which approaches the soft palate in forming u, and touches it in forming k or g. The rounded shape of the mouth needed by u, as compared with its narrow neck-like appearance needed by i, strengthens the deep partial tones, and dulls the sharp ones, thus occasioning the converse effect of i. In fact, u is essentially the vowel of the bass, i of the soprano. The inherent tone of u is F.
It is obvious that an almost endless series of modifications may be made in the primary vowels by slight changes in the position of the organs by which they are produced. Between a and i stands e; between i and u, o. In pronouncing e the tongue is less raised than in pronouncing i; for o, the back of the tongue is less raised and the lips more widely opened than for u. In o, however, as in u, the lips have to come into play; hence it is that these two sounds are so frequently weakened to e and i, whereas the converse change never takes place. In e and i we have a simple and not a double action. According to Helmholtz, the inherent pitch of o is B′ flat, of e, B‴ flat or F′.
But e and o may again undergo considerable change. If while pronouncing close e (as in the French été or German see) we round the lips, the sound is produced which is represented by ö in Middle and Southern German and eu in French, the short sound of which may be heard in the German böcke. It lies, it will be observed, between e and o, and its inherent pitch is C‴ sharp. Closely related to this ö is the German ü, French u. This sound is produced by rounding the lips when the organs of speech are in position for pronouncing i, which explains the use of ü and i as rhyming equivalents in German poetry. Ü consequently lies between i and u, though, from another point of view, it may be described as standing furthest from a in a series of which ö forms the centre. The inherent pitch of ü is G‴.
Besides o, we have also the sound heard long in words like bought or aúgust, and short in words like not and augúst, formed by slightly depressing the tongue, widening the air-passage, and rounding the lips to a less extent than in the case of o.
Other vowel-sounds which may be noticed are the e of the French prêtre, German väter, whose natural pitch is made G″ or D‴, the closely related open e (ä) of the English pair, the short a of English closed syllables like hat or happy, the short e of the English men, and the short i of the English hit, pill. These short vowels are in great measure due to the little use made of the lips in articulation, and the compensatory exercise of the tongue, which characterize modern English. It is small wonder that we experience so much difficulty in pronouncing ö and ü, when even our u is uttered with lips scarcely at all rounded. On the other hand, whenever we find these sounds in a language, we may conclude that we have to do with a speech which gives the lips their full share in articulation. Sievers would call those vowels passive in which all the organs of speech needed for their clear pronunciation are not brought into play, fully pronounced vowels being termed active.[153]
The same lazy pronunciation of cultivated English which has almost dispensed with the service of the lips is the cause of the increasing preponderance of the so-called neutral vowel heard in such words as but, virtue, dove, bird, oven. Except in affected pronunciation we may detect it in most unaccented syllables, especially if they happen to be final; thus we have diligĕnce, muttŏn, ăgainst, finăl, evĭl, valuăblĕ. So, too, as Professor Max Müller remarks, “town sinks to Paddingtŏn, ford to Oxfŏrd.” He believes it to be pronounced with non-sonant or whispered breath.[154] Mr. A. J. Ellis would make it voice in its least modified form; and Mr. Sweet regards it as a mere voice-glide. The “indistinct” vowel heard in Arabic words by travellers seems to be identical with it. Its existence in a language is a sign of age and decay; meaning has become more important than outward form, and the educated intelligence no longer demands a clear pronunciation in order to understand what is said. The participation of all the organs of speech in the creation of vowel-sounds is, on the contrary, a mark of linguistic freshness and youth. When we find both tongue and lips equally active in the formation of u and i, we may feel pretty sure that we are in the presence of an uncultivated dialect. Vowels formed by combining the position of the tongue required for u with that of the lips required for i are extremely rare in Aryan speech; an exceptional instance is to be met with in the Russian jery (y).
But we must never forget the infinite capability of modification possessed by a vowel. The same vowel-sound of the same word is not only apt to be pronounced differently by two natives of the same country, but even by the speaker himself at different times, particularly if his attention has been directed to his pronunciation of the sound in question. It is true that the shades of difference between the sounds may be so fine as to escape all but the specially trained ear; but this does not prove them to be any the less real. Putting aside quantity, accent, emphasis, or accidental alteration in the vocal organs, it is difficult to pronounce the same word twice over in exactly the same way, so far, at least, as its vowels are concerned. It is not wonderful, therefore, that it is in their vowels that dialects soonest and most easily alter, and that the vowel-system is the best guide in mapping out the several stages in the history of a language. Of course the character of a vowel-sound is materially affected by its position in a word, or by the consonants with which it is associated; the pronunciation of the same vowel varies in a closed or an open syllable. Long and short vowels, too, differ not only quantitatively, but qualitatively also. Every vowel has both its own peculiar pitch and a pitch dependent on the length of the vocal chords. The peculiar pitch is the result of the resonance-chamber in which the vowel is formed. The high pitch of i is due to the narrow air-passage in the front of the mouth in which it is produced, while the lowered pitch of a and u is caused in the one case by the greater size of the resonance-chamber, and in the other by the narrow opening of the lips. The same pitch may be produced by different modifications of the same resonance-chamber. Thus the French eu in fleur, produced by slightly raising the front part of the tongue and rounding the lips, has the same pitch as the English e in err, produced without any rounding of the lips at all.
But we have not yet finished with the vowels. The mouth is not the only agent concerned with their production. Brücke[155] asserts that the bones of the skull itself participate in the vibration caused by the utterance of the high-pitched vowels. However this may be, the larynx, the posterior wall of the pharynx, and the velum pendulum, or soft palate, with the uvula attaching to it, have all to do with the creation of vowel-sounds. Czermak has proved by experiment that the velum pendulum changes its place with each vowel that is uttered, rising successively for the pronunciation of a, e, o, u, and i. The nasal orifice, too, is closed during the pronunciation of some vowels, and more or less open during that of others. A and e were the only two vowels which a young man named Leblanc, whose larynx was completely closed, was able to utter; while, on the other hand, experiment has shown that with i, o, and u the passage to the nose is shut, slightly open with e, and considerably open with a. From this it will be seen that the term “nasal vowel” is a misnomer. Nasal vowels, in fact, are produced by dropping the uvula, and so allowing the air to vibrate freely through the cavities which connect the nose with the pharynx. So far from a passage of the air through the nose being necessary, we may even increase the nasal twang by stopping the nostrils. The strength of the nasalization depends on the distance of the velum pendulum, or soft palate, from the tongue; and in languages like French, in which much use is made of nasalized vowels, the vowel is frequently followed by a true guttural nasal. It has often been noticed that French, in spite of its strong tendency to nasalize the vowels, has no nasalized i or u. The cause of this deficiency is very simple. A nasalized vowel requires a free passage for the air from the pharynx to the nose; but this is rendered almost impossible in the formation of i, where the tongue is raised so high as to send most of the air through the mouth however much depressed the velum may be, as well as in the formation of u, where the tongue is pushed backward towards the soft palate itself. A nasal i, however, occurs in Portuguese, and probably also in the Sanskrit simha, “lion.”
Every vowel-sound, then, demands three main conditions for its production—the exspiration of air from the lungs, the vibration of the vocal chords, and the formation of a chamber of resonance by the organs of speech. The three conditions must co-exist if we are to have a simple vowel of definite quality, though the exspiration of air need not last beyond the moment at which the vowel-sound is formed. But the position of the organs of articulation both before and after its formation occasions important differences in the manner in which it is introduced or ceases to be heard. In quick and lively utterance, the energy with which the stream of air is emitted makes it difficult for each exspiration to be exactly simultaneous with the corresponding vibration of the vocal chords, while if the exspiration is weak, the vocal chords are apt for a moment not to vibrate. In order to give the chords on the one side the resisting power requisite in energetic exspiration, and on the other side to make them vibrate without delay in weak exspiration, the windpipe must be contracted for a second, thus checking the outflow of breath and causing the chords to vibrate in unison. The sonant breath so produced is the spiritus lenis of our old school-grammars, the slight noise produced by the check given in the throat to the uprush of air from the lungs. The noise may easily be detected in whispering, or in the pronunciation of a word like ’ear, when a special effort is made to prevent it from degenerating into year, and the fact that it is a noise will explain the dislike felt by the sensitive Greek to what the grammarians term a hiatus. The spiritus lenis varies according as it is the result of a compression of the chordæ vocales alone, or of the false chordæ vocales as well; but it is doubtful whether we can treat it as a distinct consonant and not rather as the pure tone of the voice. Perhaps it should most strictly be called a glide. It readily passes into the non-sonant aspirate or spiritus asper, by allowing the breath to pass through the throat without check or hindrance. The glottis, indeed, is in the latter case slightly narrowed and the larynx stiffened, but the difference between the rough and soft aspirates is that the one is a continuous sound, the other a checked breath. The vocal chords are brought together while the breath is passing through the throat, and since their movement may be either quick or gradual the hard aspirate or h may correspondingly vary in character. As Czermak first pointed out, the more usual hard aspirate is that produced by the gradual compression of the vocal chords when they remain for a moment in a given contracted position.[156]
The same causes which produce the spiritus lenis or the spiritus asper at the beginning of the vowel-sound produce similar results at its end. It may terminate with a weak breathing, a firm breathing, or a non-sonant aspirate. In the case of a weak breathing the exspiration either ceases before the vocal chords have begun to vibrate, thus resulting in a long vowel, or at the very moment at which the windpipe is opened to admit the passage of air, the result being a short vowel. The weak breathing answers to what may be called the neutral vocalic utterance, so rarely heard in language, when the vowel-sound is introduced without either the soft or hard aspirate, the windpipe being merely narrowed sufficiently to set the vocal chords in motion at the same moment that the exspiration takes place. The firm breathing corresponds with the spiritus lenis, and is due to a sudden check given to the vibrating voice. Examples of it occur in words like no! bah! uttered abruptly, or where we wish to divide two similar vowels one from the other. The non-sonant aspirate is produced by continuing the exspiration for a while after the opening of the windpipe, and may be heard in final vowels which are at once short and strongly accented. The non-sonant aspirate is sometimes combined with the firm breathing, especially in Danish, where such words as ti, nei, are pronounced with a double exspiratory effort, the second consisting of a non-sonant breath of more or less strength, jerked up, as it were, after the vowel.
Now, let us stop for a moment to remind ourselves of the distinction between sonant and non-sonant. Non-sonant or surd sounds (also called “hard” and “breathed”) are breath as modified by the organs of speech; sonants, “soft” or “voiced” sounds, are voice similarly modified, voice being breath when played upon by the vibrating chordæ vocales in its passage through the partially closed glottis. Voice, therefore, continues to be heard without interruption as long as we have a succession of sonants following one upon the other; the transition or “glide” from one sonant to another consisting simply in the change of position assumed by the organs of speech. In pronouncing the sound al, all that happens in passing from a to l is a transference of the tongue from the position required for forming a to the position required for forming l; voice continues without interruption. Now it is clear that while voice is passing from a to l, neither pure a nor pure l can be sounded, though the time occupied by its passage (that is, by the change in the position of the tongue) is so infinitesimally small that the sound or sounds actually produced cannot be heard, and all we can be conscious of is a modification of a at its end or of l at its beginning. If we have two successive vowels, each belonging to a different syllable, a separate effort of exspiration is needed for both, and the transition-sounds are apt to escape notice from the weakening of the exspiration during the interval between the two efforts; but if the vowels do not belong to distinct syllables, the result is wholly different. Diphthongs, as we term them, consist in the combination of two simple vowels, usually short, into a single syllable pronounced, therefore, with a single exspiratory effort, and with the stronger accent on the first vowel. The sound we hear is produced while the organs of speech are being changed from the position required for the one vowel to the position required for the other. We have only to sing the diphthongs ai or au on a long note to hear a distinct i and u at the end of each, and the Sanskrit grammarians discovered more than two thousand years ago that the diphthongs ê and ô were really combinations of a + i and a + u. The primary condition of the existence of a diphthong is the rapid transition from one of the component vowels to the other, and this renders the true resolution of a diphthongal sound so extremely difficult except to the specially trained ear. Once acquainted with the two component vowels, we can easily determine the intermediate or transition sounds in which the diphthong really consists; but written documents rarely do acquaint us accurately with them. Diphthongs whose second element is e or o have sometimes been termed “imperfect” and considered of younger origin than those whose second element is i or u, because of their greater fulness of tone and consequent inappropriateness to the unaccented place in the compound; but such a view does not seem to be correct. It appears certain, however, that languages show a tendency to form diphthongs the longer they live and the greater the extent to which they have been affected by phonetic decay. English is a prominent example of this tendency; our vowels are all becoming diphthongs; even the first personal pronoun I (ai) has become one, and already we hear aither and naither more frequently than either (eether) and neither. The so-called long vowels which occur in such words as say, no, he, are all diphthongal, and some of the local dialects have carried the tendency even further than the literary language.
The existence of triphthongs has been disputed, and no doubt most of the alleged cases, such as iei or ieu in the Romance idioms, are either dissyllables or consist of a semi-vowel followed by a diphthong. But, as Sievers remarks:[157] “the transition from the first to the second component element of a diphthong may be so prolonged that even the transition sounds themselves may be distinctly heard.” As for semi-vowels, they differ from the first element of a diphthong only in having lost the accent and being followed by a strongly accented vowel. Hence they come to assume the function of sonant consonants. Hence, too, the necessity that the vowels in which they originate should possess less fulness of tone than the vowels by which they are immediately followed. We may have yá and wá, but hardly ᵃᵢ and ᵃᵤ. Naturally i and u most readily pass into semi-vowels, partly from their comparatively weak tone, partly from the compression of the air-passage needed to produce them, partly from the similar position of the organs of speech in forming the spirants y and w. These spirants, as we shall see, are not to be confounded with the semi-vowels y and w.
A vowel, then, is the quality or timbre of voice as modified by the tongue and lips, and consists of the forms assumed by the vibrating air as it passes through the windpipe and vocal chords. But the tongue and lips naturally tend towards the same position whatever be the vowel sounded. A man who has been accustomed to give his tongue a particular position in pronouncing i will give it much the same position in pronouncing e, for we must never forget that there is an almost infinite number of i’s or e’s varying with the slight changes of position of the tongue and lips when placed for enunciating those vowels. According to the greater or less use made of the lips in speaking will be the character of all the vowel-sounds of a language. The vowels, consequently, fall into systems, and in investigating the phonology of a dialect, we have to inquire not only what vowels it possesses, but more particularly what system these fall into. The basis of English vowel pronunciation is the passive position of the lips, just as in the Holstein dialect it is the withdrawal and flattening of the tongue. Sievers states, that in speaking the dialect of Lower Hesse the tongue must be relaxed and in a position of the slightest possible tension; while, on the contrary, in the Saxon dialects the whole tongue must be tense, the throat stiffened and the exspiration energetic. “Hence the hard, somewhat screaming impression made by this dialect in contrast with the dull, almost heavy and negative character of the Hessian.”[158]
But it is time to turn from the vowels to the consonants, the skeleton, as it were, of articulate utterance. A language could consist wholly of vowels; indeed, a Polynesian dictionary contains numbers of words which have not a single consonant in them, and children frequently mark the differences between words rather by the vowels than by the consonants they contain. The earliest systems of writing other than ideographic are syllabaries and not alphabets, while alphabets like the Sanskrit ascribe an “inherent” vowel to each of their consonants. But though vowels are indispensable to an organized language, it by no means follows that they were equally indispensable to the first attempts at speech. As a matter of fact, a preponderance of vowels such as characterizes the Polynesian dialects is a sign of phonetic decay and linguistic old age. “Consonants,” says Professor Max Müller, “are much more apt to be dropped than to sprout up between two vowels.” If we had only the Greek μέρμερος or the Latin memor before us, we should have no idea that they have lost an initial sibilant; in fact, this only becomes apparent when we compare the Sanskrit smar, “to remember.” The endeavour sometimes made to reduce the Parent-Aryan alphabet to a small number of simple and easily pronounced consonants, is founded on the fallacy that the results of a phonetic analysis of the words we utter and a reduction of the sounds they contain into their leading types, is identical with the primitive alphabet of the Aryan race. On the contrary, the sounds of a language become more simplified and clearly marked the longer it continues to be spoken, and the primitive Aryan alphabet, instead of being a simple list of primary sounds, from which all that are harsh or indistinct have been carefully eliminated, must really have resembled the existing alphabets of barbarous or semi-barbarous tribes, and included a large variety of consonants, many of which we should find it extremely difficult to reproduce.