U+03F3 Greek Letter Yot [ϳ]

The phone [j] has been trouble for Greek since its inception. It used to be a phoneme in the language, as shown by the reconstruction of Proto-Greek (the form of Greek ancestral to its Ancient dialects), and by Linear B and the Cypriot syllabary. By the time the alphabet was introduced to Greek, it was no longer a distinct phoneme, so there was no pressing need to invent a character for it; and by the end of antiquity it has probably disappeared from the language altogether. In Middle Greek, [j] reemerged as a phone, and the question of whether it is in fact a phoneme in the modern language has dogged Greek linguistics for decades. What we have with [j], at any rate, is a sound which Greek has had for a long time—but never got around to having an alphabetic representation for. The representations we do have for the sound in Ancient and Modern Greek are after-the-event bolt-ons, and each has annoyances of its own.

1. The Ancient Yot

By the time the alphabet was used to write Greek, there was no appreciable distinction to be made between [i] and [j], and the Greeks had no latent racial memory about the distinction that they used to make. In most contexts, the location of [j] was predictable: it occured in the canonically recognised diphthongs, such as /ai/, /ei/, /oi/. On occasion, you would get it turning up elsewhere as an allophone of /i/, conditioned by register or dialect rather than phonological context. The Ancients were aware of it, and had a term for the non-syllabic pronunciation of a vowel: synizesis. However, I am not aware of either the Ancients or the Byzantines feeling the need to coin a character to represent synizesis in text.

The need to notate [j] in Greek arose in the 19th century, as philologists got busy documenting Classical Greek literature and proto-Greek. It does not arise in normally written texts intended for reading: if the Greeks could get by without a character for [j], so can modern students of the texts. Rather, the need for it arises in linguistic commentary on texts—i.e. in order to explain succinctly why this word fits that particular meter; and more often, it arises in discussion of proto-Greek, where /j/ was in full force as a phoneme. Not that this means it is confined to specialist literature on linguistics: /j/ routinely turns up to this day in grammars of Classical Greek intended for students, because appealing to the proto-Greek form of a word is often the only way to make sense of its classical morphology.

The linguists among you are either shocked to hear that classicists are ignoring the Saussurean dichotomy—or are smiling knowingly about classicists' mindset not yet having caught up to late Saussure. Now that I've been doing Classical Greek computational morphology for a little while, all I can say is, the classicists know what they are doing.

To give an example of the kind of explanation involving proto-Greek /j/: the relation between the past tense of 'rejoice', ἐχάρην /ekháreːn/ and the present tense χαίρω /khaíroː/ seems arbitrary; it makes much more sense if we reconstruct the steps /khaíroː/ < [khájroː] < */khárjoː/, with a simple metathesis explaining the new diphthong.

So the 19th century philologists needed a symbol for [j]. When they were doing their work, the IPA wasn't around, so they couldn't have used that. In fact, even when the IPA was invented, historical linguists studiously ignored it anyway: they have never been interested in consistency with other subdisciplines, with the exasperating result that each proto-language has its own transcription conventions. (That's what we can blame the Uralic Phonetic Alphabet on.) Moreover, it would be unthinkable for philologists to explicate Greek forms using the Latin script: Greeks were the foundation of Western civilisation—so any historical linguistics to be done with Greek would keep the Greek script: any [j]'s would just have to be tacked on to it. (Of course, the same philologists didn't have compunctions about transliterating Sanskrit; it was not European, after all. Not sure what the excuse was for transliterating Old Church Slavonic, but I'm sure they could come up with something.) To this day, IPA is unknown territory for Ancient Greek historical linguistics.

So if you wanted a character for [j], you searched close to home. As Haralambous documents [§1.2.2], if you were working in the German tradition, you used j, which happens to be the German grapheme for [j] (and which also ended up the IPA's choice, much to the chagrin of Americanists). If you were working in the French tradition, you used y, because it wasn't German. Haralambous observes that more recently French scholars have switched to j, because it's in the IPA; given the year I'm writing this in, I suspect it has the added advantage for the French that even if it is German, at least it isn't American.

Finally, if you went along with the prevalent Indo-Europeanist notion that [i] and [j] are equivalent, you had a third option, which was to write [j] as an iota, but with a diacritic indicating that it was non-syllabic: U+032F Combining Inverted Breve Below. This is modelled after the Indo-Europeanist tradition of writing [j] as [i̯]. So you could write the proto-form of χαίρω, depending on your training and predilections, as χάρjω, χάρyω, or χάρι̯ω.

If you went along with [i] and [j] being equivalent, you would likely also treat [u] and [w] as equivalent, and write [w] as a non-syllabic version of [u]. Now, unlike [j], [w] did make it into the Greek alphabet (though not into Attic Greek), as digamma. And invoking proto-Greek /w/ is useful for explaining Greek morphology, the same way /j/ is: for instance, the word for 'breathe', /pnéoː/, did not contract its two adjacent vowels to †/pnôː/, as one would expect, because it used to be */pnéwoː/—and that is why the noun for 'breath' is /pneûma/, πνεῦμα. One would normally see /pnéwoː/ written in Greek grammars with a digamma, as πνέϝω; but an Indo-European inclined classicist might write it instead as πνέυ̯ω.

Smyth's student grammar of Greek, probably the main grammar used in the Anglo-Saxon world, uses the inverted breve for its representation of /j/ (and /w/); and this approach causes the least disruption to the Greek script, since only the diacritic is foreign to Greek, not the letter itself. But the dominant symbol is the German j. And crucially for Unicode (since it went with what ELOT told it), it is what has been in use in Greek high school grammars of Classical Greek for decades. So the reference glyph for the codepoint is the German glyph, and its name is likewise the German name for j, yot.

Well, almost. It's the German name Jot as Anglicised by ELOT to yot (presumably because they knew it only in its Hellenised form of γιοτ, and didn't know about its German origin). Moreover, Jot is the standard name in historical linguistics for [j]—which was a German enterprise for a very long time. Ultimately of course the letter name originates in Phoenecian/Hebrew yod [י], though presumably via Greek iota rather than directly.

The joke, given what we have been discussing here, is that Greek turned Phoenecian [j] into [i]—and German turned it right back into [j]. As always with sound change, the real villain of the piece was English, which also got hold of that letter name—and transmogrified it into /dʒɔt/: this is jot as in "not one jot or tittle".

So we have a character j, with a glyph variant y (and you may want to count ι̯ as another glyph variant), which has its own codepoint in Unicode on ELOT advice. But there are already codepoints in Unicode for j and y: U+006A Latin Small Letter J and U+0079 Latin Small Letter Y. Granted, having a distinct codepoint for yot buys you the ability to treat y and j as variants of the same character, as Haralambous points out. But since no one text is likely to have both, this doesn't buy you all that much in terms of collation; and it also puts you in the position of requiring capital versions down the road—as indeed Haralambous proposes. The real reason why the yot is there is to do with Unicode's attitude to script mixing, which I discuss elsewhere.

2. The Modern Yot

We know from metrical evidence that [j] turned up again in the late mediaeval Greek vernacular, this time in a different context: it was the allophone of /i/ (and /e/) before other vowels. This made its distribution completely predictable, so neither mediaeval scribes nor renaissance printers felt any particular need to write [j] any differently from [i].

The complications that have bedevilled Greek phonologists happen if you follow the Saussurean dichotomy, treat Greek phonology on its own merits, and bring into consideration the hijinks of palatalised gamma (ɣi > ɣj > ʝj > j) and the other havoc that [j] wreaks on consonants unfortunate enough to be stuck next to it. If you let the gammas take care of themselves in the spelling, however, and keep the vowels close to what they used to be, you can still treat [j] as a variant of /i/ rather than a new phoneme.

If you want to mess up a linguistic system that's running perfectly smoothly, all you need do is bring it in contact with another linguistic system, and have them disrupt each other's rules. This is why invoking "dialect mixing" is a historical linguist's favourite escape clause when the irregularities of their language get to be too much. Unfortunately for historical linguists, dialect mixing really does happen; and the diglossia of Modern Greece made it happen in spades. The intent was to have the corrupt vernacular (Demotic) displaced by a purified version of the language, more akin to the Classical language (Puristic: Katharevousa). This did not happen; what did happen, though, is that the form of demotic that prevailed had so many words introduced from Ancient Greek (via Puristic spelling pronunciations), that its phonology has become deranged. (There is no other word for a language that takes clusters like /nðr/ or /mvl/ seriously.)

The rule /i/ > [j] / _V (/i/ goes to jot before another vowel) got tangled up in this mess. This is a rule that applies to the vernacular, but is alien to the Classical language. As a result, words imported from the Classical languge into the vernacular didn't follow the rule; so you get in the Modern language doublets such as the vernacular adjective άδεια [ˈaðja] 'empty', and the learned noun for 'leave' (literally 'empty [day]'), άδεια [ˈaði.a].

This means that there's a much better case for indicating [j] in Modern Greek than there used to be: by any reasonable standard, [ˈaðja] ~ [ˈaðia] is a minimal pair, and demonstrates that /j/ is a phoneme. (Of course, if you aren't educated enough to unlearn the rules of vernacular phonology, you'll produce [ˈaðja] in both cases, which shows that the question is even messier than I'm making it out to be.) But since Modern Greek speakers already know the language, and can get what is meant from context, they don't display any evidence of being concerned by this ambiguity in their orthography. Hard luck if you're a second language learner of Greek, of course; but second language learners are not the target audience of standardised orthographies.

Now if you're a dialectologist or a phonetician, you will be concerned to take down this kind of detail in phonetic realisation, which is why it is no surprise that linguists have worked out diacritics to differentiate between [i] and [j]. What is more surprising is that in the 19th century, these diacritics were used whenever vernacular Greek was printed. This was a result of diglossia: in the 19th century, the vernacular was only fit to speak in—and not even that in genteel company; writing was the domain of Puristic Greek. So if you ventured something as outlandish as printing a text in the vernacular, you would treat it as a linguistic transcription. By contrast, in the 20th century writing in the vernacular was done seriously, and the diacritics and paraphenalia of transcription were dropped—ironically, just as the contamination of the vernacular phonological system by Puristic meant that the diacritics could have done some good. This is the distinction Haralambous makes between "κΔ" ("dimotikí in a katharévousa context") and "πΔ" (standardised polytonic Demotic); he describes the former as follows:

First, at the late 19th and early 20th centuries, [Demotic] was still rarely printed. When this was the case, it was considered to be an "exotic", deviated form of [Katharevousa], and spelled accordingly. This is a very interesting period, because authors, in the absence of a standard [Demotic] grammar, didn't know how to write [Demotic] and invented weird spellings only to give their writings pseudo-grammatical foundations. In some sense, we can say that [Demotic] was not written, but transcribed. [§1.1.2]

So when scholars were faced with vernacular texts in the 19th century, which treated /i/ differently to how Classical Greek treated it, they took the norm to be Classical Greek (natch), and they notated the vernacular [j] as a variant of /i/, using a diacritic. The choice of diacritic was obvious: U+032E Combining Breve Below, or U+032F Combining Inverted Breve Below, of which we have already seen the latter at work in Indo-European.

The problem with treating vernacular texts as linguistic transcriptions is that these texts were not being typeset by scholarly specialists with a panoply of diacritics at their fingertips; they were being published in Athens, with whatever was at hand. These printers needed to produce something like [ι̮ υ̮] or [ι̯ υ̯], but without any breves or inverted breves in stock. The kludge they alighted on was to take U+1FD6 Small Letter Iota With Perispomeni [ῖ] and U+1FE6 Small Letter Iota With Perispomeni [ῦ], flip them upside down, and cross their fingers.

Note that the norm inside Greece is for the perispomeni to be realised as a tilde rather than an inverted breve; so these upside down symbols don't have a diacritic that looks like a breve.

At around the same time Greek typographers were flipping iota circumflex to get their jot, Bertrand Russell was flipping plain iota to get the iota-operator, his unique quantifier U+2129 Turned Greek Small Letter Iota ℩—as a complement to the universal quantifier U+2200 For All ∀ and the existential quantifier U+2203 There Exists ∃. Though the timeframe was right, I have no reason to think Russell was influenced by the Greek practice: it was a kludge, not a newly defined and available sort, and probably unknown outside Greece.

The elegant glyphs Haralambous includes in his discussion of the upside down jots don't really do these devices justice, as his upside down circumflexes are actually descenders. For the full effect, you need the bottom of the iota and upsilon precariously positioned several points above their neighbours, because the tilde (perispomeni) is level with the baseline of the other characters; and the characters should preferably be tilted somewhat, because the lead type didn't really fit upside down. Something not a million miles from this:


3. Do We Admit the Kludges?

Kludges tend to stick around; the Unicode Standard is testament enough to that. Generations of linguists have gotten used to the upside down jots, and it may not even occur to them that it was begotten of a kludge: it is the form of the non-syllabic iota and upsilon that they expect to see in print. What may be starting to kill this kludge off is not the availability of the superior alternative in newer technology (combining breves below predate Unicode), but that the kludge isn't available there.

As you may well have guessed by now, my doctoral dissertation was in Modern Greek dialectology. I had a font with a combining inverted breve, and I did not have a font with upside down iota. So all my Modern jots ended up rendered with combining breves. I wasn't too cut up about that. I wasn't aware at the time that SIL Galatia Extras has the glyph: ; it's the only 8-bit font I know of that does.

It is within the power of Unicode to bring the kludge into the fold, and allocate it a codepoint or two. It is also within the power of smart fonts (those fonts just beyond the horizon we would be eagerly waiting for if we were even aware that they were coming) to admit the upside down jots as glyphs, but not as codepoints: that is, to have the upside down jots be the single glyph renderings of underlying letter + diacritic combinations, e.g. U+03B9 Greek Small Letter Iota U+032F Combining Inverted Breve Below.

Haralambous in his discussion thinks the former would be a good idea. I think it wouldn't, and that the way to go is with smart fonts treating the upside down jots as glyph variants. To support this, I need to establish that the upside down jots are understood by the likely users to be the same thing as the letter + diacritic combinations.

I also need to point out that this is a quite small user community in any case: it is limited to dialectologists, and scholars prepared to work on 19th century vernacular texts in their original orthography. The latter constituency has been hard to produce conclusive sightings of: the traditionalists who clamour to defend the grave and the rough breathing have not gone on record as supporting the upside down jot.

The following are arguments for treating the upside down jots as combined codepoints.

  1. As Haralambous himself admits, the capital version of the upside down jot involves the combining breve: [Ι̮ Υ̮]. If we were to make the upside down lowercase jot a codepoint, we would be faced with the unpleasant prospect of the case mapping U+1Fxx Greek Lowercase Kludge Based On Iota With PerispomeniU+0399 Greek Capital Letter Iota U+032E Combining Breve Below (or U+032F Combining Inverted Breve Below, if you're that way inclined). Greek is already in enough trouble with its case mappings with the adscript schemozzle; compounding is with a character neither most Modern Greeks nor more Classicists have heard of would not be a productive maneuver. Having the lowercase version of U+0399 Greek Capital Letter Iota U+032E Combining Breve Below be U+03B9 Greek Small Letter Iota U+032E Combining Breve Below, on the other hand, is blindingly obvious.

  2. The "add circumflex and title character upside down" mechanism for obtaining jots does not extend past iota and upsilon. Due to accident of Greek linguistic history (Ancient Greek didn't leave eta unmolested next to other vowels), it is rare to have eta turn into a jot; the few times it happens though, the result is not an upside down version of U+1FC6 Small Letter Eta With Perispomeni [], but a straightforward combination of eta and breve below. Thus, for example, an older spelling of βασιλιάς [vasiˈʎas] < [vasiˈljas] < [vasiˈle̯as] < [vasiˈleas] < Classical βασιλεύς /basileús/ is βασιλη̯άς.

    In particular, the technique does not work with non-syllabic epsilon [e̯], a phone that appears in some dialects though not in the standard language (and as the derivation of [vasiˈljas] shows, is how /e/ turned into [j]). The reason why noone has ever attempted an upside down epsilon circumflex, [ɜ̰], akin to the upside down upsilon and iota circumflex, is obvious: epsilon circumflex is not a canonical combination in Greek, so Greek printers didn't have any instances handy in their trays.

    The typography of Michailidis-Nouaros' 1928 collection of folksongs from Karpathos is an instructive example. Michailidis-Nouaros needed a non-syllabic iota and a non-syllabic epsilon. Chalkiopoulos Press, 11 Geranium St., Athens, could oblige him with an upside down iota circumflex; but for the non-syllabic epsilon, the best he could do was an underline. (In the following, I represent upside down iota circumlfex as [ι̰].)

    γιατ' ἔχ' ὁ ασιλεὰς λαό, δούλους καὶ πι̰άννουσίσ σε
    "for the king has people, servants who will catch you" (p. 50)
    [ʝat eç o asilas lao, ðulus ke pjanːusis se]

    The upside down iota is no less a kludge than the underlined epsilon; both are attempts to represent what might be written more canonically as:

    γιατ' έχ' ο ασιλε̯άς λαό, δούλους κ̣αι πι̯άν-νουσίσ σε

    One might choose to render the non-syllabic epsilon and iota differently; but to encode them differently seems to me to be a misfeature.

  3. Though I don't have any quantitative data handy, my impression is that even the upside down upsilon circumflex (which doesn't look all that much like [υ̯]) wasn't more popular than the version with an upright character and diacritic.

  4. In handwritten field notes (a fair few of which I have inspected for my doctoral research), linguists do not attempt to emulate the look of the upside down artefacts of Greek ingenuity. The handwritten version of a non-syllabic iota is an iota with a breve or inverted breve beneath it—just as we would expect from its underlying structure. So linguists realise that is what the upside down jot really is.

  5. The upside down jot trick works even less for non-syllabic digraphs—which Greek, with the meltdown of its Classical diphthongs, has in abundance. Since these digraphs tend to contain iota, some linguists get their jots on the cheap, by flipping just the iota over. For example, Michailidis-Nouaros (p. 66) transcribes [panˈore̯a] 'most beautifully' < */panóːraia/ as πανώραι̰α, where αι̰ is the digraph corresponding to [e̯]. Most linguists however bite the bullet, and transcribe digraph jots with double inverted breves (or inverted breves below); thus, σκολε͡ιό [skoʎo ~ skoljo] 'school'. Once again, the breve is seen to be the proper representation of non-syllabicity in Modern Greek.
