The Future of Digital Ogam: Potential Updates to the Unicode Ogham Block to Facilitate Modern Usage

We are grateful to Adrian Doyle for contributing another insightful guest blog this month. He is the creator of the Würzburg Irish Glosses website ( and is currently completing a PhD researching Natural Language Processing techniques for Old Irish at the University of Galway. He is a research associate at the Insight Centre for Data Analytics at the University of Galway.

While the introduction of ogam to Unicode in 1999 made the script significantly more accessible to a wide community of potential users, adoption of the script has been slow among linguistic scholars. Those who do make use of digital ogam often take certain liberties with the script in order to make it fit their specific requirements. This may indicate that the twenty-nine digital ogam characters which are currently available are insufficient, either for representing historical writings or for supporting modern usage. Many issues could be resolved relatively easily, either by the introduction of new ogam characters to the standard, or by altering the current Unicode guidelines for the usage of ogam. The aim of this blog is to make the case for including certain historically attested ogam characters in a future update to the Unicode standard, and to demonstrate their utility to modern users of the script. What will be discussed here by no means amounts to a comprehensive list of all potentially useful ogam characters, but inclusion of any of them in the Unicode standard would likely be welcomed by the community of digital ogam users.

Proposals for the inclusion of new characters and scripts are often submitted to the Unicode Consortium, and the Unicode standard is regularly updated. As such, it should not be particularly difficult to introduce new ogam characters to the standard so long as there is a demand for them among a modern userbase, and the use case for such characters is easily demonstrable. The hasty inclusion of characters could lead to compatibility issues down the line, however. For this reason, it is vital that any application to include new ogam characters in Unicode should be well considered from the outset, and designed to ensure the most utility to all stakeholders of the digital script. The hope for this blog is that it may encourage some discussion regarding what various users of the script would like to see in a future iteration of Unicode standard.

Three examples of manuscript ogham where six strokes are used instead of the usual five strokes for an R. All are from the St Gallen, Stiftsbibliothek MS 904, folios 193-196.
Figure 1: Ogams in St Gallen, Stiftsbibliothek MS 904 with Six Stroke ᚏ.

Ideally, any update to the Unicode ogam character set would take into account the challenges of recording historical ogams, and other requirements of linguistic experts. Historical examples can contain features which may not occur in later ogam writings. A good example of this can be found in the manuscript which contains the Old Irish St. Gall glosses, St. Gallen Stiftsbibliothek MS 904. At least three ogams contained in this manuscript (see Figure 1) present the letter ᚏ written with six strokes, instead of the usual five. There is no way to accurately represent this graphical variation using existing Unicode characters. While a graphically similar letter might be approximated by combining two ogam characters to generate the correct number of strokes (ᚏ + ᚋ = ᚏᚋ), such a combination will never look quite right because of the built-in separation between ogam characters in Unicode. More importantly, a combination like this requires the use of a letter character which doesn’t occur in the original text. As such, the digital text actually does not read as it should.

A diagram demonstrating how each five-stroke aicme letter should look when combining with a single-stroke letter from the same aicme, with no space between.
Figure 2: Examples of Purely Graphical Combining Ogam Characters (in Red).

What would be required to properly represent graphical variation like this would be a combining ogam character, distinct from the currently available letters. Such a character could be used in combination with existing ogam letters to change their graphical appearance in the same way that combining diacritics in Unicode can currently be added to letter characters. As this issue could conceivably occur with letters from each of the four aicmí, at least four variants of this combining diacritic would likely be required (see examples in Figure 2). The case might also be made for semantically empty letter character variants to be included in Unicode. These would be similar in appearance to the currently available aicme letters, but would have a distinct use case. They could be employed to record historical ogams which clearly have a given number of visible strokes, but for which no sure interpretation can be agreed upon. Semantically meaningless character variants would allow these historical examples to be reproduced digitally without implying that any particular reading is correct. They could also be employed in niche cases where it may be desirable to represent individual, potentially meaningful, strokes without using a specific letter character to do so (as is the current practice with “transliterations” on Titus Ogamica1).

Seven examples of punctuation used with ogam written in manuscripts.
Figure 3: Ogams with Punctuation Marks.

Moving away from letter characters, the currently available “Ogham block” in Unicode is particularly sparse in terms of punctuation. At present the only two characters which might be considered forms of ogam punctuation are the “feather marks”; ᚛ and ᚜. By contrast, a much wider variety of punctuation has been used with ogam over the centuries. Even the earliest manuscript sources demonstrate that forms of punctuation which were common in roman script at the time could also be employed with ogam. The “punctus and stroke (positura)”2, for example, is commonly found in Irish manuscripts marking the end of a section of text. An example of it being used with ogam can be seen in Figure 3a, where it appears at the end of the text, below the stemline. In Figure 3b it appears that a dot is used directly on the stemline, at the end of the text, though this may be merely decorative. Modern forms of punctuation can be found in more recent manuscripts, for example, the use of commas in an 18th century manuscript can be seen in Figure 3c. Another form of punctuation, the use of dots to separate words and letters, seems to have enjoyed continued usage over the centuries. An early example of this has already been seen in Figure 1c, where a dot occurs above the stemline at the boundary between two words. This practice appears to have become more common in later manuscripts (see Figure 3d and e), perhaps because it provides a means of distinguishing between similar looking letters. Some regional variants exist also, for example, in Scottish inscriptions a colon-like mark (comprised of two dots) can be found separating words. Several other forms of punctuation have also been recorded throughout the history of the script, and unfortunately it is not possible to describe every type here. The salient point, however, is that none of these forms of punctuation can currently be reproduced using Unicode characters without breaking the stemline. For this reason, there is good cause to introduce ogam-specific punctuation characters into the Unicode standard.

Figure 4: ᚛ᚋᚔᚅ ᚉᚆᚐᚄᚉ – MIN [CH]AS[C] – St Gallen, Stiftsbibliothek MS 904 f. 170 (© e-codices).

Aside from forms of punctuation which can be found in ogam writings, be they historical or recent, editors often introduce modern punctuation into ogams for editorial purposes. Brackets, for example, are regularly used to identify text which has been supplied by an editor, though the text may not occur in a manuscript, or may not be visible due to damage or age. An example of this can be seen in Figure 4, where trimming of the top margin has resulted in the loss of certain ogam letters. Brackets are used in the transliteration, MIN [CH]AS[C], to identify ogam letters which are no longer legible in the manuscript, but which can be worked out from context. It is not possible to employ brackets in a similar manner with digital ogam text without breaking the stemline, ᚛ᚋᚔᚅ [ᚉᚆ]ᚐᚄ[ᚉ]. Where ogam has been carved on stone monuments, not only can whole letters be rendered illegible by damage to the stone, but in some cases only a portion of a letter may be damaged while some part of it remains legible. In such cases it may be desirable to demonstrate that only part of the letter has been supplied by modern editors, though this would not be possible using brackets3. For both of these reasons, the introduction of ogam-specific brackets may not be the most elegant solution for the purpose of representing damaged or missing ogam letters. An alternative solution is readily available, however, and can be found by looking to another historical script supported by Unicode, Egyptian hieroglyphs.

Four hieroglyphs are depicted next to diagrams which indicate what portion of the hieroglyph has been damaged.
Figure 5: Examples of Damage Modifiers with Egyptian Hieroglyphs.

The most recent full version of Unicode4 introduced a mechanism for indicating that a hieroglyph has been damaged. One of fifteen “damage modifiers” can be added to any hieroglyph to demonstrate that it has been partly or completely damaged (see Figure 5). The character is divided into four portions, and damage can be indicated in any one of these portions individually, or in any two or three portions at once. Alternatively, damage can be indicated in all four portions, which would mean that the entire hieroglyph has been damaged and the whole character had to be supplied by a modern editor. This division into quarters produces sixteen (24) possible damage combinations for any hieroglyph (fifteen damage modifiers plus the possibility of no damage at all). Because ogam characters are comprised of up to five strokes on either side of the stemline, it may seem that ogam would require an unruly number of damage modifiers (210 = 1,024). Not only would this be too fine-grained to be useful in practice, but in fact, all that would be required for ogam would be five damage modifiers, one for each of the five positions that a stroke can occur for any given aicme letter.

Figure 6: Examples of the Ponc Séimhithe and the Síneadh Fada with Ogam Characters.

Turning now to diacritics, a common issue faced by modern users of ogam is the lack of phonetic markers which are regularly used in roman script. Lenition, a phonetic change affecting a range of consonants in Gaelic languages, can be shown in roman script by writing the consonant followed by the letter h, though in Gaelic font, and in many manuscripts, it can also be shown by the placement of a dot, the ponc séimhithe, above the consonant in question. Long vowels are another common feature in Gaelic languages. Modern Irish, when written in roman script, uses the síneadh fada (an acute accent) to identify long vowels, while Scottish Gaelic uses the stràc (a grave accent). As Damian McManus has pointed out, however, the ogam “alphabet has no mechanism for distinguishing vowel quantity”5. This is certainly true regarding some of the earliest ogam inscriptions, in which it is not possible to tell if a vowel is long or short just by looking at the letters, however, more recent users of the script have sought and found solutions to overcome this early limitation. By the 19th century both the ponc séimhithe and the síneadh fada had been adopted by scribes writing in ogam, and examples of both can be found in manuscripts from that time (see Figure 6 as well as 3c and 3e above).

Figure 7: Examples of Combining Diacritics (U+0301 and U+0307) with Ogam Characters.

Unfortunately, Unicode does not have any satisfactory way to represent such diacritics at present. The Unicode standard does provide a wide variety of combining diacritic characters, including U+0301 “Combining Acute Accent” and U+0307 “Combining Dot Above”, which can be used to approximate síneadh fada and the ponc séimhithe in roman script. These, however, were simply not designed with ogam characters in mind. Attempting to combine them with ogam characters results in acute accents which are not correctly centred above letters, and which are often obscured by strokes which extend above the stemline (see Figure 7). Stranger still, the “Combining Dot Above” diacritic is actually rendered to the bottom-right hand side of ogam characters, making it resemble a mark of punctuation rather than one of lenition (see again Figure 7). As there is both a clear modern use case and a historical precedent for diacritic characters in ogam, their inclusion in Unicode would be a very welcome update. Ideally, discrete ogam diacritics would be introduced, however, it would also be beneficial if the currently available combining diacritics could be made to work with ogam characters.

Figure 8: ︎✝︎colman ᚛ᚈᚆᚉᚑᚁ – Colmán BOCHT – Inscription from Clonmacnoise, Co. Offaly, Ireland ©Photographic Archive, National Monuments Service, Government of Ireland.

Having addressed the potential introduction of new characters into the standard, it remains to briefly discuss Unicode’s current guidelines for using ogam. As this blog is being written, the most recent version of Unicode makes the following statements6:

  1. “Ogham should … be rendered on computers from left to right or from bottom to top (never starting from top to bottom)”
  2. “In some cases, only the Ogham feather mark is used, which can indicate the direction of the text.”

As historical examples exist which contradict each of these statements, it would probably be best for them to be omitted from future versions of the standard. As can be seen in Figure 8, the ogam text, ᚛ᚈᚆᚉᚑᚁ, not only reads towards the single feather mark, but it is written sinistroverse, i.e. from right-to-left, or top-to-bottom, depending on the orientation of the stone. In a manuscript context, the practice of writing a single word outwards in two directions from the middle of the stemline, and other playful use of ogam directionality and orientation is actually discussed in Auraicept na n-Éces7. Currently, the Unicode guidelines do not allow for such ogams to be transcribed in a historically accurate manner. It would be preferable if future versions of the standard were to adopt a descriptive rather than prescriptive stance regarding the usage of digital ogam.

Figure 9: ᚋᚐᚐᚔᚏᚔᚔᚅ – MAAIRIIN – Example of Character Doubling to Show Vowel Length in Digital Ogam.

On a final note, it would be remis not to address how modern users of ogam have adapted to the digital medium. The discussion up to this point may have given the impression that limited character availability constitutes a serious impediment to using the digital script. Nevertheless, modern ogam users have overcome such restrictions in various different ways. As no discrete diacritic marker akin to the ponc séimhithe is available, the ᚆ character is typically used following consonants in the same manner that h is used in roman script. This use of ᚆ is by no means new in ogam (see Figures 3b, 6b, 6c, and 8), however, it is the only option available to users of the digital script. Representing vowel length in digital ogam presents more of a challenge, though some modern users of the script have taken to doing so by employing a particularly archaic method. Some of the earliest Old Irish writings in roman script represent long vowels by doubling the letter in question. This practice seems to have ceased to be productive by the Middle Irish period, though, and only a handful of formulaic, monosyllabic examples survive from this time8. In a move which is doubly creative, digital ogam users like Máirín Ní Dhubhthaigh (see Figure 9) have not only resurrected this practice, but also applied it to ogam letters in lieu of discrete acute accent marks.

Clearly undeterred by the current limitations of digital ogam, its community of users have adapted well to the new medium, demonstrating a level of creativity reminiscent of the scribes who first adapted the script for use in manuscripts, and who updated it over the centuries to meet the needs of progressive generations. If Unicode were never to introduce another ogam character, there is little fear of the script falling out of use so long as it is enjoyed by such an enthusiastic userbase. All the same, it is difficult to deny there is a growing need to increase support for digital ogam. An update to the Unicode standard could greatly increase the script’s utility for current users, help it to reach an even wider audience, and allow its user community to exploit its yet untapped potential.

  1. Gippert, J. (Ed.). (2001). Titus Ogamica. [online at TITUS: Thesaurus Indogermanischer Text- und Sprachmaterialien, JohannWolfgang Goethe-Universität Frankfurt am Main, 2002]. Retrieved from ↩︎
  2.,_critical_signs_and_numerals ↩︎
  3. Editors of digital editions can avail of a wider variety of encoding options. TEI tags, for example, allow for more detailed annotations to be supplied. However, none of these alternatives allow for sub-character-level annotation. ↩︎
  4. (2022). The Unicode Standard Version 15.0 – Core Specification (Standard). Unicode Consortium. Mountain View. p. 454. ↩︎
  5. McManus, D. (1997). A Guide to Ogam. An Sagart, Maynooth. p. 87. ↩︎
  6. (2022). The Unicode Standard Version 15.0 – Core Specification (Standard). Unicode Consortium. Mountain View. p. 362. ↩︎
  7. Calder, G. (Ed.). (1917). Auraicept na n-Éces: The scholars’ primer, being the texts of the Ogham tract from the Book of Ballymote and the Yellow book of Lecan, and the text of the Trefhocul from the Book of Leinster. John Grant. p. 299. ↩︎
  8. Breatnach, L. (1994). An Mheán-Ghaeilge. In McCone, K., McManus, D., Ó Háinle, C., Williams, N., and Breatnach, L. (eds.) Stair na Gaeilge. Roinn na Sean-Ghaeilge, Coláiste Phádraig, Maynooth. p. 229. ↩︎

Leave a comment

Your email address will not be published. Required fields are marked *