We are grateful to Adrian Doyle for contributing a guest blog this month. He is the creator of the Würzburg Irish Glosses website (wurzburg.ie) and is currently completing a PhD researching Natural Language Processing techniques for Old Irish in NUIG. Adrian writes:
The ogam script has existed for over one and a half thousand years. Inscriptions in ogam occur on standing stones dating from as early as the fourth century and it seems likely that it would have been used even some time before that on materials like wood and iron which, understandably, have not survived as long as stone examples. The use of ogam on stone monuments seems to have declined during the seventh century, however, not long after this point the script began cropping up in manuscripts. The adaptation of the script for use in this new medium was an innovative process which allowed it to meet the requirements of a changing language. The lack of a stone’s edge required that an artificial stem-line be drawn, but this also allowed for ogam to be written horizontally. New letters were introduced, and old ones repurposed. Feather marks were introduced, apparently to aid in reading the script (but the exact purpose of this feature deserves its own blog).
Ogam continued to be used in manuscripts, as well as on monuments and grave-slabs right up until the nineteenth century. Throughout all these centuries of linguistic change in Irish the format of ogam script remained relatively unchanged. For over a millennium ogam has either been inscribed on a three-dimensional surface or printed on a page. Only in the last few decades, with rapid progress in computer technologies making digital media a fact of daily life, has ogam again taken on a new format, digital text. For the first time since it was transferred to the manuscript, about thirteen centuries ago, ogam writing has been adapted to meet the requirements and expectations of modern users.
The introduction of ogam characters to the Unicode standard in 1999 has made the script more accessible than ever. The Unicode block spans from U+1680 to U+169F and includes the twenty letter characters of the primary four aicmí, five supplementary letters of the forfeda, an ogam space character, and two feather marks. Since its inception Unicode has prioritised the inclusion of characters used in modern forms of communication over “preserving past antiquities”1. As such, ogam’s inclusion in Unicode is intended primarily for modern purposes rather than for transcribing historical texts.
While the adaptation of ogam script for digital media has clear benefits for anyone wishing to use it, the move comes with some interesting implications. It is important to understand the differences between digital text and print to be able to appreciate these implications. Handwritten or printed text is primarily a graphical system which requires human interpretation. We perceive it first visually, then our brains process the visual input and convert it into a linguistic representation. Digital text, by contrast, is better thought of in terms of data which is “machine-readable”. It is primarily a script which computers can process.
Machine readability and graphical legibility should not be confused. The graphical element of digital text exists purely so humans can read it, but it is meaningless to a machine. A computer recognises different letters using unique identifiers which act like ID numbers for different characters. For example, the ogam letter ᚐ is encoded in Unicode as U+1690, and no other Unicode character shares this encoding with it.
Humans and computers may disagree at times as to whether two letters are the same or not. While we humans can look at two or more letters, graphically analyse them, and determine that they represent roughly the same letter, to a computer no two characters are equal to one another unless they are the exact same character (a ≠ á ≠ A ≠ ᚐ). Conversely, while we may consider two instances of a letter to be distinct from one another for graphical reasons, like differing handwriting, italicisation or font, a computer makes no such distinction. As long as two characters share the same ID number, they represent the same thing to a machine (a = a = a = a ).
These factors can make it very difficult to accurately capture the contents of historical ogams using Unicode characters if there is any doubt as to what is written. Because the script is made up primarily of repeating patterns of scores and notches it can sometimes be difficult to tell one letter from another, particularly in the case of stone inscriptions where letters and even words can be written without any space between them. McManus has identified two instances where confusion between ᚋᚋ, “MM”, and ᚌ, “G” caused him to question earlier transcriptions by Macalister 2, 3. The graphical reality of ogam makes it particularly susceptible to this kind of misreading, however, a computer cannot comprehend this visual difficulty.
Three glosses from the St. Gall Priscian manuscript exemplify a similar problem. The letter ᚏ is written with six strokes instead of the typical five. Ironically, each of these examples uses the term cocart, referring to a correction or emendation in the manuscript. There is no correct way in Unicode to represent a six-stroke letter. Some may try to approximate such a letter by combining characters to graphically represent six strokes, similar to how ogam transliterations are presented on Titus Ogamica4. Workarounds like this are purely graphical, however, like writing VV instead of W, and this creates messy text data.
It is possible to be very creative with the direction and orientation of ogam text. Writing Irish words like seal and cead in ogam produces a kind of mirror image across the stem-line. Where ogam is written on a physical object, like a manuscript page, different readings may be possible depending on the orientation of the page. Early Irish scribes were very aware of the potential for reading ogam in both directions and even played with this notion in manuscripts, writing the same word from the middle of the stem-line outwards in two directions “so that it is the same thing that stands at the beginning and at the end of the stem”5. Words like ceall, written in this manner, can act as a kind of ogam-specific palindrome which reads the same no matter which way the text is oriented. This works because the number of strokes and notches on either side of the stem-line are the same in either orientation.
Unfortunately, very little of this type of creativity can be preserved in the transition to digital ogam. The Unicode standard specifies that ogam should be written from left to right, or bottom to top only6. It is unclear whether this specification is based on a misunderstanding of the historical usage of the script, or if it was simply a utilitarian decision to limit the number of ogam characters. While it is still possible to approximate such creative uses of the script, writing words like ceall with Unicode ogam characters (ᚉᚓᚐᚂᚂ) necessarily separates the individual letters in a way that obscures the graphical palindrome-like effect. The standard also specifies that where a single feather mark is used this “can indicate the direction of the text.” Again, this specification goes against historical examples in which text is written towards the mark, like the Colmán Bocht inscription from Clonmacnoise. As the text in this example reads from right to left, however, it would be impossible to capture in accordance with Unicode specifications at any rate.
A final point worth discussing relates to the stem-line in ogam script. Tom Scott, a popular YouTube personality and computational linguist in his own right, has reported on how unusual the ogam space character is in Unicode7. Because graphically it contains the stem-line, but it can also be replaced with a line break (it disappears at the end of a line of text), it is the only graphical character in Unicode which acts like whitespace. While this may seem trivial, a proposal was sent to the Unicode technical committee suggesting that the ogam space character should instead be processed the same way as symbol characters8. In this scenario the ogam space character would have acted more like punctuation, similar to an interpunct in Latin scriptio continua. More importantly, though, it would have been the equivalent of separating words using an emoji instead of space. Instead, Unicode had to relent and update their definition of whitespace to account for this unique feature of ogam.
The adaptation of ogam to digital media has made it more accessible than ever before, however, many of the unique graphical characteristics of the script are meaningful only to human readers and cannot be well supported by digital text. Ogam has been used in several creative ways throughout its history, particularly in manuscripts, but Unicode specifications and the limited character set can make it difficult to represent this creativity digitally. It is in some cases possible to work around these limitations, though this can create text data which is too messy for a computer to parse. Nevertheless, just as ogam was reinvented and experimented with by scribes once it made the transition to the manuscript, a similar digital reinvention may currently be in full swing. Many new creative uses are being made of digital ogam characters, and modern usage of the script has already forced Unicode to redefine the modern concept of a whitespace character.
1 Becker, J. D. (1988). Unicode 88 (Standard). Unicode Consortium. Palo Alto. p. 5.
2 McManus, D. (2004). The ogam Stones at University College Cork. Cork University Press. p. 14.
3ibid., pp. 18-19.
4 Gippert, J. (Ed.). (2001). Titus Ogamica. [online at TITUS: Thesaurus Indogermanischer Text- und Sprachmaterialien, JohannWolfgang Goethe-Universität Frankfurt am Main, 2002]. Retrieved from https://titus.fkidg1.uni-frankfurt.de/database/ogam/ogquery.asp?ciic1=057
5 Calder, G. (Ed.). (1917). Auraicept na n-Éces: The scholars’ primer, being the texts of the Ogham tract from the Book of Ballymote and the Yellow book of Lecan, and the text of the Trefhocul from the Book of Leinster. John Grant. p. 299.
6 The Unicode Standard Version 13.0 – Core Specification (Standard). (2020). Unicode Consortium. Mountain View. Retrieved from https://www.unicode.org/versions/Unicode13.0.0/UnicodeStandard-13.0.pdf p. 354.
8 Davis, M. (2007). OGHAM SPACE MARK shouldn’t be whitespace (tech. rep. No. L2/07-340). Unicode Technical Committee Document Registry. Retrieved from https://www.unicode.org/L2/L2007/07340-ogham-space.txt