The orthography of machine-readable Neolatin texts: A plaidoyer for minimal intervention

Abstract:

The Latin of the early modern period (Neolatin) is an independent stage in a continuing development of Latin. It is not a failed attempt to write correct Latin, which we can help succeed by improving its orthography. Standardization of orthography was developed with and for editions printed on paper. Publications on the web (in the following equally called 'editing') offer much more flexible editorial models.

'Normalizing' a text according to a presumed classical orthography should be avoided for two reasons: 1) The orthography of Neolatin texts is hardly ever uniform, but it is not therefore arbitrary; on the contrary it often reflects either deliberate or unconsious choices of the author. 2) There is no such thing as classical orthography (modern lexica diverge, and for new words there obviously cannot be a classical orth.).

Machine-readable texts allow completely new types of statistics-based research of post-medieval Latin - if we don't destroy the evidence by imposing our own version of normalcy. Scholars can easily change the original text to suit their purpose, but cannot roll back undocumented changes to get to the original state.

For many task on the agenda of the Boston meeting we need orthographical standards. These should not be implemented in the original texts, where - once enacted - they block further evolution, but in an intermediate layer, where they can be adapted and refined to meet the needs of present (searching, parsing, tagging) and future tasks.