The Neume Notation Project

Louis W. G. Barton


Decoration


µ Research Proposal

[University of Oxford, 24 October 1999]

§ Summary § A Plan for Deferring Commitment
§ Extent of the Problem Domain § Interoperability and Permanence
§ The Need for Computerization § The Neume Notation Editor
§ Inadequacy of Existing Representations § Optical Character Recognition
§ A Lossless Data Representation § Schedule



§ Summary

The primary goal of this research is to devise a "lossless" data representation for medieval chant. The target material is Western-European liturgical documents containing music notation and dating prior to A.D. 1550. Medieval musicologists estimate that over one million such documents survive. In the Middle Ages, chant melodies were written in a system of symbols called neumes. Neumes differ drastically from modern music symbols in both morphology and semantics. A "lossless" data representation is one that captures the semantic content of the source material in a way that will support all anticipated uses. This data representation will be in the Private Use Area of the Unicode Standard ™ (ISO/IEC 10646).

The two main categories of use for this data representation have conflicting requirements:
  1. that it be possible to reconstruct from the data stream a diplomatic facsimile of a source document in its native notational style for purposes of computer display, editing, and printing; and
  2. that the data representation be sufficiently formal and abstract as to allow comparative analysis of melodies and notational styles across many documents from a data bank.
No data representation exists today that satisfies both of these requirements.

Encoding the data in the Private Use Area of the Unicode Standard ™ has four benefits:
  1. a sufficiently large code space as to allow an individual character code for each of the hundreds of semantically distinct neume forms;
  2. the ability to mix neume data with chant text in a single file without resorting to multiple (i.e., modal) meanings for characters;
  3. suitability for Internet transmission and use in Worldwide Web applications; and
  4. the possibility of a standardized representation that is usable by programmes written in any programming language for printing, analysis, pedagogy, audio rendition, etc.
My research includes application of the data representation in the following areas:
  1. an editing program to facilitate visual data entry, with data output to this data representation;
  2. a testbed of encoded documents from various historical periods, geographical regions, and notational styles;
  3. a Web-accessible database to permanently house the encoded documents and related information, which scholars can access or supplement;
  4. assisting my Oxford University collaborators in the implementation of their algorithms for musicological analysis of data in this representation; and
  5. investigation into optical character recognition (OCR) for automatic data entry from photographic images, with output to this data representation.
The lossless data representation, together with the toolkit of applications programmes, will give scholars a standardized means for sharing data and performing analysis. Standardization will help to reduce fragmentation of digital libraries and duplication of effort. I shall seek international criticism of the specifications for the data representation and work closely on it with a research group in the University of Oxford headed by Dr John Caldwell of the Faculty of Music.
As a byproduct of this research, I hope to formulate a theoretical framework for digital encoding other types of archaic script, such as Byzantine ecphonic notation and Hebrew cheironomic notation.
Top of document


§ Extent of the Problem Domain

The number of medieval documents containing neume notation may be more than one million. University of Toronto musicologist Andrew Hughes adds that, "A minute number of these are catalogued at all adequately, and many fewer [are] known in detail" [1]. Most of the earlier documents are unique. Two examples of neume notation are provided at the end of this paper.
I am particularly interested in documents dating from the first centuries after appearance of music notation in the West (ca. A.D. 800). I have a definite cutoff date at the invention of mechanical printing for music (ca. 1550). The corpus of early neumed documents poses the greatest trouble for musicologists with regard to dating and provenance, and is the area where computerization might have the greatest impact.
In the early documents, there is a very large set of notational variants. These variants are traditionally classified into families, or ‘styles’. A small sample is illustrated below [2].


Sample of neumes in various styles

Figure 1. Neume samples in eight notational styles; approximate modern equivalents are in column (b).

Top of document



§ The Need for Computerization

Medieval musicologists wish to bring the power of computers to bear on this body of documents, because algorithmic comparison of many documents might shed light on questions that have resisted manual analysis.

Harvard University musicologist Thomas Kelly writes that, "The central question, perhaps the most difficult one in the study of [chant] is this: where did it come from?" Comparison of chant melodies may provide the best evidence. He continues, "... with Gregorian chant we seek distinctions within the repertory, but for other, ‘local’ liturgies we often look for kinship with Gregorian chant. ... When we deal with distinct bodies of chant, written a thousand years ago in places far apart, we especially have trouble equipping ourselves for the job. ... Clearly, our methods of comparison and description remain woefully inadequate, ... and we still lack the tools for comparison and description that we might wish in order to be ‘scientific’" [3].
Top of document


§ Inadequacy of Existing Representations

If a data-representation existed that satisfied the needs I have listed, then it should be used. There are about a dozen ‘standards’ for digital representation of music, including MIDI, NIFF, and SMDL [4]. All of them were designed for to the semantics of post-medieval music, characterized by discrete time intervals, regular beat, and distinct pitches—notions that are anachronistic in medieval chant. Assumptions about time quantification are deeply embedded in these representations. Encoding of chant with any of them forces an interpretation in the modern musical idiom. Such force-fitting destroys semantic content of chant that musicologists need for analysis.
Top of document


§ A Lossless Data Representation

A data representation is a set of binary codes that encode the semantics of source materials. Computer images are not data representations in this sense. Once a data representation has been publicly defined, and a large amount of material has been encoded with it, it becomes impractical to redesign the data representation. If the data representation is "lossy," then it will be impossible to automatically convert files from the old representation to a new, richer representation. Fortunately, few neumed documents have been encoded in any representation to-date.

I distinguish between "lossy" representations and "lossless" representations. A lossy representation selectively discards semantic detail to capture a particular perspective on the source materials. Lossy representations typically result from the anticipated uses of the data being narrowly defined, or from a poor understanding of the science of data representation, or from a rush to achieve quick results. A lossless representation is one that unambiguously captures all semantic content that is expected to have eventual use.

Designing a lossless representation for neumed documents is difficult because of the extent of variants, uncertainties of interpretation, and conflicting use requirements. The basic needs are that it encode chant text, neume forms, and neume locations. The first priority is to compile an exhaustive taxonomy of neume forms across all notational styles. It is fortuitous (albeit only theorized by musicologists) that there is substantial taxonomic synonymy across notational styles.
Top of document


§ A Plan for Deferring Commitment

Because of the crucial role of the data representation, it is highly desirable to delay commitment to its final specification. Deferment would allow time for adequate discussion in the international community and experimentation with the representation. To permit delay, I propose using "middle-layer" software to act as an abstraction barrier between the data representation and applications that use it. This will allow on-going development of applications programmes while the representation is evolving. The middle layer should present interface signatures for data accession and mutation. When the data representation is changed, the internal details of the middle-layer methods must be modified, but their interfaces will remain unchanged. The applications programmes should not, therefore, be affected. The diagrams on the next page summarize the data flow in the early stages of the research, versus data flow in the late states of the research.


Data flow, during development of the data representation

Figure 2a. Data flow of the chant computerization system during development of the data representation.


Data flow, after finalization of the data representation

Figure 2b. Data flow of the chant computerization system after finalization of the data representation.

Top of document



§ Interoperability and Permanence

All applications programmes that I shall write will be in the Java programming language. Java is a nearly pure object-oriented language, and the only fully-programmable language deployable on the Worldwide Web. Java is platform independent, meaning that it can be run on many types of computers without customization or recompilation.

I shall establish an Internet-accessible repository for encoded documents, to be housed permanently at Harvard University, and with a mirror site at Oxford University. Subject to copyright restrictions (as yet unexplored), this repository will be available to scholars for access and addition of encoded documents. A Web-accessible index of stored documents and their descriptions will be included.
Top of document


§ The Neume Notation Editor

Ease of data entry is crucial importance. This is due to the great number of source documents and the labor-intensive nature of data entry. A graphical-user-interface programme is needed for convenience of data entry and visual validation.

A prototype of my Neume Notation Editor is shown below. This programme is written in Java. Its structure is a complex; there are already over a thousand lines of programme code in forty-two classes. Later prototypes will accommodate chant text below the neumes, display of many notational styles, and advanced editing features needed by musicologists. In itself, this will be a substantial piece of software engineering.


Prototype of the Neume Notation Editor

Figure 3. Prototype of the Neume Notation Editor.

Top of document



§ Optical Character Recognition

A difficult barrier to computerization of chant is the labor required for data entry. It would be of great benefit to have an optical character-recognition (OCR) programme for rough-draft transcription from photographic images.

I have written a proof-of-concept OCR programme that recognizes individual neume forms by means of neural networks optimized by genetic algorithm. Instead of being manually programmed, neural-networks ‘learn’ from repeated presentment of training data. A difficulty with neural networks is to arrive at an optimal specification of the network configuration. Genetic algorithms ‘evolve’ programmes automatically and select phenotypes in each generation according to their fitness for a specified task. I am using genetic algorithms to optimize my network configurations. By this technique I expect progressive improvement in recognition accuracy, especially where the document has substantial background ‘noise’.

I have decomposed the OCR problem into several sub-problems:
  • recognizing whole pages at varying scale;
  • dealing with indistinct handwriting and degraded parchment;
  • distinguishing neumes from text or decorations;
  • segmenting a line of neumes into individual glyphs;
  • parsing compound neume forms; and
  • assigning a ‘height’ to each neume, often without a staff line as a guide to the vertical displacement.
A human operator will specify the neume family to be recognized. Once neural template-matching has been done by OCR, further processing will be needed to validate the chant text, correlate neumes to text, and resolve uncertainties about neume identification or neume ‘height’. I plan to use a knowledge-based (KB) system for post-OCR processing. A KB is an inference engine that reasons by rules on a set of facts. The rules will capture the heuristics that a musicologist uses for resolving uncertainties of visual identification. The facts will consist of OCR output and a data bank of ‘control’ documents. The operator will specify which control document should be used. After KB processing, a draft transcription will be validated visually by the operator in the Neume Notation Editor.

I expect that recognition of early documents will not be highly accurate. I consider my work in this field to be contributory rather definitive.
Top of document


§ Schedule

It is difficult to specify a timetable for my research, but I have these goals:
  • First year. Compile a complete taxonomy of neume forms and make Unicode assignments. Publish the taxonomy. Assist my University of Oxford collaborators with ‘middle-layer’ interface signatures. Get the Neume Notation Editor to a working state.
  • Second year. Revise the data representation in dialogue with interested parties. Resolve problems with encoding neume ‘heights’ and correlating neumes to chant text. Set up a Web-accessible database for encoded documents. Complete the Neume Notation Editor.
  • Third and fourth years. Publish the finalized data representation. Help modify the ‘middle-layer’ methods for direct access to the data representation. Proceed as far as possible with the OCR research.
Top of document

Footnotes
[1] Andrew Hughes, Late Medieval Liturgical Offices, (Toronto: Pontifical Institute of Mediaeval Studies, 1996), p. 2.
[2] The New Harvard Dictionary of Music, Don Michael Randel, ed., (Cambridge, MA: Harvard University, 1986), s.v. "Neume."
[3] Thomas Forrest Kelly, The Beneventan Chant, (Cambridge, England: Cambridge University, 1998), pp. 161-2.
[4] For a detailed discussion of data representations for music, see Eleanor Selfridge-Field, Beyond MIDI; The Handbook of Musical Codes, (Cambridge, MA: MIT, 1997).



Homepage e-mailE-mail GlossaryGlossary MIDIAudio

Revision: 23 March 2000
Copyright © 1999, 2000, Louis W. G. Barton