Monthly Archives: June 2014

Paul Caton: Six terms fundamental to modelling transcription

Department of Digital Humanities Lunchtime Seminar.  This is a preview of a paper he will deliver at the Digital Humanities Conference at Lausanne later this month.

Paul uses the HSM model[1] to understand transcriptions. He defines a series of abstractions, which can help us understand the process of transcription, the objects involved in the process and its agents.

Surface

A physical manifestation of an object which contains marks.

Mark

An alteration on a surface performed by an agent, i.e. scratches, prints, etc. Marks are perceptible by an agent.

Reading

Process by which an agent attempts to discover and establish a type sequence of marks in a surface. Readings can be entirely speculative. It assigns token status to marks. The reading agent must comprehend the concept of writing. A positive result state occurs when an agent assigns token status to at least one mark with certainty greater than 0. A negative result occurs when no marks are assigned token status by an agent with certainty greater than 0. A zero result state is when the agent has no certainty either way as to whether a mark is or is not a token.

Token sequence

Must at least have 1 token. A token sequence is not right or wrong, it just exists. Transcription is dependent on the token sequence produced by the reading. A T token sequence is the result of a process of transcription.

Exemplar

An exemplar is a combination of a surface and marks where the an act of reading is attempted, and is the basis for a transcription. The status of being an exemplar is relative; i.e. if one person makes a transcription FOO of exemplar BAR, and then another person wants to make a transcription BAZ of FOO, then FOO has the status of exemplar with respect to BAZ.

Document

When a positive reading result occurs, token sequence is identified, and the token sequence is recognised as type, then, and only then, a surface-mark combination can be considered a document. An agents attempts a reading when it is believed that a surface-mark combination is a document, or at least there is the possibility that it might be a document.

Paul observes that the process of transcription must involve intention to be different from reproduction or copying.

A document is not necessary for the act of transcription to occur, only the intention of recognising token sequences from marks must occur.

References

Sperberg-McQueen, C. M., Claus Huitfeldt, and Allen Renear (2001). Meaning and interpretation of markup. Markup Languages: Theory & Practice 2.3: 215–234. On the Web at cmsmcq.com/2000/mim.html

Huitfeldt, Claus, and C. M. Sperberg-McQueen (2008). What is transcription? Literary & Linguistic Computing 23.3: 295–310.

Caton, Paul (2009). Lost in Transcription: Types, Tokens, and Modality in Document Representation. Paper given at Digital Humanities 2009, University of Maryland, College Park, June 2009.

Sperberg-McQueen, C. M.. Claus Huitfeldt, and Yves Marcoux (2009). What is transcription? Part 2. Talk given at Digital Humanities, College Park, Maryland. Slides on the Web at blackmesatech.com/2009/06/dh2009.

Huitfeldt, Claus, Yves Marcoux, and C. M. Sperberg-McQueen (2010). Extension of the type/token distinction to document structure. Paper presented at Balisage: The Markup Conference 2010, Montréal, Canada, August 3 – 6, 2010. In Proceedings of Balisage: The Markup Conference 2010. Balisage Series on Markup Technologies, vol. 5. doi:10.4242/BalisageVol5.Huitfeldt01. On the Web at www.balisage.net/Proceedings/vol5/html/Huitfeldt01/BalisageVol5-Huitfeldt01.html.

Caton, Paul (2012). On the Term ‘Text’ in Digital Humanities. Literary & Linguistic Computing. 28.2: 209–220.

Caton, Paul (2013). Pure transcriptional encoding. Paper given at Digital Humanities 2013, Lincoln, Nebraska.

Sperberg-McQueen, C. M., Yves Marcoux, and Claus Huitfeldt (2014).  Transcriptional implicature: a contribution to markup semantics. Paper to be given at Digital Humanities 2014, Lausanne, Switzerland.


  1. Transcription model based on work by Huitfeldt and Sperberg-McQueen (2008) and continued jointly with Marcoux (2009, 2010).  ↩