Transformative Works Discussion Group

The inaugural meeting of Transformative Works Discussion Group will be *5:30, November 6th* in the Department of Digital Humanities Seminar room (second level of the KCL, Drury Lane building). We hope that the discussion group will then continue on the first Thursday of each month.

In honour of Halloween, the meeting will include a presentation: “Non-humanity Never Looked So Good: Romance From the Long Tail to the Long Tentacle”.

Anyone (students, researchers, staff etc) who has an interest in the area of fandom and transformative works, whether text, image, film, audio or transmedia, or the communities/technology that surround them are invited to attend. We hope that this discussion group will give people working in this area a chance to exchange information and ideas in a friendly, interdisciplinary setting. To this end we would also like to give people an opportunity to volunteer to do a short presentation of their work or nominate a paper for review and discussion. If you are interested in presenting or nominating a discussion topic then please email

Please circulate this information around your departments and to anyone else you think might be interested as these early meetings will be vital to gauging ongoing viability.

Digital Humanities Seminar

We invite all who are interested to join us for the Autumn Digital Humanities seminar at King’s College London. The seminars are on Tuesday afternoons at 18:00, and held in the Anatomy Museum (ATM) on the 6th floor of the King’s Building on the Strand campus (with exceptions clearly marked below).


Peter Stokes, Stewart Brookes, Giancarlo Buomprisco (KCL), Elaine
Treharne (Stanford), Donald Scragg (Manchester)
Digital Resource and Database for Palaeography, Manuscript Studies and Diplomatic (DigiPal) launch event (Room K2.29)

Weds 22-Oct-2014 *17:30 start*
Helma Dik (University of Chicago)
Philologia ex machina: Are we getting any closer? (Room K0.20)
*Note: this event is on a Wednesday at 17:30, and is a joint seminar with the Classics department*

Timo Honkela (National Library of Finland, Helsinki)
Text Mining for Digital Humanities (ATM)

Tobias Blanke (KCL) et alii.
Book launch: Digital Asset Ecosystems: Rethinking crowds and clouds (ATM)

Gabriel Bodard (KCL), Daniel Pett (British Museum), Humphrey Southall (Portsmouth), Charlotte Tupman (KCL)
Round table: Linking ancient people, places, objects and texts (ATM)

DigiPal Project Launch and Party

Date: Tuesday 7th October 2014
Time: 5.45pm until the wine runs out
Venue: Council Room, King’s College London, Strand WC2R 2LS
Co-sponsor: Centre for Late Antique & Medieval studies, KCL
Register your place at 

After four years, the DigiPal project is finally coming to an end. To celebrate this, we are having a launch party at the Strand Campus of King’s on Tuesday, 7 October. The programme is as follows:

  • Welcome: Stewart Brookes and Peter Stokes
  • Giancarlo Buomprisco: “Shedding Some Light(box) on Medieval Manuscripts”
  • Elaine Treharne (via Skype)
  • Donald Scragg: “Beyond DigiPal”
  • Q & A with the DigiPal team

If you’re in the area then do register and come along for the talks and a free drink (or two) in celebration. Registration is free, but is required to manage numbers and ensure that we have enough drink and nibbles to go around

If you’re not familiar with DigiPal already, we have been been developing new methods for the analysis of medieval handwriting. Regular readers of this blog will have already seen our poster for DH 2014,  but do visit the site if you haven’t yet seen it. There’s much more detail there about the project, including one post of the DigiPal project blog which summarises the website and its functionality. Quoting from that, you can:

 Do have a look at the site and let us know what you think. And – just as importantly – do come and have a drink on us on Tuesday!

DH2014: DigiPal Poster

DH2014 Poster

The DigiPal project team presented this poster at DH2014 (for a full-size version see the DigiPal website). I’m happy to say that it attracted a lot of positive interest, with discussions including its current and (shortly) planned use on Latin, Hebrew and Greek alphabets, as well as decoration, in manuscripts, inscriptions and coins; potential applications to Indic scripts (possible but challenging) and Cuneiform (certainly possible); and even a member of the Unicode consortium expressing interest in our model for handwriting. There was also a lot of interest from people working in Computer Vision, especially now that our RESTful API allows anyone to harvest images of letters, complete with their annotations, for use as training material for machine learning. I look forward to seeing these uses after the project ends in six weeks or so, and my personal thanks goes to the whole DigiPal team for all the outstanding work they have done to make it possible.

DH2014: ontology for 3D visualisation poster


The poster was displayed at the first poster session of DH2014 on July the 10th. It gave me the chance to meet many scholars interested in my research and to discuss the project with them. I received very good feedback, and started useful dialogues that might turn into future collaborations with University of Mainz, UCLA and EPFL.

This project was presented as a paper at the last CAA conference in Paris.


DH2014: SNAP:DRGN poster

Standards for Networking Ancient Prosopographies: Data and Relation in Greco-Roman Names

(This poster was also presented in the Ontologies for Prosopography pre-conference workshop on Tuesday July 8.)

In the poster session on Thursday July 10, this was up for two hours, was photographed several times, and Sebastian Rahtz and myself mostly chatted with many people who came over and expressed an interest in it. At least two, possibly three, of these people will turn out to be new project partners who we wouldn’t have known about otherwise, so I call this a win!

If you want to see the poster in full-size, you can find it on the wall in Drury Lane, in the corridor opposite room 220.

PHEME: Computing Veracity in Social Media

(Guest post from Dr Anna Kolliakou, who gave a guest seminar in DDH a few weeks ago. Anna and Robert would be very interested in collaborating with anyone in DH who has interests in their project.)

Computing Veracity in Social Media

From a business and government point of view there is an increasing need to interpret and act upon information from large-volume media, such as Twitter, Facebook, and newswire. However, knowledge gathered from online sources and social media comes with a major caveat – it cannot always be trusted. Pheme will investigate models and algorithms for automatic extraction and verification of four kinds of rumours (uncertain information or speculation, disputed information or controversy, misinformation and disinformation) and their textual expressions.

Veracity intelligence is an inherently multi-disciplinary problem, which can only be addressed successfully by bringing together currently disjoint research on language technologies, web science, social network analysis, and information visualisation. Therefore, we are seeking to develop cross-disciplinary social semantic methods for veracity intelligence, drawing on the strengths of these four disciplines. The Department of Digital Humanities, an international leader for the application of technology in social sciences, was the appropriate platform for researchers from the SLAM Biomedical Research Centre at KCL, one of PHEME’s partners, to present their proposed work in veracity intelligence for mental healthcare with an aim to develop academic collaborations with academics interested in social media analysis, NLP and text mining. For more information…

Seminar: June 2, 2014: Robert Stewart and Anna Kolliakou

Social media poses three major computational challenges, dubbed by Gartner the 3Vs of big data: volume, velocity, and variety. PHEME will focus on a fourth crucial but hitherto largely unstudied, big data challenge: veracity. The relationship between clinicians and their patients has already been changed by the internet in three waves. First, the provision of pharmaceutical data, diagnostic information and advice from drug companies and health care providers created a new source for self-directed diagnosis. Secondly, co-creation sites like Wikipedia and patient support forums (e.g. PatientsLikeMe) have more recently added a discursive element to the didactic material of the first wave. Thirdly, the social media revolution has acted as an accelerant and magnifier to the second wave.

Prof Robert Stewart and Dr Anna Kolliakou, from the SLAM Biomedical Research Centre at King’s College London, have started the process of re-tooling medical information systems to compete with this new context. This will facilitate practical applications in the healthcare domain, to enable clinicians, public health professionals and health policy makers to analyse high-volume, high-variety, and high-velocity internet content for emerging medically-related patterns, rumours, and other health-related issues. This analysis may in turn be used (i) to develop educational materials for patients and the public, by addressing concerns and misconceptions and (ii) to link to analysis of the electronic health records.

In this seminar, they will be discussing the development of 4 main demonstration studies that aim to:

  1. Identify social media preferences and dislikes about certain medication and treatment options and how these present in clinical records
  2. Monitor the emergence of novel psychoactive substances in social media and identify if and how promptly they appear in clinical records
  3. Explore how mental health stigma arises in social media and presents in clinical records
  4. Ascertain the type of influence social media might have on young people at risk of self-harm or suicide

Paul Caton: Six terms fundamental to modelling transcription

Department of Digital Humanities Lunchtime Seminar.  This is a preview of a paper he will deliver at the Digital Humanities Conference at Lausanne later this month.

Paul uses the HSM model[1] to understand transcriptions. He defines a series of abstractions, which can help us understand the process of transcription, the objects involved in the process and its agents.


A physical manifestation of an object which contains marks.


An alteration on a surface performed by an agent, i.e. scratches, prints, etc. Marks are perceptible by an agent.


Process by which an agent attempts to discover and establish a type sequence of marks in a surface. Readings can be entirely speculative. It assigns token status to marks. The reading agent must comprehend the concept of writing. A positive result state occurs when an agent assigns token status to at least one mark with certainty greater than 0. A negative result occurs when no marks are assigned token status by an agent with certainty greater than 0. A zero result state is when the agent has no certainty either way as to whether a mark is or is not a token.

Token sequence

Must at least have 1 token. A token sequence is not right or wrong, it just exists. Transcription is dependent on the token sequence produced by the reading. A T token sequence is the result of a process of transcription.


An exemplar is a combination of a surface and marks where the an act of reading is attempted, and is the basis for a transcription. The status of being an exemplar is relative; i.e. if one person makes a transcription FOO of exemplar BAR, and then another person wants to make a transcription BAZ of FOO, then FOO has the status of exemplar with respect to BAZ.


When a positive reading result occurs, token sequence is identified, and the token sequence is recognised as type, then, and only then, a surface-mark combination can be considered a document. An agents attempts a reading when it is believed that a surface-mark combination is a document, or at least there is the possibility that it might be a document.

Paul observes that the process of transcription must involve intention to be different from reproduction or copying.

A document is not necessary for the act of transcription to occur, only the intention of recognising token sequences from marks must occur.


Sperberg-McQueen, C. M., Claus Huitfeldt, and Allen Renear (2001). Meaning and interpretation of markup. Markup Languages: Theory & Practice 2.3: 215–234. On the Web at

Huitfeldt, Claus, and C. M. Sperberg-McQueen (2008). What is transcription? Literary & Linguistic Computing 23.3: 295–310.

Caton, Paul (2009). Lost in Transcription: Types, Tokens, and Modality in Document Representation. Paper given at Digital Humanities 2009, University of Maryland, College Park, June 2009.

Sperberg-McQueen, C. M.. Claus Huitfeldt, and Yves Marcoux (2009). What is transcription? Part 2. Talk given at Digital Humanities, College Park, Maryland. Slides on the Web at

Huitfeldt, Claus, Yves Marcoux, and C. M. Sperberg-McQueen (2010). Extension of the type/token distinction to document structure. Paper presented at Balisage: The Markup Conference 2010, Montréal, Canada, August 3 – 6, 2010. In Proceedings of Balisage: The Markup Conference 2010. Balisage Series on Markup Technologies, vol. 5. doi:10.4242/BalisageVol5.Huitfeldt01. On the Web at

Caton, Paul (2012). On the Term ‘Text’ in Digital Humanities. Literary & Linguistic Computing. 28.2: 209–220.

Caton, Paul (2013). Pure transcriptional encoding. Paper given at Digital Humanities 2013, Lincoln, Nebraska.

Sperberg-McQueen, C. M., Yves Marcoux, and Claus Huitfeldt (2014).  Transcriptional implicature: a contribution to markup semantics. Paper to be given at Digital Humanities 2014, Lausanne, Switzerland.

  1. Transcription model based on work by Huitfeldt and Sperberg-McQueen (2008) and continued jointly with Marcoux (2009, 2010).  ↩

SNAP:DRGN consultation workshop

Last week we held the first workshop of the SNAP:DRGN (Standards for Networking Ancient Prosopographies: Data and Relations in Greco-roman Names) project, here in King’s College London.

As announced in our press release the SNAP:DRGN project aims to recommend standards for linking together basic identities and information about the entities in various person-databases relating to the ancient world, with a view to facilitating the production of a federated network including millions of ancient person-records, compatible with the Linked Ancient World Data graph. At this workshop (see Workshop slides and recap) we presented our preliminary proposals, data models and ontology for feedback to a representative group of scholars from both the classical prosopography/onomaastics and Linked Open Data communities. We also spoke to several people with large datasets that might be contributed to the SNAP graph.

It was decided that SNAP:DRGN will attempt to address recommendations to five key use-cases of networked prosopographical data:

  1. Putting prosopographical data online, including stable URIs and openly-accessible data and metadata in standard formats (not defined by us).
  2. Contibuting a summary of said data, including identifiers for all persons and a simplifed subset of core identifying information about each entity, to the SNAP graph to that it can be built upon and referred to by other projects.
  3. Annotating SNAP entities to establish alignment and identify co-references between related datasets.
  4. Marking-up online documents to identify personal names within them to persons identified in the SNAP graph and its constituent databases.
  5. Adding relationships between persons, both within and between databases: person X is the daughter of person Y; person A in one database was killed in battle by person B in another database.

The SNAP:DRGN project will continue to work on the “Cookbook“, the summary of recommendations and examples for these five use-cases, over the coming months, in the run-up to adding several new datasets to the graph. We are also experimenting in a modest way with tool and implementations for working with the vast graph of ancient persons created: named entity recognition (NER) workflows for finding new personal names in texts; co-reference resolution for finding overlap and links between datasets; search and browse tools and APIs. This work will be reported on the SNAP:DRGN blog, and in conferences and seminars throughout the year.