Category Archives: events

Seminar: the National Archives Online

Screen Shot 2012 03 12 at 15 21 23

Last Thursday (8th March), as part of the New Direction in the DH seminar we hosted a very interesting talk by Emma Bayne and Ruth Roberts about the most recent digital developments of the National Archives online presence.

Discovery and the Digital Challenge

Emma Bayne and Ruth Roberts talked about the changes to the National Archives online services. This includes the development of a new service – the Discovery Service. This is based on a new architecture and allows improved access to the National Archives Catalogue and digitised material. Features include a new taxonomy-based search and an API to allow bulk downloads of data.
They also discussed about some of the challenges facing the National Archives in delivering large quantities of digital images of records online – moving from a gigabyte scale to a petabyte scale in a short period of time.

Recording of the seminar:

If you’re interested you can listen again to the seminar (~1hr) by clicking here.

Relevant Links:

National archives main site
National Archives Labs
Discovery Service

An Electric Current of the Imagination

I’m very glad to be joining the Department of Digital Humanities and looking forward to contributing to this blog as and when I can. I’ll use this blog for posts relating closely to the Department of Digital Humanities. You’ll be able to find more more wide-ranging blogging on digitalriffs.blogspot.com. It was a great privilege to have the opportunity last night to give an inaugural lecture right at the beginning of my term of office as Head of Department and to try and map out some of the issues I think we need to address both as a Department and as a discipline. I promised to make available the text of my lecture via this blog, so here it is:

‘An Electric Current of the Imagination’: What the Digital Humanities Are and What They Might Become

Lecture by Andrew Prescott, King’s College London, 25 January 2012

[Slide: Birdsong Compliance: http://itchaway.net/poetry_content/birdsong-compliance/Scene_1.html]

It is a great honour for me to become head of this academic department devoted to the study of the digital humanities. When I first saw experiments in the digital imaging of books and manuscripts in the British Library twenty years ago, it was impossible to imagine that they would develop into an intellectual activity on a scale warranting an academic department. The fact that King’s College London has led the way in this process is due to the work of many pioneers, and I cannot start this lecture without acknowledging their achievements and saying what a pleasure it is to join them now as a colleague. Above all, it is essential to honour the contribution of Professor Harold Short who is without doubt the father of the Department of Digital Humanities at King’s College London. Harold has been an outstanding international pioneer of the digital humanities, and I feel honoured and humbled to follow in his footsteps. Continue reading An Electric Current of the Imagination

Decoding Digital Humanities London – 2012

Decoding Digital Humanities London (DDHL) is a series of informal monthly meetings for anyone interested in research at the intersection of computational technologies and the humanities. These gatherings provide an opportunity to discuss readings and raise questions, but also to mingle and share ideas with others in the field of digital humanities.

The series was founded at University College London and is now aiming at involving a larger number of institutions across London. PhD, MA students and staff at UCL, King’s College London and Goldmisth’s University of London are amongst the organizers this year.

The first meeting will be on January 31st at 6.30pm at The Plough (upstairs), 27 Museum st, WC1A 1LH. We will discuss the Digital Humanities Manifesto: http://tcp.hypotheses.org/411.

No registration is needed but an email would be appreciated. Please write to decodingdh@ucl.ac.uk.

DDH Internal Research Seminar: Tablet apps, or the future of Digital Scholarly Editions

At yesterday’s (23 November) Internal Research Seminar, Elena Pierazzo and Miguel Vieira presented Tablet apps, or the future of Digital Scholarly Editions, a preview of the paper that they will give tomorrow at the study-day “The Future of the Book“.

The paper discussed those opportunities that tablet device could offer for the digital publication of scholarly editions. This work stemmed from Patricia Searl’s MA dissertation, who completed her Digital Humanities MA at DDH last year.

The main issue arises from the apparent lack of use of digital scholarly editions published on the web. The speakers found particularly worrying the fact that these editions are never part of undergraduate syllabi, even though they usually offer high quality scholarly texts with free, open access.

Tablet devices are user-friendly, portable and create a stronger sense of ownership compared to websites. This makes for an experience closer to reading from a book, but would it be true for digital scholarly editions? Would it work for editions that need sophisticated ways of presenting historical evidence and editorial work? The presenters believe that the eBook model would probably not be sufficient, but the “App” paradigm might.

 

Enhanced eBooks already exploit this idea by introducing a highly interactive, almost ludic component to the digital edition. Nonetheless, none of these apps have been connected to scholarly work so far. The speakers noticed, for example, how it is impossible to find an editor of T.S. Eliot’s The Waste Land enhanced eBook (see image).

Finally, the paper also discussed those issues that would be familiar to any smartphone or tablet user, such as cross-device compatibility, keeping up-to-date with new OSs and heavily controlled app “markets”. These issues influence the true user reach, but first of all complicate development quite substantially (even more than, for example, dealing with cross-browser issues).

The paper was followed by a lively discussion. There was general agreement that scholarly editing should get involved in tablet computing; the best way of doing so, however, is yet to be fully understood and provides fertile ground for an exciting new research area.

The DDH Internal Research Seminar series aims at giving a space to DDH staff to present their research and discuss them in an informal environment.

Tagore digital editions and Bengali textual computing

Professor Sukanta Chaudhuri yesterday gave a very interesting talk on the scope, methods and aims of ‘Bichitra’ (literally, ‘the various’), the ongoing project for an online variorum edition of the complete works of Rabindranath Tagore in English and Bengali. The talk (part of this year’s DDH research seminar) highlighted a number of issues I personally wasn’t much familiar with, so in this post I’m summarising them a bit and then highlighting a couple of possible suggestions.

Sukanta Chaudhuri is Professor Emeritus at Jadavpur University, Kolkata (Calcutta), where he was formerly Professor of English and Director of the School of Cultural Texts and Records. His core specializations are in Renaissance literature and in textual studies: he published The Metaphysics of Text from Cambridge University Press in 2010. He has also translated widely from Bengali into English, and is General Editor of the Oxford Tagore Translations.

Rabindranath Tagore (1861 – 1941), the first nobel laureate of Asia, was arguably the most important icon of modern Indian Renaissance. This recent project on the electronic collation of Tagore texts, called ‘the Bichitra project’, is being developed as part of the national commemoration of the 150th birth anniversary of the poet (here’s the official page). This is how the School of Cultural Texts and Records summarizes the project’s scope:

The School is carrying out pioneer work in computer collation of Tagore texts and creation of electronic hypertexts incorporating all variant readings. The first software for this purpose in any Indian language, named “Pathantar” (based on the earlier version “Tafat”), has been developed by the School. Two pilot projects have been carried out using this software, for the play Bisarjan (Sacrifice) and the poetical collection Sonar Tari (The Golden Boat). The CD/DVDs contain all text files of all significant variant versions in manuscript and print, and their collation using the ”Pathantar” software. The DVD of Sonar Tari also contains image files of all the variant versions. These productions are the first output of the series “Jadavpur Electronic Tagore”.
Progressing from these early endeavours, we have now undertaken a two-year project entitled “Bichitra” for a complete electronic variorum edition of all Tagores works in English and Bengali. The project is funded by the Ministry of Culture, Government of India, and is being conducted in collaboration with Rabindra-Bhavana, Santiniketan. The target is to create a website which will contain (a) images of all significant variant versions, in manuscript and print, of all Tagores works; (b) text files of the same; and (c) collation of all versions applying the “Pathantar” software. To this end, the software itself is being radically redesigned. Simultaneously, manuscript and print material is being obtained and processed from Rabindra-Bhavana, downloaded from various online databases, and acquired from other sources. Work on the project commenced in March 2011 and is expected to end in March 2013, by which time the entire output will be uploaded onto a freely accessible website.

 

A few interesting points

 

  • Tagore, as Sukanta noted, “wrote voluminously and revised extensively“. From a DH point of view this means that creating a comprehensive digital edition of his works would require a lot of effort – much more than what we could easily pay people for, if we wanted to mark up all of this text manually. For this reason it is fundamental to find some type of semi-automatic methods for aligning and collating Tagore’s texts, e.g. the ”Pathantar” software. Follows a screenshot of the current collation interface.

    Tagore digital editions

  • The Bengali language, which is used by Tagore, is widely spoken in the world (it is actually one of the most spoken languages, with nearly 300 million total speakers). However this language poses serious problems for a DH project. In particular, the writing system is extremely difficult to parse using traditional OCR technologies: its vowel graphemes are mainly realized not as independent letters but as diacritics attached to its consonant letters. Furthermore clusters of consonants are represented by different and sometimes quite irregular forms, thus learning to read is complicated by the sheer size of the full set of letters and letter combinations, numbering about 350 (from wikipedia).
  • One of the critical points that emerged during the discussion had to do with the visual presentation of the results of the collation software. Given the large volume of text editions they’re dealing with, and the potential vast amount of variations between one edition and the others, a powerful and interactive visualization mechanism seems to be strongly needed. However it’s not clear what are the possible approaches on this front..
  • Textual computing, Sukanta pointed out, is not as developed in India as it is in the rest of the world. As a consequence, in the context of the “Bichitra” project widely used approaches based on TEI and XML technologies haven’t really been investigated enough. The collation software mentioned above obviously marks up the text in some way; however this markup remains hidden to the user and much likely it is not compatible with other standards. More work would thus be desirable in this area – in particular within the Indian continent.
  • Food for thought

     

  • On the visualization of the results of a collation. Some inspiration could be found in the type of visualizations normally used in version control software systems, where multiple and alternative versions of the same file must be tracked and shown to users. For example, we could think of the visualizations available on GitHub (a popular code-sharing site), which are described on this blog post and demonstrated via an interactive tool on this webpage. Here’s a screenshot:Github code visualization

    The situation is striking similar – or not? Would it be feasible to reuse one of these approaches with textual sources?
    Another relevant visualization is the one used by popular file-comparison softwares (eg File Merge on a Mac) for showing differences between two files:

    File Merge code visualization

  • On using language technologies with Bengali. I did a quick tour of what’s available online, and (quite unsurprisingly, considering the reputation Indian computer scientists have) found several research papers which seem highly relevant. Here’s a few of them:- Asian language processing: current state-of-the-art [text]
    Research report on Bengali NLP engine for TTS [text]
    – The Emile corpus, containing fourteen monolingual corpora, including both written and (for some languages) spoken data for fourteen South Asian languages [homepage]
    A complete OCR system for continuous Bengali characters [text]
    Parsing Bengali for Database Interface [text]
    Unsupervised Morphological Parsing of Bengali [text]
  • On open-source softwares that appear to be usable with Bengali text. Not a lot of stuff, but more than enough to get started (the second project in particular seems pretty serious):- Open Bangla OCR – A BDOSDN (Bangladesh Open Source Development Network) project to develop a Bangla OCR
    Bangla OCR project, mainly focused on the research and development of an Optical Character Recognizer for Bangla / Bengali script
  •  

    Any comments and/or ideas?

     

    Event: THATcamp Kansas and Digital Humanities Forum

    The THATcamp Kansas and Digital Humanities Forum happened last week at the Institute for Digital Research in the Humanities, which is part of the University of Kansas in beautiful Lawrence. I had the opportunity to be there and give a talk about some recent stuff I’ve been working on regarding digital prosopography and computer ontologies, so in this blog post I’m summing up a bit the things that caught my attention while at the conference.

    The event happened on September 22-24 and consisted of three separate things:

  • Bootcamp Workshops: a set of in-depth workshops on digital tools and other DH topics http://kansas2011.thatcamp.org/bootcamps/.
  • THATCamp: an “unconference” for technologists and humanists http://kansas2011.thatcamp.org/.
  • Representing Knowledge in the DH conference: a one-day program of panels and poster sessions (schedule | abstracts )
  • The workshop and THATcamp were both packed with interesting stuff, so I strongly suggest you take a look at the online documentation, which is very comprehensive. In what follows I’ll instead highlight some of the contributed papers which a) I liked and b) I was able to attend (needless to say, this list matches only my individual preference and interests). Hope you’ll find something of interest there too!

    A (quite subjective) list of interesting papers

     

  • The Graphic Visualization of XML Documents, by David Birnbaum ( abstract ): a quite inspiring example of how to employ visualizations in order to support philological research in the humanities. Mostly focused on Russian texts and XML-oriented technologies, but its principles easily generalizable to other contexts and technologies.
  • Exploring Issues at the Intersection of Humanities and Computing with LADL, by Gregory Aist ( abstract ): the talk presented LADL, the Learning Activity Description Language, a fascinating software environment provides a way to “describe both the information structure and the interaction structure of an interactive experience”, to the purpose of “constructing a single interactive Web page that allows for viewing and comparing of multiple source documents together with online tools”.
  • Making the most of free, unrestricted texts–a first look at the promise of the Text Creation Partnership, by Rebecca Welzenbach ( abstract ): an interesting report on the pros and cons of making available a large repository of SGML/XML encoded texts from the Eighteenth Century Collections Online (ECCO) corpus.
  • The hermeneutics of data representation, by Michael Sperberg-McQueen ( abstract ): a speculative and challenging investigation of the assumptions at the root of any machine-readable representation of knowledge – and their cultural implications.
  • Breaking the Historian’s Code: Finding Patterns of Historical Representation, by Ryan Shaw ( abstract ): an investigation on the usage of natural language processing techniques to the purpose of ‘breaking down’ the ‘code’ of historical narrative. In particular, the sets of documents used are related to the civil rights movement, and the specific NLP techniques being employed are named entity recognition, event extraction, and event chain mining.
  • Employing Geospatial Genealogy to Reveal Residential and Kinship Patterns in a Pre-Holocaust Ukrainian Village, by Stephen Egbert.( abstract ): this paper showed how it is possible to visualize residential and kinship patterns in the mixed-ethnic settlements of pre-Holocaust Eastern Europe by using geographic information systems (GIS), and how these results can provide useful materials for humanists to base their work on.
  • Prosopography and Computer Ontologies: towards a formal representation of the ‘factoid’ model by means of CIDOC-CRM, by me and John Bradley ( abstract ): this is the paper I presented (shameless self plug, I know). It’s about the evolution of structured prosopography (= the ‘study of people’ in history) from a mostly single-application and database-oriented scenario towards a more interoperable and linked-data one. In particular, I talked about the recent efforts for representing the notion of ‘factoids’ (a conceptual model normally used in our prosopographies) using the ontological language provided by CIDOC-CRM (a computational ontology commonly used in the museum community).
  •  

    That’s all! Many thanks to Arienne Dwyer and Brian Rosenblum for organizing the event!

     

    P.S.
    A copy of this article has been posted here too.

    DH at Reading

    DH at Reading logoThis Friday, the University of Reading discusses Digital Humanities at the one-day event DH at Reading.
    High-profile Digital Humanities speakers will introduce fundamental topics of research in the field, and will be followed by a round table led by Ph.D. students with a strong DH component in their work.

    Elena Pierazzo, from the Department of Digital Humanities at King’s College will introduce Medieval and Modern Manuscripts in the Digital Age; while the round table will be joined by the department’s PhD students Øyvind Eide, Tom Salyers and Raffaele Viglianti.

    We’re excited to talk DH at Reading and contribute to the dissemination of the disciplines to those institutions interested in participating more to the field.

    Hack4Europe! – Europeana hackathon roadshow, June 2011

    Europeana is a multilingual digital collection containing more than 15 millions resources that lets you explore Europe’s history from ancient times to the modern day. Europeana API services are web services allowing search and display of Europeana collections in your website and applications. The folks at Europeana have been actively promoting the experimentation with their APIs by organizing ‘hackathons’ – workshops for cultural informatics hackers where new ideas and discussed and implemented.

    Some examples of the outputs of the previous hackathon can be found here. Hack4Europe is the most recent of these dissemination activities:

    Hack4Europe! is a series of hack days organised by the Europeana Foundation and its partners Collections Trust, Museu Picasso, Poznan Supercomputing and Networking Center and Swedish National Heritage Board. The hackathon roadshow will be held simultaneously in 4 locations (London, Barcelona, Poznan and Stockholm) in the week 6 – 12 June and will provide an exciting environment to explore the potential of open cultural data for social and economic growth in Europe.

    Each hackathon will bring together up to 30 developers from the hosting country and the surrounding area. They will have access to the diverse and rich Europeana collections containing over 18 million records, Europeana Search API (incl. a test key and technical documentation) and Europeana Linked Open Data Pilot datasets which currently comprise about 3 million Europeana records available under a CC0 license.

    There are four hackathons coming up, so if you’re interested make sure you sign up quickly:

    Hack4Europe! UK

    9 June 2011, London, hosted by Collections Trust

    Hack4Europe! Spain

    8 – 9 June 2011, Barcelona, hosted by Museu Picasso

    Hack4Europe! Poland

    7 – 8 June 2011, Poznan, hosted by Poznan Supercomputing and Networking Center and Kórnik Library of the Polish Academy of Sciences

    Hack4Europe! Sweden

    10 – 11 June 2011, Stockholm, hosted by Swedish National Heritage Board

    Pelagios Linking Open Data Workshop

    Linking Open Data: the Pelagios Ontology Workshop

    (A reminder that all KCL colleagues are welcome to attend this event, but registration is essential. To register please visit http://pelagios.eventbrite.com.)

    The Pelagios workshop is an open forum for discussing the issues associated with and the infrastructure required for developing methods of linking open data (LOD), specifically geodata. There will be a specific emphasis on places in the ancient world, but the practices discussed should be equally applicable to contemporary named locations. The Pelagios project will also make available a proposal for a lightweight methodology prior to the event in order to focus discussion and elicit critique.

    The one-day event will have 3 sessions dedicated to:
    1) Issues of referencing ancient and contemporary places online
    2) Lightweight ontology approaches
    3) Methods for generating, publishing and consuming compliant data

    Each session will consist of several short (15 min) papers followed by half an hour of open discussion. The event is FREE to all but places are LIMITED so participants are advised to register early. This is likely to be of interest to anyone working with digital humanities resources with a geospatial component.

    Preliminary Timetable
    10:30-1:00 Session 1: Issues
    2:00-3:30 Session 2: Ontology
    4:00-5:30 Session 3: Methods

    Confirmed Speakers:

    Johan Alhlfeldt (University of Lund) Regnum Francorum online
    Ceri Binding (University of Glamorgan) Semantic Technologies Enhancing
    Links and Linked data for Archaeological Resources
    Gianluca Correndo (University of Southampton) EnAKTing
    Claire Grover (University of Edinburgh) Edinburgh Geoparser
    Eetu Mäkelä (University of Aalto) CultureSampo
    Adam Rabinowitz (University of Texas at Austin) GeoDia
    Sebastian Rahtz (University of Oxford) CLAROS
    Sven Schade (European Commission)
    Monika Solanki (University of Leicester) Tracing Networks
    Humphrey Southall (University of Portsmouth) Great Britain Historical
    Geographical Information System
    Jeni Tennision (Data.gov.uk)

    Pelagios Partners also attending are:

    Mathieu d’Aquin (KMi, The Open University) LUCERO
    Greg Crane (Tufts University) Perseus
    Reinhard Foertsch (University of Cologne) Arachne
    Sean Gillies (Institute for the Study of the Ancient World, NYU) Pleiades
    Mark Hedges, Gabriel Bodard (KCL) SPQR
    Rainer Simon (DME, Austrian Institute of Technology) EuropeanaConnect
    Elton Barker (The Open University) Google Ancient Places
    Leif Isaksen (The University of Southampton) Google Ancient Places