The Second Workshop on Annotation of Corpora for Research in the Humanities


Thursday, November 29

9:15-9:30 Opening

9:30-10:30 Invited talk Martin Wynne. Do we need annotated corpora in the era of the data deluge?

11:00-12:30 Session A chair: António Branco

11:00-11:30 Marcel Bollmann. (Semi-) Automatic Normalization of Historical Texts using Distance Measures and the Norma tool. [PDF]

11:30-12:00 Iwe Everhardus Christiaan Muiser, Mariët Theune and Theo Meder. Cleaning up and Standardizing a Folktale Corpus for Humanities Research. [PDF]

12:00-12:30 Martin Reynaert, Iris Hendrickx and Rita Marquilhas. Historical spelling normalization. A comparison of two statistical methods: TICCL and VARD2. [PDF]

14:00-15:30 Session B chair: Erhard Hinrichs

14:00-14:30 Folgert Karsdorp, Peter van Kranenburg, Theo Meder and Antal van den Bosch. Casting a Spell: Identification and Ranking of Actors in Folktales. [PDF]

14:30-15:00 Peter Bouda, Vera Ferreira and António Lopes. Poio API - An annotation framework to bridge Language Documentation and Natural Language Processing. [PDF]

15:00-15:30 Svetla Koeva, Borislav Rizov, Ekaterina Tarpomanova, Tsvetana Dimitrova, Rositsa Dekova, Ivelina Stoyanova, Svetlozara Leseva, Hristina Kukova and Angel Genov. Bulgarian-English Sentence- and Clause-Aligned Corpus. [PDF]

16:00-17:00 Session C chair: Paul Meurer

16:00-16:30 Gosse Bouma and Ben Hermans. Syllabification of Middle Dutch. [PDF]

16:30-17:00 Florian Petran. Studies for segmentation of historical text: sentences or chunks? [PDF]

17:00-17:15 Closing remarks