Z2: Computer assisted methods for the creation and analysis of multilingual data
Principal Investigator: Thomas Schmidt
Research Assistants: Timm Lehmberg, Kai Wörner, Hanna Hedeland
Student Assistants: Secil Yusun
Project background
Departing from current research into text technology, the objective of this project is to develop a methodological and technological basis for using the computer in the study of multilingualism. On this basis, project Z2 develops software tools for the computer assisted creation, analysis and archiving of multilingual data. The work is carried out in constant exchange with other projects of the Research Centre. Besides the design, implementation and maintenance of cross platform software, project Z2 also takes care of the conversion of legacy data into current standards of digital data processing.Previous work
EXMARaLDA (Extensible Markup Language For Discourse Annotation), developed in a previous project phase, is a generalised data model which is suited to replace the older, project specific data formats. This means that the prerequisites are given for a flexible data exchange between projects, for long term archiving of data, and for the construction of software tools usable across individual projects. Fur further background on EXMARaLDA, see- Schmidt, Thomas (2005a): Computergestützte Transkription – Modellierung und Visualisierung gesprochener Sprache mit texttechnologischen Mitteln. (Reihe „Sprache, Sprechen und Computer“ 7). Frankfurt a. M.
- Schmidt, Thomas / Wörner, Kai (2005): Erstellen und Analysieren von Gesprächskorpora mit EXMARaLDA. In: Gesprächsforschung (Online-Zeitschrift zur verbalen Interaktion) 6, 171-195.
Project aims
In this source project phase (2008-2011), the project Z2 has to main goals. First, we will continue to develop methods with the help of which other SFB projects can create, analyse and sustainably archive and publish their data. Second, we are going to prepare solutions for keeping these data usable for research and teaching beyond the lifetime of the SFB.Concerning the first goal, we are working on the following tasks:
- Edit the data corpora of other SFB projects in a sustainable manner.
- Do research into text technological and methodological foundations of computer-assisted processing of multilingual data.
- (Further) develop and optimize data models, formats and software tools of the EXMARaLDA system.
- Develop methods for a web-based distribution of data
- Improve interoperability