Abstract
Cross-linguistically consistent annotation is necessary for sound comparative evaluation and cross-lingual learning experiments. It is also useful for multilingual system development and comparative linguistic studies. Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a dependency-based lexicalist framework. In this paper, we describe v1 of the universal guidelines, the underlying design principles, and the currently available treebanks for 33 languages.
Original language | English |
---|---|
Title of host publication | Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016 |
Editors | Nicoletta Calzolari, Khalid Choukri, Helene Mazo, Asuncion Moreno, Thierry Declerck, Sara Goggi, Marko Grobelnik, Jan Odijk, Stelios Piperidis, Bente Maegaard, Joseph Mariani |
Publisher | European Language Resources Association (ELRA) |
Pages | 1659-1666 |
Number of pages | 8 |
ISBN (Electronic) | 9782951740891 |
State | Published - 2016 |
Event | 10th International Conference on Language Resources and Evaluation, LREC 2016 - Portoroz, Slovenia Duration: 23 May 2016 → 28 May 2016 |
Publication series
Name | Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016 |
---|
Conference
Conference | 10th International Conference on Language Resources and Evaluation, LREC 2016 |
---|---|
Country/Territory | Slovenia |
City | Portoroz |
Period | 23/05/16 → 28/05/16 |
Bibliographical note
Funding Information:We thank all contributors working on annotated corpora under the UD guidelines, to whom we owe a substantial part of the success and momentum achieved within the UD project so far: Zˇeljko Agić, Riyaz Ahmad, Maria Jesus Aranzabe, Masayuki Asahara, Aitziber Atutxa, Miguel Ballesteros, Cristina Bosco, Giuseppe G. A. Celano, Jinho Choi, C¸ag˘rı C¸öltekin, Kaja Dobrovoljc, Timothy Dozat, Binyam Ephrem, Tomazˇ Erjavec, Richárd Farkas, Jen-nifer Foster, Iakes Goenaga, Koldo Gojenola, Bruno Guillaume, Nizar Habash, Dag Haug, Hiroshi Kanayama, Jenna Kanerva, Simon Krek, Juha Kuokkala, Veronika Laippala, Alessandro Lenci, Krister Lindén, Nikola Ljubesˇić, Olga Lyashevskaya, Teresa Lynn, Aibek Makazhanov, Catalina Ma˘ra˘nduc, Héctor Martínez Alonso, Anna Missilä, Simon-etta Montemagni, Verginica Mititelu, Yusuke Miyao, Shin-suke Mori, Hanna Nurmi, Petya Osenova, Lilja Øvrelid, Petr Pajas, Elena Pascual, Marco Passarotti, Jussi Piitu-lainen, Barbara Plank, Prokopis Prokopidis, Loganathan Ramasamy, Sebastian Schuster, Wolfgang Seeker, Moj-gan Seraji, Maria Simi, Kiril Simov, Aaron Smith, Jan Sˇteˇpánek, Alane Suhr, Takaaki Tanaka, Anders Trærup Jo-hannsen, Francis Tyers, Sumire Uematsu, Veronika Vincze, Rob Voigt, and Jonathan Washington. The work has been partially funded by the Czech Science Foundation grant GA15-10472S, Czech MEYS grant LM2015071, and SWE-CLARIN.
Keywords
- Annotation
- Cross-linguistic
- Dependency
- Multilingual
- Treebanks
- Universal