Heinecke & Tyers (2019)

De Arbres
  • Heinecke, Johannes & Francis M. Tyers. 2019. 'Development of a Universal Dependencies treebank for Welsh', Proceedings of the Celtic Language Technology Workshop, European Association for Machine Translation, Dublin, Ireland, 21-31. texte.


 Résumé:
 "This paper describes the development of the first syntactically-annotated corpus of Welsh within the Universal Dependencies (UD) project. We explain how the corpus was prepared, and some Welsh-specific constructions that require attention. The treebank currently contains 10 756 tokens. An 10-fold cross evaluation shows that results of both, tagging and dependency parsing, are similar to other treebanks of comparable size, notably the other Celtic language treebanks within the UD project."