A Surface-Syntactic UD Treebank for Naija - Ifra Nigeria Accéder directement au contenu
Chapitre D'ouvrage Année : 2019

A Surface-Syntactic UD Treebank for Naija

Résumé

This paper presents a syntactic treebank for spoken Naija, an English pidgincreole, which is rapidly spreading across Nigeria. The syntactic annotation is developed in the Surface-Syntactic Universal Dependency annotation scheme (SUD) (Gerdes et al., 2018) and automatically converted into UD. We present the workflow of the treebank development for this under-resourced language. A crucial step in the syntactic analysis of a spoken language consists in manually adding a markup onto the transcription, indicating the segmentation into major syntactic units and their internal structure. We show that this so-called "macrosyntactic" markup improves parsing results. We also study some iconic syntactic phenomena that clearly distinguish Naija from English.

Domaines

Linguistique
Fichier principal
Vignette du fichier
Caron_et_al_2019_A Surface-Syntactic UD Treebank for Naija.pdf (438.99 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
licence : CC BY NC - Paternité - Pas d'utilisation commerciale

Dates et versions

halshs-03983518 , version 1 (11-02-2023)

Identifiants

Citer

Bernard Caron, Marine Courtin, Kim Gerdes, Sylvain Kahane. A Surface-Syntactic UD Treebank for Naija. Marie Candito; Kilian Evang; Stephan Oepen; Djamé Seddah. Proceedings of the 18th International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2019), Association for Computational Linguistics, pp.13-24, 2019, ⟨10.18653/v1/W19-7803⟩. ⟨halshs-03983518⟩
441 Consultations
265 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More