Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Paper
Hierarchical Annotation and Automatic Error-Type Classification of Japanese Language Learners’ Writing
Hiromi OyamaMamoru KomachiYuji Matsumoto
Author information
JOURNAL FREE ACCESS

2016 Volume 23 Issue 2 Pages 195-225

Details
Abstract
Recently, various types of learner corpora have been compiled and utilized for linguistic and educational research. As web-based application programs have been developed for language learners, we can now collect a large amount of language learners’ output on the web. These learner corpora include not only correct sentences but also incorrect ones, and we aim to take advantage of the latter for linguistic and educational research. To this end, this study aims to automatically classify incorrect sentences written by learners of Japanese according to error types (or classes) by a machine-learning method. First, we annotate a corpus of the learners’ writing with error types defined in a tree-structured class set. Second, we implement a hierarchical error-type classification model using the tree-structured class set. As a result, the proposed method performs better in the error-classification task than in the flat-structured multiclass classification baseline model by 13 points. Third, we explore features for error-type classification tasks. We use contextual information and syntactic information, such as dependency relations, as the baseline features. In addition, because a corpus of language learners contains not only correct sentences but also incorrect ones, we propose two extended features: the edit distance between correct usages and incorrect ones and the substitution probability at which characters in a sequence change to other characters. Although the performance varies according to error types, the proposed model with all features outperforms the model with the baseline features by six points.
Content from these authors
© 2016 The Association for Natural Language Processing
Previous article
feedback
Top