Boundary Recognition of Light-Pause Marks via Grammar Testing Method
- 17 May 2018
- journal article
- computer science
- Published by EDP Sciences in Wuhan University Journal of Natural Sciences
- Vol. 23 (3), 230-236
- https://doi.org/10.1007/s11859-018-1315-0
Abstract
Boundary recognition is an important research of natural language processing, and it provides a basis for the application of Chinese word segmentation, chunk analysis, named entity recognition, etc. Based on ambiguity in boundary recognition of Chinese punctuation marks, this paper proposes grammar testing methods for boundary recognition of slight-pause marks and then calculates the annotation consistency of these methods. The statistical results show that grammar testing methods can greatly improve the annotation consistency of slight-pause marks boundary recognition. The consistency during the second time is 0.030 3 higher than during the first, which will help guarantee the consistency of large-scale corpus annotation and improve the quality of corpus annotation.Keywords
This publication has 7 references indexed in Scilit:
- AMR-to-text generation as a Traveling Salesman ProblemPublished by Association for Computational Linguistics (ACL) ,2016
- Converting SynTagRus Dependency Treebank into Penn Treebank StylePublished by Association for Computational Linguistics (ACL) ,2016
- Chinese Comma Disambiguation on K-best Parse TreesCommunications in Computer and Information Science, 2014
- Handbook of Automated Essay EvaluationPublished by Taylor & Francis Ltd ,2013
- Scoring with the computer: Alternative procedures for improving the reliability of holistic essay scoringLanguage Testing, 2012
- Latent Structured Perceptrons for Large-Scale Learning with Hidden InformationIEEE Transactions on Knowledge and Data Engineering, 2012
- A Coefficient of Agreement for Nominal ScalesEducational and Psychological Measurement, 1960