Automated Readability Assessment for Spanish e-Government Information

Abstract
This paper automatically evaluates the readability of Spanish e-government websites. Specifically, the websites collected explain e-government administrative procedures. The evaluation is carried out through the analysis of different linguistic characteristics that are presumably associated with a better understanding of these resources. To this end, texts from websites outside the government websites have been collected. These texts clarify the procedures published on the Spanish Government’s websites. These websites constitute the part of the corpus considered as the set of easy documents. The rest of the corpus has been completed with counterpart documents from government websites. The text of the documents has been processed, and the difficulty is evaluated through different classic readability metrics. At a later stage, automatic learning methods are used to apply algorithms to predict the difficulty of the text. The results of the study show that government web pages show high values for comprehension difficulty. This work proposes a new Spanish-language corpus of official e-government websites. In addition, a large number of combined linguistic attributes are applied, which improve the identification of the level of comprehensibility of a text with respect to classic metrics.

This publication has 14 references indexed in Scilit: