The Glossaryfication Web Service: an automated glossary creation tool to support the One Health community

Abstract
In many interdisciplinary research domains, the creation of a shared understanding of relevant terms is considered the foundation for efficient cross-sector communication and interpretation of data and information. This is also true for the domain of One Health (OH) where many One Health Surveillance (OHS) documents rarely contain glossaries with a list of terms for which their specific meaning in the context of the given document is defined (Cornelia et al. 2018, Buschhardt et al. 2021). The absence of glossaries within these documents may lead to misinterpretation of surveillance results due to the wrong interpretation of terminology specifically when term definitions differ across OH sectors. Under the One Health EJP project ORION, the OHEJP Glossary was recently created. The OHEJP Glossary is a tool to improve communication and collaboration amongst OH sectors by providing an easy-to-use online resource that lists relevant OH terms and sector-specific definitions. To improve the accessibility of content from the OHEJP Glossary and support the creation of integrative glossaries in future OHS-related documents, the OHEJP Glossaryfication Web Service was created. This service can support the practical use of the OHEJP Glossary and other relevant online glossaries by OH professionals. The Glossaryfication Web Service (GWS) is an application that automatically identifies terms in any uploaded text-based document and creates a document-specific list of matching definitions in selected online glossaries. This auto-generated document-specific glossary can easily be adjusted by the user, for example, by selecting the desired definition in case multiple definitions were found for a specific term. The document-specific glossary could then be downloaded, manually adjusted and finally included into the original document where it supports the correct interpretation of terminology used. Especially in sector-specific reports, such as from animal health or public health authorities, this can be beneficial to ensure the correct interpretation by other OH sectors in the future. The GWS was developed with the open-source desktop software KNIME Analytics Platform and runs as a web service on a KNIME Web Server infrastructure. The core data processing functionality in the GWS is based on KNIME’s Text Processing extension. KNIME's JavaScript nodes provided the basis for an interactive user interface where users can easily upload their files and select between different reference glossaries, such as the OHEJP Glossary, the CDC Glossary, the WHO Glossary or the EFSA Glossary. After retrieval of the user input settings, the GWS tags words within the provided document and maps these tagged words with matching entries in the selected glossaries. As the main output, the user receives a downloadable list of matching terms with their corresponding definitions, sectorial assignments and references, which can then be added by the user to the original document. The GWS is freely accessible via this link as well as the underlying KNIME workflow.