Abstract
Numerous valuable historic and cultural sources - a major part of our cultural heritage - are currently imperilled and scattered in various national archives. Arts and Humanities are sciences that are mainly based on the interpretation of cultural objects such as texts, paintings and works of arts, or historical/ethnological remains and monuments. Such objects are often unique, very valuable, fragile, irreplaceable and locally preserved in scientific collections at museums, in archives, or in urban and historic areas. Archives, museums and other cultural institutions do not simply conserve these objects. They also manage a large of documentation on them in the form of photo collections, expertise, records, scientific studies and analyses. Both the objects themselves as well as the supplementary documentation are often accessible only through physical contact with users. Duplicates such as text documents (e.g., critical editions), or image documents (facsimiles, photographs) on paper are extremely expensive in terms of manpower, know-how and printing costs, and often these expenses cannot be justified for a small scientific audience. Electronic formats for object documentation might alleviate this access problem. Numerous initiatives have been started and supported to highlight and investigate a variety of challenges that museums and other culture-historical institutions are facing in an increasingly digital, media saturated landscape. However, full knowledge and usage of this material are severely impeded by access problems, due to the lack of appropriate content-based search and retrieval aids that help users to find what they really need even when electronically and digitized copies are available. Preserving contents does not consist in simply storing them, but in actively transforming them to adapt them technically and keep them intelligible. Moreover, many informal and non-institutional contacts between cultural archives constitute specific professional communities which today, however, still lack effective and efficient technological support for cooperative and collaborative knowledge working. The creation of digital libraries, enhanced by annotation collaboratory facilities, is the technological response to bundle documents, interpretation knowledge, work processes and an expert network in a very flexible working environment. Object and document collections in the Arts and Humanities always represent work in progress. The inventory at cultural institutions is growing steadily due to donations, acquisitions, and by virtue of their own daily scientific and conservation services. These additions must be incorporated into the existing collections but often space difficulties, problems of scientific know-how and lack of personnel have to be dealt with. Professionals and experts classify, analyze, assess and expose or edit these objects and documents. Highly qualified external specialists are frequently difficult to locate, if they are not part of a scholarly network. Internal experts are often overburdened with routine work in times of small cultural budgets and can only invest time sporadically and intermittently in integrating new inventories. Many scientific members of cultural institutions have temporary contracts and leave after a few years, taking with them a great part of the accumulated know-how. The intrinsic nature of the document processing procedures supporting the progressive work on historic material, as outlined in this introduction, poses several constraints that require solutions specifically tailored to the tasks mentioned above. Over the years, Intelligent Systems are becoming valuable working instruments for researchers involved in humanistic sciences. The new challenge is now to provide these people with tools that are able to facilitate the fruition and investigation of the cultural heritage, so that even non-experts or communities of researchers may use up-to-date tools for both their personal work and for collaborative purposes. Technologically, the World Wide Web can serve both as a standard communication platform for such communities and as a gateway for document-centered digital library applications. Yet, while the Web may solve the problem of the diffusion and access of this material in its digital form, new automated tools are needed to allow a more intelligent processing and a personalized utilization of this knowledge. According to the situation previously described, besides the effectiveness and the efficiency of such solutions, such automatic tools must be able to cope with situations in which the continuous growth of the available material and knowledge is a fundamental and unavoidable issue. Hence, there is the need for a system component that is able to build incrementally upon previously acquired knowledge through diverse reasoning mechanisms. Specifically, the availability of systems that can automatically identify and separate document classes and meaningful parts inside them would alleviate experts from the need to accomplish low-level tasks, thus allowing them to focus on more intellectual interpretation-intensive tasks. For such systems to be successful in a real operating environment, however, their behavior and results must be comprehensible to human experts, which can happen only when symbolic representations are used. The choice of these symbolic mechanisms, which resemble closely the human way of reasoning, also allows a more direct comprehension and control of the knowledge synthesized at every step of the process. In the talk, different experiences and projects in the cultural heritage application domain are briefly presented and the symbolic Machine Learning approaches, developed by the LACAM Lab. of the Department of Computer Science of the University of Bari, are presented. Since the 90's Document Engineering has been one of the elective application domains for the research group working in the field of Conceptual Learning,...

This publication has 2 references indexed in Scilit: