ALCIDE: Extracting and visualising content from large document collections to support humanities studies

作者:

Highlights:

摘要

The application of research practices and methodologies from the Information and Communication Technologies to Humanities studies is having a great impact on the way humanities research is being conducted. However, although many applications have been developed to automatically analyse document collections from the historical or the literary domain, they often fail to provide a real support to scholars because of their inherent complexity: technical skills are often required to use them and to inspect their output. On the other hand, some systems are more user-friendly, but present basic analyses and are limited to the needs of a specific research community.In order to overcome the aforementioned limitations, we developed ALCIDE (Analysis of Language and Content In a Digital Environment), a web-based platform designed to assist humanities scholars in navigating and analysing large quantities of textual data such as historical sources and literary works. This suite of tools combines advanced text processing techniques with intuitive visualisations of the output to serve a broad range of research questions, which no other comparable tool can address in a single platform. Textual corpora can be inspected and compared along five semantic dimensions: who, where, when, what and how. Such dimensions in different combinations allow targeting many key questions of different humanities disciplines, as shown in the five use cases presented.

论文关键词:Web-based platform,Corpus analysis,Digital humanities,Visualisation,Natural language processing

论文评审过程:Received 4 February 2016, Revised 28 July 2016, Accepted 5 August 2016, Available online 5 August 2016, Version of Record 23 September 2016.

论文官网地址:https://doi.org/10.1016/j.knosys.2016.08.003