2021
Doctoral thesis  Open Access

Enhancing the computational representation of narrative and its extraction from text

Metilli D.

Semantic Web  Narrative  Ontology  Natural Language Processing  Knowledge extraction  Wikidata 

Narratives are a fundamental part of human life. Every human being encounters countless stories during their life, and these stories contribute to form a common understanding of reality. This is reflected in the current digital landscape, and especially on the Web, where narratives are published and shared everyday. However, the current digital representation of narratives is limited by the fact that each narrative is generally expressed as natural language text or other media, in an unstructured way that is neither standardized nor machine-readable. These limitations hinder the manageability of narratives by automated systems. One way to solve this problem would be to create an ontology of narrative, i.e., a formal model of what a narrative is, then develop semi-automated methods to extract narratives from natural language text, and use the extracted data to populate the ontology. However, the feasibility of this approach remains an open question. This thesis attempts to investigate this research question, starting from the state of the art in the fields of Computational Narratology, Semantic Web, and Natural Language Processing. Based on this analysis, we have identified a set of requirements, and we have developed a methodology for our research work. Then, we have developed an informal conceptualization of narrative, and we have expressed it in a formal way using First-Order Logic. The result of this work is the Narrative Ontology (NOnt), a formal model of narrative that also includes a representation of its textual structure and textual semantics. To ensure interoperability, the ontology is based on the CIDOC CRM and FRBRoo standards, and it has been expressed using the OWL and SWRL languages of the Semantic Web. Based on the ontology, we have developed NarraNext, a semi-automatic tool that is able to extract the main elements of narrative from natural language text. The tool allows the user to create a complete narrative based on a text, using the extracted knowledge to populate the ontology. NarraNext is based on recent advancements in the Natural Language Processing field, including deep neural networks, and is integrated with the Wikidata knowledge base. The validation of our work is being carried out in three different scenarios: (i) a case study on biographies of historical figures found in Wikipedia; (ii) the Mingei project, which applies NOnt to the representation and preservation of Heritage Crafts; (iii) the Hypermedia Dante Network project, where NOnt has been integrated with a citation ontology to represent the content of Dante's Comedy. All three applications have served to validate the representational adequacy of NOnt and the satisfaction of the requirements we defined. The case study on biographies has also evaluated the effectiveness of the NarraNext tool.



Back to previous page
BibTeX entry
@phdthesis{oai:it.cnr:prodotti:459796,
	title = {Enhancing the computational representation of narrative and its extraction from text},
	author = {Metilli D.},
	year = {2021}
}
CNR ExploRA

Bibliographic record

ISTI Repository

Deposited version Open Access

Also available from

etd.adm.unipi.itOpen Access

Mingei
Representation and Preservation of Heritage Crafts


OpenAIRE