Story embedding: Learning distributed representations of stories based on character networks

Highlights：

• This study proposes novel models and methods for representing stories of narrative works.

• The proposed methods focus on synchronizing the story which is latent under various data sources from narrative multimedia.

• Through synchronizing scenes in the text and video, stories of the narrative work are discretized.

• Names and faces of characters are synchronized by their occurrence distributions on the text and video.

• Existing character networks are integrated for better analytics and understanding.

摘要

This study aims to learn representations of stories in narrative works (i.e., creative works that contain stories) using fixed-length vectors. Vector representations of stories enable us to compare narrative works regardless of their media or formats. To computationally represent stories, we focus on social networks among characters (character networks). We assume that the structural features of the character networks reflect the characteristics of stories. By extending substructure-based graph embedding models, we propose models to learn distributed representations of character networks in stories. The proposed models consist of three parts: (i) discovering substructures of character networks, (ii) embedding each substructure (Char2Vec), and (iii) learning vector representations of each character network (Story2Vec). We find substructures around each character in multiple scales based on proximity between characters. We suppose that a character's substructures signify its ‘social roles’. Subsequently, a Char2Vec model is designed to embed a social role based on co-occurred social roles. Since character networks are dynamic social networks that temporally evolve, we use temporal changes and adjacency of social roles to determine their co-occurrence. Finally, Story2Vec models predict occurrences of social roles in each story for embedding the story. To predict the occurrences, we apply two approaches: (i) considering temporal changes in social roles as with the Char2Vec model and (ii) focusing on the final social roles of each character. We call the embedding model with the first approach ‘flow-oriented Story2Vec.’ This approach can reflect the context and flow of stories if the dynamics of character networks is well understood. Second, based on the final states of social roles, we can emphasize the denouement of stories, which is an overview of the static structure of the character networks. We name this model as ‘denouement-oriented Story2Vec.’ In addition, we suggest ‘unified Story2Vec’ as a combination of these two models. We evaluated the quality of vector representations generated by the proposed embedding models using movies in the real world.