Let’s Read A Story: a study on storytelling for children using machine learning tools

๐Ÿ“š

Let’s Read a story is a study on Aesop Fables and the possibility of exploring the connections between different characters and ideas from the original fables in a new and fun way using recently available machine learning tools.

You can find a working demo here (best performance on chrome desktop).

Let’s Read A Story

In the following post I will try to describe the thought process and some of the technical aspects behind this project. 

———–

๐Ÿ“œ

Collecting the data.

In this project I chose to focus and analyze Aesop Fables to produce new and interesting adjacencies between sentences to create new stories. I was drawn to Aesop Fables texts because of their concise yet rich story lines, the use of animals as metaphors, and the strong morals embedded in each story.

Aesop Fables for kids, project gutenberg

Each original Aesop Fable contains:

  1. A short title, usually very descriptive of the story’s content and characters.
  2. The story itself, usually no more than 30 sentences.
  3. The moral of the story, usually contains a metaphor built on the inherent nature or trait of the animals in the story.

โœจ

Cleaning the dataset

For the analysis of the content I compiled a JSON file holding all stories broken down to individual sentences, their titles, characters, and animals.

This file is key for generation of the experiment’s new stories, as it holds all sentences and acts as the ‘database’ for the experiment. 

furthermore, this file is serves as a source for creating  seed sentence from which the story grows from.

โš™๏ธ

Analyzing the sentences

Using Google’s Universal Sentence Encoder, a machine learning model that encodes text into high dimensional vectors that can be used for text classification, semantic similarity, clustering and other natural language tasks, I analyzed all sentences derived from the fables (~1500 sentences).

This yields a JSON file containing sentence embeddings for each sentence in a 512 dimensional space, this is the similarity map I use to compare and generate new adjacencies.

Example line from the file:

{"message": "There was once a little Kid whose growing horns made him think he was a grown-up Billy Goat and able to take care of himself.", "message_embedding": [0.06475523114204407, -0.026618603616952896, -0.05429006740450859, 0.003563014790415764 ...........,0.06475523194004407]}

For processing and retrieval of information for similarities, averages and distances between sentences I used the ML5 Word2Vec class and changed it a bit to work with the universal sentence encoder scheme.

Starting with generating 10 sentences, The first result was pretty surprising, making sense and pretty convincing. though a bit dark to my taste:

First test in generating output text from the universal sentence encoder (10 lines from a random seed)

Another try yielded different results, still very grim:

Second test in generating output text from the universal sentence encoder (10 lines from a random seed)

๐Ÿ’ป

Building the web application

For the first version of this project I thought it would be best if it lived on web browsers, that way it would be accessible to almost everybody. I chose building the first version on Node.JS for the server side (similarities calculations, sentiment analysis and serving content) and javascript for the frontend functionality (everything else).

๐ŸŽจ +๐ŸŽต

Adding Illustrations & Musical phrases to the story 

To enrich the stories, I chose to use  Google Magenta’s sketch-RNN model: A Generative Model for Vector Drawings to reconstruct illustrations from a pre trained model to accompany the generated stories.

The Quick Draw Dataset is a collection of 50 million drawings across 345 categories, contributed by players of the game Quick, Draw!. The drawings were captured as timestamped vectors, tagged with metadata including what the player was asked to draw and in which country the player was located.

The smart people at google magenta trained a publicly available recurrent neural network model called sketch-rnn. they have taught this neural net to draw by training it on millions of doodles collected from the Quick, Draw! game. while I’m using it to simply reconstruct animal and other general illustrations from the story, there are many other creative applications to this insanely big dataset and network.

For Let’s Read A Story, I chose to use this model while implementing a simple RegEx search on the resulting sentences. The javascript functionality determines which animal appears in the generated story and then reconstructs an illustration from the trained sketch RNN model using P5.JS. If the sentence contains an animal that does not exist in the model, there is another function that ‘enriches’ the model’s keywords and matches similar animals specified in the sentence.

These illustrations then become musical phrases based on some pre determined ‘musical’ rules:

Lion Illustration and Sound

Lion Illustration and sound

  • With the help of AFINN-based sentiment analysis library I analyze each sentence and determine whether it has a positive or negative sentiment. Based on that sentiment (a score between -5 to 5), I’m mapping the illustration’s X and Y coordinates to musical notes on a major or minor B scale – positive scores get a major scale ๐Ÿ˜Š and negative scores get a minor scale ๐Ÿ˜ฅ.
  • According to the animal appearing in the sentence, I choose a different tone.js synthesizer and a different number of musical notes. For example, a predatory animal that tends to be scary, like a wolf ๐Ÿบor a lion ๐Ÿฆ will be played by a low-tone synth and a small amount of musical notes sounds. Conversely, a bird ๐Ÿฆ or a cricket ๐Ÿฆ— will be played by a high-pitched synth and a higher amount of sounds.

This method, of course, does not purport to represent the story reliably, and there are cases in which there will be no match between the musical sounds and the story, but it gives a certain touch to the story and enriches the characters and illustrations in a somewhat charming way. In future versions this method will need to be improved.

๐Ÿ”ฎ

Into the future

This is an ongoing projects I intend to work on in the next couple of months, hopefully it will become my thesis project in ITP-NYU. I intend to develop more tools to help structure stronger narratives with the help of these and other machine learning algorithms. Testing and deployment on other platforms that give better focus to different aspects of the storytelling practice – actual books, but also smart devices, speakers, tablets and other immersive media platforms.

Some questions left unanswerd:

  • Retrieving a new generated moral for a story by analyzing the original stories with the newly generated one.
  • How would it feel if a computer stories would be read by a synthesized voice. 
  • Digging deeper into structuring narratives and building stronger and more reliable stories.
  • Generating multi-character illustrations that has more    

Github repository: https://github.com/itayniv/aesop-fables-stories


This project was completed as part of Daniel Shiffman’s Programming A2Z, and Gene Kogan’s The Neural Esthetic courses in NYU – ITP, fall 2018.

Leave a Reply

Your email address will not be published. Required fields are marked *