Purpose/Inspiration:
My main inspiration for this project came from the game Semantris, which was developed by Google’s Research team. This game utilizes tensorflow’s Word2Vec model in order to allow the AI to match user input. The goal of Semantris is to help AI understand the semantics by human language; namely, how certain words in our dialects are related to each other. For example, why might we associate “moon” with “space” but would also associate “space” with “room?”
Therefore, for this midterm project, I wanted to create something which was mildly similar to Semantris, but also had another layer. Though my app is not as developed as Semantris, I made a lot of progress along the way.
If you would like to play Semantris — here’s the link.
My Process:
In order to create this app, I chose to use two pre-trained ml5 models: imageClassifier() and word2vec(). For the layout of this app, I chose to display an image, then ask users to guess what they saw. Simultaneously, this image would be classified via imageClassifier(), which would then lead to a function to check for matches with the user’s input. Meanwhile, the user’s input would also run through word2vec to create an array of words which are closely related to the user’s guesses. Once the user’s guess correctly matched with the results of the image classifier, then the end results would be displayed at the end of the page, so that the user may gain a better sense of what the AI’s process was.
To begin building this web app, I starting with understanding how to implement both of the ml5 models. The classifier model ran perfectly, but word2vec gave many errors, mainly due to the data files that must be used with the model, or there would be a word input that was not in the dataset. Once I resolved this, I chose to configure user input within JavaScript and P5. Originally, I tried using HTML input, but I found that the inputs were not easily integrated within my JS program. Therefore, I chose to working with the prompt() method within JavaScript, but this method was scrapped as this causes a new window to open, and created a sense of disconnect. After some research, I found a p5 input method which was easy to use and made the results easy to access. Link is included below.
Once the user input system was configured, I moved on to finding a way to match the input with the resulting array from the image classifier. With my first test of this, I kept running into an issue within the for loop. Due to the nature of for loops in JavaScript, the match would only be marked “correct” on the last index of the array. Therefore, I added a ‘break’ statement within the loop so that each index could be checked (after receiving help from Mostafa). After this statement was added, the matches could be found.
After this, I went on to find a way to configure user input with the word2vec model. I had thought of multiple ways to implement this within my program. My first idea was to just make two separate game modes based on the two different models, but I felt that this did not align with my original idea and purpose which I had in mind. My second idea was to create two input sections, one for entering an image classification, and another for users to enter words they would associate with the image. While I think this is an interesting idea, I felt that the flow of the program would be disrupted by this. Lastly, I decided to settle for one input field, and chose to process the input through both of the models which were implemented. I feel that this method is the most streamlined implementation of the models and of the user input.
After this was configured, I moved on to play with the design of the page. For this, I used p5.DOM to create div elements to house all the elements on my page. This way, all the elements in my page could be easily manipulated through a separate CSS stylesheet. Lastly, I created various callbacks within my code which would trigger certain results depending on whether an element was clicked or whether another function had been executed.
End Results:
Currently, I feel that my resulting project is fairly similar to the idea which I had in mind. If I had to develop it further, I would like to create more of an emphasis on human language semantics and would then use the distance values from word2vec to make the game more challenging or would use these to display something else. I would also like to create a sort of scoring mechanism which is related to the word2vec model, but I ran out of time to implement this. Another thing which I couldn’t figure out how to do was how to select a random image every time the page loaded. I had stored the image names in an array, and used Math.floor(Math.random() ) to get a random index value from the array, but when I tried to pass this into the ml5 classifier, the image would not be displayed on the webpage, and the console would return an error of “bad image data,” but the classifier would still run. Therefore, I chose not to mess further with this due to time.
Overall, I learned a lot while creating this webpage, and even though it appears very simple, it was rather complex to work with and I ran into error after error along the way. That being said, I am now ready to learn more and feel more prepared to work with larger projects.
However, I feel that I accomplished what I wanted to, and I learned a lot along the way. In the future, I would like to train my own word2vec model to better understand how it works and to perhaps see how biases are created within these models. I have also found in my own research that word2vec AI models are now being utilized in language translation, which is something that I also find interesting.
Process Pictures:
Early prototype
Guess Error:
Final Stage:
Resources Used: