iML: Week 11 – Final Project Proposal (Thomas Tai)

Introduction

For my final project, I would like to explore music generation using machine learning. Modern day music is heavily influenced by the tools available to artists. Even though these computer generated effects are not seen by the user, they play an important role in music creation. Machine learning is relatively new and has only been applied to audio recently. Originally, my idea was to use radio signal data from NASA to generate music of the universe. However, this data is not readily available and the idea has already been done before. As a life long violinist, I want to explore machine learning with violin music. I hope to train a machine learning model that can take simple random notes and transcribe them to a more complex and pleasant melody.

Inspiration

After doing some research, I found that there were not many applications of machine learning with the violin. I think it is this way because the violin is a very complex instrument that has not been integrated with technology. Electric instruments such as the piano and guitar have become popular while the electric violin has not caught on as much. Many people find the violin to be a difficult to play instrument and there are fewer and fewer players every year. I want to create a version of the violin that can be played by people with varying musical backgrounds, from beginner to advanced. Below is an example of Google’s implementation of a musical neural network that can generate music from just a few buttons. 

Project Plan

I plan to use Magenta.js because it has a web framework available and examples that relate to my project. I will need to find a way to transcribe real time input from violin into a frequency along with amplitude. This will be fed into Magenta which will find a suitable note to output I would then need to find a way to output the notes using a sound library. I intend to train a model using violin midi files from a wide variety of composers throughout history. If all goes to plan, a player with no experience should be able to create some basic melodies that sound somewhat decent with minimal violin experience.

Possible Resources

MIDI for the browser – https://github.com/mudcube/MIDI.js

Tuner App for Chrome – https://github.com/googlearchive/guitar-tuner

Magenta.JS – https://magenta.tensorflow.org/get-started/

Week11 – Final Topic Proposal

Final Project Proposal – GPT2 Model

Link to Proposal

For my final project, I wasn’t sure what I wanted to do because I was split among three ideas for the proposal. First, I wanted to train the GPT2 Model, which is a model released by OpenAI. I hoped to then create interface for a user to feed a prompt in and have it continue a paragraph or two about the particular content. This project is the most feasible, and most achievable among the three for the due date, so I have decided to do this for the final project. The other two ideas I previously had were GST-Tacotron which is a model for training voice models made by Google, which can accurately train a human voice and make them say whatever you want to say via text input. This model is very impressive results wise, however, I found out that it is particularly difficult to train, and you need three hours of spoken audio of somebody else. The third idea I wanted to do was present my own senior thesis, because I think my thesis is very machine learning oriented and would be the culmination of a semester worth of work, I thought it was worth presenting, However I did couldn’t get permission from Professor Moon and Aven, and the fact that I most likely could not finish my project by the deadline,  so ultimately I abandoned the idea for GPT2.

Tools

I will mostly be using keras for training the model. I will also most likely be using the school’s cluster or the intel server to have the compute power, and I will also be using a variety of models. 

Datasets

I haven’t decided which dataset I wanted to use yet, but I mostly will be trying to make my own corpus instead of using some commonly preused ones – like the shakesphere or movie script corpus commonly used to train models. Particularly comments off forums or social media are very interesting to me, so I wanted to see if I can generate a realistic type of comment or a discussion. 

Goal

My main goal for this project is to create a way for users to generate their very realistic text generation model in order for users to accurately create something off a prompt. I was always interested in text generation – and using previous works like char-rnn, I was very excited to see GPT2 work so well. Perhaps having the model feed itself prompts, It can novelly generate enough content for a website. I hope to add more interaction in the spirit of IMA. For example, I can let it start its own dialogue, and have it comment on that, and then have it continue off those comments. It will be very interesting to see the kind of output it can make. At this point, I am not fully aware how GPT2 even works, I know it is a variety of the transformer types of models with attention embeddings.. which is rather complicated. I hope making this project will give me further insight into text generation and machine learning models. 

iML Final Project Proposal – Alison Frank

Link to Presentation

As I am still a beginner when it comes to machine learning, I am looking to broaden my knowledge and try something that would challenge me. Due to my past projects, I have developed more interest in text generation and word vectors and am building on this knowledge for my final. Therefore, I am going to train a text generation model and create a web application called “AI Thoughts.” Ideally, I would like to create an interface where users can chat with the model I’ve trained and ask it questions.

To do this, I am going to use the ML5 text generator model and train it on data which I have found (more information can be found here). Previously, I tested an example made by Keras (link here), but the results were not good. While I could use a model provided by Tensorflow, I am unsure as to how to connect this model to JavaScript. While I could figure out how to do this, I would rather test the ML5 model first as this would be the most streamlined option.  The ML5 model also utilizes LTSMs (long term short memory layers) which are a crucial piece in RNNs when working with text. In terms of datasets, I found one based off of New York Times articles, and it has been shown to give interesting results when paired with text generation. Along with this, I have found others based off of user reviews and books. If the dataset I use turns out to not be large enough, I will try and find one which will give a more accurate result.

The purpose of this project is mainly to give me a better understanding of machine learning and its implementations. I feel that this project will be very exploratory and gives lots of room for new knowledge. My interest in word vectors and text generation comes from the idea that language is essential to humans, and I find it interesting to see how this is processed by a machine. Along with this, word vectors are being used for translation and text generation technology is being used in many different areas, and hence has many practical uses. As I am becoming more acquainted with machine learning techniques, I hope to build upon my previous work and form something new and more advanced. But for now, I am still learning.

Week 11: Final Proposal

For my final project, I want to expand upon the chatbot I created for the midterm. If you recall, the bot I introduced during the midterm was programmed to simulate a patient with speech impediment; this time however, I want to create a model that allows for the chatbot to talk naturally. To take this a step further, I will create two chatbots with differing speech styles, and have them converse with each other. In essence, my final project will be a conversation between two basic AI’s that have been trained on different data sets. 

Inspiration

As I have mentioned in previous posts, AI language processing is quite fascinating to me, and I really enjoy interacting with chatbots or watching chatbots interact with each other. I think the uncertainty of what the chatbot will produce is what draws me in the most, because although generative language programs are made to simulate human speech as accurately as possible, the models will often deviate from ‘normal’ human conversation conventions because language is such a complex concept for machines to learn. 

Here is a video that really inspired me to go down this chatbot conversation path:

I thought that this interaction between the two bots was absolutely fascinating and entertaining to watch. In a way, their conversation reminds me of how toddlers talk with each other, with some existential questions thrown in for good measure. This is what I hope to achieve with my project.

Idea

To put this into simple terms, my final project idea is to create and train two chatbots with differing ‘personalities’ or speech styles, and have them converse with each other. Right now, I aim to have them interact in a purely text-based interface, but I may incorporate sound into this (depends on time and resources). 

Tools

I will be incorporating RASA NLU and RASA Core into my project to aid in natural language processing and grammar/syntax usage. Of course, I will need to delve deeper into RASA, but from my current understanding, it is a powerful, open source tool for structuring speech in machine learning. Both RASA NLU and Core are frameworks that can be used to create bots such as those in online customer service positions, where the user inputs a simple question or request, and the bot attempts to satisfy the customer’s needs. The most important thing about RASA for me however, is that the framework will allow me to imbue the chatbot with natural, fluid speech patterns similar to a human. This is a tremendous step up from my midterm project, because fixing grammar/syntax was one of the large hurdles I had to overcome. 

Dataset

My choice of datasets is probably the most important aspect of my project (besides the structure of the model), because this is what will give my bots a ‘personality’. I plan on training each bot with one long novel that will be transferred into a word list, and eventually a dictionary that will act as the chatbot’s vocabulary pool. Each bot will be trained on a different novel, so that the end result will be that they will have contrasting speech styles. I have already picked out the novels (although this is also subject to change). 

Bot 1 Dataset:

I really liked how my midterm chatbot turned out (trained on War and Peace); I found that it possessed a very old-fashioned, refined style of talking, which I want for my final as well. This is because the novel itself contains rather ‘outdated’ language due to the setting and time of publication. 

Bot 2 Dataset:

I think that this will be especially interesting, because American Psycho is written in a first-person perspective of a psychopath. In a sense, I will be making Bot 1 into a more ‘posh’ or refined speaker, while Bot 2 will hopefully possess some qualities akin to that of a psycho killer. I feel it would be quite interesting to see these two personalities interact with each other. 

Goal

The ultimate goal for this project to have a working, comprehensible conversation between the two chatbots. Currently, I foresee this happening within terminal, but if everything goes according to plan, I also hope to add a few more extra tweaks, such as a refined interface or different language options.