Week11 – Final Topic Proposal

Final Project Proposal – GPT2 Model

Link to Proposal

For my final project, I wasn’t sure what I wanted to do because I was split among three ideas for the proposal. First, I wanted to train the GPT2 Model, which is a model released by OpenAI. I hoped to then create interface for a user to feed a prompt in and have it continue a paragraph or two about the particular content. This project is the most feasible, and most achievable among the three for the due date, so I have decided to do this for the final project. The other two ideas I previously had were GST-Tacotron which is a model for training voice models made by Google, which can accurately train a human voice and make them say whatever you want to say via text input. This model is very impressive results wise, however, I found out that it is particularly difficult to train, and you need three hours of spoken audio of somebody else. The third idea I wanted to do was present my own senior thesis, because I think my thesis is very machine learning oriented and would be the culmination of a semester worth of work, I thought it was worth presenting, However I did couldn’t get permission from Professor Moon and Aven, and the fact that I most likely could not finish my project by the deadline,  so ultimately I abandoned the idea for GPT2.

Tools

I will mostly be using keras for training the model. I will also most likely be using the school’s cluster or the intel server to have the compute power, and I will also be using a variety of models. 

Datasets

I haven’t decided which dataset I wanted to use yet, but I mostly will be trying to make my own corpus instead of using some commonly preused ones – like the shakesphere or movie script corpus commonly used to train models. Particularly comments off forums or social media are very interesting to me, so I wanted to see if I can generate a realistic type of comment or a discussion. 

Goal

My main goal for this project is to create a way for users to generate their very realistic text generation model in order for users to accurately create something off a prompt. I was always interested in text generation – and using previous works like char-rnn, I was very excited to see GPT2 work so well. Perhaps having the model feed itself prompts, It can novelly generate enough content for a website. I hope to add more interaction in the spirit of IMA. For example, I can let it start its own dialogue, and have it comment on that, and then have it continue off those comments. It will be very interesting to see the kind of output it can make. At this point, I am not fully aware how GPT2 even works, I know it is a variety of the transformer types of models with attention embeddings.. which is rather complicated. I hope making this project will give me further insight into text generation and machine learning models. 

Leave a Reply