Week 12 : An.i.me – Final Project Proposal – Abdullah Zameek

For the final project, I wanted to experiment with some sort of generative art since I felt there is no better time and place to try out some sort of generative model first hand. Throughout the semester, I kept using a recurring Pokemon theme in most of my projects because of how fond I am of the series and came across this article that spoke about machine generated Pokemon.
This time, however, I wanted to do something a tad bit different, but along the same lines. So, I decided to bring in one of my other all-time favorite interests – Anime. 

Idea

We’ve all heard of Snapchat and their filters that allow various kinds of special effects to be applied onto your face. But, I think we could go one step further than that with the help of generative models such as GANs. 
I came across this paper that described a GAN model to generate anime characters, and this proved as a great source of inspiration for my project. What if a given human face could be translated across domains from reality into an animated face? After a bit of reading, it turned out that this exact application is doable with GAN models. 

The project presentation is here

Implementation

As I described in the presentation, I investigated two different models – pix2pix and CycleGAN. The reason why CycleGAN is a clear winner is because of the fact that it allows for unpaired image-to-image translation. This is highly desirable because of the fact a given anime character dataset is not going to have a corresponding “human” face pair. This allows for a great deal of flexibility in creating a model where  the anime character images and human faces can be treated independently. 
One of the key papers in cross-domain translation is this paper published by Facebook AI and tackles the matter of Unsupervised Cross Domain Image Generation. 
Going forward, I haven’t honed in onto a very specific model as of yet, but there are some great CycleGAN derived models out there such as DRAGAN, PCGAN and most notably, TwinGAN , which is derived from PCGAN. 
With regards to the dataset, once again, there are multiple datasets out there and while I will make a decision within the next few days, there are some strong contenders such as TwinGAN’s Getchu  and the popular Danbooru dataset

I’m very much inclined to go with the Getchu Dataset with the TwinGAN model because of the ease of access. However, the resulting model is not directly compatible with ml5js or p5js so there will be a bit of interfacing to do there which I’ll have to tackle. 

Goal

The final outcome can be thought of as a sort of “inference” engine where I’d input the image of a new human face and then generate its corresponding animated face. Ultimately, by the end of this project, I want to get a better understanding of working with generative models, as well as make something that’s amusing.  

Leave a Reply