For this final project, we had two ideas in our minds. I will introduce them one by one:
Refine the CartoonGAN project
This idea is a continuation of our midterm project by adding two functionalities: gif converting and foreground/background only converting.
In detail, the first one requires work on unpacking gif into frames and pipe all frames as a batch to the model. Then the model gives the output value and our app will pack those frames into a new gif. Through this process, we need to put extra attention to avoid running out of memory.
As for the second idea, I am planning to combine the BodyPix model with our CartoonGAN model. However, to maximize the efficiency of the computation. I will apply some WebGL tricks with the BodyPix and CartoonGAN output. So that we can deliver the best user experience.
Generative ART with Sentiment Analysis on both audio and text
The idea is that the text could contain certain emotions so does the audio/speech. What’s more interesting is that even the same content, the text, and the audio could have different emotions. We can use this kind of conflictions to make generative arts.
However, there are some problems remaining here. First, there is no real-time sentiment analysis during these years. We can find an alternative that uses face expression.
The other thing is how to visualize or say generate the emotions that will attract the audience? This seems to be a harder and more critical question than the previous one.