Website: https://cartoon.steins.live/
Recap
In the midterm, we have implemented a CartoonGAN on the browser that allows users to upload or take a photo and transform it into a cartoon-like style. And we have trained two models, one is for Miyazaki and the other is one is for Aku no Hana.
Current Solution
Having done some experiments with generative arts, we decided to continue working on the CartoonGAN project by adding features including GIF transformation, foreground/background only transformation, and exporting the CartoonGAN into an ml5 function/model.
Gif transformation
To transform a gif to a cartooned gif on the browser, we need to first encode the gif into binary data, port the binary data to tensors, and pipe tensors into the model. Afterward, we need to do that in a reverse way to get the animated gif.
The main obstacle here is how to do the encoding and decoding. Since there is no modern gif encoding/decoding library for browser, it took me a while for finding suitable libraries, which are extremely hard to cope with.
Moreover, the essence of gif is a set of frames and the transformation is to pipe those frames into the model. However, the model itself is large and the number of frames may crash the browser’s v-ram. Also, I have a problem here that if we have 16 frames in total, the cost of time by using batch 4 to do transformation is much more than using batch 1. And this is much unexpected from my point.
Below is a demo of how the gif looks like:
Foreground/Background Transformation
This feature is heavily based on the BodyPix model that is to get masks for the foreground and background and apply these masks to the transformed images.
Here is one major bug I found for TensorFlow.js that the most recent BodyPix model requires corresponding most recent TensotFlow.js dependency. Otherwise, the results will not match at all.
Below is a set of demo for the model.
I believe this will give a fusion experience that could put the users into a cartoon scene seamlessly.
ml5 wrapper
This work is about to wrap the model into ml5 and align the output to the ml5 standard. Moreover, we need to implement testing, examples, and documentation on how to use the model to that others can use this model easily.