aiarts-final – IMA Documentation

December 21, 2019

CartoonGan on the Web | Final.documentation- Casey & Eric

Logistics

Group Member – Casey, Eric

Github Repo – CartoonGAN-Application

Previous Report – Midterm Documentation

Proposal – Final Proposal

GIF.Sample - Original GIF.Sample - Chihiro GIF.Sample - paprika GIF.Sample - Hayao

Background

Motivation

For this stage of this project, we want to further refine our work done up to this point, including 1) our web interface/API to the CartoonGAN models and functionalities; 2) the web application utilizing CartoonGAN, which would have more layers of interaction and possibility to it with the new features we plan for it to have.

We sincerely hope that through these refining work, CartoonGAN can finally become a powerful and playful tool that can be used by learners, educators, artists, technicians, so that our contribution to the ml5 library would truly help others, and spark more creativity in this fascinating realm.

Methodology & Experiments

Gif Transformation

Developing gif transformation on a web application is a more demanding task than we imagined. Due to the fact that there isn’t any efficient modern gif encoding/decoding libraries, my partner who worked on this functionality went through quite some effort to find usable libraries for working with gifs in our application.

*This could be a potential direction for future contributions.

Front-end wise, we implemented a simple but effective piping algorithm in order to recognize the type of input the user uploaded, and trigger respective strategies accordingly.

Demo gif outputs:

GIF.Sample - Trump.Original GIF.Sample - Trump.Chihiro GIF.Sample - Trump.Shinkai GIF.Sample - Trump.Paprika GIF.Sample - Trump.Hosoda GIF.Sample - Trump.Hayao

Styles: Original – Chihiro – Shinkai – Paprika – Hosoda – Hayao

Some experiments:

This cyberpunk kitty is recorded during one of our experiments with GIF transformation. As shown in the video, the transformation (original style to Miyazaki’s Chihiro style) output is glitchy, resulting from a single frame loss. This could be resulting from issues with GIF encoding and decoding in our web application, as we currently work with GIF in the following way:

GIF ➡️ binary data ➡️ tensor ➡️ MODEL ➡️ tensor ➡️ binary data ➡️ GIF

Therefore, encoding issues could largely effect our final outcome. This is a problem that needs to be looked into in the future.

Foreground/Background Transformation

Foreground/background transformation is one of out biggest feature updates to our CartoonGAN web application.

The main methodology we used to develop this feature is to implement BodyPix as a tool to recognize humans from their background, and use that as a mask for the input image. This mask is then used to manipulate the pixel data from the image, so that the cartoonization can be applied to either foreground, background or both depending on the user’s choice.

We hope this could bring our user experience to another level, as we try to bring our users the experience of seeing themselves in the cartoon world of their choice, by either turning themselves into a cartoonized character, or turning their surrounding world into a fusion of their reality and fantasy.

Demo foreground/background outputs:

Foreground –

F/B.Sample - A.in F/B.Sample - A.out

F/B.Sample - C.in F/B.Sample - C.out

Background –

F/B.Sample - B.in F/B.Sample - A.out1

F/B.Sample - B.out2

Social Impact

ml5 library

We wrapped CartoonGAN into a ml5 library, and submitted a pull request to merge our work into the ml5.

Rull Request

Ml5 Pull Request Screenshot

The reason we included this as part of our project goal is that we hope our work would become real contributions to the creative world out there. Machine learning on the browser is still a relatively new and merging field, the more work and attention it receives, the faster it will grow. Though I am a newbie myself, I really hope that my efforts and contribution could help ml5 grow as an amazing tool collection for the brilliant innovative minds in this realm.

Further Development

There are still work to be done and room for improvement in this project to bring it fully up to our expectations.

Web application wise, GIF transformation is still relatively slow and buggy, due to the insufficiency of existing tools to work with gifs on the browser. We did our best to accommodate these issues, but we still want to look into potential ways of improvement, maybe even new issues to contribute to.

The CartoonGan ml5 library is still a work in progress. Although we have the barebones ready, there’s still work needed. We are currently in progress of building tests, examples, guides and documentation for the library, and designing wise, we still need to improve the library in aspects like error and corner cases handling, image encoding and other input format supports. These are all necessary elements for CartoonGAN to become an easy-to-use and practical library, which is our ultimate hope.

December 15, 2019December 15, 2019

AI Arts Final Project: A-Imitation — Crystal Liu

Inspiration

The inspiration of this project is quite ridiculous. At first, I didn’t have any ideas about my final project, so I just browsed some random things on the Internet. Then I found this picture:

Related image

And after I clicked it to learn more about it, I found that there were so many interesting and weird spoof paintings on the Internet. These paintings reminded me that I could build a project to let people add their creative ideas and their characteristics to the well known paintings. Besides, I noticed that people tend to imitate the signature moves of the characters in the painting, such as Mona Lisa and The Scream. The connection between the motion or poses and visual communication reminded me a project called Move Mirror. Here is the demonstration of it.

I really liked the connection and I wanted to apply such an idea to my project. In my case, if the users imitate the poses of the figures in the painting, there will be a corresponding painting beside the canvas to tell them which painting the machine think they are imitating.

The last one is the artistic filters of Beauty Camera that can transfer the original camera style to the oil painting style. I wanted to use style transfer model to achieve such effect.

Related image

My Project

Firstly, there are six paintings on the top of the web page working as the reference. The user can click the “Load Dataset” button and then click the “Start Predicting” button to play the project. If the users imitate the poses correctly, they can see the painting on the right side of the canvas. Also, if they press the spacebar, they can transfer the style to the painting style. You can see a brief demo video through this link:

https://drive.google.com/file/d/120leTwPVZvnTRMPI0Y-0_3AK9wFFSrUC/view?usp=sharing

Methodology & Difficulties

I use KNN & PoseNet and Style transfer to build my final project. I have already used KNN&PoseNet in my midterm project so the logic is similar. I just need to define the classification result and the corresponding output. For this case, the output is the picture of the painting and the style of it. For the style transferring, I referred to Aven’s example about multi-style transferring and used the array to decide which kind of style it would display.

One if the difficulties is the logic of my project. At first, my plan is to let the user take a screenshot and the result of style transfer will show on that image. However, I failed to take a screenshot directly. The only thing I could do was using saveCanvas() to download the screenshot and the outcome was not that good. Thus, I gave it up and tried to show the change through the live video rather than photos. And it worked well.

The other problem is the outcome of style transfer models. I chose a lot of famous paintings as follows:

Meisje_met_de_parel.jpg (4095×4794)

However, the result of training models was lower than my expectations. Some of them were quite similar, so I just abandoned them and found different one. And these are some results of the models.

According to Professor Aven, it was because some of my images didn’t have vivid shape and color. Thus, the model couldn’t give style with strong features. I accepted his suggestions to choose the part with bright color and strong sense of geometry instead of the whole image. And the result was better than before. For example, I only used a part of the portrait of Van Gogh to train the model.

What’s more, I learned how to train several models at the same time. Before I only changed the file but didn’t create new checkpoint and model folders. In this way, I can only get one model because the latest one could cover the previous models. Now I know how to create the checkpoints and models and download multiple models.

The last step is to beautify the UI. I chose a simple but classical image as the background to fit the characteristic of a gallery. Then I only kept “Start Predicting” and “Load Dataset” buttons and changed their original style to an artistic one. Next I made some cards with the basic information of the paintings and the paintings themselves. These cards were put on the top of the web page as a reference for the users.

Significance & Further development

I have noticed that building this project helped me learn more about the paintings, especially their name and creators. Therefore, the significance of my project could be teaching the users some basic information about the famous paintings. For the further development, I wanted to enhance the educational function of my project. If the reference card contains more information such as the background or style of the paintings, the users will learn more about the paintings in a relatively interesting way. However, it’s essential to figure out how to draw the user’s attention to the reference. My idea is to add some text bubbles on the painting so that it seems like the figure in the painting is telling the information to the users. Also, I can add some sound files in my project to enrich the form of interaction. In addition, I plan to increase the types and the number of the paintings to enrich both input and output.

December 15, 2019

Final Project Documentation

Link to final presentation slides: https://drive.google.com/file/d/1vmSHhVKNERSHDP3kohUbkKPy6XJEMXsV/view?usp=sharing

Link to final project posters: https://drive.google.com/file/d/15ZaGm7wcuEwJYs4QItOdXVyZF-wcR_OH/view

Link to final project video: https://www.youtube.com/watch?v=dNwHWeazyxk

Link to concept presentation slides: https://drive.google.com/file/d/1jgxeo-knGx7nLrnWBmPZYLOIwz4mdJ8k/view?usp=sharing

Background

For my final project, I’ll be exploring abstract art with Deepdream in two different mediums, video and print. I plan on creating a series of images (printed HD onto posters to allow viewers to focus on the detail) and a zooming Deepdream video. I’ll create the original images with digital tools, creating abstract designs, and then push them one step (or several) further into the abstract with the Deepdream algorithm.

I love creating digital art with tools such as Photoshop, Illustrator, Data moshing and code such as P5, however I’ve realized that there are limitations to these tools in terms of how abstract and deconstructed an image can get. I’ve also been interested in Deepdream styles for a while and the artistic possibilities, and I love the infinite ways Deepdream can transform images and video. In the first week, I presented Deepdream as my case study, using different styles of the Deep Dream generator tool to transform my photos.

Case study slides: https://drive.google.com/file/d/1hXeGpJuCXjlElFr1kn5yZVW63Qcd8V5x/view

I would love to take this exploration to the next level by combining my interest in abstract, digital art with the tools we’ve learned in this course.

Motivation

I’m very interested in playing with the amount of control digital tools give me to create, with Photoshop and Illustrator giving me the most control over the output of the image, coding allowing me to randomize certain aspects and generate new designs on its own, and data moshing simply taking in certain controls to “destroy” files on its own, generating glitchy images.

However, Deepdream takes away all most of this prediction and control, and while you are able to set certain guidelines, such as “layer” or style, octaves, or the number of times Deepdream goes over the image, iterations and strength, it is impossible to predict what the algorithm will “see” and produce in the image, creating completely unexpected results that would be nearly impossible to achieve in digital editing tools.

References

I’m very inspired by this Deepdream exploration by artist Memo Akten, capturing the eeriness and infinity of Deepdream: https://vimeo.com/132462576

His article (https://medium.com/@memoakten/deepdream-is-blowing-my-mind-6a2c8669c698) explains in depth his fascination with Deepdream, something I share. As Akten writes, while the psychedelic aesthetic itself is mesmerizing, “the poetry behind the scenes is blowing my mind.” Akten details the process of Deepdream, how the neural networks work to recognizevarious aspects of the reference image based on its previous training and confirm them by choosing a group of neurons and modifying “the input image such that it amplifies the activity in that neuron group” allowing it to “see” more of what it recognizes. Akten’s own interest comes from how we perceive these images, as we recognize more to these Deepdream images, seeing dogs, birds, swirls, etc., meaning viewers are doing the same thing as the neural network by reading deeper into these images. This allows us to work with the neural networks to recognize and confirm the images it has found, creating a cycle requiring both AI technology and human interference, to set the guidelines and direct it to amplify the neural network activity as well as perceive these modified images. Essentially, Deepdream is a collaboration between AI and humans interfering with the system to see deeper into images and produce something together.

Experiments

I started by playing around with different images I had already created in code and with Illustrator and Photoshop using Alex Mordvintsev’s Deepdream code to find the settings I wanted to use. While at first I went all out with the strength and iteration settings, I soon realized that the images became too obscured and I wanted the final result to be an abstraction of an existing image, not a completely indiscernible abstraction.

Here’s an image created with the 5a layer on high strength with many iterations:

After researching what each setting related to and refining my process, I settled on my favorite settings in Deepdream, such as these settings that I used for the video:

Then I created my own images for the posters and video, using what I learned from my experiments. I realized that images with a lot of white space often had less interesting results and use of color, such as this image of circles created with P5:

And these results, which didn’t achieve the style I was going for:

Therefore, I focused on creating images with a full canvas of color and/or detail to get the most interesting abstract results. I settled on these 5 images to create the Deepdream posters.

This image of “Juju’s” created in p5 which presented an interesting pattern and enough detail across the canvas for Deepdream abstraction:

Here is after 5a level iteration:

After editing the image to turn it black and white for the final result:

I really liked this result because the black and white filter applied in Photoshop combined with Deepdream added texture and depth, even though the original image was 2D.

The next image I created was this one, beginning from a photograph and using Illustrator. Using the trace image and merge functions, I separated details of one photograph into the top and bottom of the canvas, applying shapes and letters to add detail.

Using the layer 3b I achieved this result:

I like this one because the abstraction still preserves the “dual reality” I created in the first image in Illustrator, adding fascinating detail and depth.

For the next one, I started with this photo of my old laptop’s lid with stickers on it:

I then used Photoshop to distort the image:

In Illustrator I traced, expanded, ungrouped and moved around various parts to abstract it further:

Then, using the 3b layer in Deepdream, I created this image, with some connection to reality but only if the previous progressions are viewed together:

For this image, I took a photo of a pattern on my pants and used Photoshop to edit the colors and distort the “pattern”:

I then used layer 3a to create this stunning image with new colors and seemingly infinite detail:

Finally, I created this image in p5 using beginShape and Perlin noise to randomize the drawing of the shapes and the colors:

Using layer 3a with a lower strength to highlight the rectangle lines in the original, I created this final image:

A link to all my final images is at the top of this post. To create the posters in A3, I formatted the images in Illustrator, adding quotes to highlight my view of my relationship with Deepdream and creation throughout the process. While I originally viewed this project as giving up control to Deepdream to abstract images further, I ended up regaining that control through editing photos and adjusting Deepdream settings to achieve the level of abstraction I wanted. This made me view the process more as a collaboration, using Deepdream for my specific artistic purposes instead of giving it control over my images.

Social Impact

I think exploring Deepdream is a great case study in the societal impact of AI technology, especially when thinking about Deepdream’s original uses and its artistic applications. Developed originally to enhance Google image’s perception of the content of images, developers realized the potential to create their own psychedelic imagery by increasing the strength, iterations and octaves and pulling images from different layers of neural network to reveal the algorithm’s process of enhancing and “seeing deeper” inside of given images. As AI is developed for practical uses of enhancing companies’ performances and governments’ efficiency, I foresee artists continuing to explore the artistic applications of various AI developments by tweaking the software for their own uses. This shows that as capitalism monetizes technological functions, humans will continue to see the beauty and aesthetics of these functions. Deepdream alone has much more potential in artistic applications as artists continue to explore its uses.

Further Development

One way I plan to develop this project further is through VJing with Deepdream videos. After creating my zooming Deepdream video set to music, I realized that it had great potential to be used as a video DJing tool accompanying techno music. I also received this feedback from friends I showed the video to who thought the pairing with music added a new element to the music. This winter I plan on throwing some parties with my friends who are DJs, and I’m planning on exploring how to import Deepdream style videos into VJing software to allow manipulation of the videos in real time with music. I liked using video editing techniques and manipulating the speed, colors (such as inverting the colors in time with the rhythm) and zoom to accompany the music and I see a lot more potential in this area.

I also plan on continuing my Deepdream poster series as a personal project since I love the results and I have many more images I’ve created with code and digital editing tools I would love to explore more. I plan on creating a new page on my portfolio website showcasing my AI art and potentially collaborating with other artists to use Deepdream in further projects.

December 15, 2019

Week 14: Final Project Documentation by Jonghyun Jee

Slides for the finals can be viewed here.

Project Title: Object-Oriented Art

Background

I’m recently interested in the nascent philosophical movement known as Object-Oriented Ontology (usually called by the acronym OOO), which has attracted a lot of attention from the arts and humanities scene. In short, OOO rejects the idea of human specialness: we should not place the privilege of human existence over the existence of nonhuman objects. Working on my AI Arts final project, I convinced myself that my works, in part, reflect the idea of OOO, so I named this series “Object-Oriented Art.” In contrast to the mainstream phenomenology that presumes things are only real insofar as they are sensible to human conception, OOO claims that things do exist beyond the realm of human cognition. In his article, Dylan Kerr lists some examples of questions that are posed by OOO artists: “what does your toaster want? How about your dog? Or the bacteria in your gut? What about the pixels on the screen you’re reading off now—how is their day going? In other words, do things, animals, and other non-human entities experience their existence in a way that lies outside our own species-centric definition of consciousness?” One of the main criticisms against OOO is that it is simply impossible for us to withdraw from human perception. Wondering about how the day of pixels on my screen might go is fascinatingly imaginative; and yet, that idea itself is still too human-centered—it is nothing more than applying human-only notions to other nonhuman objects. An interesting parallel can be found from an emerging AI arts: a controversy about whether the credit of AI-generated artworks goes to AI. As of now, I think, all the algorithmic artworks in the world are still human’s brainchild. Humans programmed the code, collected data, and fed algorithms these data to create a piece of art. Unless AI does the same process by itself without any single human involvement, I think AI is no more than an art tool like a brush. For my project, I put emphasis on the possibilities of AI as an art tool—algorithms as my brush, data as my paint.

Motivation

To put in a nutshell, my project is to feed a number of algorithms with my own sketches in order to visualize my ideas and impressions. Below are my original drawings:

Chameleon is the living example of “style transfer,” which I’m using primarily to color and morph my drawings.

This is my self-portrait.

And this is a skeleton of frog. When I was very young, probably 3 or 4 years old, I was floating around my house and found this tiny frog skeleton. I stared at it for quite a while, knew right away that it is not alive anymore. It was my first encounter with the idea of death.

Let’s see how AI spices these drawings up!

Methodology

If I fed algorithms my raw sketches, the result would be disappointing. As their background is just plain white, AI will fill up this blank with dull, redundant patterns. So I had to do a sort of “biscuit firing” by adding some colors and patterns. I used the tool called “Painnt” to apply thin styles on my drawings.

The next step was to choose data. For the chameleon, I intended to visualize a future chameleon that is surrounded by human-caused environmental pollution.

I combined my drawing with an image of plastic waste using “DeepStyle” powered by Google. DeepStyle allows its user to easily apply the style transfer effect on one’s image; it usually takes five minutes to train and yield a result. The generated result is already pretty interesting, but it the distinction between the object and background is not highly evident.

So I repeated the same process with a different image of plastic waste. You can see how the sky and shadow of the right image are partially shown in the generated result.

Using Photoshop, I combined the two results together and got this final image. However, I needed AI’s help once more.

The resolution of this image is 775×775, which is not very favorable to print out. When zoomed into the arm of the chameleon, the image became visibly pixelized. I used AI image upscale to enhance the resolution of my image by 3100×3100.

I repeated the same process for my other sketches as well.

The other chameleon combined with images of forest fire.

The skeleton of frog combined with a picture of frog eggs. In so doing, I tried to blur the line between life and death, drawing and photography.

My self-portrait combined with an image of Dancheong, a Korean decorative patterns for traditional wooden buildings, especially for temples. I chose Dancheong as my style input because it is a symbolic representation of my cultural background (Korea & Buddhism).

Conclusion

I intended to focus on the effectiveness of AI as an art tool, especially in terms of creating a piece of fine arts. Using traditional art mediums such as paint and ink is not only time-consuming but mostly irreversible. We cannot simply press CTRL+Z in a canvas. When I create an artwork, the biggest obstacle has always been the lack of my techniques; my enthusiasm cooled off when I could not visualize my thoughts, ideas, and impressions in a way I had envisioned. The AI tools I have learned during the class, in this sense, could fill in the technical gap of my art experiments. After using AI to color and morph my drawings, I printed out the generated results and juxtaposed my original sketches with AI-modified versions of them in order to show the process of how AI spiced up my raw ideas. One remarkable thing I noticed during my project is that, AI arts also requires a sort of “technique.” I had to choose tools and data that are appropriate to visualize my ideas, modify the parameters, manipulate my data (sketches and photos) to yield more satisfying results. Some may think AI artwork is just a click away, but in fact I think it requires inspiration and consideration as much as traditional art mediums do. I would like to continue my art experiments with the tools I learned during this course, and explore more about the possibilities of artificial intelligence and computer vision in terms of creating artworks. Huge thanks to Aven who spared no pains to help us learn the inside scoop of AI Arts, and all other classmates who gave me extremely valuable feedback.

December 14, 2019December 14, 2019

Week 14: AI Arts Final Project

Link to video: https://youtu.be/X-HujM0LWVg

Background:

Without a doubt, the entertainment industry is a big part of everyone’s lives. As we become more and more connected to the larger digital landscape, forms of media such as movies, music, etc. will stimulate our imagination and inspire us even more. Many people use pop culture to reference their everyday lives and I’m no different. As a fan of the sci-fi genre, I have always imagined the night time of Shanghai to be incredibly breath-taking. As I listen to songs of the synthwave genre and look outside my window, I imagine Shanghai as a city from a popular movie such as TRON and Bladerunner.

One of the reasons for this is because gentrification has become a topic among the citizens of the city. The city has been gentrified to create space for shopping malls and high-rise buildings, not to mention that the Chinese government has started to experiment with facial recognition technology and social credit scores. Although the development of technology is extremely important for society as a whole, it is also reminiscent of dystopian science fiction pop cultures, such as 1984 by George Orwell and Bladerunner. Synthwave and the cyberpunk genre display what the future of society could be if technological development continues without regarding the impact on humanity. Therefore in a way, the scenery and aesthetics of the genre are extremely beautiful, however, this is only on the surface. If one were to look past the eerie beauty, they will find that it is not as perfect as it seems.

Motivation:

I wanted to create a video project using the style transfer model that was given to us by training it with a series of different images. The images themselves were cityscapes reimagined to be futuristic. An example is the image on the top, a frame pulled from one of the videos I incorporated into my project. The style was transferred from a TRON landscape that I pulled from the internet. For a portion of the class, we have focused on style transfer and it really interested me. Is it truly art if you are just imitating a style of something and applying it to something else? If not, then what could be the purpose?

My motivation was the exploration of our newfound ability to reproduce styles of popular media through machine learning and produce a video that would help me display sonically and visually what I see in Shanghai’s nighttime. Furthermore, I wanted to explore how style transfer can interact with media and help shape it. In this case, “Can style transfer be used for a music video, to invoke the same feelings or perceptions of the style that is being transferred?”

Methodology:

The methodology for the project mostly involves the usage of style transfer at its core. The important part is training the model with pictures that are most representative of the cyberpunk genre. The images involve something like this:

These images are all images pulled off the internet by searching “cyberpunk cityscape”. But just transferring images were not enough, I wanted to create videos indicative of my vision for Shanghai. So Adobe Premiere, VLC, Logic Pro for the music, and Adobe After effects were all involved in the process of making the video. Devcloud also played an important role in allowing me to complete most of my training and conversion at a manageable time.

Experiments:

First Iteration:

For my very first iteration of the project, I ended up creating what was a lo-fi video. I trained the models with the original three photos that I have added in this post. The results were decent as you can see from the examples down here:

For the first time, I was actually converting an image into multiple different styles that I trained myself. I thought it was amazing! However, it was now time to find a way to convert the videos that I had into multiple individual frames that ideally result in a high fps video. After consulting with the IMA faculty present at the time, I ended up going with the line of code “saveCanvas”.

This allowed me to save the frames that were being played from the transferred video set. The issue with this technique was that it resulted in a very low fps video that seemed and felt extremely glitchy. When the frames were saving, it also saved the frames that were stuck in the process of transferring style, so for some videos I ended up with more than 500 frames of just static imagery. Not to mention the quality of the style transfer itself didn’t help much. I later added the frames into Adobe Premier and transformed it into a whole video. I then wanted the video to also invoke the feelings of the cyberpunk genre so I ended up downloading a song from the internet called “Low earth orbit” by Mike Noise. The end result was a video that fundamentally did what I wanted it to, but was a very low-quality video of what I needed it to be.

Second Iteration

For the second iteration, I took into account the advice that critics gave me and incorporated them into my project. But, more than that, I really wanted to increase the quality of the videos that I pulled together.

This time I started out with experimenting with the inference.py file that is found on our devcloud folder for the style transfers.

The first time I tried, I didn’t quite understand the code that needed to be written. I understood that it was to direct it somewhere to get it to work, but I didn’t realize what exactly I needed to input. So, I ended up failing quite a few times and eventually getting quite disheartened. But then I cracked it after a couple of rounds of failure, then I ended up with ABSOLUTELY STUNNING results.

Results:

The quality exceeded my expectations and provided me with a way to produce breathtaking results. More importantly, I was able to keep the resolution of the original file without the model squashing it into something that was not ideal.
Later on, I decided I needed to transfer the frames efficiently so that my transferred video would be able to reproduce the high fps rates of the original video.

I used VLC and played around with the option of Scene Filter that allowed me to automatically save frames to a designated folder. I ended up saving over 500 frames per video, some even reaching 1200 frames. The important part of the process was to collect these frames and upload them onto devcloud and put the style transfer inference.py on it. This way, I was to reproduce 3+ styles for every single video that I had. Safe to say that this process took most of my time as I uploaded, transferred, and downloaded them back due to devcloud’s connection was only pulling at a low 20kbps at times.

After downloading all the style transferred frames, I proceeded to put them into Adobe After effects and added all the frames from one style and video into an image sequence. What this allows is the automated process of detecting the fps of the original video and creating a high-res video with the correct order of images. I did this for all 28 folders. I ended up only choosing certain styles and videos for the final product as some of the styles didn’t look different enough while some of them were too different from my intended cyberpunk theme.

I also implemented an additional video from youtube that had a drone’s perspective of the pearl tower, an iconic architectural piece indicative of Shanghai. I chose the frames from that part of the video (800) frames and added it to my video. This piece had a neat detail, it had a text from the original video style transferred which said “The greatest city of the far east”, which I thought was extremely cool to have so I left it in the final cut.

I wanted to implement music, so I hopped on my own program called Reason and tried different ideas.

I tried different presets of sounds and resulted in getting close to what I wanted but not exactly. I later proceeded to use the Logic Pro program on the school’s computers to use the presets on there and resulted in the music that you hear in the final piece. I added reverb, and an EQ to cut out some of the lower frequencies that clashed with the dominant bass.

After that, it was just a matter of adding them all into Adobe Premier to turn them into a whole video. I added multiple transitions and effects that would allow a smoother transition from video to video.

Also, I layered the same video on top of each other and changed the opacity so that it would seamlessly change styles from the TRON all blue, to a bit more colorful neon-esque hue from a different style.

The music would also last a bit longer and I made it so that in the end one by one the instruments would stop playing until one was left and made it fade out.

The resulting video was what I had hoped for and left me very proud of the work that I did.

Social Impact:

The social impact of this video is not that significant in my opinion. The video was created as a way for me to show others how my imagination thinks of Shanghai. Therefore, this video is a bit more personal than my midterm project.

Nevertheless, the project fundamentally challenges what it means to be artistic. I think that the usage of style transfer is extremely important. What is real art when style transfer is involved? Is my video considered art when I incorporated someone else’s style to my project? I think artistically the machine learning model can produce some amazing results, however, I believe that it also raises some very important questions about the usage of style transfer such as the question of originality and plagiarism.

This model does show a lot of promise if used in the correct way, such as providing an infinite source of inspiration for an artist using his own work as a trained model. An example is Roman Lipski’s Unfinished. I can also see someone with the ability to create his own frame of images that he wants to train the model on, and create his vision of the world around him through style transferred videos. Artistically, I also think that it can help with the creation of movies and music videos.

Further development:

Some further development for this project could potentially include further refinement of the model. We can try to make the model produce something as close to the original image as possible with minimal loss. Further development could also include incorporating acting and more shots of humans and how they interact with the style transfer model. More aerial shots can also help with the video and produce something much more focused on the cityscape.

I mostly used night shots of the city into my video. Trying to style transfer images of the city during day-time produced very lackluster products, it mostly turned the screen green or blue and that wasn’t very satisfying. I believe that it has something to do with the style of the image that it was created on. The original that I used to train the model was very dark in overall tonality and thus, it might have helped when I used night pictures of the city. Further looking into the development of day-time shots and transferring it into night time shots can also help in the future.

References:

Drone video of Pearl Tower https://www.youtube.com/watch?v=NOO8ba58Fps

Drone video of Shanghai https://www.youtube.com/watch?v=4nIxR_k1l30