Week 11: Deepdream exploration

For my deepdream exploration, I created a zoom video that’s a compilation of two deepdream sessions. I created one deepdream with setting mixed4c, then another with the last image picked up from the first video and zoomed in further, also on mixed4c.

deepdream zoom low quality

I had a lot of challenges with this, as at first I didn’t realize I need to change the video file name to save the video (leading to some confusion) and I also had some random errors where img0.jpg wasn’t defined as I didn’t realize that I had to run all the cells together every time for everything to work. I also had trouble with the original photos I tried to use as they were very high quality and I didn’t realize that the session would close down after 30 minutes so the videos kept failing to generate and getting stuck loading at one or more of the cells.

After getting some help from Aven I was able to realize these errors and generate the videos. I plan on using this technique for my final video project, so I’m excited to explore it more!

I also put together the class video for the bigGAN collaboration, available here:

https://www.youtube.com/watch?v=xxrblvC-VOM 

FInal Project Concept Eric & Casey

For this final project, we had two ideas in our minds. I will introduce them one by one:

Refine the CartoonGAN project

This idea is a continuation of our midterm project by adding two functionalities: gif converting and foreground/background only converting.

In detail, the first one requires work on unpacking gif into frames and pipe all frames as a batch to the model. Then the model gives the output value and our app will pack those frames into a new gif.  Through this process, we need to put extra attention to avoid running out of memory. 

As for the second idea, I am planning to combine the BodyPix model with our CartoonGAN model. However, to maximize the efficiency of the computation.  I will apply some WebGL tricks with the BodyPix and CartoonGAN output. So that we can deliver the best user experience.

Generative ART with Sentiment Analysis on both audio and text

The idea is that the text could contain certain emotions so does the audio/speech. What’s more interesting is that even the same content, the text, and the audio could have different emotions. We can use this kind of conflictions to make generative arts.

However, there are some problems remaining here.  First, there is no real-time sentiment analysis during these years. We can find an alternative that uses face expression.

The other thing is how to visualize or say generate the emotions that will attract the audience?  This seems to be a harder and more critical question than the previous one.

Week 11 – DeepDream Experiment by Jonghyun Jee

DeepDream, created by Google engineer Alexander Mordvintsev, is a computer vision program that chews up the reality and renders it into trippy, somewhat nightmarish image. With a help from CNN (Convolutional Neural Networks), the effect of deepDream is a result of how the algorithm views images; that’s why this pattern recognition is called algorithmic pareidolia. For this week’s assignment, I tried a number of experiments with varying parameters to see what sort of results it would yield.

Instead of photographs, I drew a self-portrait and took a picture of it. I colored my drawing with Photoshop and Painnt:

Then I uploaded my drawing on this site, which allows its users to easily apply DeepDream effects on their images–without knowing much of how this DeepDream actually works. 

We can see from the generated image above that it warped the original image with mostly animal-related features. We can spot the dog-like and parrot-like visuals, but still the original portrait looks like a human face. To control more parameters of this effect, I used the notebook called “DeepDreaming with Tensorflow” provided by Alex Mordvintsev. I tried different layers to see which one yields the most interesting output.

Those layers are characterized by edges (layer conv2d0), textures (layer mixed3a), patterns (layer mixed4a), parts (layers mixed4b & mixed4c), objects (layers mixed4d & mixed4e).

Mixed 4b created spirals in the background.

And Mixed 4c showed the floral patterns. The way how it transformed the background elements was pretty cool; and yet, my face didn’t change much. I could see there was something interesting going in terms of computer vision. I moved on to the next step: video!

This notebook powered by Google Colaboratory provides a simple yet powerful user environment to generate a DeepDream video. To break it down with several steps, the first thing I had to do was mounting my Google Drive. It allows users to import their own Google Drive and upload an input image and download the output (generated video, to be specific). The next step is to load the model graph–the pre-trained inception network–to the colab kernel. After loading the starting image, we can customize our own neural style by adjusting the sliders (the strength of the deep dream and the number of scales it is applied over). Then we can finally begin generating the video by iteratively zooming into the picture.

Layer: mixed4d_3x3_bottleneck_pre_relu Dreaming steps: 12 Zooming steps: 20 From its thumbnail, we can see some interesting architectural images and dogs. And yet, 32 frames were too small to enjoy a full DeepDream experience.

Layer: mixed4c Dreaming steps: 60 Zooming steps: 20 Dreaming steps were a bit too high compared with zooming steps. At the point where it begins to zoom, the image doesn’t even look like the original portrait. It rather seems a way too deep-fried.

Layer: mixed4c Dreaming steps: 16 Zooming steps: 80 When I added more zooming steps, it goes far deep but the images look a bit too redundant. It would have been better if I tried different layers.

Overall, it was a very exciting tool to play around with. The whole rendering process didn’t take a long time thanks to the pre-trained model. I still don’t have clear idea for my upcoming finals, but DeepDream will be a definitely interesting option. 

Week 11: Training Deepdream – Jinzhong

The assignment for this week is to play around DeepDream, a GAN network to transfer the style (or pattern) of an image, for example, this one:

WORK

There are 5 parameters that are customizable in the step of generation in total. These are:


octave_n = 2
octave_scale = 1.4
iter_n = 15
strength = 688
layer = "mixed4a"
 
And today I am going to talk about my research and understanding of these parameters, as well as my tests and experiments.
 

octave_n

– Test Range: [1, 2, 3, 4, 5, 6]

– Test Outcome:

Form the test we can see the parameter determines the depth of deep dream. The larger octave_n becomes, the deeper the render/transfer process will be. When it is set to 1, the picture is only slightly changed, the color of the sheep remains almost the same as its original source. However, when the parameter becomes larger, the contrast colors become heavier and the picture loses more features.

octave_scale

– Test Range: [0.5, 1, 1.5, 2, 2.5, 3]

– Test Outcome:

This parameter controls the scale of the deep dream. Although the contrast colors are not as heavier as the first parameter octave_n, the size of each transfer point scales and affect a larger area. So, we can see from the last picture, the intersections of several transfers are highlighted.

iter_n

– Test Range: [10, 15, 20, 25, 30, 35]

– Test Outcome:

This parameter controls the number of iteration of the deep dream. In other words, it determines the times of image processing. When the number is smaller, the output woulld be more similar to its original input. When the number becomes larger, the output would be more ‘deepdreamed’.

strength

– Test Range: [300, 400, 500, 600, 700, 800]

– Test Outcome:

The strength determines the scalar condition of each deep dream process. As we may see from the pictures above, the 6 transforms of the original picture are almost the same while only differ in the strength of colors (patterns). The higher strength outputs the sharper result.

layer

– Test Range: [“mixed3a”, “mixed3b”, “mixed4a”, “mixed4c”, “mixed5a”]

– Test Outcome:

The layer gives different patterns of the deep dream. It is also the pattern GAN used to train. So, each pattern would render different shape of DeepDream.

Week 11 – Deepdream – Katie

I wanted to try out Deepdream for this assignment to get more familiar with it. I’m thinking more about the concept of perception for my final project, specifically where human and computer perception don’t intersect. These are usually considered to be failures in terms of training, but are they actually failures or do they simply not align with human perception?

Anyway, I tried out changing different parameters but ultimately came up with this video through Deepdream of Magritte’s Golconda.

https://drive.google.com/file/d/16UhPfu7oLG9we59W-rJ68ZURRs6yEOWO/view?usp=sharing

I still would like to see more from Deep Dream in a video output (these are too many dogs), so I’d like to continue to work on this and update later as I find new results.