Monthly Archives: March 2020

Week 6: Midterm Project —— Yunhao Ye (Edmund)

Basically, my midterm project is going to be a visual one and I will explore with the object detection technique. Currently, I want to use “Yolo V3” but it is not decided yet. What I want to do is to portray the world in a machine’s view.

Since the object detection model just tries to discover the objects it can recognize and provide its label, position and position. It does not care the details in those boxes and if there is any difference between the objects belonging to the same label. So, in my mind, the world in machine’s view may be a world filled with repetitive rectangles with different colors. And I want to demonstrate that kind of view in the “Processing” with the help of “runwayML”.

The first image is one of the famous example images of “Yolo V3” and the second one is a simulation of the visual effect of my project. I will cut off the words of the labels and make it a complete visual one. And I may also change the opacity of the boxes based on the possibility of its speculation.

This idea is inspired by some pictures only with simple shapes, I love that simple style with and want to relate it to the machine vision. 

And the combination of the simple shapes reminds me of the oil paintings portraying the static objects. When the artists want to evaluate the painting, they will find the borders of the each object in the painting in their mind, just like the object detection model does to the input images. The artists do this to observe if the objects in the painting achieve a beauty of balance between the objects. And my object may also helps to discover the beauty behind the realistic images.

 

Week 5: Experiment on Model Training —— Yunhao Ye (Edmund)

First I change the import dataset from “fashionmnist” to “cifar-10” and download it.

Then I check the shape of the arrays of images and arrays.

And then I get the classnames from the internet and then look through the training datasets by getting randoms images and their labels. Different from the case of “fashionmnist”, the return of train_labels[index] is also an array instead of an integer, so I need to use int() function to fetch the value in the array. 

I check that the value in the images array are RGB value, so I resize both training and test arrays by dividing by 255 to make the value between 0 and 1.

Then I begin to design the model. Firstly, I use the same model we used during the class. But I need to modify the change the input shape since they are different datasets.

It works well but I do not think the accuracy is satisfactory. So I try to change the optimizer and loss function. I have tried four more different optimizers and I discover that the growing rate of the accuracy during the training process is very similar. The accuracy of “Adamax” is the highest and the evaluation is the closest to the accuracy gotten after 10 epochs, so I choose to “Adamax” in later experiment.

 

 

Then I try to change the loss function, but changing it to different functions will raise the same error. I search it briefly on the internet and find out it may be because different loss functions expect different input shapes.

I want to increase the accuracy so I try to put more “Dense” layers in the model. I add three more layers and it turn out to increase the accuracy a little bit but takes much more time at the same time.

Lastly, I want to use “Dropout” to see if it can make the evaluation better. The outcome is interesting because the accuracy during the training process decrease a little while the the evaluation accuracy is nearly the same, so I think it is useful to prevent overfit.

 

Week 4: The relationship between the neural network and the brain/neuron —— Yunhao Ye (Edmund)

In my opinion, I do not think DNN has a close relation with brain or neuron. In my case, I would prefer to regard the relevance between these two terms as a coincidence rather than a kind of inspiration. 

Firstly, I want to consider the structure which make them work. In my experience of coding, I gradually discover that all the complicated algorithm repeat the same structure. First breaking a complicated problem into many easier subunits, and then design functions for each unit. Finally, combining them all together and then we get that expected algorithm. And between the gap of each subunit, we need to give them a kind of connection to make the whole stuff works. That is the input and output of taken by each function. And for the case of DNN, those inputs and outputs are vectors or arrays filled with numbers. For brain and neuron, it also works in a similar way. The neurons in our bodies communicates with each other through the electrical signal, and with those neurons, our brains can process a very complicated algorithm and then give us the useful information. They both works in a structure built with multiple layers. But I think it is not because that DNN is inspired by the brain/neuron so it has this structure. It is because it has to use this structure which is similar to the brain/neuron to make itself possible so it got its name.

Secondly, I want to compare how they respectively achieve ‘learning’ and ‘intelligence’. For DNN, it learn things in a very inflexible way. It evaluates itself with a fixed algorithm. It knows if it makes progress by a fixed algorithm. It makes a change and tries to perform better according to a fixed algorithm. All the stuff is related to algorithm, calculus and numbers. When we code, we always try to transform how we think in natural language and then translate the natural language in to programming language. In this case, we also do this, it is convenient and useful. However, it is sometime more complicated then the thought in our brain, and it is sometimes limited in one situation and lack of flexibility. For example, DNN can only work in a fixed pattern, it can only get the answer from the calculation, just like the score for our exam. But in the real world, we seldom judge someone or something only by its score. And sometimes we really do so because some limitation in the real world. But there is no limitation in our brain when we are thinking. So I think there is still a remarkable difference between DNN and brain/neuron.

Week3: Case Study 3 —— Yunhao Ye (Edmund)

Anna Ridler is an artist and researcher who is interested in working with collections of information of data, creating new and unusual narratives in a variety of mediums. In Myriad (also named Tulips), she looks at the process of using large dataset to produce a piece of art. This project is inspired by “tulip-mania”, a financial craze for tulip bulbs in the 1630s. And it has long been considered as the first economic bubble. In this project, she took 10,000 photographs of tulips and categorized them, revealing human’s feeling about those pictures.

Then, Ridler used this large sample of photographs as the training set of an AI based project —— ‘Mosaic Virus’. This project creates a video work generated by an AI, which shows a tulip blooming. The appearance of the tulip is controlled by the price of bitcoin. ‘Mosaic’ is the name of the virus that causes the stripes in a petal which increased their desirability and helped cause the speculative prices during the time. In this piece, the stripes depend on the value of bitcoin, changing over time to show how the market fluctuates

Though this project not a complicated project as it is only a GAN trained on 10,000 photographs, I am much impressed by its insight and creativity. First, Ridler successfully use AI to do the visualization of the data. Second the way she achieves the visualization is quite creative, she chose a historic event which similar to the circumstances nowadays. By reflecting the speculative price of bitcoin with the different shape and stripe of tulips. She ingeniously combine the economic bubble caused by bitcoin and the bubble caused by tulips together. Lastly, the name of this project —— ‘Mosaic Virus’ add to its insight. Mosaic Virus is the cause of the desirable stripes of the petal, thus is also the cause of the bubble 400 years ago. And she tries to make a connection between the speculative price of the ever-changing stripes of tulips, reminding us that bitcoin can be another virus to cause another bubble.

 

Week 2: Experiment with RunwayML —— Yunhao Ye (Edmund)

After the installation, I first tried with some models I am interested in and run them remotely. But it turns out to take about 5-10 minutes while the inference time it reports is only about 10 seconds. So I downloaded them and run them locally and this makes them much faster.

The first model I tried is called Deep-Portrait-Image-Relighting. It is a very useful one. It can helps to modify a photo by relighting it. For example, you can change the light to left, right, bottom, top, top right, bottom left and so on. When you are not satisfactory with the light effect of your photo, I think this will help a lot.

The original image:

Right:

Bottom:

Top-right:

Bottom-left:

The second model I tried is called CarttonGan. This model learn from the painting of 4 famous Japanese cartoonist —— Hosoda Mamoru, Hayao Miyazaki, Kon Satoshi’s Paprika and Shinkai Makoto. And this model can modify the original image by changing its art style similar to those cartoonists paintings. 

The original image:

Hosoda:

Hayao: 

Paprika:

Shinkai:

The last model I tried is called Unsupervised-Segmentation. Basically, it tries to detect the pixels belonging to the same block (cluster) and then change them to the same colors. I have done some coding of this model last semester in ICS course. 

The original image: 

After segmentation:

 

I have done some further research on this model.The basic structure of this model is to iterate and give the same table to similar pixels and then give the same color to the pixels in the same table.

[公式]Iterate it for T times (we can change T)
[公式]Use CNN to get the feature of the image
[公式] According to the feature we get, for each pixel, find its cluster 

[公式] For every cluster of this image
[公式]Find the most type (color) in this cluster
[公式] For every pixel in this cluster, assign the type (color) to it

Here is a whole picture of this model

And this is a YouTube video introducing another model of segmentation.