iML Week 02: Case Study – Ivy Shi

Style Transfer – Real-Time Food Category Changer

One interesting project I came across on the internet involving machine learning and artificial intelligence is called “Real-time Food Category Change”. It is a food style transfer project lead by Ryosuke Tanno and presented at the European Conference for Computer Vision in 2018. The idea of is nothing grandiose, it simply allows users to change the “appearance of a given food photo according to the given category [among ten typical Japanese Food].” For example, you can transfer a bowl of ramen noodle to curry rice by exchanging the texture while still preserving the shape. 

Here are some sample image results: 

More at: https://negi111111.github.io/FoodTransferProjectHP/

The results are achieved by using a Conditional Cycle GAN on a colossal food image database. 230,000 food images were collected from Twitter stream which were grouped into 10 categories for image transformation. The algorithm – Conditional Cycle GAN is an extension of CycleGAN with the addition of some conditional inputs. This modification is necessary to overcome CycleGAN’s disadvantage on only learning image transformation between two fixed paired domains. More algorithmic method and technical considerations can be found in this paper: Magical Rice Bowl: A Real-time Food Changer.

As someone who enjoys taking photos and eating food, I personally found this implementation to be a lot of fun. Additionally, the fact that such food image transformation can be done in real time on both smartphones as well as PCs is quite impressive. There are even practical future applications of this idea which is to combine this with virtual reality to unlock new eating experience.  An example would be if people are on diet and try to restrict high-calorie food intakes, they can eat low-calorie foods in reality while still enjoying high-calorie foods in their VR glasses. 

Week 2: Case Study Research – The BachBot

BachBot

“Can you tell the difference between Bach and a computer?” BachBot is an AI that makes music like Bach made by Feynman Liang. Their goal is to generate and harmonize chorales in a way that’s indistinguishable from Bach’s own work. This task requires both music theory and creativity. Here’s the link of BachBot, you can take the BachBot challenge to see if you can distinguish the difference between Bach’s excerpt and computer-generated melody. 

Continue reading “Week 2: Case Study Research – The BachBot”

Week 02 Assignment: Case Study Research – Style2Paints- Abdullah Zameek

Style2Paints – An AI driven lineart colorization tool

One of the biggest “bottlenecks” in the comic/manga industry is the time taken for artists to find the perfect color schemes for their drawings, and to actually color them in. This makes creating a single chapter or volume of a particular work to be a long, tedious process. 
Developed as  a collaboration between four students at the The Chinese University of Hong Kong and Soochow University, Style2Paints is one of the first systems to colorize lineart in “real-life human workflow”. What this essentially means is that it tries to follow the exact same process that a human goes through when coloring in a picture. As the authors of the project described, the human workflow could be summed up as follows :

sketching -> colour filling/flattening – > gradients/adding details -> shading

The Style2Paints library mimics the same process, and generates 4 different, independent layers in the form of PSD files. The layers are :

  • Lineart Layers
  • Flat Color Layers 
  • Gradient Layers
  • Shading Layers

Style2Paints was inspired by past projects such as PaintsChainer[TaiZan 2016] and Comicolorisation[Furusawa et al]. These outputs of these two model, however, often contained artifacts and coloring mistakes. This is solved up some extent with the separate layer model that Style2Paints uses.
Having separate layers allows for an artist to adjust and fine-tune each layer before merging them to form the final picture. But, as described by the authors, the Style2Paints model is able to do most of that fine tuning for the user. The user inputs the lineart image, and three optional parameters which are hints(which color palette to focus on more, etc), color style reference images, and light location and color. 

The results generated by the model are classified as follows: 

Fully Automatic Result – When there is absolutely no human intervention.
Semi Automatic Result – When the result needs some color correction, the user can put in some color hints (clicks) to guide the model.
Almost Automatic Result – Semi automatic results with fewer than 10 human corrections.

The underlying technology behind this project lies in a two-stage convolutional neural network framework for colorizing. The first stage involves (called the drafting stage) involves an aggressive splash of color across the canvas to create a vivid color composition. This stage would contain color mistakes and blurry texture, would be fixed in the next stage where blurry textures are refined to smoothen the final output. This sort of model splits the complicated task of coloring into two, smaller tasks which allows for more effective learning models, and can even be used to refine the output from other models such as PaintsChainer.

A few images from the model have been attached below for reference. (It was reported that all the images below were achieved with fewer than 15 clicks)

I find this project to be interesting for multiple reasons. From a technical point of view, I like the fact that they approached the problem in the most “human” way as possible, i.e focus more on how a human would do it, rather than what would be most efficient for a computer to handle. Secondly, this model gives artists freedom to experiment with different colors at a faster pace. For example, they can try out different schemes using the model, and pick out on that works best in a short amount of time, as opposed to manually filling in the colors.  This would certainly help artists to create more exciting content in a shorter amount of time, which would ultimately be of benefit to the industry as  whole.

Sources :
The published paper can be found here
https://github.com/lllyasviel/style2paints

[P] Style2Paints V4 finally released: Help artists in standard human coloring workflow!
byu/paintstransfer inMachineLearning

IML | Week02 Case Study Research -Quoey Wu

ET City Brain is an intelligent system developed by Alibaba Cloud which collects massive data and uses deep neural network to help manage the city in a smarter and more efficient way. More specifically, ET City Brain can function well in the management of mass transit systems, traffic congestion and signal control, community surveillance and safety, smart healthcare, urban natural resources management and many other aspects.

Here is the graph that illustrates the structure of the city brain. There are mainly three platforms operating together to combine data and serve the whole city. Here is the link for more information.

Also, AI capabilities are the core of ET City Brain. There should be a lot of training process according to logs, videos and data streams from systems across the urban center. Some of capibilities are as below:

  • Speech Recognition
  • Face Recognition
  • Image Identification
  • Text Recognition
  • Natural Language Processing 

Using these capabilities, it can do things like expediting ticketing and customer services, identifying suspects in a crowd, creating driving assistance app and helping public service departments. 

This case interests me because it shows how powerful big data and AI can be and how our daily life can be related to computer data. In the past, it is even impossible to put all these data together, but nowadays we are able to make use of it and analyze it to get what we want. And I’m curious about the details of how the computer operates in each section and then generates the final result, which still needs more research.

Case Study – Magenta

Magenta is a research project developed by Google AI aims to use Machine Learning as a tool in the creative industry. 

It’s powered by TensorFlow and has a JavaScript API called Magenta.js. Here is its website: https://magenta.tensorflow.org, which shows how they used Magenta to generate music and create sketches. I found one project which was very interesting: Draw Together with a Neural Network .

It allows you to create sketches together with an AI. It has several models. Once you choose one, you only need to draw part of the figure and the AI will help you finish the rest of it. 

If you are wondering how they achieve this, you can check out this website , it describes how they taught the machine to draw using a neural network called sketch-run. The dataset they used to train this neural network was from Quick Draw. They collected this huge dataset by asking users to play their game and recorded their sketches.

 

By combing both the human intelligence and machine learning technique, this project may provide a new way of creating art works in the future.