Week 02: ml5js Experiment – Eric Li

Intro

ML5 is a user-friendly machine learning framework that is for users who are new to ml field. It covers easy-to-use models and examples.  Among these models, I want to point out the BodyPix model.

What it is?

BodyPix can do segmentation on human body pictures. Instead of simply cutting the figure out of the overall picture, it can also detect every single part of the human body like the arms, legs, and head (it can even tell right and left!). 

Thus, this model can be a good replacement with Kinect. We can use it to track the user’s hand or head positions.

In addition, this model can track multi-people in one scene, which means that it can fire some better performance in terms of complicated and multi-users gesture detection.

body p[ix demo

Above is a demo of the BodyPix model where you can see that it detects the different parts of the human body and colored them separately. 

Potential Usecase

One thing comes into my mind is like the smart danmuku like Bilibili and Netflix where the comments or subtitle will not block characters in the scene.

Also, this cloud serve as a replacement of Kinect as I have mentioned before. 

Week 2 AI Arts Assignment – Cassie Ulvick

For this week’s assignment, I played around with an ml5.js project called BodyPix_Webcam. I was drawn to this particular project because it reminded me a lot of a green screen or the Photo Booth app on Macbooks.

Basically, it detects the presence of a body through your laptop’s webcam. The output is a real-time video of everything covered in black except for the detected bodies. When I was testing it, it worked pretty well when I was by myself with no one else to detect.

I was curious how well it would work with multiple people, so I asked my friend to test it out with me. This didn’t turn out as well, as parts of our faces were covered in black.

This project was interesting to me because of its potential applications. The fact that the bodies were detected in real time would be very useful and could be used to improve a lot of existing green screen systems. In Photo Booth, for example, there are some effects where you can change the background of your photo. However, in order to use these effects, you have to step out of the camera frame first so that the app can detect the difference in the background with and without the person in it in order to detect where the body is. BodyPix_Webcam, however, eliminates this step. If its methodology was applied to Photo Booth, I think it would create a better user experience for users wanting to use the different background effects. However, it would just need to be further trained so that multiple people could be detected more accurately.

Week #2 Assignment: ml5.js Experiment — Lishan Qin

Playing with ml5.js example      

This was my first time using the ml5 library. I started with the imageClassifier() example. In my daily life, I’ve seen and used many applications of the image classification/recognition technology, such as Taobao, Google translator, Google image recognition and so on, and I’ve been curious about how it works for some time. Luckily, this ml5 example has shed some light on this question.

Technical part

The ml5.imageClassifier() contains a pre-trained model that can be used to classify different objects, and the ml5 library accesses this model from cloud. The In the example code of the sketch.js, I can see that the first thing the program does is to declare a callback used to initialize the Image Classifier method with MobileNet, and then declare a variable to hold the image that needed to be classify.

After that, it initializes the image classifier method with MobileNet and load image.

It then defines a function that can get the result of the classification or display some of the errors in the program in the console. (I’m not sure if this function counts as a promise though.) The output of this function not only shows the results of the classification but also the confidence on how accurate the classification is.  Finally, there is a function that displays the output of the function using p5.js.

I assume this example project has concluded many machine learning technologies such as deep learning, image processing, object detection and so on. I think the pre-trained model of this example must concerned a great deal of the machine learning technologies. I will try to look into it in the future.

My output:

Questions:

One of the problems I encountered when trying this example that I still don’t know how to solve is that the program always seems to crash when I use an image from my local file. It failed to load the image and often give wrong classification. I still don’t understand why this is happening. The output of the grogram often looks like this when I use an image from my computer instead of online:       

My code:    

Everything seems to be fine when I use an image from online with an url though…

Thought on potential future application:

Even though today there are already many applications of image classification algorithm similar to imageClassifier(), I still believe the great potential of this example haven’t been fully exploited. Most application of this technology focus on shopping or searching, but I believe it can also be used in both the medical area and the art field. For example, maybe this technology can be used to classify different symptoms of patients’ medical report images like CT or ultrasonic image reports in hospital to adds to the development of future AI doctor…Or it can be used to help fight against plagiarism in art field, or other creative art projects like “which picture of Picasso you look like the most”…The imageClassifier() example, along with other outstanding ml5.js example, have showed me the great potential of ml5 in the future in all walks of life.

Week 02: ml5js Experiment – Jinzhong Yu (jy2122)

INTRO

For me, ml5js is not a complete new stuff. During last year, I started to contribute to severial open source project, including ml5js. So, it is my second time to ’embrace’ this elegant library.

ML5.JS is a frontend-machine learning library that helps beginners to get in touch with machine learning rapidly. It is based on tensorflow.js but cutting out most of the mistery parts, making it a simple but useful tool for all people. With this library, you might not want to customize your own models (nodes, layers, tensors, etc.) But it is very easy to load pre-trained models and beautifully serve the intelligence of machine learning simply.

INTEREST

When it comes to my interest in ml5, I will talk about two of my favourate networks — SketchRNN and GAN(in this case, I mean DCGAN). The former one use the creativity of RNN to let machine create sketch step by step. And the latter one using deep convolutional GAN to generate picture on its own.

Although, with the limited computational power of browser, the output does not always perfect:

cat generated by SketchRNN
picture generated by DCGAN

The first is generated by SketchRNN (what a lovely cat!). And the second one (emmmm), I do not know what it stands for. Maybe its the deeper thought of machines?

CONCLUSION

Introduce machine learning to the web interface is a significant movement and still needs a lot of efforts. Web is the largest entry of the whole internet and the widest gate between human and machines. So, it could be very fancy and meaningful if web can be equipped with high quality machine learning features.

Week 02 Assignment: ml5.js Experiment – Ziying Wang

I played with the ml5.js together with p5.js model called “sketch-rnn”. Its idea is drawing with artificial intelligence. The trained model in sketch-rnn would allow the user to pick a category and let the user start drawing first. After the user stops drawing (release the left-click on the mouse), sketch-rnn will automatically complete the rest of the drawing based on the stroke the user made previously and turn it into the shape of the chosen category. It’ll then automatically continuously generate similar artworks until the user presses the “clear” button. The model aims at suggesting humans collaborate with artificial intelligence on art. Currently, due to the limited models in its database, the drawings are rather ragged, but with the collection of more paintings, sketch-rnn can probably create aesthetically-valued works with the human in the future.

The following is what happened after I chose Mona Lisa in the category and draw a circle to start.

I then try to draw something completely not connected to any elements in the category I choose, I want to see if the AI can cleverly reshape the default pictures in its database. For example, I choose Mona Lisa as my category again and draw a triangle first.

The first drawing turned out good, ai cleverly used my triangle as the body of Mona Lisa. But then I found out that the triangle was no more than some lucky original resources hidden in the database. 

The following drawings didn’t go well, sketch-rnn simply covers my triangle with “Mona Lisa” resources in its database, which make me assume that if this ai can’t find any element in its database similar to the user’s drawings, it would just draw a completely new drawing to cover the original stroke.

It turns out that my assumption is not entirely right. Even though in the Mona Lisa example it is covering and redrawing, in many other trials I conducted later,  it still tries to recognize the basic outline of your drawing and is trying to complete the work with a few more strokes. Sometimes, however, it’s hard to recognize what it’s drawing when I draw a huge mess first and it finishes it by adding on some simple strokes.

I’ve never had experiences in training models before but I tried to read its sample code.

Link: https://github.com/tensorflow/magenta-js/tree/master/sketch

It first implements a set of models for every category, in the sample code, it’s using the cat model as an example. The user’s pen is tracked to see if it has started drawing & finished drawing, previous and after coordinates of the stroke are stored and the model is set to the initial state. After the pen state is stored, it is transferred into the model’s state. The parameters of the model state are traced. The model then samples the coordinates and reset the pen status to the coordinate it collects and finish the drawing according to the cat model resources. 

There are certain terms I don’t understand in this code, for example, I don’t understand how the amount of certainty functions here.

Additionally, I can’t locate the code where the model compares the shape of the user’s drawing to the ones in the database, or is it just using the ratio calculated from the coordinates to match with the models.