Week 03: Track Your Head – Jinzhong Yu (jy2122)

PROJECT

DEMO: https://nhibiki-nyu.github.io/AIArts/Week3/dist/

SOURCE: https://github.com/nhibiki-nyu/AIArts/tree/master/Week3/

DESCRIPTION

For the week3 assignment, I made use of BodyPix to detect the position of the player’s face with the help of WebCam. BodyPix is a network that cuts human’s outline from a picture which can also help to detect the human’s position in one picture.

So, in this project, I started the Web Camera to get a real-time picture, and send the picture to the input of network in order to detect the real-time position of player’s face. Then, I integrate the position data with a game – Pong. The real-time position of face can be a controller for the board. When player move the position in front of the camera, the board will shift along with the player.

pong die
pong die

SHORTAGE

When I was building this project, I faced (mainly) two problems:

  • weak computational power of my OLD late 2013 13” Macbook Pro.
  • other objects that in front of the camera might interfere with the detection.

For the first problem, I read one blog from Google posted early this year. It has mentioned that with the help of tfjs, the 2018 15” MBP has framerate of 25/s. So when I run the website on another laptop, it has better performance.

And for the second one, I want to introduce FaceNet, an other network that can distinguish different faces in one picture. So we can just track one face in the camera.

Week 03 Assignment: trying Save/Load Model with Image Classification

This week’s assignment turned out to be bit less successful than I hoped. I wanted to use the assignment as an opportunity to build off of an earlier project I did using image classification. For that project, I loosely trained the model with images of the alphabet in American Sign Language (ASL). *note that I’m not very knowledgable about ASL and even worse at signing (I had to reference an alphabet guide) but here’s a video for reference of the last project:

Despite the fact that I only trained it with 24 of the 26 letters (J and Z require motion while image classification training requires still images), it was incredibly time consuming to have to retrain it every time I opened the project. Previously when I was doing this project, I didn’t know that a save/load model had recently been developed, so I figured I should try to implement that in this project to make it more efficient.

I referenced Daniel Shiffman’s video on the Save/Load model, and for the first part it does seem to be working; after training, the model.json and model.weights.bin files download and open to look like the demonstration in the video. It’s only when I try to load them back into the program that it stops working. In localhost, it stops running at “loading model.” My terminal shows this:

badterminal

I think there’s probably a relatively simple explanation to this or something that I’m overlooking, and I think I should continue to work on it and make sure I can make this work. If I could get it to work, I think the save/load model would be extremely helpful in developing a larger project, and especially if I wanted to move forward with something similar to the image classification model.

Code:

Training & saving model code: https://github.com/katkrobock/aiarts/tree/master/train_saveModel

Loading model code: https://github.com/katkrobock/aiarts/tree/master/loadModel

Reference:

Daniel Shiffman / The Coding Train video: https://www.youtube.com/watch?v=eU7gIy3xV30

Week 3 Assignment: Artificial UNintelligence :/

I thought it was interesting how the imageClassification program from class often gets the image wrong if it’s slightly more obscure, so I wanted to play with the idea of artificial “un”intelligence by coding an unsure, apologetic AI who’s just trying their best.

In order to do this, I changed the “gotResult” function, shown here in the original code:

By adding a variable “resulting_label” and adding numerous replace functions to the label, as well as text for the label and probability results:

The replace functions include several letters swapped for other letters that produce “dumber” results, like replacing u with oo and s with th. The results are some pretty unintelligent and unsure answers from a poor AI who’s just making their best guess on some confusing images…

This project was challenging, especially with getting started and exploring the possibilities of editing a preexisting in the first place, but I enjoyed thinking about ways to add personality to AI and exploring how even wrong results can produce a funny, interactive program. I felt like the AI programs we worked with in class can seem very robotic and lack personality or voice, so I wanted to humanize it a bit by making the wrong answers seem acceptable because the AI had tried their best! I had to simplify my original idea into one I could code, but I think there are many possibilities for using machine learning technologies outside of their intended functions but still producing entertaining programs, and I’d love to go further with this with my future projects!

My code: https://gist.github.com/jujucodes/df555376b27a34734a279dcdefe8cb63