MLNI – Final Project (Cherry Cai and Wei Wang)

Project Name: Control

Project Partner: Wei Wang (ww1110)

Code: link 

  • Concept

Nowadays, in society, people are always being controlled by the outside world, acting against their wishes. Some who tired of being cooped up struggled to free themselves from the control, but finally accepted their fate and submit to the pressure. Inspired by the wooden marionette, we came up with the idea of simulating this situation in a funny way using a physical puppet and machine learning model (bodyPix and poseNet).

  • Development

1. Stage 1: Puppet Interaction

In the beginning, we drew a digital puppet using p5 and placed it in the middle of the screen as the interface. 

For interaction, users can control the movement of a physical puppet’s arm and leg using two control bars that connected to a physical puppet’s arms and legs. 

Puppet with control bars
  Original Puppet

Physical Puppet

By using the machine learning model poseNet, the body segment’s position will be detected accordingly and based on the user’s control, some protest sounds and movements will be triggered on the virtual one that showed on the screen. To be specific, the virtual puppet’s arm will be raised accordingly to the user’s control to the physical one, and this applied to the other arm and of course the two legs.

To simulated the protest, we created the effects that if the position of the arm is raised up to the level of its eyes, the segments will be thrown away in an attempt to get out of control but will be regenerated again after a few seconds. Here is the demo of this stage.

2. Stage 2: Human Interaction

After building up the general framework of our project, we moved on to revising the interface. After getting feedbacks, we decided to take advantage of the bodyPix, and redesign the interface using not p5 elements but pictures from the users. We first designed a stage to guide the user to take a webcam image of themselves with their full-body including head, torso, arms, and legs. We placed an outline of the human body in the middle of the screen and when the users successfully pose as the gesture, a snapshot will be taken automatically. By using bodyPix, different body segments of the user’s body will be saved based on the maximum and minimum values of x and y of different body parts. We then grouped different segments together, fixed the position of the head and torso while the user’s hands and legs can be moved accordingly with the user’s real-time position detected by poseNet. Here is the demo of this stage.

3. Stage 3: Puppet and Human Interaction

After the developed of the body the interaction we combined them together. After the user takes the snapshot, others can use the puppet to control their movements and if the movements are too intensive, the virtual image of the user will be angry and screamed some dirty words and thrown that part of its body away. However, since poseNet doesn’t really work well on the puppet, we did some manipulation which greatly increases the confidence score.

Modified Puppet
 Modified Puppet

4. Stage 4: Further Development

We also continue to revise the interface by using bodyPix. We improved the image segmentation with pixel iteration. In this stage, different body segments are extracted out not based on the maximum and minimum value of x and y anymore, but just the pure segments image. Due to the time constraint, this version is not fully complete with the protesting sound and movements, so we exclude this from the final presentation. Here is a demo of this stage. 

  • Coding

The machine learning model we used for the first stage is bodyPix, and for the following stages, we used both bodyPix and poseNet. We haven’t encountered any difficulties when extracting positions (x and y value of different body segments), however, we have some difficulties when using those values.

1. Getting the user’s body segments image using bodyPix

Bodypix detects both the front and the back of the human body. As the id listed below, for instance, PartId 3, 4, 7, 8, 13, 15, 16, 19, 20 are all the back surfaces of the human body.

/* PartId PartName -1 (no body part) 0 leftFace 1 rightFace 2 rightUpperLegFront 3 rightLowerLegBack 4 rightUpperLegBack 5 leftLowerLegFront 6 leftUpperLegFront 7 leftUpperLegBack 8 leftLowerLegBack 9 rightFeet 10 rightLowerLegFront 11 leftFeet 12 torsoFront 13 torsoBack 14 rightUpperArmFront 15 rightUpperArmBack 16 rightLowerArmBack 17 leftLowerArmFront 18 leftUpperArmFront 19 leftUpperArmBack 20 leftLowerArmBack 21 rightHand 22 rightLowerArmFront 23 leftHand */

Since we don’t want to keep the interface as clean as possible, we only want to get the segments we need. Based on this idea, we construct a new array with a group of 0s and 1s at the length of 24 (the length of the list that bodyPix can detect).  When 1 at the corresponding index of the array,  we take the segment and copy the image, while 0 means that this segment result can be ignored. 

let iList = [1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1 ,1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1];

Once we get the body segments that we want, we constrained the image and subtract out the image of a person with segments.  We saved the values of minimum and maximum Xs and Ys to bound the segment as a rectangle, copied the image in the bounded area to a predefined square and then arranged the orientation after scaling them to the respective position.

2. Triggering sound and movement

To trigger the protesting sound and movement, we added a time count. 

// time count let countLW = 0; //leftWrist let countRW = 0; //rightWrist let countLA = 0; //leftAnkle let countRA = 0; //rightAnkle

In order to play the sound as an alert before the movement is triggered, we input two different if statements and with a different value ranges of the time count. For instance, here is an example of triggering the movement and sound of the right arm. Whenever the Y value of the right wrist is smaller than the Y value of the right eye, the time count started increasing on a plus 1 base. As the time count value reaches 3, the sound will be displayed, and when it reaches 5, the movement will be triggered. 

 // poseNet - check status for rightArm if(rightWristY < rightEyeY && rightArmisthrown == false){ countRW += 1; // console.log("rightWrist"); } if (countRW == 3 && rightArmisthrown == false) { playSound("media/audio_01.m4a"); } if (countRW == 5 && rightArmisthrown == false) { rightArmisthrown = true; rAisfirst = true; } if(rightWristY > rightEyeY){ countRW = 0; }

3. Movement Matching
 
To match the virtual image’s movement according to the physical movement, we decided to calculate the angles between the position of arms/legs and shoulders/hips. We searched online and used atan()(returns the arctangent (in radians) of a number) function in p5 to calculate the degree based on different x and y values that we got from poseNet.

// radias calculation radRightArm = - atan2(rightWristY - rightShoulderY, rightShoulderX - rightWristX); radLeftArm = atan2(leftWristY - leftShoulderY, leftWristX - leftShoulderX); // set leg to move within certern range radRightLeg = PI/2 - atan2(rightAnkleY - rightKneeY, rightKneeX - rightAnkleX) ; radLeftLeg = atan2(leftAnkleY - leftKneeY, leftAnkleX - leftKneeX) - PI/2;

We used translate() function to make the limbs follow users’ movement, rotate around shoulders and hips accordingly. However, this system of coordination doesn’t really work for the throwing process since they rotate around perspective center points. Therefore, we designed another coordination system and recalculated the translating center which point should be calculated from rotation with the midpoints with the simultaneous angle.

translate(width/2 - 100 * sin(radLeftArm/2), height/2 - 200 *cos(radLeftArm/2)); 
  • Further Developments

Apparently our project has a large room to be improved, not only the interaction but also the interface. During the Final IMA Show, we also found out that the snapshot took too easily and the whole project needs to be refreshed constantly. We wound definitely like to add with a time count to make users sense the way of interaction and have enough time to make the pose in front of the webcam. Restart the project is another problem that needs to be fixed so that another gesture recognition should be added to the project for restarting the process.

  • Reflection

We had learned a lot through this process and had a better understanding of both bodyPix and poseNet. Using a physical puppet really increase the user experience and during the IMA Show, we received a lot of resonance from the visitor of our project’s concept. 

Week 10 MLNI – Final Project Concept Presentation (Cherry Cai)

Control (Project Presentation)

  • Teammate: Wei Wang (ww1110)
  • Inspiration

Nowadays students/workers are under too much pressure bought by society. A lot of us are acting against our wish, like a marionette from the threads, arms dangling, floating at the mercy of the breeze. Some who tired of being cooped up struggled to free themselves from the control, but finally accepted their fate and submit to the pressure.

  • Interface

A wood marionette placed in the middle of the screen

  • Interaction
    1. Two control bars operating marionette’s body segments’ orientations
    2. Based on the user’s control, a protest will be given out by the marionette. The effect will be triggered accordingly(e.g. voice, movements, etc.)
    3. Segments will be thrown away in an attempt to get out of control if the instruction is not followed.
    4. The segment will be regenerated with a string that could, again, be controlled by the bars.
  • Machine Learning

KNN classification to recognize the orientation of control bars

Week 9 MLNI – Innovative Interface with KNN (Cherry Cai)

Traffic Light Controller

  • Inspiration

I was inspired by an online traffic controlling game. 

(https://www.fanfreegames.com/game/traffic-light-control)

The mechanism of this game is intuitive: users are asked to control the traffic light by clicking their mouse so that the car from different directions won’t bump with each other when passing through the intersection.

  • My project

So my expectation is to create a traffic light controller project implementing the concept of KNN Classification. By recognizing input gestures, the traffic light can be changed accordingly. Understand the KNN classification mechanism is definitely not easy, I did some experiments before creating my own project but still had a hard time manipulating the code. I reached out to my friend Wei and she had helped me a lot.

Input: real-time camera image

Neurons: three pre-loaded snapshots representing the three traffic lights. The snapshots are recorded in an array after pressing three specific keys on the keyboard. 

Interface:  the traffic light drew using p5

I had encountered some problems regarding using the result and triggering the expected effect. As I have three neurons, I was a little bit confused by how to call each of them out. I will solve this problem as soon as and improve the interface as well.

MLNI – Midterm Project (Cherry Cai)

Balance Between Nature and Human

  • Project Demo

I want to explore the balance between human society and nature. This concept is inspired by Zhengbo, an artist committed to human and multispecies equality, who recently exhibited his project Goldenrod in our Gallery. I want to use the machine learning model to simulating a project that can be able to raise people’s attention to protecting the environment.

For the interface, I was inspired by the project called Transcending Boundaries by Teamlab. By projecting the flowers and other natural elements on the human body, it is not only beautiful to look at but also interactive.

  • Machine Learning Model

The machine learning model I used is bodyPix. Since bodyPix is Real-time Person Segmentation in the Browser, it allows me to separate the different segments of the body and applied my idea of revealing the importance of keeping a balance between humans and nature. The body segments I used include: the leftFace (id: 0); the rightFace(id: 1); and the torsoFront (id: 12). 

  • Storyboard

Before initiating this project, I draw a storyboard. To imitate the human side, I first decided to use the real-time camera image from the webcam. For the nature side, I decided to use different ellipse to imitate the leaves.

The interaction embedded in this project is simple. The interface will change accordingly to the position of the user while different effects will be triggered as the user reached a certain position. 

  • Coding
    1. Hiding the background

In order to track the different segments of the human body, I needed to use

bodypix.segmentWithParts(gotResults, options);

However, it was not able to apply a maskbackgroud to this model. Alternatively, I added with an if statement so that only the needed segments will be displayed, otherwise, a white background will be added.

if (data[index] != 1 && data[index] != 0 && data[index] != 12) {
   img.pixels[colorIndex + 0] = 255;
   img.pixels[colorIndex + 1] = 255;
   img.pixels[colorIndex + 2] = 255;
   img.pixels[colorIndex + 3] = 0;
}

  2.  Speed

As more and more elements added to the project, the speed of detecting the movement was slow down. After I discussed with Moon, I did the following manipulation to my code. 

First, I was calling the machine learning model in the draw function, so that the machine learning model was run every second when drawing the new interface. To call the machine learning model just once, I put it in the gotResult function and the project speeded up a lot.

Second, I reduce the resolution of the image by adding a scale. 

video.size(width/scaling, height/scaling);
img = createImage(width/scaling, height/scaling);

 After these effective manipulations, the project is much more smooth and improved the user experience.

3. Separated the torso

Since the model cannot detect the left and right side of the body, I need to separate the body by tracking the minimal x position of the left face and applied to the torso. 

I first create an empty array called posXs = [ ]. After running the machine learning model, I pushed all the x position of the left face into this array. Then I find the minimum value within this array and applied to the torso. To limited the length, I also wrote an if statement to splice the first data in the list when the length reaches the limit of 80. 

if (posXs.length < 80) {
posXs.push(x);
} else {
posXs.splice(0, 1);
posXs.push(x);
}

  4. Pixelization

After realizing the concept of this project, I reconsidered the interface since using the webcam seems a little bit inconsistent and distractive. 

After inspiring by peers and Moon, I decided to capture the color from the webcam and with it for doing some pixel manipulation. I created another class for the human side and used the color from the video.

let face = new Face(mappedX,mappedY);
    r = video.pixels[colorIndex + 0];
    g = video.pixels[colorIndex + 1];
    b = video.pixels[colorIndex + 2];
    face.changeColor(r,g,b);
    faces.push( face );
  • Reflection

Overall, I think the final deliverable meets my expectation and the interaction is intuitive and easy to be understood. For future improvement, I think I will probably add more elements to imitate nature, at the same time, make the human side more realistic. More movement can be added into this project to trigger more effects in the future. 

Week 5 MLNI – Interactive Portraiture (Cherry Cai)

Explore the Brain Function

For this assignment, I created an interactive Portraiture that utilizes bodyPix under the pixel iteration and manipulation concept.

  • Inspiration

I was inspired by this picture and want to visualize the different functions between the left brain and the right brain. By tracking the position of the left face and the right face, I changed the color of the rectangles and referenced them with the specific function. Since the webcam reversed its direction of movement, I matched the function of the right brain by tracking for the left face and vice versa. 

The link to the code