Shenshen Lei – Page 2 – IMA Documentation

October 9, 2019

MLNI week4 assignment

Shenshen Lei

sl6899

For this weeks project I created a program that combines the p5.js sketch and the poseNet. The code generate random bubble one the top while the bubble fall freely. The camera will recognize the position of nose. There will be a sharp triangle move following the nose position to break the bubble when the angle touches the bubble. The code is simple, but I faced with some problems while coding. I put the “nose” function in the Particle class which means that each time a new bubble generate will a new triangle. This slowed down the speed of the program. Another thing is that I first put

this.nosex= p.position.x

this.nosey=p.position.y

out of the if function, but the bubbles did not disappeared. Lab assistant helped me solve the problem but I am still a little bit confused about the logic of the code. I hope to create more program that achieve more complex functions.

The link of my complete program:

https://drive.google.com/open?id=1PV5iZwnuoTm2d25K-i6EZGgSP7Oi7i8J

September 25, 2019

MLNI week 3 sketch

Shenshen Lei

sl6899

I created a sketch that detects the position of mouse. If the mouse is pressed, the brush in rotating polygon shape will follow the movement of mouse. When the mouse is released, the there will be a eraser that can cover the drawing.

This is the video record of the sketch:

https://drive.google.com/open?id=16ym1DuB8unpB11Mn-ib2QaON2JDwXxLC

I hope to create more complex sketch board in the future and combine some machine learning programs with it.

September 17, 2019

p5.js sketch Shenshen Lei

Initially, I was trying to sketch a bubble generate a program that each time the mouse clicked, a free moving bubble appear on the canvas. However, when I put the bubble drawing part into the mouseClicked() function, the newly created bubble does not move. So I change my program to the new game version: when the program runs, a free moving ball appears on the screen. Pressing the upper arrow to make the radius increase while the down arrow decreases the radius. Any other key was pressed while resetting the ball to starting radius. If the radius smaller than 0 or bigger than the canvas, the program will stop.

The video record of the sketch:

https://drive.google.com/file/d/1YvoSu5ToCa4h3q_UUq-2bWhPN3V7__lx/view?usp=sharing

September 16, 2019

MLNI week2 case study

Visual Recognition–OrCam

Shenshen Lei(sl6899)

Visual Recognition:

Visual Recognition is one of the widely employed AI technology today. There are plenty of Visual AI such as facial recognition, image search, color recognition, etc. Visual Recognition promotes the development of Zero UI. Our group found a typical and very useful product that based on Visual Recognition technology: OrCam.
OrCam is a company developed a series of products that help those who are blind or visually impaired. OrCam MyEye could recognize the text and faces at a certain distance in front of the user and then read out the content to them. It could also recognize some body language such as “looking at the watch” and tell the user what time it is. It improves the reading efficient of the users and makes their lives more convenient.
However, we believe that the product could be improved in several aspects. First, the way to trigger the text recognition function is to point at the word, or the cam will start to read the whole page of the text. Blind people may not know where the word is, so pointing-to-start sometimes does not work. Another thing is that if the cam could trace users’ pupil when they look at the text, users can get simultaneous feedback without using their finger to point texts. Also, the recognition range of the cam is too limited. If it can enlarge the distance of recognition it can help in more aspect such as recognize the traffic light or warning the user earlier if there are barriers. Finally, the OrCam may also change the design of the outlook.

September 9, 2019September 11, 2019

Week 1 Machine Learning for New Interface case study(Shenshen Lei sl6899)

Introduction of Deepfake

Academic research related to Deepfake lies predominantly within the field of computer vision, a subfield of computer science often grounded in artificial intelligence that focuses on computer processing of digital images and videos. An early landmark project was the Video Rewrite program, published in 1997, which modified existing video footage of a person speaking to depict that person mouthing the words contained in a different audio track.[6] It was the first system to fully automate this kind of facial reanimation, and it did so using machine learning techniques to make connections between the sounds produced by a video’s subject and the shape of their face.

The modern researches focus on creating more realistic and more natural images. The term Deepfake originated around the end of 2017 from a Reddit user named “Deepfake”. It was quickly banned by YouTube and other websites. The reason was that people used materials maliciously to create porn videos while many celebrities were threatened by it. Deepfake then faded out from the entertainment area.

How Deepfake works?

Principle: Training a neural network to restore someone’s distorted face to the original face, and expecting this network to have the ability to restore any face to the face of that person.

Problems:

-The sharpness of picture declines under the algorithm.

-Cannot recognize the face in some rare angles

-Quality of generated faces highly depends on the learning materials.

-The generated face cannot fit in the body in videos.

Improvement–Gan

Gan is a new training method developed by Tero Karras, Timo Aila, Samuli Laine, Jaakko Lehtinen.

In their website, they described the Gan as following “We describe a new training methodology for generative adversarial networks. The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details as training progresses. This both speeds the training up and greatly stabilizes it, allowing us to produce images of unprecedented quality.”

Drawbacks: Gan brings many unpredictable factors.

Latest debate: ZAO

ZAO is an application that uses Deepfake technology to change faces. It maps the user-submitted positive selfies to clips of movie and TV programs, replacing the characters’ faces and generating realistic videos. The application became viral in a few days. However, public cast doubt on the user agreement and worry about the safety of personal information. One provision in the agreement said that the portrait right holder grants ZAO and its affiliates “completely free, irrevocable, permanent, transferable and re-licensable rights” worldwide, including but not limited to: Portrait rights of portrait rights holders contained in portraits, pictures, video materials, etc. Users have to agree with the treaty before using the app. In China, pornography was strictly limited, but the point is that facial payment was widely used. Thought the company quickly changed the agreement and acknowledged that the facial payment technology will not be broke by photo, there will be more potential threats.

presentation: https://drive.google.com/file/d/1CaJ6DsXWfC9OZxWvS1J8ed-MKRM6PZ70/view?usp=sharing

Sources

https://www.bbc.com/zhongwen/simp/chinese-news-49589980

https://en.wikipedia.org/wiki/Deepfake#Academic_research

https://github.com/deepfakes/faceswap

https://arxiv.org/abs/1710.10196