Final Project Documentation – Katie

Background

I knew going into this project that I wanted to create something related to human vs. computer vision. Instead of trying to contrast the two though, I thought it could be interesting to try to create a visual dialogue between them using videos and images. What came out of these efforts was object[i].className==”person”, an art/photography booklet that deals with human subject matter, or objects classified as “person.” 

The theme of human vs. computer perception is something I’ve tried to explore throughout the course of the semester. Initially, I was interested in What I saw before the darkness by AI Told Me, which deals with computer vision when shutting off neurons of a neural network. I was greatly inspired by the duo Shinseungback Kimyonghun, specifically the works FADTCHA (2013) and Cloud Face (2012), which both involve finding human forms in nonhuman objects.

Methodology

I first started by using the ml5.js bodyPix() and poseNet() models to see where points of the body could be detected in outdoor spaces. 

test1

This was fine as a test, but didn’t provide any interesting or clear data that I felt I could use moving forward. I decided to switch to using the ml5.js YOLO() model to capture full, bounded images that could be used as content. To go along with this, using the webcam feed provided very low quality images as well; to improve this, I used video taken on a DSLR as input. This fixed issues with the image quality, but presented new issues with the bounding and image detection in itself–because the model couldn’t process at the same speed of the video, it would run at a delayed speed, resulting in inaccuracy. Instead, I slowed down the videos to 10% speed, and then it functioned much more smoothly.

testwalkway

I ran into some challenges with the code as well, mostly with how to get and save the images. It was fitting that the solution ended up being using a combination of the get() and save() functions. The final code was relatively simple. 

What came out of this was a lot of very interesting images. A few of these:

I then used my own vision and decisions to pull these images together into various compositions to form a book.

further development

I was really interested in this project and would love to continue to work on it. I think what would benefit the project the most would be to continue to enable this conversation between human and machine on the topic of perception, however that may be. One way would be to gather opinions or thoughts on this from other people and to incorporate those words. Since I don’t want this to have an argument or a theory to prove, but rather to be more of an experiment, I don’t think I would be too strict on which content to include or to not include, as long as it relates to this topic of perception and vision. After that, it’s difficult to say what I would do, since each step relies heavily on what results come previously.

Final code with videos:

Github

Sources:

What I saw before the darkness

FADTCHA

Cloud Face

ml5.js bodyPix

ml5.js poseNet

ml5.js YOLO

Final Project Concept – Katie

For my final project, I am focusing on the theme of human vs. computer perception. This is something I’ve tried to explore through my midterm concept and initial plan of reconstructing humans from image classification of parts. When I talked with Aven, I realized there were other, less convoluted ways of investigating this that would allow the work of the computer to stand out more. He showed me the examples from the duo Shinseungback Kimyonghun that also follow these ideas; specifically I was more inspired by the works FADTCHA (2013) and Cloud Face (2012), which both involve finding human forms in nonhuman objects.  

fadtcha

These works both show the difference in which a face detection algorithm can detect human faces, but humans cannot. Whether or not it’s because CAPTCHA images are very abstract, and whether or not it’s because the clouds are fleeting doesn’t matter; this difference is still exposed.

cloud-face

I wanted to continue with this concept by using a human body-detecting algorithm to find human forms in different spaces where we cannot see them. Because I’m most familiar/comfortable with the ml5.js example resources, I started by using BodyPix to do some initial tests, which was interesting as far as seeing what parts of buildings are seen as body segments, but it’s not a clear idea. Then I tried using PoseNet to see where points of the body could be detected. 

test1

test2

This was a little more helpful, but still has a lot of flaws. These two images were the shots where the highest number of body points could be detected (other shots had anywhere from 1-4 points, but no similar shape to human body), but still this doesn’t seem concrete enough to use as data. I plan on using a different method for body detection—as well as a better quality camera—to continue working toward the final results.

Week 11 – Deepdream – Katie

I wanted to try out Deepdream for this assignment to get more familiar with it. I’m thinking more about the concept of perception for my final project, specifically where human and computer perception don’t intersect. These are usually considered to be failures in terms of training, but are they actually failures or do they simply not align with human perception?

Anyway, I tried out changing different parameters but ultimately came up with this video through Deepdream of Magritte’s Golconda.

https://drive.google.com/file/d/16UhPfu7oLG9we59W-rJ68ZURRs6yEOWO/view?usp=sharing

I still would like to see more from Deep Dream in a video output (these are too many dogs), so I’d like to continue to work on this and update later as I find new results.

Week 07: Constructing Humans – Midterm Progress – Katie

Background

The direction of my project changed from the original concept I had in mind. Originally I wanted to do a project juxtaposing the lifespans of the user (human) and surrounding objects. Upon going through the ImageNet labels though, I realized that there was nothing to describe humans, and that the model had not been trained with human images. There were a few human-related labels (scuba diver, bridegroom/groom, baseball player/ball player, nipple, harvester/reaper), but these rarely show up through the ml5.js Image Classification, even if provided a human image.  Because of this, it would be impossible to proceed with my original idea without drastically restructuring my plan.

I had seen another project called I Will Not Forget (https://aitold.me/portfolio/i-will-not-forget/) that shows first a neural network’s imagining a person, then what happens when neurons are turned off one by one. I’m not sure exactly how this works, but I like the idea of utilizing what is already happening in the neural network to make an art piece, not manipulating it too heavily. In combination with my ImageNet issue, this started to make me wonder what a machine (specifically through ImageNet and ml5.js models) thinks a human is then. If it could deconstruct and reconstruct a human body, how would it do it? What would that look like? For my new project, which I would like to continue to work on for my final as well, I want to images of humans based on how different body parts are classified with ImageNet. 

New Steps
  1. Use BodyPix with Image Classifier live to isolate the entire body from the background, classify (done)
  2. Use BodyPix live to segment human body into different parts (done)
  3. Use BodyPix with Image Classifier live to then isolate those segmented parts, classify (in progress)
  4. Conduct testing, collect this from more people to get a larger pool of classified data for each body part. (to do)
  5. Use this data to create images of reconstructed “humans” (still vague, still looking into methods of doing this) (to do)
Research

I first was trying to mess around to figure out how to get a more certain idea of what I as a human was being classified as.

Here I use my phone as well to show that the regular webcam/live feed image classifier is unfocused and uncertain. Not only was it recognizing images in the entire frame, but also its certainty was relatively low (19% or 24%). 

In the ml5.js reference page I found BodyPix and decided to try that to isolate the human body from the image.

bodypiximageclassifier

This worked to not only isolate the body, but also more than doubled the certainty.  To be able to get more certain classifications for these body parts, I think it would be necessary to at least separate from the background. 

With BodyPix, you can also segment the body into 24 parts. This also works with live feed, though there’s a bit of a lag.

bodypix_partsegmentation

Again, in order to get readings for specific parts while simultaneously cutting out background noise, BodyPix part segmentation would need to be used. The next step for this would be to be able to only show one or two segments of the body at a time while blacking out the rest of the frame. This leads into my difficulties.

Difficulties

I’ve been stuck on the same problem/trying to figure out the code in different ways for a few days now. I was getting some help from Tristan last week to try to figure it out, and since we have differing knowledges (he understands it at a lower level than I do) it was very helpful. It was still this issue of isolating one or two parts and blacking out the rest that we couldn’t fully figure out though. For now we know that the image is broken down into an array of pixels, which are assigned numbers that correlate to the specific body part (0-23):

Conclusion

I have a lot more work to do on this project, but I like this idea and am excited to see the results that come from it. I don’t have concrete expectations for what it will look like, but I think it will ultimately depend on what I use to create the final constructed images. 

Week 06 – Midterm Project Concept – Katie

Background

Something I think about a lot: most people make an enormous effort to avoid death and even things that vaguely remind them of death.  I don’t think this is a new phenomenon, but at the same time it seems that sometimes people think that continuing technological advancements will help to distance themselves from it by prolonging life, maybe in hopes of eluding death altogether. Many people spend exorbitant amounts of money to look younger, or some to live their ideas of healthier lives. There’s no wrong or right in these decisions, but I think it’s really interesting to step back and assess them in order to understand how our thoughts about death impact the way we live.  For me, it’s been even more beneficial to think about death on a regular basis, and to not necessarily view it as good or bad, but to accept its inevitability and to maybe grow comfortable with it.

Project

I’m kind of inspired by an app called WeCroak that sends notifications 5 times a day with quotes, essentially just little reminders about death; for my project, I really want to make something that just reminds people that they (and everything around them) will die at some point through the use of machine learning. What I’d like to do is create a sort of image classification project, but instead of labelling objects with names, they will be labelled with lifespans.

Initially I thought it could be really interesting for it to be able to guess the remaining time left in an object’s lifespan, but for that to work it would have to know object lifespan – object age = object remaining time. I think this could be next to impossible to achieve, especially for inanimate objects. To add to this, even though there are beta versions of age guessing programs for humans, I think these come with a lot of ethical problems before even being implemented into another project, in addition to already being inaccurate. There’s something a little too extreme about a death countdown when all I want is to provide a gentle reminder.

Methodology

With just providing lifespans as labels, the project seems like it will mostly be a matter of manipulating datasets. I don’t really have much experience with that, which is why I think it would be a good opportunity to learn. Because it seems really tedious to try to do this manually, and maybe not possible to finish in the time given, I’m going to do some research to see if there is an API for objects’ lifespans that I can potentially weave in with the image recognition to avoid this.

Reference

WeCroak app: https://www.wecroak.com