Machine Learning for Arts – Crashing Jiwonnnnnnnnnnnnnnnnnnnn

Final for Machine Learning

Title:
- A drawing of a full body beautiful girl, young, hyper-realistic, very detailed, intricate, very sexy pose, unreal engine, damatic lighting, 8k, detailed, black and white.
Sketch link:
- https://editor.p5js.org/jiwonyu/sketches/o4rM_9URi
One sentence description
- A mixed media project that explores interaction with a traditional media; goal of the painting is to protray the suffocating gazes of women in the digital world.
Project summary (250-500 words)
- This project is a multimedia art project that aims to create a powerful, yet simple interaction that intertwines both traditional and new media. The primary purpose of the project is to highlight one dark side of Artificial Intelligence (AI) generated art. AI art continues to enhance the unequal gender dynamic in today’s culture, further creating a very specific image of what a ‘woman’ should look like. Similarly, AI art often puts a strict definition of certain adjectives such as ‘beautiful’ and ‘pretty’ by associating very specific physical traits to those terms. With this idea, this project heavily focuses on two topics: Women and Asian Fetishization. The project consists of two parts: 1. Drawing 2. Laptop. The first part is made of three papers, which combine to make the full figure of a woman lying down, staring at the audience. The background is a collage of screenshots of Lexica.art, which is an inventory of AI Art that was made by users. The screenshots show what kind of images are associated with each keyword, such as pink,red,women, asian, figure, body, etc.. The viewers can see what prompts users in the internet have used to generate each image. The second part is the laptop, which functions as a sensor that detects a face with its webcam. Before anyone is in the frame, an audio of a woman breathing will play. However, when a face is detected (aka when someone “looks” at the woman), the audio will stop, thus the woman stops breathing. The project aims to highlight women’s suffocating feelings of unwanted gaze in the digital world.
Inspiration: How did you become interested in this idea? Quotes, photographs, products, projects, people, music, political events, social ills, etc.
- my project last year
- lexica.art
- art is supposed to be agitating
- social ill
Process: How did you make this? What did you struggle with? What were you able to implement easily and what was difficult?
- Materials:
  - three charcoal papers
  - eraser
  - black and white charcoal
  - glue
  - speakers
  - laptop
    - p5js
    - webcam
- the biggest struggle while making this project was the ideation part and the fact that my materials got stolen the day before friday (12/8/2022)???????????KJSFAKSD;FALKDSJ;FALKSDF I reached out to ima/itp people on the floor by discord, but I am planning to send a mass email in hopes to get a wider audience.
- video of p5.js sketch:
- day1:
- day2:
- day3:
- day4:
- day5:– finished!
- for the show:

Audience/Context: Who is this for? How will people experience it? Is it interactive? Is it practical? Is it for fun? Is it emotional? Is it to provoke something?
- the intended audience is the users of text to image generators. I want to raise awareness of the different uses of AI generated images. Clearly, AI art has many uses. They can be used to quickly do a mock up sketch, provide background image, etc. in a fast, efficient manner. I personally used text to image to create most of my visuals for my performance. Yet, as seen in Lexcia.art, there are many different uses of AI art as well. Although it is in my hopes that text to image are not used to create lewd, sexual images of women, I think it’s worth pointing out to the users.
- I hope to stir uncomfortableness with my art so that the impact and the impression is grand.
- my laptop functions as a sensor that detects people’s faces (aka a person’s nose). When a face is detected, the audio of a woman breathing stops. when there is no one in the camera peripheral, the painting starts breathing again.
- I think
User testing: What was the result of user testing? How did you apply any feedback?
- People liked the idea of a breathing painting that stops breathing depending on the presence of the audience. I got a feedback that there is a cohesive theme in between most of my works, which I really liked as a comment.
- more user testing, as I will mention again at the last section, is something that I need more of.
Source code
- let myPoseNet;
  let video;
  let poseResults;
  let catSong;
  let paused = false;function preload() {
  catSong = loadSound(“breath.wav”);
  }function setup() {
  //from class example
  video = createCapture(VIDEO);
  video.hide();
  createCanvas(640, 480);
  fill(255, 0, 0);
  myPoseNet = ml5.poseNet(video, gotModel);
  textSize(100);
  }
  //from class example
  function gotModel() {
  //console.log(myPoseNet);
  myPoseNet.on(“pose”, gotPose);
  }
  
  function gotPose(results) {
  poseResults = results;
  
  if (!catSong.isPlaying() && paused == false) {
  catSong.loop();
  }
  if (results && results[0]) {
  paused = false;
  console.log(“detecting”);
  }
  }
  
  function draw() {
  image(video, 0, 0, width, height);
  if (poseResults) {
  const nose = poseResults[0]?.pose.nose;
  ellipse(nose?.x, nose?.y, 20, 20);
  if (!poseResults[0]?.pose?.nose) {
  catSong.pause();
  paused = true;
  console.log(“pausing”);
  }
  }
  }
Code references: What examples, tutorials, references did you use to create the project? (You must cite the source of any code you use in code comments. Please note the following additional expectations and guidelines at the bottom of this page.)
- class example from week 3: https://github.com/yining1023/machine-learning-for-the-web/blob/master/week3-pose/PoseNet_VideoMusic/sketch.js
- coding help from Angelo in Coding Lab!
Next steps: If you had more time to work on this, what would you do next?
- As of the presentation that I’m giving on December 9th, the next steps would be for me to actually finish the drawing and put it onto the show.
  - to specify, I would first finish the figure in the drawing. Then, I would glue on the background that I printed out. Then, I would set it up with my laptop and rent lights and speakers from the ER. UPDATE: DONE!
  - I might spray the painting with a finishing spray, since charcoal drawings. UPDATE: DONE!
  - I would try to get more user feedback to figure out the best setting for the show as well. UPDATE: DONE!

Winter Show Documentation:

- art description used for the winter show:
- video documentation (ft. my good friend faith!). the audio stops when the user is in front of the camera.
- some different angles of the project:
- Reflection:
- Originally, I wanted a webcam instead of using a laptop built in camera. However, the ER did not have an available webcam that I could check out before the show. Next time, I want to hide all physical interfaces.
- During the show, I got a feed back that it might be better to blackout the screen so that the user is not looking at the laptop + make it more intuitive for the users to interact/look at the drawing instead (thank you David Rios!). So I put the brightness down to 0.

Final Proposal

Project title
- TBD
One sentence description: Can you summarize your idea in one sentence? Stick to the facts — what are you planning to make?
- Physical oil pastel painting in direct collaboration with machine learning (runaway)
Project abstract: ~250 word description of your project.
- tbd, I need to determine exactly what I’m going to paint in order to describe the project. However, it is going to be a group project between machine learning and machine learning. It will have a paininting component and a technology component.
- slide show for more explanation:
- https://docs.google.com/presentation/d/1KjiVN3AsW0mrRHwnd97Wqptm53hwd7FyO7mPHaXpiQc/edit#slide=id.g18b63fa5de4_0_5
Inspiration: How did you become interested in this idea?
- I was always interested in ways to make interactive media arts emphasize more of the ‘art’ aspect than the technologies that are involved with the major. I think it’s more valuable and interesting to use technologies to enhance a traditional art, not to completely replace it.
- Similarly, I did a project last year where I painted a oil pastel painting with a live interaction with the audience. It was almost like a performative piece.
- I want to push myself to use my traditional media skills (like playing the violin and painting) and use technology to enhance my traditional-medium projects.
Visual reference: Drawings, photos, artworks, texts, or other media that relate to your idea.
- previous project: https://photos.google.com/search/_tv_video/photo/AF1QipNf1uq3szplAuQgbjpYHYyPHFNseuBoKISkWA5S
Audience: Who are you making the project for? How do you expect your audience to interact with your piece? What will their experience be like?
Challenges: What is your biggest technical and/or conceptual challenge you anticipate?
- I think the ideation part will be the most challenging;
Code sketches: This is not required but if you have sketches in progress share them as additional links for feedback.
- n/a

8 Assignment

For this week’s assignment, I decided to play with Runway.

Describe the results of working with the tool, do they match your expectations?

the results were definitely stunning, but only after playing around with the words quiet a bit. I had a difficult time actually articulating what I wanted in words, and I found myself using Lexica as a tool to guide me through the process. Typically, I would search a key word on Lexica, browse through the images, click on the one that I found to my liking, and look at the prompt that that specific image has. Then, I would paste it onto runway and tweak the words so it fits to what I am looking for.

Can you “break” the tool? In other words, use it in a way that it was intended for and what kinds of results do you get?

I’m not sure if this would count as me ‘breaking’ the tool, but I did have instances of runway not showing me the results because it was against the guidelines. This happened when I used the words ‘corpses laying on the ground.’ I think this is similar to there being a lot of nudity when someone types words like “anime girl.”

Another instance that I’ve been reading a lot about is ‘arts’ being stolen due to ai generative art. On twitter, I’ve read tweets from digital artists asking people to not use their art styles. Regardless of this plea, a lot of artists’ styles have been copied with AI art, which I think is an interesting and heated discussion that needs to be solved out.

Can you find any pro tips in terms of prompt engineering? Can you change your prompt to make the generated results better?

As mentioned in the beginning, using Lexcia as a starting point was a great help. The following are some key words that I found to be helpful when I was trying Runway:

highly detailed
sharp point
painted by (wanted artist)
with colors like (colors that I want)
dimly lit (or any kind of words to indicate the lighting)
any atmospheric words like:
- ghostsly
- gothic
- empty
- scary
- victorian
- haunted

These words were obviously suited to my specific search. I was looking to have a inside picture of a haunted victorian doll-house.

The results are the following:

The prompt that I used is this:

“Large gothic victorian house hall with large chandeliers under the ceiling, broken victorian cups on the ground, huge speakers on the ground, and a throne in the middleon each side of the throne, horror movie, cyber-punk, psychidelic moonlight, artstation, detailed, colorful, futuristic, with pink color”

or some sort of a variation from this prompt.

I am still tweaking words so I can get the one picture that I like to use as a background for one of my performances.

Assignment 7

For this week’s hw, I wanted to recreate some sort of a filter. I decided to take a Korean traditional mask, also known as Tal in Korea, as my inspiration. Below is a picture:

Tal and Talchum: Traditional Masks and Dramas of Korea

I wanted to specifically ddo the woman mask, which is the middle one with the white skin.

I originally wanted to draw all of my assets of ProCreate and upload them to the sketch. However, when I tried to show the png files with image();, none of the png files showed, so I had to manually type in the shapes, such as ellipse, arc, and line, which was a bit of a work.

For the keypoints, I used the following link:

Below is the video of my mask working:

https://drive.google.com/file/d/1B9sxxkFW6M-5abQA8sxchxpIRwBcIofl/view?usp=sharing

Below is the p5js sketch for FaceMesh:

https://editor.p5js.org/jiwonyu/sketches/mSJaQafYu

Assignment 5B

I decided to build onto the doodlenet sketch that was demonstrated during class. I attempted to make the doodle net classify the webcam, but to be completely honest, I am not sure if I did it right, since the confidence scores were all below 30%, and it was not an accurate guess (but I guess a lot of doodlenet guesses are inaccurate).

In addition to this, I kept running into the problem of my web browser freezing. every time I opened the camera, the browser froze before it could classify anything (this happened on October 19th).

However, on October 20th, I tried again, and although it was slow, my web browser did not freeze on me.

Perhaps to make the classification more accurate, I thought maybe putting black and white filter would be helpful, since all of the doodlenet samples are black and white, and drawn with a big stroke.

I thought that maybe I should just use filter (), but that did not have enough contrast. Then, I used PixelLoaded() to create this pixelated look on the webcam.

With the black and white, pixelated image, the top guesses from the classifier were the following:

lion
mona lisa
mustache

I still wasn’t sure if the classifications were correct because the confidence scores were still all below 30 percent.

I think Mona Lisa, if anything, is the most accurate classification of myself, but I could see how all of these classifications were derived from my image: they’re all somewhat related to facial features.

Lion has two eyes, a nose, and a mouth, and so does the Mona Lisa. A Mustache may not have these individual facial features, but it’s on a face, so people might’ve drawn the eyes and the nose to go along with the mustache.

I commented out a lot of things that may slow down the sketch, since my laptop couldn’t handle too many codes. The below is the link to the sketch:

https://editor.p5js.org/jiwonyu/sketches/eY7aPSRkM

Week 6 Homework

Something you find online. For example, take a look at Kaggle, awesome datasets or this list of datasets.
- https://www.registry.jockeyclub.com/registry.cfm?page=releasedNames&CFID=75624210&CFTOKEN=146476e202081137-69D31F18-5056-BE0C-977E746CB080EB2A
- Above is the dataset of horse games in Jockey Clubs. I particularly enjoyed this dataset because, to be honest, it was a funny data. The names are often comical, and they remind me of names that are generated by programs such as “random name generators.”
- I think I was also drawn to this dataset because I am taking an ITP class called Alter Egos, and we have to pick an alter ego of ourselves and name it; the horse name dataset reminds me of the names that people could use. I also feel like a lot of these names sound like cartoon characters and/or superheroes, which I enjoy.
- I’m glad most of the names are not human-like (ex: John Smith, Sarah Meyers, James Brown, etc.). I think having such literal names would throw a lot of people off.
Find a dataset that you collect yourself or is already being collected about you. For example, personal data like steps taken per day, browser history, minutes spent on your mobile device, sensor readings, and more.
Below are a few that I could think of on top of my head:
screen time – collected automatically
brwoser hisroey -collected automatically
heart rate monitor on the treadmil – collected automatically
- miles walked
- calories burnt
- time stamp of how long the workout was
search history -collected automatically
youtube history – collected automatically
what i eat – manually tracked by me
- sometimes calories intake too
what i like/save on social media – collected automatically/and manually in a way, since I am actively liking them?
my weekly hours for work – clocking in hours. – collected manually
duty logs (I need to write down all the interactions I have with the dorm residents in a log) – collected manually
card usage – collected automatically

Week 5 homework

how can machine learning support people’s existing creative practices? Expand people’s creative capabilities?’

Watching Fiebrink really opened up my scope for machine learning design. Her creation, wekinator, made the machine learning process seem so easy, especially compared to complex, mathematical coding that would go in if it was not for machine learning algorithm. I think machine learning can be much more accesible to users of all kinds of backgrounds — it is more hands-on and learn as you play around type of style (in my opinion). On the other hand, hard coding can have a learning block (also in my opinion haha). I really resonated with what Fiebrink said when she was doing the demo of wekinator. As she was creating a demo, she suggested that what she’d doing right now is cool, but something that is already possible with other types of interfaces. Then, she added another layer with blotar, thus making hand gestures into an instrument that creates a fun, bizzare sound that no other instrument can make.
I also liked the example of the tree bark instrument — it reminded me of a performance piece by Spencer (class of 2022 itp) that was showed at the last spring show. I really love how machine learning can help humans forget about the stem aspects (not entirely), and allow the users to emotionally, physically, and mentally engage with their projects.
I also loved this talk because it made me want to use the software that Fiebrink mentioned for one of my class finals (maybe two!).

dream up and design the inputs and outputs of a real-time machine learning system for interaction and audio/visual performance. This could be an idea well beyond the scope of what you can do in a weekly exercise.

As I mentioned before, I think Fiebrink’s programs do a really nice job making machine learning approachable, and it made me want to use for programs for one of my finals (ITP class Alter Egos).
As an input, it would be cool to use hands as the manipulator. Although this seems super cliche, with Alter Egos performance, the staging will be pretty dark and I will have a costume on, which means I can’t have my face as a source of input. It might be interesting to have body posture/movement as an input as well, but realisitcally speaking, I’m not sure how accurately the camera would be able to capture the movements in the dark. I also thought about doing eye tracking, but I want to have white lenses on that completely cover my irises, so I’m not sure if that would be a good idea. Maybe I could do open eyes v.s. closed eyes?
As an output, I think audio would be cool and complementing to my class (because we are learning how to do audio/visual manipulation). However, if I could get this to work, I think having lightings change as an output would be a really cool concept as well.
Just like a thermin, I also think it’d be fascinating to have each hand represent an output. For example, left hand would be volume, while right hand would be the pitch.

Create your own p5+ml5 sketch that trains a model with real-time interactive data. This can be a prototype of the aforementioned idea or a simple exercise where you run this week’s code examples with your own data

‘automatic’ camera that captures the video when the user makes a ‘v’ sign:

https://editor.p5js.org/jiwonyu/sketches/RB0AYLhUo

I thought it would be cute to have a camera that would detect when a user makes a common hand gesture in picutres, v, because it’s annoying to set up self-timer.

my attempt to make data collecting more discrete, instead of pressing 1 and 2:

https://editor.p5js.org/jiwonyu/sketches/u6k0FIEP9

I really couldn’t think of an elegant way to collect datas, so I thought that I could ‘start’ collecting data when there was noise (using p5 sound volume) v.s. when there isn’t. For example, if the noise level is below 10 (mapped bewteen 0-100), the it would be category 1 data (whatever this may be), while if the noise level is above 10, the sketch would collect category 2 data (this can also be anything). In my imagination, I thought that I could make category 1 into my ‘others’ category, while category 2 can be my specific data group (Which i didn’t decide yet).
However, nothing was being read, and sometimes when I run the sketch the browser did not even ask to use the mic, so I wasn’t sure how to troubleshoot that.

Improve the handPose example we built in class https://editor.p5js.org/yining/sketches/dX-aN-8E7

handpose p5 sketch with all five fingers:

https://editor.p5js.org/jiwonyu/sketches/JapMciUty

Week 4 Homework

Pick one of the models above (PoseNet, HandPose, UNET, BodyPix, CoCoSSD) and following the examples and ml5.js documentation experiment with controlling elements of a p5.js sketch (color, geometry, sound, text) with the output of the model. (You may also choose a ml5.js model not covered here if you like!)

- https://editor.p5js.org/yining/sketches/NhSshC1cq (pose + cat)
- I chose to base my sketch on this example code: https://editor.p5js.org/jiwonyu/sketches/LR7_BO_nQ
- I wanted to make a nose-keyboard — so users can play fun, simple jingles in a very demanding and hard techniques (because the user has to be fast with how fast the nose moves).
- I used the p5.sound library to create synth sounds, and I wanted to do this instead of uploading individual audio note files because I want to incorporate p5.sound into my near future projects.
- In the future, I want to make the text() notes more appealing, and have all notes from C-B, and possibly buttons that allows the user to change the octave.
- Video is the following:

Considering the Model and Data Biography reflect on the the following questions:

- - What questions do you still have about the model and the associated data? Are there elements you would propose including in the biography?
    - I think in the biography, there should be an another ‘who’ section on who first labeled/annotated the image datas. I think it would be helpful to know a brief biographies of people who make up the labelers. For example, model and data biography could show a rough circular percentage diagram that shows the age, gender, race, background, financial backgrounds (but not limited to these).
    - I think having this understanding of the population makeup can help the users see where a bias may lie in the model.
    - ex)
    - How does understanding the provenance of the model and its data inform your creative process?
      - I think it challenges me to really think about what I put as my dataset. Oftentimes, I only use myself as the data (especially in google teachable machines). This is obviously bad, because as the only creator and the user (especially for small projects), I am inclined to make a model that satisfies my standards only. Understanding the origin of the model and its data shines light into the importance of ‘unbiased’ data collection and labeling, though I am not sure how ‘unbiased’ a man-made product can actually be. Nevertheless, I think it’s important to collect data from diverse points and get user testing/feedback to improve the current data collection

Week 3 Homework

Reflect on the relationship between labels and images in a machine learning image classification dataset? Who has the power to label images and how do those labels and machine learning models trained on them impact society?

The labeling and categorization of images demonstrate many political, cultural, and historical bias in the humna brain. First, we do not know the exact way of how the image inputs were put into the system. Were they picked by the computers, or were they manually put in by the humans and the coders who might’ve made the machine learning model? Did a company hire random three people off of craigslist? Whichever the method might be, there’s clearly something wrong with every choice. For example, if the inputs were chosen by a computer, how were they chosen? were there consent? If humans chose the inputs, the dataset is going to have the person’s specfic bias. None of the inputs can truly be ‘objective.’

One of the things I did not recognize as a problem until reading the passage was how binary the gender categorization is. I think the machine categorization has flaws in that it is too black and white on all of its processes. Unlike humans who can make a ‘case-by-case’ decisions (and therefore have the ability to make excemptions), machines cannot do that unless it has been programmed specficially to make that specific excemption.

Machine learning and image classification is impacting today’s society more than ever before. With technologies like facial recognition, handwriting recognition, voice recognition, etc., classification models are being utilized in both corporate and federal levels. This may speed a lot of processes, but it definitely has many flaws. One of the most important flaw is that it actually enhances bias and discrimination against many minority groups. For example, with skewed method of categorizing who is a criminal versus who is not, machine learning algorithms make heavy racial, financial, and gender based assumptions, further instituting discrimination.

Train your own image classifer using transfer learning and ml5.js and apply the model to an interactive p5.js sketch. You can train the model with Teachable Machine or with your own ml5.js code. Feel free to try sound instead of or in addition to images. You may also choose to experiment with a “regression” rather than classification.

https://editor.p5js.org/jiwonyu/sketches/MAP33RLgI

My attempt was to make the gif go faster as the user inputs the word, “hurry! into the computer. I used the delay () function from p5js library.

The model that I trained was able to detect background noises vs. “hurry,” but it’s not too accurate in a way that the model will cateogorize other random words as “hurry.”

For my sample, I had about 50 seconds of me saying hurry in a low, high, slow, and fast fluctuations.

I also had 50 second sample of the background noise as well.

What didn’t work based on my code was making the gif slow down with the number that was inputed:

gif.delay(x);

I console logged the x value (which was going down with x– from the initial value of 1000), and the x value was decreasing as I wanted to. However, the gif itself was not going faster as the x value decreased, which I was not able to fix:

lastly, I got the gif from giphy.com.

Week 2 Homework

Explore ImageNet. ImageNet sample images, Kaggle ImageNet Mini 1000, What surprises you about this data set? What questions do you have? Thinking back to last week’s assignment, can you think of any ethical considerations around how this data was collected Are there privacy considerations with the data?

I was surprised by the simplicity of the sample images. For example, on ImageNet sample imges on github, often there were images with just a part of an animal (like hen and cock). Most also had a simplified, or blurred background to focus on the object even more. Some even had animals in their unnatural environment — the jellyfish sample image was artificial jellyfish in a fishtank. I thought this contradicted to what most people would use as an input to these ML Algorithms. Additionally, I thought it was note worthy to state that there were only one sample image per object, which would greatly decrease the confidence level of the algorithm. Additionally, for the category of “groom,” there were multiple people in the image, making it difficult for an accurate learning. I am also questioning to what extent the datas will skew against or for a specific race, gender, age, etc. For example, on the category of “groom,” it’s a family of East Asian people. How would this affect the outputs when a user uses a black family as an input? What about two grooms getting married? What if it’s interracial?

Using the ml5.js examples above, try running image classification on a variety of images. Pick at least 10 objects in your room. How many of these does it recognize? What other aspects of the image affect the classification, including but not limited to position, scale, lighting, etc.

I was surprised to see that the ML was able to get cowboy boots correctly. Not only was I shocked by the fact that it recognized a show, but I was more shocked by the fact that the ML had a specific category of “cowboy boots.”

I also thought it was funny how some objects were identified as two different things depending on the distance and the angle of the object. For example, orange was easily spotted when it was up close, but once it got far away, it was read as a computer mouse.

Doing this exercise made me realize how limited each data base can be. I am maybe thinking that it might be better to have a more specific image identifier, so that the data samples can be bigger. For example, there could be an image identifier for JUST bird eggs. On this identifier, the creator could have more image samples per egg, instead of having 1-2 that present a specific type of an egg in a broader image identifier.