For this assignment we were asked to create a digital, audio-visual, sample-based instrument. I found five different sounds from freesounds.org and the London Philarmonia Orchestra’s website. For each individual sound, I was tasked to design a visual, animated representation for each sound and sketch it on paper. The instrument and corresponding animation were created using P5.js. According to the composer Edgard Varèse mentioned in this week’s reading, “music is organized sound.” I wanted to experiment with the concept of music as organized sound by creating an instrument out of an object we use everyday: the elevator. By taking sounds from an elevator and mixing it with other instruments, we could generate unique sound in an interactive way.
Process
First, I found sound files that I wanted to use for this project. I found elevator music from freesounds.org which served as the inspiration for the theme. It loops very well which made it great as background music. The below image shows a preliminary idea of what I wanted to do on paper.
I added if statements that would detect when audio was playing and trigger the corresponding animation. We call the keyPressed function which plays the audio file, then test that condition in the draw loop. To make the color of the background change, I used the random function and an interval timer. To roll the ball across the floor, I mapped the audio clip’s progress to horizontal movement. When the elevator ding is played, we draw the inside of the elevator until the audio clip ends. Finally, I created a balloons object keep track of the location, color, and speed of each balloon which would be recalculated upon each draw loop.
Reflection
This first assignment was interesting because we got to create our own instrument using just audio clips. Using programming, it is easier than ever before to create digital instruments from sound files. There are an infinite amount of possibilities that musicians and artists alike can create with the technology we have today. I hope to learn more about the intersection between music and programming in this class.
During this meet, we spent about 1 hour talking in-depth about her life and interest. Last time, we asked her to pay attention to her daily life and take some notes if she feels that something is not accessible enough for her. So, we discussed it during this meet.
Bikes
She lives very far away from her school. It takes about 2 hours on her way back home every day by the metro. The worse situation is that the metro station is not conveniently near her home. Therefore, she has to walk about 10-20 minutes to the metro station every day, which is a heavy burden for her. Her legs often pain after walking for a comparatively long distance. Therefore, she and her father want her to learn how to ride a bike. However, because of cerebral palsy, she is a little unbalanced. Therefore, we anticipate that she might have trouble controlling the balance of the bike. Until now, she has not ever tried to learn to ride a bike. I suggested her to try to learn to ride bikes with protective measures. It may be possible to design some mechanisms or design a bike that is accessible to her.
Hobbies
She is very fond of traditional Chinese drawings. She takes part in a club in the school and gets trained. Since one of her hands is fully functioned, she draws very well and gets praised in school! In addition to that, she is also trying other hobbies like photographing and ceramic art in her spare time.
Major
She is now major in graphic design. Currently, she is learning to use Photoshop and will be learning to use Premier to edit pictures and videos. She can do all the work with her fully functioning hand. She chose this major because she is abler than many of her classmates since this major requires a higher level of learning and practice. And she loves to learn it. However, she also worries that she may not get a well-paid job compared to other normal students who learn the same skills like her. Also, she thinks English is becoming a barrier for her to live and work in Shanghai.
For this stage of this project, we want to further refine our work done up to this point, including 1) our web interface/API to the CartoonGAN models and functionalities; 2) the web application utilizing CartoonGAN, which would have more layers of interaction and possibility to it with the new features we plan for it to have.
We sincerely hope that through these refining work, CartoonGAN can finally become a powerful and playful tool that can be used by learners, educators, artists, technicians, so that our contribution to the ml5 library would truly help others, and spark more creativity in this fascinating realm.
Methodology & Experiments
Gif Transformation
Developing gif transformation on a web application is a more demanding task than we imagined. Due to the fact that there isn’t any efficient modern gif encoding/decoding libraries, my partner who worked on this functionality went through quite some effort to find usable libraries for working with gifs in our application.
*This could be a potential direction for future contributions.
Front-end wise, we implemented a simple but effective piping algorithm in order to recognize the type of input the user uploaded, and trigger respective strategies accordingly.
This cyberpunk kitty is recorded during one of our experiments with GIF transformation. As shown in the video, the transformation (original style to Miyazaki’s Chihiro style) output is glitchy, resulting from a single frame loss. This could be resulting from issues with GIF encoding and decoding in our web application, as we currently work with GIF in the following way:
GIF ➡️ binary data ➡️ tensor ➡️ MODEL ➡️ tensor ➡️ binary data ➡️ GIF
Therefore, encoding issues could largely effect our final outcome. This is a problem that needs to be looked into in the future.
Foreground/Background Transformation
Foreground/background transformation is one of out biggest feature updates to our CartoonGAN web application.
The main methodology we used to develop this feature is to implement BodyPix as a tool to recognize humans from their background, and use that as a mask for the input image. This mask is then used to manipulate the pixel data from the image, so that the cartoonization can be applied to either foreground, background or both depending on the user’s choice.
We hope this could bring our user experience to another level, as we try to bring our users the experience of seeing themselves in the cartoon world of their choice, by either turning themselves into a cartoonized character, or turning their surrounding world into a fusion of their reality and fantasy.
Demo foreground/background outputs:
Foreground –
Background –
Social Impact
ml5 library
We wrapped CartoonGAN into a ml5 library, and submitted a pull request to merge our work into the ml5.
The reason we included this as part of our project goal is that we hope our work would become real contributions to the creative world out there. Machine learning on the browser is still a relatively new and merging field, the more work and attention it receives, the faster it will grow. Though I am a newbie myself, I really hope that my efforts and contribution could help ml5 grow as an amazing tool collection for the brilliant innovative minds in this realm.
Further Development
There are still work to be done and room for improvement in this project to bring it fully up to our expectations.
Web application wise, GIF transformation is still relatively slow and buggy, due to the insufficiency of existing tools to work with gifs on the browser. We did our best to accommodate these issues, but we still want to look into potential ways of improvement, maybe even new issues to contribute to.
The CartoonGan ml5 library is still a work in progress. Although we have the barebones ready, there’s still work needed. We are currently in progress of building tests, examples, guides and documentation for the library, and designing wise, we still need to improve the library in aspects like error and corner cases handling, image encoding and other input format supports. These are all necessary elements for CartoonGAN to become an easy-to-use and practical library, which is our ultimate hope.
As the name suggested, my project is about social media. The socializing patterns have changed a lot due to the development of mobile phone and social media. In China, specifically, Wechat becomes the most frequent used chatting software and software for takeaway ordering such as ELEME (饿了么) become more and more popular. The development of social media and software changes the way we socialize a lot: they offer us convenience as well as more access to various information; however, they are also controlling us to some extent, as we rely and focus on them so much that we have neglected and lost some important aspects in life. Therefore, I want to embed the relationship between people and their socializing ways, in order to reflect the impact of social media nowadays critically.
Undoubtedly, the development of smartphones and social media created many conveniences in our daily life. In my project, I want to use some examples in real life (Wechat and ELEME) to reveal the connection between social media and people in a specific way. At the same token, I get inspiration from the topology theory to demonstrate the interaction between social media and people in an abstract way. As a geometric concept, the topology reveals the relationship among things with lines and its ramifications: arc, node, poly …, which can be a good example of abstracting the social network. The combination of specific (real-life examples) and abstract (topology) ways can reveal the complication of the interaction between people and social media.
On the other hand, social media has created problems, which is the other aspect that I want to include in my project. I got inspiration from a YouTube video called “Deleting Social Media for 30 Days Changed My Life” by Yes Theory. A critical question was raised: as we gain connectivity [from social media], what have we lost?”. To explore the reliance on social media for people nowadays, the host challenged himself to delete all social media software on his phone for 30 days and to see how he and his lifestyle had changed. The video aims at making people be aware of some aspects that have been neglected due to the dependence of social media; for instance, various activities including reading, working out, hanging out with friends without browsing one’s own phone and have no communication, can be easily achieved and enjoyed without social media. The host engaged more in his life instead of scrolling down the screen all-day and achieve nothing, which reflects the disadvantage that social media has brought to us. Therefore, I wanted to cover both the advantage and the disadvantage of social media to show the impact of it in an integrated and critical way.
Perspective and Context
Considering that the performance would be conducted in an underground club, I wanted to let the audience feel involved in my project and to do a typical VJ-ing performance. According to Eva Fischer in the chapter on VJing, “the viewer response [in VJing] plays an important role. Interaction with the audience, albeit on a subconscious level, has a big influence on the result of the performance, which never occurs in the same manner twice” (115). Under that certain atmosphere, I believe the more the audience gets involved in my performance, the better my project could work. Therefore, both audio and video I wanted to make an offer a sense of strong impact and shock: for the sound, popular trap beat and element of pop music would be contained; for the visual part, there would be simple patterns and shapes with strong-contrast color. Power and energy are the way I want to use in order to shorten the distance between my performance and the audience and to let them enjoy themselves during the performance.
For some specific patterns in the visual part, I gained inspiration from Ryoji Ikeda and his work Formula and Data Matrix:
I really appreciate how he deals with simple lines and cubes. As the geometric concept of topology is an essential part of my project, the relationship and movement of different shapes in Ryoji Ikeda’s work offer me some stylistic ideas.
Development & Technical Implementation
In two to four paragraphs describe how you developed your project and how your final realtime audiovisual performance system works technically. What was your research and creation process like? Which things did you try out before arriving at your final setup? How was the audio created? How were the visuals created? What materials did you use to create them? Link to any outside source if you used them.
Describe the setup you used for the performance and how it works. What are the different components? How do they interact with each other? How do you interact with them? Include pictures, diagrams, and screenshots! Upload your Max patch, including a functioning snapshot, to Github as through the process described in Assignment 3 and link to it in this section.
Audio part: Because I want to use the visual part to reflect the geometric relationship, which is quite abstract, the audio part is more realistic. Several real-life sounds are included (Wechat, ELEME) in order to raise the resonance of the audience and to make them feel closer to the performance. I asked my friend to pretend to be the delivery man and record his voice. The content of his voice is the scenario of take-out delivery: “Hello, your take-out food is ready. I’m at the front door of the building. Can you come and get it? Wait what? Put it to the storage room? Ok sure. It will be in the storage room. Remember to pick it up. Thank you!”.I am sure that most people have encountered a situation like this, so the audience should be familiar enough to the content shown in the record. I want to keep the voice and words clear, so it is used as the very beginning of the performance. The real-life scenario is used as an intro leading the audience to the following performance and make sure they are engaged in the environment and atmosphere I created. At the same time, I asked my friend Bongani, who makes rap music, for one of his rap songs ELEME. I cut a part of the song with the lyrics “say 你好,你的外卖到了” (Say hello, your take-out food is ready) to add some rhythm to the delivery situation, and to make the audio more energetic.
The following part is about Wechat. I recorded several sounds related to Wechat, including sounds of Wechat notification, Wechat phone call, Wechat account transfer, Wechat message sent and so on. These sounds are combined and composed in GarageBand, with some beats added to them to create a cadence. I also get help from my friend who helped me to edit a piece of the soundtrack with JL and I mixed several elements together. From simple to complicated, the density of the sounds changes as time goes by. It reflects the relationship between the users and Wechat: at the beginning, Wechat helps the user to communicate and offers convenience; after a while, the user becomes busier and busier with Wechat. Wechat starts to control its user’s daily life.
After the Wechat part, I added a record from a talk show Rock & Roast as a transition. The host used an example situation in a nightclub, in which a man used the advantage of the background music and the atmosphere to socialize. The situation the host performed is ironic as he wanted to show the contrast between him, a person who was too shy to socialize and could not step out of his comfort zone, and the man in the nightclub, and to reflect the problem in the socializing situation of the youth nowadays. I want to use this as the transition from the demonstration of some real-life situations to a reflective aspect. In the following part, I created some wavy and mysterious background soundtracks to lead the atmosphere of the performance deeper while adding some breaking trap beats to the basic layer to avoid being too dull.
At the end of the performance, I made another melody based on my performance of Guzheng, a traditional Chinese instrument. The Chinese style and elements have been more and more popular nowadays in music and performance. It is special and cool to have Chinese elements combined with some modern techniques, which is undoubtedly one reason for me to keep it in my performance. What is more, I do not want my performance to end in a sad and deep way; in contrast, I want to leave some space for the audience to refresh and reflect for a while. Take their time to review what they have just experienced: how do they feel? What do they gain from social media? What do they lose?
Visual part: As I have mentioned before, topology is essential for my project. Jit.gl is the best choice for me to compose the visual things, as it offers an opportunity to draw and create geometric shapes. At the beginning, I was thinking about having various cool effects with various components in MAX; however, Eric suggested that jit.gl has already contained a lot for me to play with. Therefore, my MAX patch is quite easy to understand, as I copied and posted several 1D jit.gl worlds and made them overlapping with each other to create different effects. By changing the drawing model of the gridshape into lines and giving a close shape to it (sphere, cone, or circle), lines and shapes with lines were drawn.
In the Wechat part, I recorded several screen captures, including sending Wechat message and having a Wechat phone call and made them as the background. To create a chaotic effect as time goes by, I also combined two 1EASEMAPPERs to get a colorful and geometric background. The 1EASEMAPPERs can be added to several elements to make plain shapes colorful in some certain time.
3D model has been contained as well. I asked my friend to make a 3D model of “ASSC”, referring to Anti Social Social Club, for the reflecting part (after transition); however, the final effect did not satisfy my expectation, which will be discussed further in the Performance part of this documentation.
Performance
Overall, the performance went well. Few mistakes had been made, which satisfied me a lot. Because I do not have enough time to manage too many knobs and variables, the audio part is mostly ready-made, and I focus more on generating the visual part during the performance.
The scale, dim and rotatexyz are 3 parameters that I generated most frequently during the whole performance. In the intro part, I generated the dim of the gridshape to change the shape from lines (1 side) to triangles (3 sides) and so, matching with the beat of the background music. For the following part, I generated the x and y scale and used the rotatexyz to move the shapes and make them overlapping with each other so that some special colors and shapes can be created. One thing I found out to be effective is the change of the MIXR mode, as the color and effects would have great differences with it. With the help of the MIXR mode, I turned the screen capture of Wechat phone call into the background of the performance, as it has a black background and a few colorful lines on it. When shapes from jit.gl overlapped with it, special shapes would come out.
I planned a lot, but still quite a few problems came out during the performance. Eric said I was trying to do too many things at the same time. When I was practicing before the performance, the screen went fluent even I generate several knobs at the same time; however, the final effect was out of my expectation, as it kept on seizing up during the whole time. I was nervous and worried about that on the stage, but I just want to do something (or I don’t even know what I am doing at that time) to save my performance, so I kept generating the knobs and scrolling down the screen, which made the consequence worse. Fortunately, my friend commented on this and said “I didn’t find it. I thought you meant to do that as an effect.”
I tried to embed the 3D model into the project as well, as I have mentioned in the last part. However, because I failed to make special texture for it, the 3D model turned out to be plain and quite strange in the whole project, especially on a big screen. So I let it out for a while and removed it immediately during the performance.
Conclusion
At the starting point of this project I brain-stormed a lot of ideas that I wanted to put into it. I succeeded in focusing on some real-life examples (Wechat, ELEME) and to make them as specific as I can; however, I the narration of the whole performance needs improvement. The performance was divided into several parts, which did not move fluently enough as an integer. I tried to add transitions and phone call sounds to link these parts, but improvement is still needed for the emotional change during the whole process.I am not that confident with the final effect as my original thought is to make something beyond simply cool or lit, but conveys some points that can be further discussed.
For the technical part, I think I relied too much on software like GarageBand and JL rather than MAX when making the audio. I need to spend more time on developing audio with MAX and adding effects to them. In the visual patch, I have too many things (5 jit.worlds and several effects) and I believe there could be some ways to simplify it.
The experience of VJing is really interesting and I learned a lot, both technically and practically. A lot of accidents happened during the whole process: MAX breaks down frequently; some language on the objects that is theoretically correct does not work… but fortunately, I gained my first experience of doing VJ on the stage in a real club instead of simply being an audience as before. A music label planning to organize some performance even asked me to cooperate with them and to perform with them. The experience really taught me a lot and helped me to understand the VJ culture and their works better.
Our project’s name is Wild. We got the inspiration from the movie Blade Runner. In the movie, artificial intelligence is designed to be nearly the same as human, it is really hard to distinguish the difference between humans and artificial intelligence. They are built by humans, and they are stronger than humans. When they come to fight with humans for limited resources, humans get hurt by the creature created by ourselves. Just like the first project I have done in the interaction lab, we designed a short scene play which described a situation that there would be a memory tree in the future which people can upload and download memory and knowledge from. But when one is too greedy and get too much from the memory tree, he is hurt by the memory tree. In this project, we want to present human’s over-dependence on technology. The technology is a double-edged sword, it can make our life more convenient but when we depend on it too much, it will hurt us on the other hand. The “Wild” is named after the scene in which all objects as well as the music lose control and go wild.
Perspective and Context
Our project is consists of three parts. The first part is the music, Joyce made the music using garage band. Because we wanted to make a contrast between how people rely on the technology and how would the situation be when we lose the control of the technology in the whole project, so we decided to make the first part of the music more peaceful and mild, and at the turning point, the music crashes. It turned crazy and wild. In this way, we create a connection between music and the visual scene. And the second part is about the Kinect. I got the inspiration for using Kinect to create the interaction between the performer and the project from this contemporary ballet dance.
I really love how the dancer’s movement influence the objects and projection next to him, and at the same time, the objects influence him as well. Rather than just simply controlling, the performer he himself has become a part of the project, he and the project become as a whole. Therefore, I decided to put myself into the performance. I originally planned to use the depth function in the Kinect to control the movement of the object in the max, but because my computer is Mac and many functions including detecting the depth can not be used. So, at last, I just use the shape capture to capture the shape of my body. And in this project, I use my hands to receive the 0s and 1s, which represents that humans can receive digits information through technology(because the technologies we have developed till now are all based on the binary system). The digits disappear where I am, but later on, with an explosive sound in the background music, the digits lose the control and randomly move on the screen.
The third part is the max patch. In the beginning, we want to design a changing object from regularly changing to crazily changing, in order to represent the control-losing. But later we found it was too boring and dull to just use a single object. So we decided to use more videos to enrich the content. We use a scene of a bustling city as the beginning of our project to indicate the development of the city, then there are green 0s and 1s dropping from the sky, the whole city is under the control of digits. By this, we want to show that the city is running with the help of technology. And then we use several short clips of how we depend on technology in our daily life. Then the main object appears. It changes and moves regularly according to the music. Later, we add the scene from Kinect, and with a sudden glass-broken sound, we cover the whole screen with the white canvas very fast, which represents the exploiting point. After this, our core object goes wild. It turns from smooth to poignant. The whole scene goes crazy. We add all different videos to show the situation that when we rely on technology too much and it crashes.
In the group of me and Joyce, basically, she finished the part of the music and I was in charge of the Kinect and processing part. And then we developed the max part together. Through this process, I learned to connect different ideas together to create a more diverse project, and I really enjoy cooperating with her.
While dealing with the Kinect, one biggest issue we met is the data transmission from Kinect to the computer. The amount of the data is too big for the adapter from USB to TYPEC, so when we try to connect Kinect through the adapter, there was nothing on the screen. We tried several different adapters, but none of them can transfer the data from Kinect to my computer. There was only one that works. So at last, we switch the midi mix to the Kinect during performing, and after the part of Kinect, we switch back to the midi mix.
Developing the max patch is the most challenging part. According to Ana Carvalho and Cornelia Lund (eds.), the VJ is just playing the videos and music that already exist. We want to avoid using too many video shots from the real world. Thus, I created videos of smog and random numbers by processing.
To create an adjustable object, we decided to use jit.gl to create a 3D object immediately. We learned from the tutorial on the Youtube, and then improve it. Performance
It was the first time we went to the club for performing. We got a bit nervous at first, but we found it was normal and everyone got nervous. During the performance, I was in charge of the movement of the center cube while Joyce switches the videos and the music. Basically everything smoothly, but before starting, the processing could not work normally, it kept reporting the error. I guessed that it was because we have run both the processing and max for too long, so we restart it immediately. Luckily, it worked. And in the middle part of the performance, the music became a little loud. Such unexpected things happen a lot during performances, so learning how to deal with them calmly is needed for every live performer.
Another important thing is to organize the patch in a clear way. It is dark in the club, therefore, when we want to use the MIDIMIX to control the patches, it is hard to see what we write on the tag. So assigning the notes on the MIDIMIX according to the arrangement of the objects on the patch is important.
At the end of the performance, when we adjust the position and shape of the cube in the center, it did not change as we practiced. We planed to only have the circle rotating, but there was a small yellow cube in the center, which make the project seems even better. So, although we have no idea why it would appear here, we are still satisfied with it.
Conclusion
I think this is an interesting project which broadens my eyes, I learned to produce music by garage band and I explore the use of Kinect. But there are a lot of things we can do to make the project better.
First, about the Kinect, the first thing is I want to try to use p5 or runaway to capture the movement of the body because I found that the control stage was not as dark as I thought. Or I can try depth and other functions of the Kinect on a windows laptop. And I also want to turn the Kinect to the audience so that there will be more interaction. Maybe an underground club is not a very proper place for these kinds of interaction, but this is a fun way to think from to develop the project.
Second is about the max part, we still use too much real scene videos in our project. I think there is more we can do on the patch itself. For now, our developments on the objects in the max simply rely on the functions inside the max. And it is hard for us to make a controllable and beautiful object in the max directly. I asked my classmates about their way of using max, I found that both of the two projects I love the most have used the .js. Therefore, I want to try more different ways of combining the max with other resources to create a more diverse project.
One biggest issue of our project is that we are trying to convey something “meaningful” to the audience, but actually there is no need for us to do so. Just like what we learned at the beginning of the semester, the synaesthesia varies because everyone has different life experiences and thinking. Therefore, their feeling toward the same piece of music may be different. The meaning of art is the same. The meaning of an art piece can vary from person to person, and it doesn’t need to be educational. We can jump out of traditional thinking and create something more crazy and abstract. So, in the future, rather than just put the street views on the screen, I want to discover more forms of presenting the idea I think. Also, while designing the music, we should consider more about the place we are going to present the project is a club, so the music with stronger beats and rhythm will be more suitable.
In all, even though there I think the connection between our music and visual art is strong, and audiences’ feedback tells us that we have made an exciting live performance.