For this assignment we were asked to create a digital, audio-visual, sample-based instrument. I found five different sounds from freesounds.org and the London Philarmonia Orchestra’s website. For each individual sound, I was tasked to design a visual, animated representation for each sound and sketch it on paper. The instrument and corresponding animation were created using P5.js. According to the composer Edgard Varèse mentioned in this week’s reading, “music is organized sound.” I wanted to experiment with the concept of music as organized sound by creating an instrument out of an object we use everyday: the elevator. By taking sounds from an elevator and mixing it with other instruments, we could generate unique sound in an interactive way.
Process
First, I found sound files that I wanted to use for this project. I found elevator music from freesounds.org which served as the inspiration for the theme. It loops very well which made it great as background music. The below image shows a preliminary idea of what I wanted to do on paper.
I added if statements that would detect when audio was playing and trigger the corresponding animation. We call the keyPressed function which plays the audio file, then test that condition in the draw loop. To make the color of the background change, I used the random function and an interval timer. To roll the ball across the floor, I mapped the audio clip’s progress to horizontal movement. When the elevator ding is played, we draw the inside of the elevator until the audio clip ends. Finally, I created a balloons object keep track of the location, color, and speed of each balloon which would be recalculated upon each draw loop.
Reflection
This first assignment was interesting because we got to create our own instrument using just audio clips. Using programming, it is easier than ever before to create digital instruments from sound files. There are an infinite amount of possibilities that musicians and artists alike can create with the technology we have today. I hope to learn more about the intersection between music and programming in this class.
For my final project, I wanted to explore audio generation with Magenta. My original idea was to use NASA data to generate sounds based on data collected on the universe, but I came up with a better idea shortly after. Having played the violin for ten years, I have found that the violin is difficult to play because it requires accurate finger positions and complex bow technique. I wanted to create an interface for people to make music without musical experience. My inspiration for this project also came from the Piano Genie project that Google made, which allows for improvisation on the piano.
Process
The goal of this project was to use notes played from the violin to produce a sequence of notes on the computer. Below are the steps I needed to complete in order to make the project come to life.
I began by experimenting with a variety of pitch detection algorithms, which included McLeod pitch, YIN(-FFT), Probabilistic YIN, and Probabilistic MPM. I ultimately decided to use a machine learning algorithm included in the ml5.js library. The ml5.js pitch detection algorithm uses CREPE, which is a deep convolutional neural network which translates the audio signal into a pitch estimate. Below is a diagram of the layers and dimensions of the model included in the paper.
After running the pitch detection and checking that the RMS level is greater than 0.05, we call a function in Piano Genie that asks for a prediction based on the model. This can also be done by moving the bow across the violin or typing 1-8 on the keyboard. I created a mapping based on the string that is played. For example, G will call 1 and 4 on the model, D will call 2 and 5, A will call 3 and 6, and E will call 4 and 7. These are chords that are usually spaced an octave apart from each other. Most of the time the notes played are harmonious, but occasionally they sound awful. Below is an explanation of the Piano Genie model from their website.
Training
I trained the data on the following songs from classicalarchives.com, which had violin and piano parts:
Sonata No.1 for Solo Violin in G-, BWV1001
Partita No.1 for Solo Violin in B-, BWV1002
Sonata No.2 for Solo Violin in A-, BWV1003
Partita No.2 for Solo Violin in D-, BWV1004
Sonata No.3 for Solo Violin in C, BWV1005
Partita No.3 for Solo Violin in E, BWV1006
Violin Sonatas and Other Violin Works, BWV1014-1026
Violin Sonata in G, BWV1019a (alternate movements of BWV1019)
Violin Sonata in G-, BWV1020 (doubtful, perhaps by C.P.E. Bach)
Violin Suite in A, BWV1025 (after S.L. Weiss)
Below is a sample of a Bach Sonata:
I ran the included model training code which can be found here. I attempted to the training script on the Intel AI Devcloud, but the mangeta library requires libasound2-dev and libjack-dev to work. This cannot be installed since apt-get is blocked on the server. I scraped the files off the classical archives website and converted it into a notesequence which can be read by tensorflow. I then evaluated the model and converted it using a tensorflow.js model. When I was converting the tensorflow model into a tensorflow.js model, I ran into some dependency trouble. The script wanted me to use a tensorflow 2.0 nightly build but it wasn’t available for mac. I had to create a new python 3.6 environment and install dependencies manually.
Challenges
Along the way, I ran into a couple issues that I was mostly able to resolve or work around. First, I had an issues with AudioContext in Chrome. Ever since the autoplay changes introduces a few years ago, microphone input and audio output is restricted as a result of obnoxious video advertisements. Generally, this is good, but in my case the microphone would not work 50% of the time in Chrome, even when audioContext.resume() was called. This could be because p5.js or ml5.js has not been updated to support these changes, or it could be my own fault. Ultimately, I used Firefox, which has more open policies and fixed the issue.
Another issue I had was that Ml5.js and Magenta were conflicting with each other when run together. I could not figure out why this was occurring, I assume this was because they used the same tensorflow.js backend which may have caused issues with the graphics. Rather than fixing the error, my only real option was to silence it.
I was generally pretty happy about the results that I produced. The model is not very good at generating rhythm but it does a good job at generating Bach style chords. The pitch detection model also needs some modifications to pick up notes more accurately. Much of the work was already done by the Piano Genie team who created the model, I only adapted it to work for the violin. The violin is rarely used in any machine learning experiments because it is difficult to receive notes, whereas the Piano has midi support which allows it to work universally. I hope that as machine learning grows, more instruments will be supported.
For this week, we had to train a cyclegan model on the Intel AI Devcloud. I created a photo to comic style transfer using a custom dataset collected on the internet. I was inspired by @ericcanete’s instagram artwork, so I created a model using his photos. I also used a dataset from the University of Illinois that was scraped from Flickr and includes portraits of people with a variety of facial expressions and backgrounds. Cyclegan allows us to make image to image translations using two types of training data. I ran the scripts provided to us with some modifications. Training took about 2 days with 112 epochs run. Below is a sample of the training and testing images I used for both domains.
Domain A (1,389 items):
Domain B (1,308 items):
Results
Left Side: Input Image Right Side: Output Image
Conclusion
This was not really a comic style transfer as it was a line art style transfer. Ironically enough, this model does not work well for faces, the thing it was trained on. The black eyes turn into creepy white eyes and the portraits look somewhat spooky. This model is somewhat overkill as there are probably much better algorithms for converting images into black and white. However, it is interesting how we can teach a machine how to do a certain style of art without explicitly teaching it.
For my final project, I would like to explore music generation using machine learning. Modern day music is heavily influenced by the tools available to artists. Even though these computer generated effects are not seen by the user, they play an important role in music creation. Machine learning is relatively new and has only been applied to audio recently. Originally, my idea was to use radio signal data from NASA to generate music of the universe. However, this data is not readily available and the idea has already been done before. As a life long violinist, I want to explore machine learning with violin music. I hope to train a machine learning model that can take simple random notes and transcribe them to a more complex and pleasant melody.
Inspiration
After doing some research, I found that there were not many applications of machine learning with the violin. I think it is this way because the violin is a very complex instrument that has not been integrated with technology. Electric instruments such as the piano and guitar have become popular while the electric violin has not caught on as much. Many people find the violin to be a difficult to play instrument and there are fewer and fewer players every year. I want to create a version of the violin that can be played by people with varying musical backgrounds, from beginner to advanced. Below is an example of Google’s implementation of a musical neural network that can generate music from just a few buttons.
Project Plan
I plan to use Magenta.js because it has a web framework available and examples that relate to my project. I will need to find a way to transcribe real time input from violin into a frequency along with amplitude. This will be fed into Magenta which will find a suitable note to output I would then need to find a way to output the notes using a sound library. I intend to train a model using violin midi files from a wide variety of composers throughout history. If all goes to plan, a player with no experience should be able to create some basic melodies that sound somewhat decent with minimal violin experience.
Possible Resources
MIDI for the browser – https://github.com/mudcube/MIDI.js
Tuner App for Chrome – https://github.com/googlearchive/guitar-tuner
For this week, we were allowed to either develop a project using the ML5.js CVAE library or generate visuals using deep dream. After reading some articles, I learned that the person who developed deep dream did so by accident. I found it interesting that art could be generated by scientific research by simply tweaking a couple lines of code. These images represent artificial intelligence research in an artistic way. I really enjoyed the psychedelic images shown in class, so I tried to create some visuals using pre-trained models and services.
I found a website named deepdreamgenerator.com that allowed me to take a style and apply it to an image.
Style + Image:
+
Result:
Another cool website was http://deepdream.psychic-vr-lab.com/deepdream which creates trippy images that are similar to taking psychic drugs.
Input Image of a city wall in Xi An:
Output Image:
Photo I took from the Bund:
After playing around with different variables to optimize the quality using deep_dream.py, I created this image with the following parameters:
step = 0.03 # Gradient ascent step size
num_octave = 3 # Number of scales at which to run gradient ascent
octave_scale = 1.4 # Size ratio between scales
iterations = 20 # Number of ascent steps per scale
max_loss = 5.
Deep Dream Result:
Deep Dream art has died off in popularity but the concept is still very cool and these generated images show how artificial intelligence and art are actually very similar. Many of these types of programs are variations of Google’s deep dream and all produce similar results. I wonder if style transfer could be combined with deep dreaming to make a more interesting version of both models. I hope to understand deepdream more and use it for my projects in the future.