Exploring Magenta as a tool for musical expression (and, more generally, the field of generative music as a whole) I’m intrigued by a few recurring concepts:
The concept of musical “wrongness” in generative compositions — notes and rhythms that sound out of place within the context of the piece, to what extent those are bugs or features, and to what extent they need to be sanded smooth to create a “successful” piece
The line between human creativity and computer “creativity” (sorry for all the scare quotes) and how far an algorithm can take a piece of music before a human needs to intervene to turn it into something recognizable as music
My previous experiment with MelodyRNN showed me that, while the generated output doesn’t quite stand on its own, it makes for excellent source material for sampling. I’ve always liked working with improvisations from other (human) musicians as source material for sampling — it immediately takes the piece in a more interesting direction when the building blocks are not the size and shape I would have constructed myself. With that in mind, I turned to the 16-bar trio models in MusicVAE to compose my piece for me.
Using the example MIDI files from the pre-trained dataset, I generated three different interpolations at three different temperatures (0.1, 0.5, 1.5). I imported the nine resultant MIDI files to Ableton and played them all at the same time — sort of a digital riff on Ornette Coleman’s Free Jazz double quartet — and mapped each to a different patch for melody, bass or drums. I then added a bit of randomization to the velocity to make it sound more expressive.
The results are strange as hell, but far from unlistenable:
It’s been a long week, and I felt like I needed some new age music. Luckily, the Lakh Dataset has an absolutely mammoth collection of Enya MIDI files (75+). Seemed like a perfect little dataset to train for my exploration of Magenta’s MelodyRNN.
I tried transposing them all into the same key, but in the MIDI transcriptions most of the songs were written in C major with an uncountable number of accidental sharps and flats; so I decided to roll the dice and see what MelodyRNN could do.
I ran the network on 77 files (some of which were different transcriptions of the same song) for 4,000 steps, with a loss of ~0.6 and perplexity of ~1.8. I then ran the model to produce 10 outputs with 128 steps each. The results were off-kilter but musically satisfying.
The two tracks below are two iterations of the resultant MIDI files. “enyaBot Piano” plays through each of the 10 files once on a GarageBand piano, while “enyaBot” loops each of the MIDI tracks one at a time on one of GarageBand’s ten “Classic” synthesizers.
I was pleasantly surprised by the results. Even where the output sounds musically “wrong,” it sounds wrong in a consistent (and fairly interesting) way. I find something particularly compelling about the way the computer plays back these “wrong” notes — with no variation in velocity, every note is given the same weight, which makes it sound like it’s playing mistakes with 100% confidence.
The Octovox is a four-player live vocal processing unit. Each player stands on one side of the truncated square pyramid, and each side has five inputs which control two voices — one tracks the pitch of the player’s voice and maps it to a synthesizer, while the other retunes their voice to an algorithmically determined note. On the right of each side, a slider controls the volume of the pitch-tracking synthesizer and a knob adjusts the cutoff filter; on the left, a slider adjusts the volume of the vocoder and a knob adjusts the delay level, and a button in the center cycles through the Markov chain that determines the pitch of the vocoder.
Here’s a picture of the Max patch:
Here’s a picture of the circuitry inside (which should probably come with some kind of trigger warning):
There are plenty of live videos of this instrument in use from the NIME performance (probably the most thoroughly documented three minutes of my life), but I’d rather share this video of rehearsal / play-testing from the night before the show, as I feel it captures the exploratory nature of the instrument even better than the live performance:
The purpose of the Octovox is to encourage exploration and improvisation using the voice as an instrument. It’s a continuation of the same line of thinking that has inspired almost all of my work this semester (and, honestly, almost all of my work at ITP). One of Mimi’s questions during our final critiques cut to the core of that intention beautifully — when she asked if the instrument was intended more for “singers” or “non-singers,” I realized my goal was to demonstrate that there’s fundamentally no difference.
Training a machine learning algorithm (combination of Markov Chains and RNNs) on Sacred Harp songbook
Here’s what it sounds like:
Here’s what it looks like (graphic notation, hence “shape note”):
Here’s what I did with it last semester (performing the algorithm live with Max/MSP and p5.speech):
And with web audio (sampled solfege and played back with Tone.Markov):
At the time I couldn’t get the vocoder to work in the web browser so I faked it, and then the code fell apart before I could properly document it, so this is the only proper video I have of the project.
Combining Tone.js, Three.js (for spatialization), and text-to-speech (a library other than p5.speech, possibly Voiceful)
I recently downloaded a country music sample pack for another project, and I’ve wanted to use it to make something akin to Henry Flynt’s Hillbilly Tape Music. Building on a project from another class a few weeks back, I took a fiddle sample and played it through two separate players, with playback speed determined by mouse position when the mouse is pressed down (it plays back at a normal rate when the mouse is not pressed). In the interest of creating a “proper piece of music” with a beginning, middle and end (rather than just a weird interactive piece that plays until the user gets bored and closes the tab), I drew an array of dots to the canvas and provided instructions to connect them all before stopping the piece. This ensures that the user explores the entirety of the interface and all of the various combinations of playback speed, while still allowing them room to explore and play the piece their own way. Thus, the piece relies on both a generative mechanical system (the out-of-phase playback) and a generative social system (the indeterminate pattern of connecting the dots).
I can’t remember where I first read it, but the definition of music that I’ve found most useful in my practice is “the deliberate organization of sound over time.” That definition, which cuts to the core of what is both magical and universal about music, served as the loose inspiration for this project, which bears the working title “Sound Over Time.”
In this piece, I used the second(), minute(), and hour() functions of p5.js, and mapped to them the pitch of a series of oscillators. Each oscillator plays through an array of 60 notes (a little over eight octaves of a C major scale), with the note changing every second, minute and hour respectively. The piece never repeats over the course of the day, and starts over from the beginning every night at midnight.
In the interest of making the work interactive, and of allowing myself and the user to hear the whole piece without waiting a full day, I added a series of sliders in the top right corner that can be used to override the global clock time.
This is a placeholder post. I’ve got a lot to say about Sampulator and Keezy but I’ve been having trouble getting on the internet since I’ve been here today so it’ll have to wait.
For my project on sampling, I wanted to use a recording of a collective improvisation that I took last month in Germany. I recorded myself and a group of four other people (most of whom I had never met before) playing music in a stairwell for over an hour. We had a guitar and a ukulele, but mostly everyone just clapped and sang. It’s hard to put into words what a powerful experience this was, and I wanted to zoom in on a few minutes toward the end, where I improvised a new arrangement of a song I wrote nearly ten years ago.
The irony is not lost on me that I was in Germany for a web audio conference and my biggest musical takeaway was an acoustic jam session. With that in mind, I wanted to take this document and filter it through the medium of web audio. When you push play, two recordings begin playing simultaneously — a Tone.Player and a Tone.GrainPlayer. When the mouse is pressed, the playback rate of each recording is mapped to the X- and Y- axes, and when the mouse is released, it snaps back to regular time, but the recordings are (almost necessarily) out of phase.
Coming into school several hours later than I had planned today, I ran into Sukanya, who mentioned writing her homework blog post about a (terrifying) company called Faception. I remembered reading about them last year in Allison Parrish’s Electronic Rituals, Oracles and Fortune Telling class in an article called “Physiognomy’s New Clothes“; so I returned to that syllabus and found an article I had never gotten around to reading called “How Algorithms Rule Our Working Lives.”
The article, adapted from a book called Weapons of Math Destruction, focuses primarily on a company called Kronos (why do all these companies have such ominous names?) that develops systems to assist in the hiring process for large companies like chain stores and restaurants. Part of their hiring process is a legally dubious “personality quiz” often used by psychiatrists to diagnose and treat personality disorders. This portion of the quiz flags applicants according to their answers, effectively weeding out any potential hires who have ever shown signs of mental illness.
[note: going back to add more to this; just wanted to hit “save” so I have something to post on the homework wiki, even if it’s incomplete]
This sketch is relatively bare-bones still, but it incorporates an idea that I think has potential. The basic premise is two synths, an AM synth and an FM synth, which play up and down a scale along the X and Y axes, respectively. However, the mic input is mapped to a Chebysev distortion, which warps the waveform when the mic picks up a signal.