[Owen Caldwell]: Midterm Portfolio – #1 Alt Text Comic

The Project

ABOVE: A night sky not like our own; swirling, turbulent strokes of blue, stars and moon pocket warm yellow strokes in concentric circles. TO THE LEFT: A dark cypress tree trashes toward a night sky, commanding attention. Sharp and branching like a cluster of stalagmite as it looms over a rural town beyond. TO THE RIGHT: Rolling hills meet the sky and flank a dozen quaint buildings. In the center of town, a thin steeple, only breaks the horizon line at the painting’s lower third. Blue brushstrokes make up most of the land, boxy pale-blues for the houses, dots of yellow lights, and shimmering blue-green trees just to the right of the town.

Starry Night by Vincent Van Gogh, June 1889.

Audio Description:

On a gray background, Truman Capote’s head and upper torso fill a square frame. He has a brooding expression, eyes closed, elbows propping him on a table below. His right hand index finger pressed against his temple. Deep wrinkles on a bald forehead above solemn eyelids. Arms perpendicular, the fingers of his other hand meet at the wrist. He uses his left hand thumb to support the weight of his chin. Semicircular black glasses loosely dangle in the hand that pushes against his temple. Just below the glasses, on Capote’s wrist, a black leather wrist strap. He’s dressed in a white dress shirt cuffed at the wrists, and a black suit with white pinstripes. To Capote’s right, A close light reflects off the details in his skin, his blazer, and casts dark left-bound shadows.

Irving Penn’s portrait of the author Truman Capote. 1965. Black and white photograph on a gelatin silver print.

Audio Description:

A red blob on a black square background. Two black holes within the blob appear to be in orbit, both surrounded by a white outer-layer. Each hole has a white curved tail pointed in opposite directions. Curved white reflections surround the holes at the perimeter of the blob.

Ian Cheng’s digital illustration titled “3FACE.” 2022.

Audio Description:

PrOject Description

I decided to redo my alt text project for the midterm, and I chose a range of different visual arts pieces to both describe in audio and alt-text.

Documentation

For this project I chose a place in my camera roll that I thought would be interesting; the most interesting thing I did all summer in fact, which was my mid-June trip to Japan with my friends Hashim and Cole.

A snapshot of Owen's camera roll showing various images of his trip to Japan: In one image, Hashim and Cole stand together holding an apple and a drink. In another, Cole is sitting in front of a delicious meal. There are a few images of a colorful city skyline, and a few images of signs written in Japanese.
 
I picked a few photos that told a neat story during that trip, and I wrote alt text, 1 image at a time:
 
Emily, Cole, and Owen arrive in Tokyo, Japan full of wonder and excitement. They walk on a catwalk across a bustling freeway, enormous skyscrapers tower above, piercing a hazy blue sky.
 

Tired, hungry, and running low on hope, the gang stops their hunt to create an inceptive collage on instagram using their three phones, one taking a photo of the other's screen. Beyond the haphazard layers of color and graphic buttons, the words "New York" and "Rio de Janeiro" scuttle the aloof faces of Owen and Cole, who stare longing into Emily's lens with mouths agape.

A triumphant Cole raises his ceramic tea cup to the camera, seemingly unaware of his messy brown hair, with his signature striped shirt and round silver glasses. A great big helping of ramen, complete with seaweed, red fish eggs, rice, wasabi, and sesame. Sitting in this restaurant feels like a warm hug with its traditional Japanese wood decor and warm lighting.
 

I decided to write captions based on the alt text as opposed to writing the captions first and then the alt text. I made sure that the captions and alt text don’t repeat the same information. My philosophy was that the alt text should inform the emotion and details of the image, while the caption should tell a story. Here are alt texts and captions in order from left to right:

Alt Text 1: Emily, Cole, and Owen arrive in Tokyo, Japan full of wonder and excitement. They walk on a catwalk across a bustling freeway, enormous skyscrapers tower above, piercing a hazy blue sky.

Caption 1: Emily, Cole, and Owen hunt for food in Shibuya, Tokyo

Alt Text 2: Tired, hungry, and running low on hope, the gang stops their hunt to create an inceptive collage on instagram using their three phones, one taking a photo of the other’s screen. Beyond the haphazard layers of color and graphic buttons, the words “New York” and “Rio de Janeiro” scuttle the aloof faces of Owen and Cole, who stare longing into Emily’s lens with mouths agape.

Caption 2: After 2 hours and no luck, The gang stops to document their looks of defeat.

Alt Text 3: A triumphant Cole raises his ceramic tea cup to the camera, seemingly unaware of his messy brown hair, with his signature striped shirt and round silver glasses. A great big helping of ramen, complete with seaweed, red fish eggs, rice, wasabi, and sesame. Sitting in this restaurant feels like a warm hug with its traditional Japanese wood decor and warm lighting.

Caption 3: We made it to salvation at long last!

Reflection Questions

  • What is the theme of the work? How is that theme particularly expressed through the modality of the week?

Going to Japan was such a visual experience for my friends and I. My friend Hashim was taking pictures the whole time— and I think the interest we all felt was in the little things, from infrastructure, to the types of plants that were there, to people, to the way houses are constructed— everything was totally unique from America. To think then about how to put that experience in words was the big effort of the project, especially a landscape like the first image, where my own experience was about taking in all of those details, and then how might I describe that feeling? I’m not saying I did a great job at that necessarily, but that is inevitably the goal.

  • Which elements of the work are beautifully/wonderfully/perfectly expressed through the modality?

Description of actions; not just the contents of the image, i.e. two people on a bridge or a guy in front of a bowl of food, but the story of the image, is something that might be able to be inferred by a sighted audience, but it is a must in this modality. For alt-text, the author must take that visual storytelling that cues in the sighted audience and put it into words, and I think something very interesting can come from that if the work is done well.

  • Which elements are lost or inexpressible through the modality of the week?

I think even the best alt-text authors won’t be able to capture the nitty-gritty of an image. In the same way that translations of certain texts into english lose a level of detail that is only expressible in the original language, trying to perfectly describe something visual in words is impossible.

  • Who does this project exclude, who would not be able to interact with this work, and who is this modality not accessible for?

The work of trying to describe something visual in words perfectly may require additional linguistic complexity, in which case you actually end up cutting out a portion of your audience that have learning disabilities. There aren’t standards that necessarily include people with learning disabilities who may not be able to easily read complex alt-text descriptions.

  • Now that you’ve identified who is excluded, what is one way you could remix this piece to include another population? (You don’t have to make this part, but think about it and write about it).

One solution would be to incorporate plain-language, which minimizes writing complexities like specialized words or longer sentences, and would make the alt-text simple and direct, hopefully without losing the description’s complexity. Another more experimental idea is to use touch to distinguish information about the image. If the image were a map for example, the project could include textured lines to indicate a place or a set of directions, and then accompanying alt-text if needed to describe in greater detail what the map is communicating.

Additional Modality (if applicable) 

What modality did you apply? 

I combined Alt Text and Audio Description.

How did you decide on this modality?

I was thinking about how museums would go about describing pieces for blind and visually impaired people, because when describing work, especially abstract pieces, there are a ton of different directions you could go. I just wanted to explore this and try it myself.

What does the beholder gain from this additional modality? Why? 

I think a human voice is superior to the auto-generated voice of the screenreader. Higher-quality human-written description combined with a human voice reading the piece should be (and I hope is) the standard in museums.

Does the beholder lose anything from this modality? What? 

The reader becomes the sole representational perspective of the piece for whoever is blind or low-vision. If a piece is poorly described, there is no way to corroborate that perspective. Without a range of perspectives the audience member loses their agency over the piece.

Show documentation of this modality

I selected my images from Artstor, downloaded them, then put them in a google slide. I listened to some audio from MoMa to get a sense of how much description I should do and how to describe certain images. I wrote the audio description in the speaker notes section below.

A screen grab of a google slides presentation. A slide of Starry Night is enlarged, there are speakers' notes below the slide.

Recorded my descriptions in voice memos. iPhone mics are surprisingly high quality.

A screen grab of voice memos.

I trimmed the silence off the ends of my recordings.

A screen grab of "trimming" function in the voice memo app.

Finally, I placed my images, image titles, added alt text, and embedded an audio player for each piece.