Moacir P. de Sá Pereira on Data, Data Visualization, and Literary Analysis

By Jordan Williamson

Moacir P. de Pereira gave a talk titled “Data, Data Visualization, and Literary Analysis: From NYWalker to Essay” on April 4 in Bobst Library. Prof. Pereira, an assistant professor and Faculty Fellow of English, works in the digital humanities and on 20th– and 21st-century American literature. He describes his methodology as “mixed”: an “idiosyncratic approach” that “uses digital tools to analyze aesthetic objects.”

The primary digital tool Prof. Pereira used in the talk was one of his own making, NYWalker, a program that can be used to track and map references to places in texts. He began his talk by analyzing Langston Hughes’s poem “Could Be” with NYWalker, to show how the movement from data to data visualization to literary criticism might work. The poem contains seven references to places, with two repeated. NYWalker was able to pull those places out of the poem and locate them on a map of the United States, revealing that two of the places mentioned in the poem, one in Cincinnati and one in Detroit, have been replaced by Interstate 75. With this new information in mind, Prof. Pereira noted, our reading of the poem might change, making it “an elegy for African American communities destroyed” by infrastructure projects.

Next he moved onto a larger project, an analysis of Colum McCann’s novel, Let the Great World Spin. According to Prof. Pereira, the novel contains about 780 references to 310 places. Using a tool in NYWalker that, in his words, allows users to “add semantic sugar,” he showed how he sorted each mention of a place by section of the novel. A user could also, for example, categorize the reference by speaker, or distinguish between its use in plot or mention in a discussion or interior monologue. The tool allows users to encode information that a computer could not discover on its own and that a human, having read the book, might catch but not necessarily register as meaningful.

Having sorted the novel’s references by section, and with the understanding that each section corresponds to a character, Prof. Pereira was able to establish a geographical center for each section and character. One character, Ciaran, had a particularly surprising center. His section describes his move from Dublin to New York, and with the references to places mapped out, the center of his narrative “lunges” from Dublin to New York at distinct points. This finding offers, for Prof. Pereira, “the opening for returning to the text.” The reader can ask “why is this the case,” with respect to something discovered through data visualization, and then “build to an article or essay.”

This is an example of what Prof. Pereira calls “everyday criticism” informed by digital methodologies. It is, he says, “mixed methodological, iterative, and process-driven.” What seems important to him is the mixing of the quantitative and qualitative modes of inquiry and assessment, the acknowledgement that data capture and visualization is “not enough to stop the work,” and the fact that one cannot know, when capturing and analyzing data, exactly what will happen. One will always wind up with new questions.

Prof. Pereira closed the talk with what he called a “changed map” of the process of reading. This new map proceeded from the initial reading to data capture and then to data visualization and then back to the first step again. When returning to the first step, though, one rereads with the data in mind, which allows one to look for new sources of data to mine and new questions to ask. Rather than foreclosing possibilities of interpretation or assessing the text from some objective point of view, the changed map Prof. Pereira advocates is almost endlessly generative of opportunities to capture and use information. The process is “not enough to stop the work,” indeed, and is useful precisely because of that.

For more information about Moacir de Pereira’s work, please visit