Andrew Tran

Andrew Ba Tran

Investigative Data Reporter, The Washington Post

Why journalists shouldn’t be afraid of numbers

Andrew Ba Tran is a data reporter on the Investigative Rapid Response team at the Washington Post. His data journalism has won him two Pulitzer prizes, one in 2014 for the Boston Globe’s coverage of the Boston Marathon bombings, and one in 2018 for the Post’s coverage of conservative activists’ failed efforts to plant a false story about Senate candidate Roy Moore within the Post itself.

Tran was born in Chicago to parents who came to the U.S. as Vietnamese refugees. Finding it too cold in the Windy City, the family moved to Texas when Tran was very young. 

Tran studied government at the University of Texas. To “scratch that itch” of wanting to write, he worked at the college paper, the Daily Texan. He loved the grind of reporting and the challenge of telling a story in a short timeframe. After graduation, Tran started his career in journalism as a staff writer at the South Florida Sun Sentinel. 

Tran is also an adjunct professor of qualitative methods and data visualization at American University. He shares his knowledge of R, a statistical programming language, through his website “R for Journalists.”


By Agnes Cheung

When I first started in journalism, I wanted to become an investigative journalist like Lane DeGregory at the Tampa Bay Times, covering really big stories. While I was an online producer at the Boston Globe, I was feeling a bit stuck. I happened to be working with Matt Carroll, the data journalist who was featured in the movie “Spotlight.” I was really impressed with how he was able to tell so many different stories using different slices of data analysis. I wanted to get into that more.

Now, as a member of the Rapid Response team, I investigate various topics in the news using data. I’ve covered everything from climate change disasters to police shootings, the rise of gun violence, and the trends in right-wing extremism. I work quickly to analyze data and tell stories with it.

Data Stories Are Human Stories

Data journalism is data plus the journalist doing journalism, doing analysis to tell better stories. Without the journalism part, it is really just presenting data. But what does the data mean? 

In August 2022, we published a piece about the impact of heat waves by ZIP code. We made a map using open source and proprietary data from the nonprofit First Street Foundation. Our additional analysis showed how many properties would be affected.

But who cares about percentages of buildings at risk? So we humanized this information by estimating how many people would be affected by heat waves and wildfires at the county level using census and demographic data — and within that, using poverty levels and racial demographics to see whether people of color were more at risk.

People could not only see a big, pretty map, they could also look up their zip code and see how they would be affected in relation to their neighbors, and in the big picture. 

It was important to let people see themselves in the data.

Click the image to read the Washington Post’s interactive story on the most dangerous future heat waves

Sharing Data Across Newsrooms

Back in the ’80s and ’90s, data journalism was called “computer-assisted reporting.” It was very technical. Newsrooms had to use a lot of extra, expensive software to do statistical analysis. Journalists had to take college courses to learn to use the computer. 

At the Boston Globe, I got to the point where Excel couldn’t open the really big data sets we got from sources. Google told me that I needed R or Python, which were free, open-source softwares. The software languages haven’t really changed too much since the beginning, but nowadays the software developers have streamlined a lot of the complicated coding that I used to do by hand. 

In the past, when it came to data for topics such as elections or gun violence, a lot of journalists and newsrooms had to reinvent the wheel every single time, even though the data sets were the same. Now, more journalists share their data work and their processes. That is a big change in data journalism. 

Data editor and reporter Jeremy Singer-Vine posted when he was at BuzzFeed News that he was scraping gun background checks from the FBI. That was normally a PDF, but he ran a script and made it available in .csv, which was the cleaned-up, workable data. He even updated the data once a month. He showed his work so everyone understood how he did it. 

I worked with that data a lot when it first came out, and later again at the Post. I was repurposing his codes. That work was mine, and I could run it myself whenever I needed it, but it was all inspired by him.

Over the years, I’ve heard from other journalists who used data or codes that I shared as inspiration for their stories. It’s one big, happy family: The journalism community now is more open than it used to be.

Losing the Fear of Numbers

Journalists nowadays can no longer be allergic to numbers. Modern journalists know that when it comes to investigative, explanatory and feature writing, a lot of really high-impact stories could come from great data journalism. It is another reporting tool.

Data literacy leaves journalists less susceptible to being misled by bad data or by sources who might have an agenda. You need to develop a radar when it comes to numbers.

You should always validate your data; you should always have a second pair of eyes; you should always have someone look over your work. It can be easy to go down the rabbit hole. You don’t want to burn all that time that you and other reporters have invested in the story.

To me, the important thing about journalism is to report on those who are impacted disproportionately. 

Data literacy leaves journalists less susceptible to being misled by bad data or by sources who might have an agenda. You need to develop a radar when it comes to numbers.

With the heat wave story, I would love to see the affected communities get the help they need to set up systems to prepare for the heat waves — systems to check whether there are enough air conditioning units, to mitigate power failure, ways to help the elderly living alone or those who are more vulnerable. There is money out there to help prepare for these types of situations; I’d love for those affected to see that financial assistance.

I worked on a story related to the disproportionate distribution of post-disaster funds from the Federal Emergency Management Agency (FEMA). Caseworkers were telling reporter Hannah Dreier that because of Jim Crow laws, the Black homeowners didn’t have the proper documentation proving that they owned the land they inherited from their grandparents who were previously slaves. Their relief fund applications were rejected for papers that never existed.

Because of our reporting, Elizabeth Warren and other lawmakers passed some bills that forced FEMA to accept different types of documents. The last I heard, around 200,000 more people got relief money from FEMA.

That’s the type of impact I look for in my work. I believe that if the story subjects think that their situation was communicated effectively to the national audience, then change could happen.


Connect with Andrew Ba Tran
LinkedIn

Leave a Reply

Your email address will not be published. Required fields are marked *