Methodology and data
After a blazing summer of 2021, Boston residents have heat on the mind. Though, like many aspects of urban life, not all feel the effects of the heat equally. In fact, researchers in Boston have in recent years begun to identify so-called Urban Heat Islands, or neighborhoods that, because of their geography and historical disinvestment, get far hotter than others. For this project, we wanted to understand the heat island effect and its relationship with redlining and social vulnerability in Boston through the lens of data.
Luckily, there is a wealth of data available on this subject, though much of it comes in different forms and contexts. We undertook the challenge of cleaning, synthesizing, and visualizing each piece of data in a way that accurately represents the urban heat island effect in Boston.
This was a multi-faceted project, with many large datasets to work through. In 2021, the city published a “Canopy Change assessment” which, among other datasets, provided heat metrics, canopy percentage for each census tract, and potential area where trees could be planted.
To begin the project, we sifted through each of those datasets in CSV form in Excel, though we eventually turned to Geojson files for easier geographical representation in some cases. We also examined the city’s analysis, which included visualizations of some of the data we were planning to use. That served as a good stepping-off point.
As we examined the information, we noticed that the city broke down its geographic data into two different forms: small hexagons and census tracts. Each could help us represent points like average temperature in a specific location at a specific time, but each came with its set of challenges. The hexagons, while smaller than the census tracts and therefore more detailed, often did not fall within the boundaries of the city’s neighborhoods. That would make calculating neighborhood averages difficult. The census tracts were less detailed, but they fall better within those boundaries. There’s also generally more census tract data available outside of this subject matter (housing, population, etc) which we felt would better lend itself to our ideas for visualizations.
For the heat metrics, Andrew made two experimental graphs in Tableau, one with the hexagons and one with the census tracts, in which we visualized mean temperature data. That data came from a 2019 project called Wicked Hot Boston, which was conducted by researchers in the city, at Boston’s Museum of Science, and at Northeastern University’s Helmuth Lab. On two different days in July of that year, researchers panned out across the city with temperature-reading devices, gathering data at 6 a.m., 3 p.m., and 7p.m. The data they collected was recorded by census tract and hexagon.
For this project, we chose to focus on the high afternoon temperature in each census tract because we felt it best displayed the phenomenon we intended to chart — the idea that on an average day, different neighborhoods reach different temperatures because of their layout and historical policies. Heat disparities are most evident at the day’s hottest.
After checking through the heat data to make sure it did not require any cleaning (it did not), we overlaid it on a map layer of the city’s census tracts in Tableau. We started with the mean afternoon high data points, and tweaked the color range to show the differences between data points in the most intricate detail possible. Temperatures ranged from 90 to 100 degrees Fahrenheit, with neighborhoods like Roxbury, Chinatown, and East Boston registering among the hottest neighborhoods.
Andrew constructed two more maps in the same manner, just using the morning and evening high heat census tract data respectively, so we could see the change in heat over the course of a day. From there, we created a custom string parameter, then a calculated field using that parameter, our intention being to create a filter that would allow a viewer to use a drop down arrow to click between the three maps. This highlighted a key pattern: the neighborhoods that get the hottest in the afternoon stay hot longer than neighborhoods that only reach an average heat in the afternoon.
Andrew used a spatial file in Tableau that included every individual tree in Boston, and put it on top of the afternoon high temperature data (by census tract). He played around with the opacity and neighborhood labels to make sure both map layers could be seen as they needed to be. He ultimately ruled out labeling the neighborhoods (which he was using a separate spatial file for), because it made the map look too cluttered. The point of this graph was to show exactly where the trees are in Boston, so we can understand the correlation between tree density and heat. Viewers will notice that in areas like Chinatown, the heat is intense, and there are very few trees. While in Jamaica Plain, the tree density is one of the highest in Boston, and the temperature, corresponding, is cooler.
Charlie constructed a map in Tableau showcasing the open canopy space in the city, with the idea of figuring out where trees could be planted in Boston to counteract the heat island effect, and because the city is touting investment in canopy as a way to help fight the issue. He did so by taking the open space percentage Geojson file provided by the city, using the geography provided to create a marks layer with the “TC P P” metric to represent the open canopy space, and laying it on top of a copy of the previously constructed map showing census tracts and afternoon high temperature. The map ultimately shows that the areas the need the trees most desperately have little to no room for them. There is lots of space on the outskirts of the city and in areas that are already wooded.
Charlie made a map in Tableau that was intended to showcase the link between heat islands and redlining — doing so by placing maps showcasing average afternoon high temperatures, the number of people of color in each census tract, and the number of low to no income people living in each census tract next to one another. This allowed us to show the link between poorer, more diverse neighborhoods with high temperatures. He constructed a toggle using a parameter and a calculated field in an effort to allow the reader to see the figures in more detail.
Finally, we wanted to construct at least one graph. To do so, Andrew cleaned heat and canopy percentage data from two separate CSV files in Excel that were formatted as Geojsons. He isolated the relevant metrics, after painstakingly sorting the city’s 200+ census tracts into the neighborhoods they fit into. Once he had that information, he was able to see, for example, how hot each individual neighborhood was when the data was taken. He ended up using the cleaned data to build a scatter plot in Tableau that showed the relationship between temperature and percentage of tree canopy in each census tract and neighborhood. He created a filter on the scatter plot so that viewers can isolate the data by neighborhood if they wish. The scatter plot shows exactly what we were hoping to show with our visualizations: the areas with less canopy get hot early, are among the hottest in the afternoon, and remain hot well into the evening. Areas with a larger canopy stay cooler in the morning, reach an average temperature in the afternoon, and cool down rather quickly.
For our reporting process, we began by looking for experts who could give us a base level understanding of the urban heat island phenomenon. Charlie spoke with Hessam Azarijafari, a researcher with MIT’s Concrete Sustainability Hub, who authored a study entitled “Urban-Scale Evaluation of Cool Pavement Impacts on the Urban Heat Island Effect and Climate Change.” The big takeaway from our interview was that there are other ways to combat the heat island effect beyond just planting trees. Hessam advocated specifically for the use of cool pavements and green roofs, which may be able to decrease the heat island effect by reflecting heat instead of absorbing it. He also explained that the reason asphalt causes these heat islands is in part because it absorbs heat in the morning and afternoon, then releases it later in the evening, keeping the air temperature hot.
Andrew spoke with David Meshoulam, who helped us understand some of the challenges associated with bolstering the tree canopy. He offered helpful history, and gave good suggestions in terms of what data to look for. Andrew also spoke with two residents of heat island communities. He found them by going to some of the hotter census tracts in Roxbury and East Boston and stopping people walking by on the sidewalk. These residents were essential for the narrative arc of our story.
To check out the data, click the link and open the following files: