Climate Heatmaps Made Easy

Investigating Paleoclimate Data with Pandas and Seaborn

Willy Hagi
Towards Data Science

--

Some time ago Dr. Ed Hawkins, who happens to be the creator of the Climate Spirals, released to the world the Warming Stripes graph for Annual Global Temperature ranging from 1850–2017. The concept is simple but also very informative: each stripe represents the temperature for a single year and as the time series goes on the changes get to be very visible for anyone to see.

Essentially the stripes together make a Heatmap, and Heatmaps are easy to make with a little help from Seaborn. The purpose here is to expand the Warming Stripes for a couple more centuries to look the temperature changes from a slightly different perspective, but for that you need data.

The original Warming Stripes made by Dr. Ed Hawkins

Climate for Cavemen

Paleoclimatology, as the name suggests, is the kind of study focused on the conditions of distant era’s climate — from a few centuries ago to the first thousand years since our planet came to existence.

You might be wandering: how can we know? It’s a little complicated, but paleoclimatology thrives by using data known as proxies, indirect measures of temperature and rainfall contained within tree rings, ice cores, corals and other natural sources which are carefully analyzed to provide good reconstructions of climate information.

There are many places you can go to download paleoclimatological time series, but your first stop should probably be NOAA’s National Centers for Environmental Information (NCEI) database. For scientists and data scientists with a taste for climate, this is a gold mine (but is not the only one).

Putting It All Together

The data you’ll use represents a time series reconstruction of Northern Hemisphere’s Temperature from 1000–1998 shown in the seminal work of Mann, Bradley and Hughes (1999). You might know this from the Hockey Stick graph, but now you’ll get the opportunity to make your own version of the graph here.

To start things up, import the usual suspects:

Reading Files

With Pandas you can read the text file containing the time series with read_csv():

The delim_whitespace=True argument recognizes data columns separated by a white space and with header=None you say to Pandas your data has no header and no particular name for your columns, so the time series within the data file look like this:

Two simple columns

The columns are recognized by an index number, with ‘0’ for the first column (the year column) and ‘1’ the second (the temperature data). Good news is that Pandas allows you to rename the columns with any str name you prefer, such as:

Your Own Hockey Stick

One amazing Seaborn feature is the possibility to set style and context for your graphs according to different tastes, like:

While there are many options the style selected is the ticks and the context talk, defining also the lines.linewidth from the Matplotlib parameters. Check the Seaborn documentation for more.

With the style ready, it’s time to plot your Hockey Stick graph with a combination of Seaborn and Matplotlib:

The Hockey Stick, right at the end (where we are now)

What’s happening up there? With sns.lineplot() you plotted the Temperature data pretty much like you'd do with the standard plt.plot() from Matplotlib while the red and blue shaded regions represent (with a poetic license) the Medieval Warm Period and the Little Ice Age with the help from axvspan(). The set_xlabel() and set_ylabel() functions are familiar with anyone with a contact with Matplotlib, but here you saw they also work well with LaTeX for the $\degree$ symbol.

But enough of Hockey, where are the Stripes?

Paleoclimate Stripes

Looking at the original Warming Stripes as a Heatmap the graph doesn’t show any color bar nor axis labels and ticks, so it’s important to supress these to keep the fidelity of your own stripes. For that, Seaborn provides lots of options enough to make a good approximation.

One thing to pay attention is the fact that Seaborn requires a 2D array as input data, while your time series is a simple 1D array:

One way to get around this problem is to use NumPy and make a new dimension with np.newaxis. With this solved it's easy to make your own Stripes:

How comparable is the Hockey Stick era to the last 1000 years?

Using sns.heatmap() your arguments were:

  • data: the mbh99 temperature time series
  • cmap and cbar: the seismic colormap from the Matplotlib Library (other cool ones are: RdYlBu_r, seismic, coolwarm, bwr, RdBu_r) and False to supress the colorbar
  • vmin, vmax and center: the minimum (-.4) and maximum (.1) limits to plot the colors, while the red and blue diverge at 0.
  • xticklabels and yticklabels: all False to make just the stripes visible

There you are, your own Warming Stripes. Quite easy, isn’t it?

Conclusion

Paleoclimatology is the study of distant time’s climate, from a few centuries ago to the first days of Earth. NCEI’s Paleoclimate Database allows anyone to download several datasets from different studies, such as Mann, Bradley and Hughes (1999) but you can get data from many others as well.

With a little of Pandas and Seaborn you can read data and make Heatmaps for any time series you are interested, such as the one you used here, and make your own version of the Warming Stripes (plus many other stuff as well). Have fun!

PS: you can get the source code and the data easily right here.

References

[1] Mann, M. E., Bradley, R. S. and Hughes, M. K. (1999). Northern hemisphere temperatures during the past millennium: inferences, uncertainties, and limitations. Geophysical research letters, 26(6), 759–762.

--

--

Meteorologist, climate data scientist founder of the Amazon’s first-ever Climate Consulting Company. You can find me this way: linktr.ee/willyhagi