Using pollen to map plant species occurrence through time: a D3.js example

For the Global Pollen Project, I wanted to create an interactive map that integrates modern observations of plant species with evidence of their distribution in the past.

Past Occurrences of Betula

Colour scale from blue to yellow indicates the ‘last seen date’ at a location, with yellow the most recent.

Modern occurrence data is relatively easy to access: the Global Biodiversity Information Facility (GBIF) provides an open API to developers to query occurrences of species. The GPP automatically assigns a GBIF taxon ID to each family, genus, and species.

Obtaining past occurrence data using paleoecological / long-term ecological data was a bit more tricky. Neotoma is a great database for paleoecological data, including pollen, diatoms and ostracods. Neotoma also provides an open API, and calibrates the age-depth models of sedimentary data, meaning that the dates given to the presence of taxa are more reliable. Using pollen data from Neotoma, it is possible to plot distributions through time.

There are some issues when comparing the modern observations from GBIF to pollen presence in Neotoma data.

First, Neotoma does not return recursive results up the taxonomic heirarchy. For example, if a pollen grain is identified as a Betula, this grain will not be returned when searching the API for Betuleacae. GBIF, on the other hand, has a solid taxonomic backbone and accounts for heirarchical relations.

Second, the presence of pollen is not just reliant on the presence of a taxon, but also on the preservation characteristics of the grain, the environment in which it was deposited, and the ability to distinguish this grain from others, at family, genus and species resolutions. There are also radically different spatial patterns of sampling effort between modern observations and sediment core data.

Third, the palaeoecological data is sorted by ‘pollen types’, rather than linked to botanical species designations. The interpretation in these maps therefore contains certain assumptions about the relations between pollen morphology and species.

The example on this page just plots the Neotoma pollen data. Check out the code at my GitHub repo.