The long-term ecological detective: Holocene ecosystem functioning
Analysis of microfossils and inorganic compounds from lake sediment cores and peat cores can provide rich information on past biodiversity and environmental conditions.
Jeffers et al (2011) identified ...
To get started, we first load and open the Bristlecone library in an F# script file (.fsx):
open Bristlecone
open Bristlecone.Language
open Bristlecone.Time
Then, we may define units of measure that are not SI units (which are included with F#) and are not provided in the Time module of Bristlecone.
[<Measure>] type area
[<Measure>] type indiv // pollen accumulation proxy units -> individuals per area
[<Measure>] type conc // nitrogen concentration proxy units (e.g., δ15N or %TN, scaled)
[<Measure>] type d15N // proxy measure of nitrogen
[<Measure>] type grain
[<Measure>] type cm
We can then use these units of measure when defining the model system. Before we can write model equations, we need to define the states, parameters, and measures that we need to apply within them.
// States
let N = state<conc> "available_N" // Reconstructed available N proxy (δ15N or %TN scaled)
let X = state<indiv / area> "population" // Pollen accumulation proxy for population density
let obsN = measure "observed_N" // After conversion with an alpha conversion factor
let obsPAR = measure<(grain / cm^2) / year> "observed_PAR"
// Core ecological parameters
let λ = parameter "λ" notNegative 0.01<conc/year> 1.0<conc/year> // External N input
let γN = parameter "γ[N]" notNegative 0.001<1/year> 0.1<1/year> // N loss rate
let r = parameter "r" notNegative 0.001<(indiv/area)/conc> 20.<(indiv/area)/conc> // Intrinsic growth rate
let γX = parameter "γ[X]" notNegative 0.01<1/year> 0.2<1/year> // mortality
// Measurement model parameters
let αδ15N = parameter<d15N> "αδ15N" noConstraints -2.0<d15N> 2.0<d15N> // intercept for δ15N proxy
let βδ15N = parameter<d15N/conc> "βδ15N" noConstraints 0.1<d15N/conc> 2.0<d15N/conc> // slope linking latent N to δ15N
// Likelihood function parameters
let ρ = parameter "ρ" noConstraints -0.500 0.500
let σx = parameter "σ[x]" notNegative 10. 50.
let σy = parameter "σ[y]" notNegative 0.001 0.100
In this scenario, we will assess eight hypotheses that relate to the form of dependency between an individual plant taxon and nitrogen availability. To scaffold the hypotheses, we first define a base model that contains the two ordinary differential equations for the two state variables; it takes three pluggable model components as arguments using a standard F# function definition.
let baseModel
(uptake : ModelExpression<conc> -> ModelExpression<indiv/area> -> ModelExpression<conc/year> * ModelExpression<1>)
(feedback: ModelExpression<indiv/area> -> ModelExpression<conc/year>)
(density : ModelExpression<indiv/area> -> ModelExpression<1>) =
// ODEs
let ``dN/dt``: ModelExpression<conc/year> =
let uptake, uptakeMult = uptake This (State X)
P λ
- uptake * uptakeMult
- P γN * This<conc>
+ feedback(State X)
let ``dX/dt``: ModelExpression<(indiv / area)/year> =
let uptake, _ = uptake (State N) This
P r * uptake * density This - P γX * This
// Measure: convert modeled N to proxy comparison scale
let nToProxy : ModelExpression<d15N> = P αδ15N + P βδ15N * State N
let nFromProxy : ModelExpression<conc> = (Measure obsN - P αδ15N) / P βδ15N
Model.empty
// Add the core ecological ODEs:
|> Model.addRateEquation X ``dX/dt``
|> Model.addRateEquation N ``dN/dt``
|> Model.estimateParameter λ
|> Model.estimateParameter γN
|> Model.estimateParameter r
|> Model.estimateParameter γX
// Add the conversion from d15N to N availability:
|> Model.addMeasure obsN nToProxy
|> Model.initialiseHiddenStateWith N nFromProxy
|> Model.estimateParameter αδ15N
|> Model.estimateParameter βδ15N
// Add the likelihood function:
|> Model.useLikelihoodFunction (ModelLibrary.Likelihood.bivariateGaussian (Require.state X) (Require.measure obsN))
|> Model.estimateParameter ρ
|> Model.estimateParameter σx
|> Model.estimateParameter σy
In this model, we naively assume that the pollen accumulation rate approximates the individuals of the plant taxon per unit area. However, for nitrogen the use of a measurement model is desirable, at a minimum because raw d15N (permil) does not
Next, we define the three interchangable components that we will plug in. The models as stated in Jeffers (2011) are not truly nested, as some combinations are not ecologically plausable. In this instance, we define two sets of model components; one for N-dependent systems, and one for N-independent systems. The two sets are encoded in a record for each of the three modes below.
Feedback component.
A plant-soil feedback may be enabled or disabled, and is only defined in one mathematical form.
let feedbackMode =
// Conversion factor (individuals into N)
let σ = parameter "α" notNegative 0.01<conc/(indiv/area)> 10.<conc/(indiv/area)>
let none _ = Constant 0.<conc/year>
let positive (X: ModelExpression<indiv/area>) = P σ * P γX * X
{|
NDependent =
Components.modelComponent "Feedback" [
Components.subComponent "None" none
Components.subComponent "Positive feedback" positive
|> Components.estimateParameter σ
]
NIndependent =
Components.modelComponent "Feedback" [
Components.subComponent "Positive feedback" positive
|> Components.estimateParameter σ
]
|}
N-dependency
We define the N-dependency ("uptake") as a tuple, where the first element
is the form of uptake, and the second is a multiplier used to turn on or
off the uptake term within the dN/dt equation. This is required because
when growth is N-independent, the growth term in dX/dt must be multipled
by 1, whereas uptake must be turned off. Three forms of N-dependency are
specified: N-independent, linear N-dependency, and saturating N-dependency.
let uptakeMode =
// Parameters:
let a = parameter "a" notNegative 1e-6<area/(indiv year)> 1e-3<area/(indiv year)> // Uptake rate constant
let b = parameter "b" notNegative 0.5</conc> 2.0</conc> // Half-saturation (MM)
// If independent, need to substitute 1 into the growth equation instead of 0.
let independent _ _ =
Constant 1.<conc/year>, Constant 0.
// tuple of (uptake rate (0 if none), uptake multiplier)
let linear (N: ModelExpression<conc>) (X: ModelExpression<indiv/area>) = P a * N * X, Constant 1.
let michaelisMenten (N: ModelExpression<conc>) (X: ModelExpression<indiv/area>) : ModelExpression<conc/year> * ModelExpression<1> =
(P a * N * X) / (Constant 1. + (P b * N)), Constant 1.
{|
NDependent =
Components.modelComponent "Uptake" [
Components.subComponent "Linear" linear
|> Components.estimateParameter a
Components.subComponent "Saturating (Michaelis-Menten)" michaelisMenten
|> Components.estimateParameter a
|> Components.estimateParameter b
]
NIndependent =
Components.modelComponent "Uptake" [
Components.subComponent "Independent" independent
]
|}
Density-dependency
...
let densityMode =
let K = parameter "K_growth" notNegative 1.<indiv/area> 1e6<indiv/area>
let c = parameter "c_density" notNegative 0.01<area/indiv> 1e-3<area/indiv>
let logisticDD (X: ModelExpression<indiv/area>) : ModelExpression<1> =
Constant 1.<1> - X / P K
let expDD (X: ModelExpression<indiv/area>) : ModelExpression<1> =
Exponential (-(P c) * X)
let constant (_: ModelExpression<indiv/area>) : ModelExpression<1> =
Constant 1.<1>
{|
NDepenentGrowth =
Components.modelComponent "Density" [
Components.subComponent "Constant" constant
Components.subComponent "Logistic" logisticDD
|> Components.estimateParameter K
]
NIndependentGrowth =
Components.modelComponent "Density" [
Components.subComponent "Logistic" logisticDD
|> Components.estimateParameter K
Components.subComponent "ExpDecay" expDD
|> Components.estimateParameter c
]
|}
Finally, we can scaffold all of the possible model-component combinations into a set of hypotheses that represents the product of all of the components.
The system is not truly nested, so we create two nested sets and concatenate them together.
let hypothesesDependent =
Hypotheses.createFromModel baseModel
|> Hypotheses.apply uptakeMode.NDependent
|> Hypotheses.apply feedbackMode.NDependent
|> Hypotheses.apply densityMode.NDepenentGrowth
|> Hypotheses.compile
let hypothesesIndependent =
Hypotheses.createFromModel baseModel
|> Hypotheses.apply uptakeMode.NIndependent
|> Hypotheses.apply feedbackMode.NIndependent
|> Hypotheses.apply densityMode.NIndependentGrowth
|> Hypotheses.compile
let hypotheses =
List.append hypothesesDependent hypothesesIndependent
The resulting list of hypotheses mirrors the 10 nitrogen-based models presented in Table 1 of Jeffers et al (2011). Bristlecone assigns each model a code based on the components within it. For our 10 models that vary uptake mechanism (UP), feedback (FE), and density-dependence (DE), they are:
for h in hypotheses do
printfn "%s" h.ReferenceCode
|
Fitting the models
We can fit the ecological models to data by defining an estimation engine that contains the method that will be applied for model fitting:
let engine =
Bristlecone.mkContinuous ()
|> Bristlecone.withCustomOptimisation ( Optimisation.MonteCarlo.Filzbach.filzbach
{ Optimisation.MonteCarlo.Filzbach.FilzbachSettings.Default with BurnLength = Optimisation.EndConditions.atIteration 200000<iteration> })
|> Bristlecone.withConditioning Conditioning.NoConditioning
|> Bristlecone.withSeed 1500 // We are setting a seed for this example - see below
|> Bristlecone.withTimeConversion DateMode.Conversion.RadiocarbonDates.toYears
let endCond = Optimisation.EndConditions.atIteration 100000<iteration>
Next, we must load in some real data. We leverage FSharp.Data here to read in a csv file containing the raw data.
open FSharp.Data
type PalaeoData = CsvProvider<"data/loch-dubhan/ld.tsv">
let data = PalaeoData.Load "data/loch-dubhan/ld.tsv"
let ts =
[
X.Code, data.Rows |> Seq.map(fun r -> float r.Par_betula, DatingMethods.Radiocarbon (float r.``Scaled_age_cumulative (cal yr bp)`` * 1.<Time.``cal yr BP``>)) |> TimeSeries.fromCalibratedRadiocarbonObservations
obsN.Code, data.Rows |> Seq.map(fun r -> float r.D15N, DatingMethods.Radiocarbon (float r.``Scaled_age_cumulative (cal yr bp)`` * 1.<Time.``cal yr BP``>)) |> TimeSeries.fromCalibratedRadiocarbonObservations
] |> Map.ofList
We can run an individual model fit like so:
let result =
Bristlecone.tryFit engine endCond ts hypotheses.[5].Model
module Bristlecone from Bristlecone
<namespacedoc><summary>The core library of Bristlecone, containing model-fitting functions.</summary></namespacedoc>
Main functionality of Bristlecone, including functions to scaffold `ModelSystem`s and for model-fitting (tests and real fits).
--------------------
namespace Bristlecone
<summary> An F# Domain Specific Language (DSL) for scripting with Bristlecone. </summary>
val Measure: sId: MeasureId<'u> -> ModelExpression<'u>
--------------------
type MeasureAttribute = inherit Attribute new: unit -> MeasureAttribute
--------------------
new: unit -> MeasureAttribute
val parameter: code: string -> con: Parameter.Constraint -> lower: float<'u> -> upper: float<'u> -> IncludedParameter<'u>
<summary> Define an estimatable parameter for a Bristlecone model. </summary>
--------------------
[<Measure>] type parameter
<summary> Terms for scaffolding a model system for use with Bristlecone. </summary>
<summary>Likelihood functions to represent a variety of distributions and data types.</summary>
<namespacedoc><summary>Pre-built model parts for use in Bristlecone</summary></namespacedoc>
<summary> Log likelihood function for dual simultaneous system, assuming Gaussian error for both x and y. Requires parameters 'σ[x]', 'σ[y]' and 'ρ' to be included in any `ModelSystem` that uses it. </summary>
<summary> Creates a nested component that can be inserted into a base model. </summary>
<summary>Types to represent a hypothesis, given that a hypothesis is a model system that contains some alternate formulations of certain components.</summary>
<summary> Compile: run the writer(s), add parameters into the model builder, and wrap in Hypothesis </summary>
module List from Bristlecone
--------------------
module List from Microsoft.FSharp.Collections
--------------------
type List<'T> = | op_Nil | op_ColonColon of Head: 'T * Tail: 'T list interface IReadOnlyList<'T> interface IReadOnlyCollection<'T> interface IEnumerable interface IEnumerable<'T> member GetReverseIndex: rank: int * offset: int -> int member GetSlice: startIndex: int option * endIndex: int option -> 'T list static member Cons: head: 'T * tail: 'T list -> 'T list member Head: 'T with get member IsEmpty: bool with get member Item: index: int -> 'T with get ...
<summary>Compiles a reference code that may be used to identify (although not necessarily uniquely) this hypothesis</summary>
<returns>A string in the format XX_XXX_YY_YYY... where XX_XXX is a singe component with XX the component and XXX the implementation.</returns>
<summary>A basic estimation engine for ordinary differential equations, using a Nelder-Mead optimiser.</summary>
<summary> A module containing Monte Carlo Markov Chain (MCMC) methods for optimisation. An introduction to MCMC approaches is provided by [Reali, Priami, and Marchetti (2017)](https://doi.org/10.3389/fams.2017.00006) </summary>
<summary> An adaptation of the Filzbach method (originally by Drew Purves) </summary>
<summary> A Monte Carlo Markov Chain sampler based on the 'Filzbach' algorithm from Microsoft Research Cambridge. </summary>
<summary> End the optimisation procedure when a minimum number of iterations is exceeded. </summary>
<summary> Choose how the start point is chosen when solving the model system </summary>
<summary> Use a mersenne twister random number generator with a specific seed. </summary>
<summary> Conversion functions that translate from one time unit into another. These functions are intended primarily for use in estimation engines when translating from data time to model time. </summary>
namespace FSharp
--------------------
namespace Microsoft.FSharp
namespace FSharp.Data
--------------------
namespace Microsoft.FSharp.Data
<summary>Typed representation of a CSV file.</summary> <param name='Sample'>Location of a CSV sample file or a string containing a sample CSV document.</param> <param name='Separators'>Column delimiter(s). Defaults to <c>,</c>.</param> <param name='InferRows'>Number of rows to use for inference. Defaults to <c>1000</c>. If this is zero, all rows are used.</param> <param name='Schema'>Optional column types, in a comma separated list. Valid types are <c>int</c>, <c>int64</c>, <c>bool</c>, <c>float</c>, <c>decimal</c>, <c>date</c>, <c>datetimeoffset</c>, <c>timespan</c>, <c>guid</c>, <c>string</c>, <c>int?</c>, <c>int64?</c>, <c>bool?</c>, <c>float?</c>, <c>decimal?</c>, <c>date?</c>, <c>datetimeoffset?</c>, <c>timespan?</c>, <c>guid?</c>, <c>int option</c>, <c>int64 option</c>, <c>bool option</c>, <c>float option</c>, <c>decimal option</c>, <c>date option</c>, <c>datetimeoffset option</c>, <c>timespan option</c>, <c>guid option</c> and <c>string option</c>. You can also specify a unit and the name of the column like this: <c>Name (type<unit>)</c>, or you can override only the name. If you don't want to specify all the columns, you can reference the columns by name like this: <c>ColumnName=type</c>.</param> <param name='HasHeaders'>Whether the sample contains the names of the columns as its first line.</param> <param name='IgnoreErrors'>Whether to ignore rows that have the wrong number of columns or which can't be parsed using the inferred or specified schema. Otherwise an exception is thrown when these rows are encountered.</param> <param name='SkipRows'>Skips the first n rows of the CSV file.</param> <param name='AssumeMissingValues'>When set to true, the type provider will assume all columns can have missing values, even if in the provided sample all values are present. Defaults to false.</param> <param name='PreferOptionals'>When set to true, inference will prefer to use the option type instead of nullable types, <c>double.NaN</c> or <c>""</c> for missing values. Defaults to false.</param> <param name='Quote'>The quotation mark (for surrounding values containing the delimiter). Defaults to <c>"</c>.</param> <param name='MissingValues'>The set of strings recognized as missing values specified as a comma-separated string (e.g., "NA,N/A"). Defaults to <c>NaN,NA,N/A,#N/A,:,-,TBA,TBD</c>.</param> <param name='CacheRows'>Whether the rows should be caches so they can be iterated multiple times. Defaults to true. Disable for large datasets.</param> <param name='Culture'>The culture used for parsing numbers and dates. Defaults to the invariant culture.</param> <param name='Encoding'>The encoding used to read the sample. You can specify either the character set name or the codepage number. Defaults to UTF8 for files, and to ISO-8859-1 the for HTTP requests, unless <c>charset</c> is specified in the <c>Content-Type</c> response header.</param> <param name='ResolutionFolder'>A directory that is used when resolving relative file references (at design time and in hosted execution).</param> <param name='EmbeddedResource'>When specified, the type provider first attempts to load the sample from the specified resource (e.g. 'MyCompany.MyAssembly, resource_name.csv'). This is useful when exposing types generated by the type provider.</param>
Loads CSV from the specified uri
CsvProvider<...>.Load(reader: System.IO.TextReader) : CsvProvider<...>
Loads CSV from the specified reader
CsvProvider<...>.Load(stream: System.IO.Stream) : CsvProvider<...>
Loads CSV from the specified stream
<summary> The rows with data </summary>
module Seq from Bristlecone
--------------------
module Seq from Microsoft.FSharp.Collections
val float: value: 'T -> float (requires member op_Explicit)
--------------------
type float = System.Double
--------------------
type float<'Measure> = float
<summary> Contains types representing common dating methods in long term data analysis. </summary>
union case DatingMethods.Radiocarbon.Radiocarbon: float<'u> -> DatingMethods.Radiocarbon<'u>
--------------------
type Radiocarbon<'u> = | Radiocarbon of float<'u> static member (+)<'u> : e1: Radiocarbon<'u0> * e2: float<'u0> -> Radiocarbon<'u0> static member (-)<'u> : e1: Radiocarbon<'u0> * e2: Radiocarbon<'u0> -> float<'u0> static member AddDays: days: int<day> -> date: Radiocarbon<'u> -> Radiocarbon<'u> static member AddMonths: months: int<month> -> date: Radiocarbon<'u> -> Radiocarbon<'u> static member AddYears: date: Radiocarbon<'u> -> years: int<year> -> Radiocarbon<'u> static member FractionalDifference<'u> : isSigned: bool -> d1: Radiocarbon<'u0> -> d2: Radiocarbon<'u0> -> TimeDifference<float<'u0>> static member TotalYearsElapsed<'u> : d1: Radiocarbon<'u0> -> d2: Radiocarbon<'u0> -> float<'u0> static member Unwrap<'u> : Radiocarbon<'u0> -> float<'u0> member Value: float<'u> with get
<summary>Represents a date made by radiocarbon measurement</summary>
val Time<'t> : ModelExpression<'t>
--------------------
module Time from Bristlecone
module TimeSeries from Bristlecone.Time
<summary>Contains functions and types to create and manipulate `TimeSeries` values, which represent observations ordered in time.</summary>
--------------------
type TimeSeries<'T,'date,'timeunit,'timespan> = TimeSeries.TimeSeries<'T,'date,'timeunit,'timespan>
module Map from Bristlecone
--------------------
module Map from Microsoft.FSharp.Collections
--------------------
type Map<'Key,'Value (requires comparison)> = interface IReadOnlyDictionary<'Key,'Value> interface IReadOnlyCollection<KeyValuePair<'Key,'Value>> interface IEnumerable interface IStructuralEquatable interface IComparable interface IEnumerable<KeyValuePair<'Key,'Value>> interface ICollection<KeyValuePair<'Key,'Value>> interface IDictionary<'Key,'Value> new: elements: ('Key * 'Value) seq -> Map<'Key,'Value> member Add: key: 'Key * value: 'Value -> Map<'Key,'Value> ...
--------------------
new: elements: ('Key * 'Value) seq -> Map<'Key,'Value>
<summary> Fit a time-series model to data. Please note: it is strongly recommended that you test that the given `EstimationEngine` can correctly identify known parameters for your model. Refer to the `Bristlecone.testModel` function, which can be used to generate known data and complete this process. </summary>
<param name="engine">The engine encapsulates all settings that form part of the estimation method. Importantly, this includes the random number generator used for all stages of the analysis; if this is set using a fixed seed, the result will be reproducable.</param>
<param name="endCondition">You must specify a stopping condition, after which the optimisation process will cease. Bristlecone includes built-in end conditions in the `Bristlecone.Optimisation.EndConditions` module.</param>
<param name="timeSeriesData"></param>
<param name="model"></param>
<returns></returns>
bristlecone