Why this API?¶

Here we explain the motivation for the API.

Imports¶

In [1]:

Copied!

import matplotlib.pyplot as plt
import numpy as np
import openscm_units

import continuous_timeseries as ct
import matplotlib.pyplot as plt
import numpy as np
import openscm_units

import continuous_timeseries as ct

Handy pint aliases¶

In [2]:

Copied!

UR = openscm_units.unit_registry
Q = UR.Quantity
UR = openscm_units.unit_registry
Q = UR.Quantity

Set up matplotlib to work with pint¶

For details, see the pint docs (stable docs, last version that we checked at the time of writing) or our docs on unit-aware plotting.

In [3]:

Copied!

UR.setup_matplotlib(enable=True)
UR.setup_matplotlib(enable=True)

Introducing the problem¶

Imagine you have timeseries of emissions, like the below.

In [4]:

Copied!

emissions = Q(np.array([0, 1, 3, 5, 10, 10, 9, 7, 6]), "GtC / yr")
years = Q(np.array([1850, 1900, 1950, 2000, 2010, 2020, 2030, 2040, 2050]), "yr")
emissions = Q(np.array([0, 1, 3, 5, 10, 10, 9, 7, 6]), "GtC / yr")
years = Q(np.array([1850, 1900, 1950, 2000, 2010, 2020, 2030, 2040, 2050]), "yr")

At first glance, this seems quite straightforward. However, there are quite a few unanswered questions with such data. For example:

do these values represent the average emissions for each year?
should they be interpolated linearly?

If we take a more complicated example, like emissions over the COVID period (around 2020), the issue becomes clearer.

In [5]:

Copied!

emissions_covid = Q(np.array([10.2, 10.3, 9.5, 10.1, 10.3, 10.5]), "GtC / yr")
years_covid = Q(np.array([2018, 2019, 2020, 2021, 2022, 2023]), "yr")
emissions_covid = Q(np.array([10.2, 10.3, 9.5, 10.1, 10.3, 10.5]), "GtC / yr")
years_covid = Q(np.array([2018, 2019, 2020, 2021, 2022, 2023]), "yr")

If we naively plot these emissions, the problem is clearer.

In [6]:

Copied!

fig, ax = plt.subplots()
ax.plot(years_covid.m, emissions_covid.m, marker="o")
ax.grid()
fig, ax = plt.subplots()
ax.plot(years_covid.m, emissions_covid.m, marker="o")
ax.grid()

No description has been provided for this image

When you first look at this plot, everything seems fine. However, if you look more closely, you realise something: the way this data is plotted, it looks like emissions started to drop sharply in 2019, not 2020. Put another way, this makes it look like the COVID dip was centred around 1 Jan 2020, when we know it is centred more around July 2020.

Given we know that emissions data is generally the average of the emissions over the year, we can make this clearer with our plot. For example, the plotting below.

In [7]:

Copied!





fig, ax = plt.subplots()
x_vals = np.hstack([years_covid, years_covid[-1] + Q(1, "yr")]).m
y_vals = np.hstack([emissions_covid, emissions_covid[-1]]).m
ax.step(x_vals, y_vals, where="post", marker="o")
ax.grid()
fig, ax = plt.subplots()
x_vals = np.hstack([years_covid, years_covid[-1] + Q(1, "yr")]).m
y_vals = np.hstack([emissions_covid, emissions_covid[-1]]).m
ax.step(x_vals, y_vals, where="post", marker="o")
ax.grid()

As you can see, this is a bit of mucking around and we are currently just assuming that having a constant value over the year is the right choice, rather than actually knowing that to be the case.

The last key motivating issue is the question of integration. Recall the points we have.

In [8]:

Copied!

years
years

Out[8]:

Magnitude	[1850 1900 1950 2000 2010 2020 2030 2040 2050]
Units	yr

In [9]:

Copied!

emissions
emissions

Out[9]:

Magnitude	[ 0 1 3 5 10 10 9 7 6]
Units	gigatC/yr

It is clear that we have to consider the size of the timesteps in order to integrate the emissions. So, we want an API that makes that easy.

On top of this, the decision about whether to linearly interpolate between the emissions values or treat them as stepwise constant (i.e. assume that emissions are constant between the defining points) will have a big difference on the result, yet we do not have any information about what choice was intended based on the data. So, we want an API that solves this too.

The proposed solution¶

Our proposed API to solve this is the Timeseries class, along with the associated TimeseriesContinous and TimeseriesDiscrete classes.

Discrete representation¶

In the case of our COVID emissions example, we would capture the timeseries in its discrete form as shown below.

In [10]:

Copied!





covid_emissions = ct.TimeseriesDiscrete(
    name="co2_emissions",
    time_axis=ct.TimeAxis(years_covid),
    values_at_bounds=ct.ValuesAtBounds(emissions_covid),
)
covid_emissions
covid_emissions = ct.TimeseriesDiscrete(
    name="co2_emissions",
    time_axis=ct.TimeAxis(years_covid),
    values_at_bounds=ct.ValuesAtBounds(emissions_covid),
)
covid_emissions

Out[10]:

continuous_timeseries.TimeseriesDiscrete

name

co2_emissions

time_axis

bounds

Magnitude	[2018 2019 2020 2021 2022 2023]
Units	yr

values_at_bounds

values

Magnitude	[10.2 10.3 9.5 10.1 10.3 10.5]
Units	gigatC/yr

Continuous representation¶

To go to a continuous representation, we have to specify the interpolation option we want to use. For emissions, the most accurate choice is generally piecewise constant, because emissions are typically reported as total emissions for the year (i.e. the number is the average over the year, which isn't well captured by linear interpolation, at least not naively).

In [11]:

Copied!





covid_emissions_linear_interpolation = covid_emissions.to_continuous_timeseries(
    interpolation=ct.InterpolationOption.PiecewiseConstantPreviousLeftClosed
)
covid_emissions_linear_interpolation.plot(time_axis=covid_emissions.time_axis)
covid_emissions_linear_interpolation
covid_emissions_linear_interpolation = covid_emissions.to_continuous_timeseries(
    interpolation=ct.InterpolationOption.PiecewiseConstantPreviousLeftClosed
)
covid_emissions_linear_interpolation.plot(time_axis=covid_emissions.time_axis)
covid_emissions_linear_interpolation

Out[11]:

continuous_timeseries.TimeseriesContinuous

name	co2_emissions
time_units	yr
values_units	gigatC/yr
function	PPolyPiecewiseConstantPreviousLeftClosed(x=array([2018, 2019, 2020, 2021, 2022, 2023]), y=array([10.2, 10.3, 9.5, 10.1, 10.3, 10.5]))
domain	(2018 yr, 2023 yr)

We visualise the implications of different interpolation options below.

In [12]:

Copied!





fig, axs = plt.subplot_mosaic(
    [["piecewise_constant_previous"], ["piecewise_constant_next"], ["other"]],
    figsize=(12, 8),
)

discrete_points_plotted = []
for interp_option, ax, marker in (
    (
        ct.InterpolationOption.PiecewiseConstantPreviousLeftClosed,
        axs["piecewise_constant_previous"],
        "^",
    ),
    (
        ct.InterpolationOption.PiecewiseConstantPreviousLeftOpen,
        axs["piecewise_constant_previous"],
        "o",
    ),
    (
        ct.InterpolationOption.PiecewiseConstantNextLeftClosed,
        axs["piecewise_constant_next"],
        "^",
    ),
    (
        ct.InterpolationOption.PiecewiseConstantNextLeftOpen,
        axs["piecewise_constant_next"],
        "o",
    ),
    (ct.InterpolationOption.Linear, axs["other"], "x"),
):
    if ax not in discrete_points_plotted:
        covid_emissions.plot(
            ax=ax,
            marker="+",
            s=130,
            label="Input discrete points",
            color="red",
            zorder=3,
        )
        discrete_points_plotted.append(ax)

    continuous = covid_emissions.to_continuous_timeseries(interpolation=interp_option)
    continuous.plot(
        time_axis=covid_emissions.time_axis,
        ax=ax,
        alpha=0.7,
        label=f"{interp_option.name} interpolation",
    )
    ax.scatter(
        covid_emissions.time_axis.bounds,
        continuous.interpolate(covid_emissions.time_axis),
        marker=marker,
        s=130,
        alpha=0.4,
        label=f"{interp_option.name} at boundaries",
    )

for ax in axs.values():
    ax.legend(loc="center left", bbox_to_anchor=(1.05, 0.5))
    ax.grid()

fig.tight_layout()
fig, axs = plt.subplot_mosaic(
    [["piecewise_constant_previous"], ["piecewise_constant_next"], ["other"]],
    figsize=(12, 8),
)

discrete_points_plotted = []
for interp_option, ax, marker in (
    (
        ct.InterpolationOption.PiecewiseConstantPreviousLeftClosed,
        axs["piecewise_constant_previous"],
        "^",
    ),
    (
        ct.InterpolationOption.PiecewiseConstantPreviousLeftOpen,
        axs["piecewise_constant_previous"],
        "o",
    ),
    (
        ct.InterpolationOption.PiecewiseConstantNextLeftClosed,
        axs["piecewise_constant_next"],
        "^",
    ),
    (
        ct.InterpolationOption.PiecewiseConstantNextLeftOpen,
        axs["piecewise_constant_next"],
        "o",
    ),
    (ct.InterpolationOption.Linear, axs["other"], "x"),
):
    if ax not in discrete_points_plotted:
        covid_emissions.plot(
            ax=ax,
            marker="+",
            s=130,
            label="Input discrete points",
            color="red",
            zorder=3,
        )
        discrete_points_plotted.append(ax)

    continuous = covid_emissions.to_continuous_timeseries(interpolation=interp_option)
    continuous.plot(
        time_axis=covid_emissions.time_axis,
        ax=ax,
        alpha=0.7,
        label=f"{interp_option.name} interpolation",
    )
    ax.scatter(
        covid_emissions.time_axis.bounds,
        continuous.interpolate(covid_emissions.time_axis),
        marker=marker,
        s=130,
        alpha=0.4,
        label=f"{interp_option.name} at boundaries",
    )

for ax in axs.values():
    ax.legend(loc="center left", bbox_to_anchor=(1.05, 0.5))
    ax.grid()

fig.tight_layout()

/home/docs/checkouts/readthedocs.org/user_builds/continuous-timeseries/envs/latest/lib/python3.11/site-packages/continuous_timeseries/timeseries_discrete.py:196: InterpolationUpdateChangedValuesAtBoundsWarning: Using the interpolation PiecewiseConstantPreviousLeftOpen means that the y-values do not line up exactly with the x-values. In other words, in the output, y(x(1)) is not equal to the input y(1), y(x(2)) is not equal to the input y(2), y(x(n)) is not equal to the input y(n) etc. This may cause confusion. Either ignore this warning, suppress it (by passing `warn_if_values_at_bounds_could_confuse=False` or via Python's `warnings` module settings) or choose a different interpolation option.
  warnings.warn(msg, InterpolationUpdateChangedValuesAtBoundsWarning)
/home/docs/checkouts/readthedocs.org/user_builds/continuous-timeseries/envs/latest/lib/python3.11/site-packages/continuous_timeseries/timeseries_discrete.py:196: InterpolationUpdateChangedValuesAtBoundsWarning: Using the interpolation PiecewiseConstantNextLeftClosed means that the y-values do not line up exactly with the x-values. In other words, in the output, y(x(1)) is not equal to the input y(1), y(x(2)) is not equal to the input y(2), y(x(n)) is not equal to the input y(n) etc. This may cause confusion. Either ignore this warning, suppress it (by passing `warn_if_values_at_bounds_could_confuse=False` or via Python's `warnings` module settings) or choose a different interpolation option.
  warnings.warn(msg, InterpolationUpdateChangedValuesAtBoundsWarning)

These different interpolation options have an impact on integration too, which is often what we are most interested in when thinking about CO$_2$ emissions (cumulative CO$_2$ emissions are more useful than instantaneous in many cases).

In [13]:

Copied!





fig, axes = plt.subplots(nrows=3, sharex=True, figsize=(12, 8))

continuous_linear_integral_values = (
    covid_emissions.to_continuous_timeseries(
        interpolation=ct.InterpolationOption.Linear
    )
    .integrate(Q(0, "GtC"))
    .interpolate(covid_emissions.time_axis)
)

for interp_option in (
    (ct.InterpolationOption.PiecewiseConstantPreviousLeftClosed),
    (ct.InterpolationOption.PiecewiseConstantPreviousLeftOpen),
    (ct.InterpolationOption.PiecewiseConstantNextLeftClosed),
    (ct.InterpolationOption.PiecewiseConstantNextLeftOpen),
    (ct.InterpolationOption.Linear),
):
    continuous = covid_emissions.to_continuous_timeseries(interpolation=interp_option)
    continuous.plot(
        time_axis=covid_emissions.time_axis,
        ax=axes[0],
        alpha=0.7,
        label=interp_option.name,
    )

    integral = continuous.integrate(Q(0, "GtC"))
    integral.plot(
        time_axis=covid_emissions.time_axis,
        ax=axes[1],
        alpha=0.7,
        label=interp_option.name,
    )

    axes[2].plot(
        covid_emissions.time_axis.bounds,
        integral.interpolate(covid_emissions.time_axis)
        - continuous_linear_integral_values,
        label=interp_option.name,
    )

axes[0].set_title("Continuous")
axes[1].set_title("Integral")
axes[2].set_title("Difference from integral of linear interpolation")
axes[2].yaxis.set_units(UR.Unit("MtCO2"))
for ax in axes:
    ax.legend(loc="center left", bbox_to_anchor=(1.05, 0.5))
    ax.grid()

fig.tight_layout()
fig, axes = plt.subplots(nrows=3, sharex=True, figsize=(12, 8))

continuous_linear_integral_values = (
    covid_emissions.to_continuous_timeseries(
        interpolation=ct.InterpolationOption.Linear
    )
    .integrate(Q(0, "GtC"))
    .interpolate(covid_emissions.time_axis)
)

for interp_option in (
    (ct.InterpolationOption.PiecewiseConstantPreviousLeftClosed),
    (ct.InterpolationOption.PiecewiseConstantPreviousLeftOpen),
    (ct.InterpolationOption.PiecewiseConstantNextLeftClosed),
    (ct.InterpolationOption.PiecewiseConstantNextLeftOpen),
    (ct.InterpolationOption.Linear),
):
    continuous = covid_emissions.to_continuous_timeseries(interpolation=interp_option)
    continuous.plot(
        time_axis=covid_emissions.time_axis,
        ax=axes[0],
        alpha=0.7,
        label=interp_option.name,
    )

    integral = continuous.integrate(Q(0, "GtC"))
    integral.plot(
        time_axis=covid_emissions.time_axis,
        ax=axes[1],
        alpha=0.7,
        label=interp_option.name,
    )

    axes[2].plot(
        covid_emissions.time_axis.bounds,
        integral.interpolate(covid_emissions.time_axis)
        - continuous_linear_integral_values,
        label=interp_option.name,
    )

axes[0].set_title("Continuous")
axes[1].set_title("Integral")
axes[2].set_title("Difference from integral of linear interpolation")
axes[2].yaxis.set_units(UR.Unit("MtCO2"))
for ax in axes:
    ax.legend(loc="center left", bbox_to_anchor=(1.05, 0.5))
    ax.grid()

fig.tight_layout()

/home/docs/checkouts/readthedocs.org/user_builds/continuous-timeseries/envs/latest/lib/python3.11/site-packages/continuous_timeseries/timeseries_discrete.py:196: InterpolationUpdateChangedValuesAtBoundsWarning: Using the interpolation PiecewiseConstantPreviousLeftOpen means that the y-values do not line up exactly with the x-values. In other words, in the output, y(x(1)) is not equal to the input y(1), y(x(2)) is not equal to the input y(2), y(x(n)) is not equal to the input y(n) etc. This may cause confusion. Either ignore this warning, suppress it (by passing `warn_if_values_at_bounds_could_confuse=False` or via Python's `warnings` module settings) or choose a different interpolation option.
  warnings.warn(msg, InterpolationUpdateChangedValuesAtBoundsWarning)
/home/docs/checkouts/readthedocs.org/user_builds/continuous-timeseries/envs/latest/lib/python3.11/site-packages/continuous_timeseries/timeseries_discrete.py:196: InterpolationUpdateChangedValuesAtBoundsWarning: Using the interpolation PiecewiseConstantNextLeftClosed means that the y-values do not line up exactly with the x-values. In other words, in the output, y(x(1)) is not equal to the input y(1), y(x(2)) is not equal to the input y(2), y(x(n)) is not equal to the input y(n) etc. This may cause confusion. Either ignore this warning, suppress it (by passing `warn_if_values_at_bounds_could_confuse=False` or via Python's `warnings` module settings) or choose a different interpolation option.
  warnings.warn(msg, InterpolationUpdateChangedValuesAtBoundsWarning)

Timeseries representation¶

In general, users won't interact with TimeseriesContinous and TimeseriesDiscrete directly. Instead, they will mostly use the Timeseries class. This offers the same behaviour as above, with a simpler API.

In [14]:

Copied!





ts = ct.Timeseries.from_arrays(
    x=years_covid,
    y=emissions_covid,
    interpolation=ct.InterpolationOption.PiecewiseConstantPreviousLeftClosed,
    name="covid_emissions",
)
ts
ts = ct.Timeseries.from_arrays(
    x=years_covid,
    y=emissions_covid,
    interpolation=ct.InterpolationOption.PiecewiseConstantPreviousLeftClosed,
    name="covid_emissions",
)
ts

Out[14]:

continuous_timeseries.Timeseries

time_axis

bounds

Magnitude	[2018 2019 2020 2021 2022 2023]
Units	yr

timeseries_continuous

name	covid_emissions
time_units	yr
values_units	gigatC/yr
function	PPolyPiecewiseConstantPreviousLeftClosed(x=array([2018, 2019, 2020, 2021, 2022, 2023]), y=array([10.2, 10.3, 9.5, 10.1, 10.3, 10.5]))
domain	(2018 yr, 2023 yr)

In [15]:

Copied!

ts.plot()
ts.plot()

Out[15]:

<Axes: xlabel='yr', ylabel='gigatC/yr'>

In [16]:

Copied!

ts.integrate(Q(100, "GtC")).plot()
ts.integrate(Q(100, "GtC")).plot()

Out[16]:

<Axes: xlabel='yr', ylabel='gigatC'>

This API also solves our integration issue. If we pass in a timeseries with time steps of different sizes, the integral is nonetheless evaluated correctly (making it easy to avoid one of the most common gotcha's in climate science/scenario development).

In [17]:

Copied!





ts_varying_step_size = ct.Timeseries.from_arrays(
    x=Q([1750, 1800, 1825, 1900, 2000], "yr"),
    y=Q([0.0, 0.2, 0.3, 1.2, 10.3], "GtCO2 / yr"),
    interpolation=ct.InterpolationOption.Linear,
    name="varying_step_size",
)
ts_varying_step_size
ts_varying_step_size = ct.Timeseries.from_arrays(
    x=Q([1750, 1800, 1825, 1900, 2000], "yr"),
    y=Q([0.0, 0.2, 0.3, 1.2, 10.3], "GtCO2 / yr"),
    interpolation=ct.InterpolationOption.Linear,
    name="varying_step_size",
)
ts_varying_step_size

Out[17]:

continuous_timeseries.Timeseries

time_axis

bounds

Magnitude	[1750 1800 1825 1900 2000]
Units	yr

timeseries_continuous

name

varying_step_size

time_units

yr

values_units

gigatCO2/yr

function

ppoly

order	1
c	[[0.004 0.004 0.012 0.091] [0. 0.2 0.3 1.2 ]]
x	[1750. 1800. 1825. 1900. 2000.]

domain

(1750 yr, 2000 yr)

In [18]:

Copied!

ax = ts_varying_step_size.plot()
ax.grid()
ax = ts_varying_step_size.plot()
ax.grid()

In [19]:

Copied!

ax = ts_varying_step_size.integrate(Q(0, "GtC")).plot()
ax.grid()
ax = ts_varying_step_size.integrate(Q(0, "GtC")).plot()
ax.grid()

The timeseries representation also makes it trivial to do integral-preserving interpolation on the time axis of your choice.

In [20]:

Copied!





decadal_steps = ct.TimeAxis(
    np.arange(
        ts_varying_step_size.time_axis.bounds.m.min(),
        ts_varying_step_size.time_axis.bounds.m.max() + 0.01,
        10,
    )
    * ts_varying_step_size.time_axis.bounds.u
)

fig, axes = plt.subplots(nrows=2, sharex=True, figsize=(12, 8))
for label, updated_time_axis, updated_interpolation in (
    ("starting_point", None, None),
    (
        "same time - piecewise constant next left-closed",
        ts_varying_step_size.time_axis,
        ct.InterpolationOption.PiecewiseConstantNextLeftClosed,
    ),
    (
        "decadal steps - piecewise constant next left-closed",
        decadal_steps,
        ct.InterpolationOption.PiecewiseConstantNextLeftClosed,
    ),
):
    if updated_time_axis is not None:
        updated_time = ts_varying_step_size.interpolate(updated_time_axis)
    else:
        updated_time = ts_varying_step_size

    if updated_interpolation is not None:
        updated_time_and_interp = updated_time.update_interpolation_integral_preserving(
            updated_interpolation
        )
    else:
        updated_time_and_interp = updated_time

    updated_time_and_interp.plot(ax=axes[0], continuous_plot_kwargs=dict(label=label))

    updated_time_and_interp.integrate(Q(0, "GtC")).plot(
        ax=axes[1], continuous_plot_kwargs=dict(label=label)
    )

axes[0].set_title("Emissions")
axes[1].set_title("Cumulative emissions")
for ax in axes:
    ax.grid()
    ax.legend()

fig.tight_layout()
decadal_steps = ct.TimeAxis(
    np.arange(
        ts_varying_step_size.time_axis.bounds.m.min(),
        ts_varying_step_size.time_axis.bounds.m.max() + 0.01,
        10,
    )
    * ts_varying_step_size.time_axis.bounds.u
)

fig, axes = plt.subplots(nrows=2, sharex=True, figsize=(12, 8))
for label, updated_time_axis, updated_interpolation in (
    ("starting_point", None, None),
    (
        "same time - piecewise constant next left-closed",
        ts_varying_step_size.time_axis,
        ct.InterpolationOption.PiecewiseConstantNextLeftClosed,
    ),
    (
        "decadal steps - piecewise constant next left-closed",
        decadal_steps,
        ct.InterpolationOption.PiecewiseConstantNextLeftClosed,
    ),
):
    if updated_time_axis is not None:
        updated_time = ts_varying_step_size.interpolate(updated_time_axis)
    else:
        updated_time = ts_varying_step_size

    if updated_interpolation is not None:
        updated_time_and_interp = updated_time.update_interpolation_integral_preserving(
            updated_interpolation
        )
    else:
        updated_time_and_interp = updated_time

    updated_time_and_interp.plot(ax=axes[0], continuous_plot_kwargs=dict(label=label))

    updated_time_and_interp.integrate(Q(0, "GtC")).plot(
        ax=axes[1], continuous_plot_kwargs=dict(label=label)
    )

axes[0].set_title("Emissions")
axes[1].set_title("Cumulative emissions")
for ax in axes:
    ax.grid()
    ax.legend()

fig.tight_layout()

Summary¶

This notebook has explained why we have built this API. In short, it solves four key problems:

Removing the ambiguity about what to do between discrete time points (which removes ambiguity about the value of integrals and derivatives).
Making interpolation, integration and differentation trivial, even on time axes with uneven steps.
Making plotting in a way that actually represents the meaning of the data trivial.
Doing all of the above with unit awareness, to remove that entire class of headaches and gotchas.

Alongside solving these issues, it also has the added bonus that it makes it easy to integral-preserving interpolation, which can be very helpful for reporting over different time periods with different conventions.