om_code.omfitderiv.fitderiv#

class fitderiv(t, d, cvfn='sqexp', logs=True, noruns=5, noinits=100, exitearly=False, bd=False, empirical_errors=False, optmethod='l_bfgs_b', nosamples=100, stats=True, statnames=False, showstaterrors=True, warn=False, linalgmax=3, iskip=False)[source]#

Bases: object

to smooth data and estimate the time derivative of the data using Gaussian processes.

Summary statistics - the maximal time derivative, the time at which the maximal time derivative occurs, the timescale found from inverting the maximal time derivative, the maximal value of the smoothed data, and the lag time (the time when the tangent from the point with the maximal time derivative crosses a line parallel to the time-axis that passes through the first data point) - are found and their errors estimated using bootstrapping. All statistics can be postfixed by ‘ err’ to find this error.

A summary statistic is given as the median of a distribution of the statistic calculated from time series sampled from the optimal Gaussian process. Its error is estimated as the interquartile range of this distribution.

After a successful optimisation, the following attributes are generated:

t: array

The times specified as input.

d: array

The data specified as input.

f: array

The mean of the Gaussian process with the optimal hyperparmeters at each time point.

fvar: array

The variance of the optimal Gaussian process at each time point.

df: array

The inferred first time-derivative.

dfvar: array

The inferred variance of the first time-derivative.

ddf: array

The inferred second time-derivative.

ddfvar: array

The inferred variance of the second time-derivative.

ds: dictionary

The summary statistics and their estimated errors.

Examples

A typical work flow is:

>>> from fitderiv import fitderiv
>>> q= fitderiv(t, od, figs= True)
>>> q.plotfit('df')

or potentially

>>> import matplotlib.pyplot as plt
>>> plt.plot(q.t, q.d, 'r.', q.t, q.y, 'b')

Methods

calculatestats([nosamples, statnames, ...])

Calculates the statistics from the smoothed data and the inferred time derivative.

fitderivsample(nosamples[, newt])

Generate sample values for the latent function and its first two derivatives (returned as a tuple).

plotfit([char, errorfac, xlabel, ylabel, ...])

Plots either the data and the mean of the optimal Gaussian process or the inferred time derivatives.

printstats([showerrors, performprint])

Creates and potentially displays a dictionary of the statistics calculated from the smoothed data and its inferred time-derivatives.

run(cvfn, ta, da, ma, noruns, noinits, ...)

Instantiates and runs Gaussian process

Runs a Gaussian process to both smooth time-series data and estimate its time-derivatives.

Parameters
t: array

The time points.

d: array
The data corresponding to the time points with any replicates given

as columns.

cvfn: string

The type of kernel function for the Gaussian process either ‘sqexp’ (squared exponential) or ‘matern’ (Matern with nu= 5/2) or ‘nn’ (neural network).

logs: boolean

If True, the Gaussian process is used to smooth the natural logarithm of the data and the time-derivative is therefore of the logarithm of the data.

noruns: integer, optional

The number of attempts to be made at optimising the kernel’s hyperparmeters.

noinits: integer, optional

The number of random attempts made to find good initial choices for the hyperparameters before running their optimisation.

exitearly: boolean, optional

If True, stop at the first successful attempt at optimising the hyperparameters otherwwise take the best choice from all successful optimisations.

bd: dictionary, optional

Specifies the limits on the hyperparameters for the Gaussian process. For example, bd= {0: [-1, 4], 2: [2, 6]}) sets confines the first hyperparameter to be between 1e-1 and 1e^4 and confines the third hyperparmater between 1e2 and 1e6.

empirical_errors: boolean, optional

If True, measurement errors are empirically estimated by the variance across replicates at each time point. If False, the variance of the measurement error is assumed to be the same for all time points and its magnitude is a hyperparameter that is optimised.

optmethod: string, optional

The algorithm used to optimise the hyperparameters, either ‘l_bfgs_b’ or ‘tnc’.

nosamples: integer, optional

The number of bootstrap samples taken to estimate errors in statistics.

stats: boolean, optional

If True, calcuate summary statistics for both the smoothed data and the inferred time- derivative.

statnames: list of strings

To customise the names of the statistics. The default names are: ‘max df’ for the maximal time derivative; ‘time of max df’ for the time at which the maximal time derivative occurs; ‘inverse max df’ for the timescale found from inverting the maximal time derivative; ‘max f’ for the maximal value of the smoothed data; ‘lag time’ for the lag time defined as the time when the tangent from the point with the maximal time derivative crosses a line parallel to the time-axis that passes through the first data point.

showstaterrors: boolean, optional

If True, display estimated errors for the statistics.

warn: boolean, optional

If False, warnings created by covariance matrices that are not positive semi-definite are suppressed.

linalgmax: integer, optional

The number of times errors generated by underlying linear algebra modules during the optimisation by poor choices of the hyperparameters should be ignored.

iskip: integer, optional

If non-zero, only every iskip’th data point is used to increase speed.

Methods

calculatestats([nosamples, statnames, ...])

Calculates the statistics from the smoothed data and the inferred time derivative.

fitderivsample(nosamples[, newt])

Generate sample values for the latent function and its first two derivatives (returned as a tuple).

plotfit([char, errorfac, xlabel, ylabel, ...])

Plots either the data and the mean of the optimal Gaussian process or the inferred time derivatives.

printstats([showerrors, performprint])

Creates and potentially displays a dictionary of the statistics calculated from the smoothed data and its inferred time-derivatives.

run(cvfn, ta, da, ma, noruns, noinits, ...)

Instantiates and runs Gaussian process

__init__(t, d, cvfn='sqexp', logs=True, noruns=5, noinits=100, exitearly=False, bd=False, empirical_errors=False, optmethod='l_bfgs_b', nosamples=100, stats=True, statnames=False, showstaterrors=True, warn=False, linalgmax=3, iskip=False)[source]#

Runs a Gaussian process to both smooth time-series data and estimate its time-derivatives.

Parameters
t: array

The time points.

d: array
The data corresponding to the time points with any replicates given

as columns.

cvfn: string

The type of kernel function for the Gaussian process either ‘sqexp’ (squared exponential) or ‘matern’ (Matern with nu= 5/2) or ‘nn’ (neural network).

logs: boolean

If True, the Gaussian process is used to smooth the natural logarithm of the data and the time-derivative is therefore of the logarithm of the data.

noruns: integer, optional

The number of attempts to be made at optimising the kernel’s hyperparmeters.

noinits: integer, optional

The number of random attempts made to find good initial choices for the hyperparameters before running their optimisation.

exitearly: boolean, optional

If True, stop at the first successful attempt at optimising the hyperparameters otherwwise take the best choice from all successful optimisations.

bd: dictionary, optional

Specifies the limits on the hyperparameters for the Gaussian process. For example, bd= {0: [-1, 4], 2: [2, 6]}) sets confines the first hyperparameter to be between 1e-1 and 1e^4 and confines the third hyperparmater between 1e2 and 1e6.

empirical_errors: boolean, optional

If True, measurement errors are empirically estimated by the variance across replicates at each time point. If False, the variance of the measurement error is assumed to be the same for all time points and its magnitude is a hyperparameter that is optimised.

optmethod: string, optional

The algorithm used to optimise the hyperparameters, either ‘l_bfgs_b’ or ‘tnc’.

nosamples: integer, optional

The number of bootstrap samples taken to estimate errors in statistics.

stats: boolean, optional

If True, calcuate summary statistics for both the smoothed data and the inferred time- derivative.

statnames: list of strings

To customise the names of the statistics. The default names are: ‘max df’ for the maximal time derivative; ‘time of max df’ for the time at which the maximal time derivative occurs; ‘inverse max df’ for the timescale found from inverting the maximal time derivative; ‘max f’ for the maximal value of the smoothed data; ‘lag time’ for the lag time defined as the time when the tangent from the point with the maximal time derivative crosses a line parallel to the time-axis that passes through the first data point.

showstaterrors: boolean, optional

If True, display estimated errors for the statistics.

warn: boolean, optional

If False, warnings created by covariance matrices that are not positive semi-definite are suppressed.

linalgmax: integer, optional

The number of times errors generated by underlying linear algebra modules during the optimisation by poor choices of the hyperparameters should be ignored.

iskip: integer, optional

If non-zero, only every iskip’th data point is used to increase speed.

Methods

__init__(t, d[, cvfn, logs, noruns, ...])

Runs a Gaussian process to both smooth time-series data and estimate its time-derivatives.

calculatestats([nosamples, statnames, ...])

Calculates the statistics from the smoothed data and the inferred time derivative.

fitderivsample(nosamples[, newt])

Generate sample values for the latent function and its first two derivatives (returned as a tuple).

plotfit([char, errorfac, xlabel, ylabel, ...])

Plots either the data and the mean of the optimal Gaussian process or the inferred time derivatives.

printstats([showerrors, performprint])

Creates and potentially displays a dictionary of the statistics calculated from the smoothed data and its inferred time-derivatives.

run(cvfn, ta, da, ma, noruns, noinits, ...)

Instantiates and runs Gaussian process

calculatestats(nosamples=100, statnames=None, showerrors=True)[source]#

Calculates the statistics from the smoothed data and the inferred time derivative. The default names are ‘max df’, ‘time of max df’, ‘inverse max grad’, ‘max f’, and ‘lag time’.

Parameters
nosamples: integer

The number of bootstrap samples used to estimate errors.

statnames: list of strings, optional

A list of alternative names for the statistics.

showerrors: boolean, optional

If True, display the estimated errors.

fitderivsample(nosamples, newt=None)[source]#

Generate sample values for the latent function and its first two derivatives (returned as a tuple).

Parameters
nosamples: integer

The number of samples.

newt: array, optional

Time points for which the samples should be made. If None, the orginal time points are used.

Returns
samples: a tuple of arrays

The first element of the tuple gives samples of the latent function; the second element gives samples of the first time derivative; and the third element gives samples of the second time derivative.

plotfit(char='f', errorfac=1, xlabel='time', ylabel=False, figtitle=False)[source]#

Plots either the data and the mean of the optimal Gaussian process or the inferred time derivatives.

Parameters
char: string

The variable to plot either ‘f’ or ‘df’ or ‘ddf’.

errorfac: float, optional

The size of the errorbars are errorfac times the standard deviation of the optimal Gaussian process.

ylabel: string, optional

A label for the y-axis.

figtitle: string, optional

A title for the figure.

printstats(showerrors=True, performprint=True)[source]#

Creates and potentially displays a dictionary of the statistics calculated from the smoothed data and its inferred time-derivatives.

Parameters
showerrors: boolean, optional

If True, display the errors.

performprint: boolean optional

If True, display the statistics.

Returns
statd: dictionary

The statistics and their errors.

run(cvfn, ta, da, ma, noruns, noinits, exitearly, optmethod, stats, nosamples, statnames, showstaterrors)[source]#

Instantiates and runs Gaussian process