om_code.omfitderiv.fitderiv
om_code.omfitderiv.fitderiv#
- class fitderiv(t, d, cvfn='sqexp', logs=True, noruns=5, noinits=100, exitearly=False, bd=False, empirical_errors=False, optmethod='l_bfgs_b', nosamples=100, stats=True, statnames=False, showstaterrors=True, warn=False, linalgmax=3, iskip=False)[source]#
Bases:
object
to smooth data and estimate the time derivative of the data using Gaussian processes.
Summary statistics - the maximal time derivative, the time at which the maximal time derivative occurs, the timescale found from inverting the maximal time derivative, the maximal value of the smoothed data, and the lag time (the time when the tangent from the point with the maximal time derivative crosses a line parallel to the time-axis that passes through the first data point) - are found and their errors estimated using bootstrapping. All statistics can be postfixed by ‘ err’ to find this error.
A summary statistic is given as the median of a distribution of the statistic calculated from time series sampled from the optimal Gaussian process. Its error is estimated as the interquartile range of this distribution.
After a successful optimisation, the following attributes are generated:
- t: array
The times specified as input.
- d: array
The data specified as input.
- f: array
The mean of the Gaussian process with the optimal hyperparmeters at each time point.
- fvar: array
The variance of the optimal Gaussian process at each time point.
- df: array
The inferred first time-derivative.
- dfvar: array
The inferred variance of the first time-derivative.
- ddf: array
The inferred second time-derivative.
- ddfvar: array
The inferred variance of the second time-derivative.
- ds: dictionary
The summary statistics and their estimated errors.
Examples
A typical work flow is:
>>> from fitderiv import fitderiv >>> q= fitderiv(t, od, figs= True) >>> q.plotfit('df')
or potentially
>>> import matplotlib.pyplot as plt >>> plt.plot(q.t, q.d, 'r.', q.t, q.y, 'b')
Methods
calculatestats
([nosamples, statnames, ...])Calculates the statistics from the smoothed data and the inferred time derivative.
fitderivsample
(nosamples[, newt])Generate sample values for the latent function and its first two derivatives (returned as a tuple).
plotfit
([char, errorfac, xlabel, ylabel, ...])Plots either the data and the mean of the optimal Gaussian process or the inferred time derivatives.
printstats
([showerrors, performprint])Creates and potentially displays a dictionary of the statistics calculated from the smoothed data and its inferred time-derivatives.
run
(cvfn, ta, da, ma, noruns, noinits, ...)Instantiates and runs Gaussian process
Runs a Gaussian process to both smooth time-series data and estimate its time-derivatives.
- Parameters
- t: array
The time points.
- d: array
- The data corresponding to the time points with any replicates given
as columns.
- cvfn: string
The type of kernel function for the Gaussian process either ‘sqexp’ (squared exponential) or ‘matern’ (Matern with nu= 5/2) or ‘nn’ (neural network).
- logs: boolean
If True, the Gaussian process is used to smooth the natural logarithm of the data and the time-derivative is therefore of the logarithm of the data.
- noruns: integer, optional
The number of attempts to be made at optimising the kernel’s hyperparmeters.
- noinits: integer, optional
The number of random attempts made to find good initial choices for the hyperparameters before running their optimisation.
- exitearly: boolean, optional
If True, stop at the first successful attempt at optimising the hyperparameters otherwwise take the best choice from all successful optimisations.
- bd: dictionary, optional
Specifies the limits on the hyperparameters for the Gaussian process. For example, bd= {0: [-1, 4], 2: [2, 6]}) sets confines the first hyperparameter to be between 1e-1 and 1e^4 and confines the third hyperparmater between 1e2 and 1e6.
- empirical_errors: boolean, optional
If True, measurement errors are empirically estimated by the variance across replicates at each time point. If False, the variance of the measurement error is assumed to be the same for all time points and its magnitude is a hyperparameter that is optimised.
- optmethod: string, optional
The algorithm used to optimise the hyperparameters, either ‘l_bfgs_b’ or ‘tnc’.
- nosamples: integer, optional
The number of bootstrap samples taken to estimate errors in statistics.
- stats: boolean, optional
If True, calcuate summary statistics for both the smoothed data and the inferred time- derivative.
- statnames: list of strings
To customise the names of the statistics. The default names are: ‘max df’ for the maximal time derivative; ‘time of max df’ for the time at which the maximal time derivative occurs; ‘inverse max df’ for the timescale found from inverting the maximal time derivative; ‘max f’ for the maximal value of the smoothed data; ‘lag time’ for the lag time defined as the time when the tangent from the point with the maximal time derivative crosses a line parallel to the time-axis that passes through the first data point.
- showstaterrors: boolean, optional
If True, display estimated errors for the statistics.
- warn: boolean, optional
If False, warnings created by covariance matrices that are not positive semi-definite are suppressed.
- linalgmax: integer, optional
The number of times errors generated by underlying linear algebra modules during the optimisation by poor choices of the hyperparameters should be ignored.
- iskip: integer, optional
If non-zero, only every iskip’th data point is used to increase speed.
Methods
calculatestats
([nosamples, statnames, ...])Calculates the statistics from the smoothed data and the inferred time derivative.
fitderivsample
(nosamples[, newt])Generate sample values for the latent function and its first two derivatives (returned as a tuple).
plotfit
([char, errorfac, xlabel, ylabel, ...])Plots either the data and the mean of the optimal Gaussian process or the inferred time derivatives.
printstats
([showerrors, performprint])Creates and potentially displays a dictionary of the statistics calculated from the smoothed data and its inferred time-derivatives.
run
(cvfn, ta, da, ma, noruns, noinits, ...)Instantiates and runs Gaussian process
- __init__(t, d, cvfn='sqexp', logs=True, noruns=5, noinits=100, exitearly=False, bd=False, empirical_errors=False, optmethod='l_bfgs_b', nosamples=100, stats=True, statnames=False, showstaterrors=True, warn=False, linalgmax=3, iskip=False)[source]#
Runs a Gaussian process to both smooth time-series data and estimate its time-derivatives.
- Parameters
- t: array
The time points.
- d: array
- The data corresponding to the time points with any replicates given
as columns.
- cvfn: string
The type of kernel function for the Gaussian process either ‘sqexp’ (squared exponential) or ‘matern’ (Matern with nu= 5/2) or ‘nn’ (neural network).
- logs: boolean
If True, the Gaussian process is used to smooth the natural logarithm of the data and the time-derivative is therefore of the logarithm of the data.
- noruns: integer, optional
The number of attempts to be made at optimising the kernel’s hyperparmeters.
- noinits: integer, optional
The number of random attempts made to find good initial choices for the hyperparameters before running their optimisation.
- exitearly: boolean, optional
If True, stop at the first successful attempt at optimising the hyperparameters otherwwise take the best choice from all successful optimisations.
- bd: dictionary, optional
Specifies the limits on the hyperparameters for the Gaussian process. For example, bd= {0: [-1, 4], 2: [2, 6]}) sets confines the first hyperparameter to be between 1e-1 and 1e^4 and confines the third hyperparmater between 1e2 and 1e6.
- empirical_errors: boolean, optional
If True, measurement errors are empirically estimated by the variance across replicates at each time point. If False, the variance of the measurement error is assumed to be the same for all time points and its magnitude is a hyperparameter that is optimised.
- optmethod: string, optional
The algorithm used to optimise the hyperparameters, either ‘l_bfgs_b’ or ‘tnc’.
- nosamples: integer, optional
The number of bootstrap samples taken to estimate errors in statistics.
- stats: boolean, optional
If True, calcuate summary statistics for both the smoothed data and the inferred time- derivative.
- statnames: list of strings
To customise the names of the statistics. The default names are: ‘max df’ for the maximal time derivative; ‘time of max df’ for the time at which the maximal time derivative occurs; ‘inverse max df’ for the timescale found from inverting the maximal time derivative; ‘max f’ for the maximal value of the smoothed data; ‘lag time’ for the lag time defined as the time when the tangent from the point with the maximal time derivative crosses a line parallel to the time-axis that passes through the first data point.
- showstaterrors: boolean, optional
If True, display estimated errors for the statistics.
- warn: boolean, optional
If False, warnings created by covariance matrices that are not positive semi-definite are suppressed.
- linalgmax: integer, optional
The number of times errors generated by underlying linear algebra modules during the optimisation by poor choices of the hyperparameters should be ignored.
- iskip: integer, optional
If non-zero, only every iskip’th data point is used to increase speed.
Methods
__init__
(t, d[, cvfn, logs, noruns, ...])Runs a Gaussian process to both smooth time-series data and estimate its time-derivatives.
calculatestats
([nosamples, statnames, ...])Calculates the statistics from the smoothed data and the inferred time derivative.
fitderivsample
(nosamples[, newt])Generate sample values for the latent function and its first two derivatives (returned as a tuple).
plotfit
([char, errorfac, xlabel, ylabel, ...])Plots either the data and the mean of the optimal Gaussian process or the inferred time derivatives.
printstats
([showerrors, performprint])Creates and potentially displays a dictionary of the statistics calculated from the smoothed data and its inferred time-derivatives.
run
(cvfn, ta, da, ma, noruns, noinits, ...)Instantiates and runs Gaussian process
- calculatestats(nosamples=100, statnames=None, showerrors=True)[source]#
Calculates the statistics from the smoothed data and the inferred time derivative. The default names are ‘max df’, ‘time of max df’, ‘inverse max grad’, ‘max f’, and ‘lag time’.
- Parameters
- nosamples: integer
The number of bootstrap samples used to estimate errors.
- statnames: list of strings, optional
A list of alternative names for the statistics.
- showerrors: boolean, optional
If True, display the estimated errors.
- fitderivsample(nosamples, newt=None)[source]#
Generate sample values for the latent function and its first two derivatives (returned as a tuple).
- Parameters
- nosamples: integer
The number of samples.
- newt: array, optional
Time points for which the samples should be made. If None, the orginal time points are used.
- Returns
- samples: a tuple of arrays
The first element of the tuple gives samples of the latent function; the second element gives samples of the first time derivative; and the third element gives samples of the second time derivative.
- plotfit(char='f', errorfac=1, xlabel='time', ylabel=False, figtitle=False)[source]#
Plots either the data and the mean of the optimal Gaussian process or the inferred time derivatives.
- Parameters
- char: string
The variable to plot either ‘f’ or ‘df’ or ‘ddf’.
- errorfac: float, optional
The size of the errorbars are errorfac times the standard deviation of the optimal Gaussian process.
- ylabel: string, optional
A label for the y-axis.
- figtitle: string, optional
A title for the figure.
- printstats(showerrors=True, performprint=True)[source]#
Creates and potentially displays a dictionary of the statistics calculated from the smoothed data and its inferred time-derivatives.
- Parameters
- showerrors: boolean, optional
If True, display the errors.
- performprint: boolean optional
If True, display the statistics.
- Returns
- statd: dictionary
The statistics and their errors.