Using omniplate to analysis data from plate readers¶
The software corrects fluorescence for autofluorescence, calculates the fluorescence per cell, and estimates the growth rate and other growth characteristics.
We recommend running ipython via a jupyter notebook or a terminal and then importing omniplate.
Getting started¶
Loading the software
Getting help
How the data is stored¶
The dataframes
Working with a subset of the data
Restricting the range of time to be analysed
Checking the data is as expected¶
Checking the data
Checking the contents of the wells
Ignoring wells
Analysing OD data¶
Analysing OD data: correcting for non-linearities and for the media
Analysing OD data: plotting
Analysing OD data: estimating growth rates
Trouble shooting estimating growth rates
Finding the local maximum growth rate
Analysing fluorescence data¶
Correcting autofluorescence: GFP
Correcting autofluorescence: mCherry
Checking which corrections have been performed
Estimating the time-derivative of the fluorescence
Specialising to mid-log, or exponential, phase¶
Quantifying the behaviour during mid-log growth
Making life easier¶
Saving figures
Extracting numerical values from a column
Getting a smaller dataframe for plotting directly
Exporting and importing the dataframes¶
Exporting and importing the dataframes
An example of processing the data to create a new column¶
Analysing multiple data sets simultaneously¶
Loading and processing more than one data set
Renaming conditions and strains
Averaging over experiments
Importing processed data for more than one experiment
We first import some standard packages, although these packages are only necessary if you wish to develop additional analysis.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
%matplotlib inline
import seaborn as sns
# makes figures look better in Jupyter
sns.set_context('talk')
Loading the software¶
%run import_local # ignore this command
import omniplate as om
print(om.__version__)
2.0
There are two ways of starting the software.
You can start without choosing a data set and optionally can specify a working directory in which to save output and a data directory from which to load data. Both can be the same.
p= om.platereader(datadir="data", wdir="data")
Data directory is /Users/pswain/Dropbox/packages/ompkg/data. Working directory is /Users/pswain/Dropbox/packages/ompkg/data. Files available - see .files - are: --- {0: 'ExampleData.tsv', 1: 'ExampleData.xlsx', 2: 'ExampleDataContents.xlsx', 3: 'ExampleData_r.tsv', 4: 'ExampleData_s.tsv', 5: 'ExampleData_sc.tsv', 6: 'Glu.xlsx', 7: 'GluContents.xlsx', 8: 'GluGal.xlsx', 9: 'GluGalContents.xlsx', 10: 'GluGal_r.tsv', 11: 'GluGal_s.tsv', 12: 'GluGal_sc.tsv', 13: 'Glu_r.tsv', 14: 'Glu_s.tsv', 15: 'Glu_sc.tsv', 16: 'HxtA.xlsx', 17: 'HxtAContents.xlsx', 18: 'HxtB.xlsx', 19: 'HxtBContents.xlsx', 20: 'Sunrise.xlsx', 21: 'SunriseContents.xlsx'}
You can access files by name or using omniplate:
p.files[1]
'ExampleData.xlsx'
Datasets are data saved by omniplate, using as tsv or csv files.
p.datasets[2]
'Glu'
To load data generated by the plate reader, two files are needed: the first file is the raw data, either produced by the plate reader or you can use data already parsed into a tidy format; the second file describes the contents of the wells of the plate and should be an Excel spreadsheet.
p.load('ExampleData.xlsx', 'ExampleDataContents.xlsx')
Loading ExampleData.xlsx Experiment: ExampleData --- Conditions: 0.25% Mal 0.5% Mal 1% Mal 1.5% Mal 2% Mal 2% Raf 3% Mal 4% Mal Strains: Mal12:GFP Mal12:mCherry Mal12:mCherry,Gal10:GFP Null WT Data types: OD GFP AutoFL mCherry Ignored wells: None Warning: wells with no strains have been changed to "Null".
The alternative tidy format that omniplate expects, saved in a tsv or csv file, is:
time well OD GFP AutoFL mCherry
0 0.0 A1 0.25 46.0 18.0 19.0
1 0.23 A1 0.27 45.0 17.0 17.0
etc., and where time is in hours.
To load tidy data, change the platereadertype:
p= om.platereader("ExampleData.tsv", "ExampleDataContents.xlsx",
platereadertype="tidy", datadir= "data")
Loading ExampleData.tsv Columns must be labelled 'time', 'well', 'OD', etc., and time must be in units of hours. Experiment: ExampleData --- Conditions: 0.25% Mal 0.5% Mal 1% Mal 1.5% Mal 2% Mal 2% Raf 3% Mal 4% Mal Strains: Mal12:GFP Mal12:mCherry Mal12:mCherry,Gal10:GFP Null WT Data types: OD GFP AutoFL mCherry Ignored wells: None Warning: wells with no strains have been changed to "Null".
You can also change the datasheet for the data,
p= om.platereader("ExampleData.xlsx", "ExampleContents.xlsx", dsheets= [1])
and the working or data directory, e.g.,
p.changewdir("newdata")
Getting help¶
To open this webpage, use
p.webhelp
Typing
help(p)
gives information on all the methods available in platereader.
For example,
help(p.correctauto)
gives information on a particular method.
You can use
p.info
to see the current status of the data processing;
p.log
to see a log of all previous processing steps and p.savelog() to save this log to a file.
The dataframes¶
Although you will not often has to look at the data directly, omniplate stores data in three Pandas dataframes:
- p.r: contains the data stored by well
- p.s: contains time-series of processed data, which are created from all the relevant wells
- p.sc: contains summary statistics, such as the maximum growth rate
You can see what columns are in these dataframes using, for example:
p.r.columns
Use .head() to show only the beginning of the dataframe:
p.r.head()
There is also a dataframe that stores the contents of the wells: p.wellsdf.
p.r.head()
time | well | OD | GFP | AutoFL | mCherry | experiment | condition | strain | |
---|---|---|---|---|---|---|---|---|---|
0 | 0.000000 | A1 | 0.2555 | 46.0 | 18.0 | 19.0 | ExampleData | 2% Raf | Mal12:GFP |
1 | 0.232306 | A1 | 0.2725 | 45.0 | 17.0 | 17.0 | ExampleData | 2% Raf | Mal12:GFP |
2 | 0.464583 | A1 | 0.2767 | 46.0 | 19.0 | 18.0 | ExampleData | 2% Raf | Mal12:GFP |
3 | 0.696840 | A1 | 0.2870 | 45.0 | 19.0 | 20.0 | ExampleData | 2% Raf | Mal12:GFP |
4 | 0.929111 | A1 | 0.2939 | 47.0 | 20.0 | 20.0 | ExampleData | 2% Raf | Mal12:GFP |
p.s.head()
experiment | condition | strain | time | mean_OD | mean_GFP | mean_AutoFL | mean_mCherry | OD_err | GFP_err | AutoFL_err | mCherry_err | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | ExampleData | 0.25% Mal | Mal12:GFP | 0.000000 | 0.260733 | 45.000000 | 17.333333 | 18.666667 | 0.012772 | 1.00000 | 0.57735 | 2.081666 |
1 | ExampleData | 0.25% Mal | Mal12:GFP | 0.232306 | 0.264400 | 43.666667 | 17.000000 | 17.666667 | 0.011888 | 0.57735 | 0.00000 | 0.577350 |
2 | ExampleData | 0.25% Mal | Mal12:GFP | 0.464583 | 0.268533 | 45.000000 | 19.000000 | 18.666667 | 0.008025 | 1.00000 | 0.00000 | 1.154701 |
3 | ExampleData | 0.25% Mal | Mal12:GFP | 0.696840 | 0.275833 | 44.333333 | 19.000000 | 18.666667 | 0.009900 | 0.57735 | 0.00000 | 1.154701 |
4 | ExampleData | 0.25% Mal | Mal12:GFP | 0.929111 | 0.279233 | 45.000000 | 19.333333 | 19.666667 | 0.009209 | 0.00000 | 0.57735 | 0.577350 |
p.sc.head()
experiment | strain | condition | OD_measured | GFP_measured | AutoFL_measured | mCherry_measured | |
---|---|---|---|---|---|---|---|
0 | ExampleData | Mal12:GFP | 0.25% Mal | True | True | True | True |
1 | ExampleData | Mal12:GFP | 0.5% Mal | True | True | True | True |
2 | ExampleData | Mal12:GFP | 1% Mal | True | True | True | True |
3 | ExampleData | Mal12:GFP | 1.5% Mal | True | True | True | True |
4 | ExampleData | Mal12:GFP | 2% Mal | True | True | True | True |
Working with a subset of the data¶
For almost all of platereader's methods you can work with a subset of data.
For example,
- you can include particular conditions by specifying
conditions = ['2% Glu', '1% Glu']
- you can include all conditions containing the word Glu, by specifying
conditionincludes= 'Glu'
- you can exclude all conditions containing either the words Glu or Gal, by specifying
conditionexcludes= ['Glu', 'Gal']
You can at the same time set experiments, experimentincludes, and experimentexcludes and strains, strainincludes, and strainexcludes.
Restricting the range of time to be analysed¶
Data at the beginning or at the end of the experiment may be corrupted. You can ignore such data using, for example
p.restricttime(tmin= 2, tmax= 15)
or
p.restricttime(tmin= 2)
Although this command will remove data from the p.s dataframe, it will not remove data from p.r and so can be reversed.
Checking the data¶
You can display the data by the wells in the plate.
In this example, columns 11 and 12 contain only media and no cells.
p.plot(y= 'OD', plate= True)
p.plot(y= 'GFP', plate= True)
Checking the contents of wells¶
You can search the p.wellsdf dataframe for experiments, conditions, and strains and for wells:
p.showwells(strains= 'Mal12:GFP', conditions= '1% Mal')
experiment condition strain well ExampleData 1% Mal Mal12:GFP D1 ExampleData 1% Mal Mal12:GFP D2 ExampleData 1% Mal Mal12:GFP D3
p.contentsofwells(['A1', 'D1'])
A1 -- experiment condition strain ExampleData 2% Raf Mal12:GFP D1 -- experiment condition strain ExampleData 1% Mal Mal12:GFP
With showwells, you can more easily see which strains are in which conditions for each experiment by setting concise =True:
p.showwells(concise= True, strainincludes= 'GFP',
sortby= 'strain')
experiment condition strain replicates ExampleData 0.25% Mal Mal12:GFP 3 ExampleData 0.5% Mal Mal12:GFP 3 ExampleData 1% Mal Mal12:GFP 3 ExampleData 1.5% Mal Mal12:GFP 3 ExampleData 2% Mal Mal12:GFP 3 ExampleData 2% Raf Mal12:GFP 3 ExampleData 3% Mal Mal12:GFP 3 ExampleData 4% Mal Mal12:GFP 3 ExampleData 0.25% Mal Mal12:mCherry,Gal10:GFP 3 ExampleData 0.5% Mal Mal12:mCherry,Gal10:GFP 3 ExampleData 1% Mal Mal12:mCherry,Gal10:GFP 3 ExampleData 1.5% Mal Mal12:mCherry,Gal10:GFP 3 ExampleData 2% Mal Mal12:mCherry,Gal10:GFP 3 ExampleData 2% Raf Mal12:mCherry,Gal10:GFP 3 ExampleData 3% Mal Mal12:mCherry,Gal10:GFP 3 ExampleData 4% Mal Mal12:mCherry,Gal10:GFP 3
p.ignorewells(['B2', 'B3'])
p.plot(y= 'OD', plate= True)
Analysing OD data: correcting for non-linearities and for the media¶
Correcting OD¶
Typically plate readers have a non-linear relationship between the OD and cell numbers, particularly for sufficiently large ODs.
You can use your own calibration data or omniplate's default calibration data to correct for this non-linearity.
The corrected ODs are then in arbitrary units.
Omniplate also, optionally, substracts the background OD of control wells, those with only media and marked as containing a Null strain.
p.correctOD()
Fitting dilution data for OD correction for non-linearities. Using default data.
Corrected for the background OD of the media. ExampleData - 0.25% Mal: Correcting OD for the OD of the medium.
Corrected for the background OD of the media. ExampleData - 0.5% Mal: Correcting OD for the OD of the medium.
Corrected for the background OD of the media. ExampleData - 1% Mal: Correcting OD for the OD of the medium.
Corrected for the background OD of the media. ExampleData - 1.5% Mal: Correcting OD for the OD of the medium.
Corrected for the background OD of the media. ExampleData - 2% Mal: Correcting OD for the OD of the medium.
Corrected for the background OD of the media. ExampleData - 2% Raf: Correcting OD for the OD of the medium.
Corrected for the background OD of the media. ExampleData - 3% Mal: Correcting OD for the OD of the medium.
Corrected for the background OD of the media. ExampleData - 4% Mal: Correcting OD for the OD of the medium.
The function used for this correction is plotted unless figs= False.
You can change the calibration data used:
p.correctOD(ODfname= 'ODcorrection_Raffinose_Haploid.txt')
Analysing OD data: plotting¶
All plotting uses Seaborn's relplot function, and if you wish you can set most of the variables that Seaborn allows.
p.plot(y= 'OD', wells= True, strainincludes= 'Gal10:GFP',
conditionincludes= '5',
prettify_dict={"time" : "time (h)"})
If you do not plot individual wells, errors are shown by shading using the standard deviation over all relevant wells.
p.plot(y= 'OD', conditionincludes= '0.25')
p.plot(y= 'OD', strainincludes= 'Gal10:GFP',
hue= 'condition', style= 'strain')
You can also use seaborn directly:
fg= sns.relplot(x= "time", y= "OD", data= p.r, kind= "line",
hue= "condition", col= "strain", errorbar="sd")
Analysing OD data: estimating growth rates¶
You can estimate (specific) growth rates using all replicates for each strain in each condition using a Gaussian process-based algorithm.
If the maximum growth rate is a local maxima, it is marked with a yellow circle.
Local maxima are identified by the prominence of the local peaks. If the peak is falsely identified, you can custom how the local maxima are detected by specifying the degree of prominence with the peakprominence parameter, which has a default value of 0.05.
If you would prefer not to have local maximum growth rates shown on the plots, set plotlocalmax= False.
p.getstats(strains="Mal12:GFP", conditionincludes= "0.25")
Fitting log_OD for ExampleData: Mal12:GFP in 0.25% Mal Taking natural logarithm of the data. Using a (twice differentiable) Matern covariance function.
log(max likelihood)= 1.160789e+02 hparam[0]= 1.868106e+03 [1.000000e-05, 1.000000e+05] hparam[1]= 9.514160e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- hparam[0] determines the amplitude of variation hparam[1] determines the stiffness hparam[2] determines the variance of the measurement error
Three hyperparameters are estimated during the fitting, and you can change the lower and upper bounds for these hyperparameters. The bounds are specifed in log10 space.
For example, to change the bounds on the measurement noise, parameter two, to $10^{-3}$ and $10^{0}$ use the "bd" option:
p.getstats(strains="Mal12:GFP", conditionincludes= "0.25",
options={"bd": {2: (-3,0)}})
Fitting log_OD for ExampleData: Mal12:GFP in 0.25% Mal Taking natural logarithm of the data. Using a (twice differentiable) Matern covariance function.
log(max likelihood)= 2.262978e+02 hparam[0]= 1.068147e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 1.219873e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-03 [1.000000e-03, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- hparam[0] determines the amplitude of variation hparam[1] determines the stiffness hparam[2] determines the variance of the measurement error
You can specify the bounds on multiple parameters, such as parameters 0 and 1, using
p.getstats(strainincludes="Mal12:GFP", conditionincludes= "0.25",
options={"bd" : {0: (-2, 2), 1: (-4, -1)}})
Growth statistics are saved for the strains processed in p.sc.
For example, the maximum growth rate and its error are given by:
p.sc[["experiment", "condition", "strain", "max_gr", "max_gr_err"]].head()
experiment | condition | strain | max_gr | max_gr_err | |
---|---|---|---|---|---|
0 | ExampleData | 0.25% Mal | Mal12:GFP | 0.238089 | 0.007046 |
1 | ExampleData | 0.5% Mal | Mal12:GFP | NaN | NaN |
2 | ExampleData | 1% Mal | Mal12:GFP | NaN | NaN |
3 | ExampleData | 1.5% Mal | Mal12:GFP | NaN | NaN |
4 | ExampleData | 2% Mal | Mal12:GFP | NaN | NaN |
We can get the growth rate for all strains using, for example:
p.getstats(figs=False)
Fitting log_OD for ExampleData: Mal12:GFP in 0.25% Mal Taking natural logarithm of the data. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 1.170348e+02 hparam[0]= 4.031928e+02 [1.000000e-05, 1.000000e+05] hparam[1]= 7.982820e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:mCherry in 0.25% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 2.557829e+02 hparam[0]= 2.299176e+03 [1.000000e-05, 1.000000e+05] hparam[1]= 8.368615e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:mCherry,Gal10:GFP in 0.25% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 4.054205e+02 hparam[0]= 8.156116e-01 [1.000000e-05, 1.000000e+05] hparam[1]= 1.154880e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: WT in 0.25% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 2.564110e+02 hparam[0]= 1.769838e+02 [1.000000e-05, 1.000000e+05] hparam[1]= 4.528240e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:GFP in 0.5% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 3.920577e+02 hparam[0]= 1.585845e+02 [1.000000e-05, 1.000000e+05] hparam[1]= 3.474849e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:mCherry in 0.5% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 2.607274e+02 hparam[0]= 2.746075e-01 [1.000000e-05, 1.000000e+05] hparam[1]= 8.445697e+00 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:mCherry,Gal10:GFP in 0.5% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 4.052167e+02 hparam[0]= 9.017991e-01 [1.000000e-05, 1.000000e+05] hparam[1]= 1.081475e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: WT in 0.5% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 2.556597e+02 hparam[0]= 3.025010e+02 [1.000000e-05, 1.000000e+05] hparam[1]= 4.717890e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:GFP in 1% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 4.009709e+02 hparam[0]= 1.143425e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 1.124104e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:mCherry in 1% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 2.613205e+02 hparam[0]= 8.089581e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 1.784135e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:mCherry,Gal10:GFP in 1% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 4.043046e+02 hparam[0]= 2.593596e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 1.609164e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: WT in 1% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 2.624566e+02 hparam[0]= 6.468665e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 2.076316e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:GFP in 1.5% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 4.016234e+02 hparam[0]= 1.215822e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 1.278155e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:mCherry in 1.5% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 2.578402e+02 hparam[0]= 2.473220e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 1.728221e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:mCherry,Gal10:GFP in 1.5% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 3.957857e+02 hparam[0]= 1.527151e+04 [1.000000e-05, 1.000000e+05] hparam[1]= 1.157736e+02 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: WT in 1.5% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 2.627814e+02 hparam[0]= 7.495055e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 2.307803e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:GFP in 2% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 3.986799e+02 hparam[0]= 2.071288e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 1.523154e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:mCherry in 2% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 2.638987e+02 hparam[0]= 1.895763e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 1.498187e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:mCherry,Gal10:GFP in 2% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 3.944877e+02 hparam[0]= 6.621335e+04 [1.000000e-05, 1.000000e+05] hparam[1]= 1.475438e+02 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: WT in 2% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 2.635570e+02 hparam[0]= 9.091073e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 2.590318e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:GFP in 2% Raf Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 3.989004e+02 hparam[0]= 8.616926e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 2.462003e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:mCherry in 2% Raf Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 2.610439e+02 hparam[0]= 3.049051e+03 [1.000000e-05, 1.000000e+05] hparam[1]= 1.259862e+02 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:mCherry,Gal10:GFP in 2% Raf Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 4.093619e+02 hparam[0]= 3.263728e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 2.319406e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: WT in 2% Raf Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 2.642165e+02 hparam[0]= 1.304896e+01 [1.000000e-05, 1.000000e+05] hparam[1]= 2.907961e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:GFP in 3% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 4.036215e+02 hparam[0]= 1.294332e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 1.483970e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:mCherry in 3% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 2.648341e+02 hparam[0]= 3.565919e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 2.079773e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:mCherry,Gal10:GFP in 3% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 4.063823e+02 hparam[0]= 2.398131e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 1.732800e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: WT in 3% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 2.639243e+02 hparam[0]= 1.280668e+01 [1.000000e-05, 1.000000e+05] hparam[1]= 3.118357e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:GFP in 4% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 4.029342e+02 hparam[0]= 5.291649e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 2.328964e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:mCherry in 4% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 2.649733e+02 hparam[0]= 3.586448e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 2.152214e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: Mal12:mCherry,Gal10:GFP in 4% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 4.005103e+02 hparam[0]= 1.297051e+03 [1.000000e-05, 1.000000e+05] hparam[1]= 9.948548e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- Fitting log_OD for ExampleData: WT in 4% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. log(max likelihood)= 2.602374e+02 hparam[0]= 1.396726e+01 [1.000000e-05, 1.000000e+05] hparam[1]= 2.826651e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- hparam[0] determines the amplitude of variation hparam[1] determines the stiffness hparam[2] determines the variance of the measurement error
p.plot(y= "gr", strains="Mal12:GFP", conditionincludes= "1",
hue= "condition", style="strain",
prettify_dict= {"gr" : "growth rate"})
Trouble shooting for estimating growth rates¶
The optimisation routine that fits the hyperparameters of the Gaussian process to OD data can be customised by changing options to getstats.
The bounds on hyperparameter 2 is a measure of how noisy you think the measurements are.
Occasionally with the default bounds, we find that the growth rate fluctuates in small waves over time or has two peaks separated by a shallow trough. This behaviour is often because the software is fitting the noise in the data.
Changing the bounds on hyperparameter 2 – using the "bd" option, particularly increasing the lower bound so that the minimal level of noise is higher, usually stops these fluctuations so that the growth rate varies smoothly.
Finding the local maximum growth rate¶
Scipy's find_peaks is used to find the local maximum growth rate, which is useful if the highest growth rate is at t= 0.
You can specify properties that a peak must satisfy to be considered a maximum using the options of find_peaks.
First, try
p.getstats(showpeakproperties= True, width= 0,
prominence= 0)
which will display the width and prominence of the local peaks to get an idea of the values to set. These values are given in the number of x and y points and not in real units.
Once you have a minimum value of one of these parameters that a true local maximum should satisfy then use, for example,
p.getstats(width= 15)
so that all local maxima will have a width of at least 15 x units.
Correcting autofluorescence: GFP¶
Omniplate corrects fluorescence measurements for autofluorescence using a reference strain that does not have a fluorophore.
We run correctauto to correct for autofluorescence.
The preferred method is correctauto, which requires at least three replicate wells for the fluorescently tagged and reference strain.
The alternative is the legacy methods, correctauto_l, which require only a single well and can use measurements of two fluorescence wavelengths.
For example,
p.correctauto_l(["GFP", "AutoFL"], strains="Mal12:GFP", conditionincludes="1",
refstrain= "WT", options= {"figs" :False})
Correcting autofluorescence using WT as the reference. Using two fluorescence wavelengths. Correcting autofluorescence using GFP and AutoFL. Correcting for background fluorescence of media. ExampleData: Processing reference strain WT for GFP in 1% Mal. Fitting log_GFP for ExampleData: Mal12:GFP in 1% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. --- Correcting for background fluorescence of media. ExampleData: Processing reference strain WT for GFP in 1.5% Mal. Fitting log_GFP for ExampleData: Mal12:GFP in 1.5% Mal Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. ---
The results are stored in the p.s dataframe and can be plot by specifying cGFPperODd. The corrected fluorescence for the wild-type strain, which should be zero, gives an indication of the size of the errors.
p.plot(y= "cGFPperOD", strains="Mal12:GFP", conditionincludes="1",
hue= "condition", style="strain",
prettify_dict={"cGFPperOD" : "fluorescence per cell",
"time" : "time (h)"})
Correcting autofluorescence: mCherry¶
In the legacy code, mCherry is corrected for autofluorescence by using the fluorescence of a reference strain at the same OD as the strain of interest.
You can specify whether or not omniplate should use Gaussian processes.
p.correctauto_l("mCherry", conditionincludes= "1", strainincludes= "mCherry",
refstrain= "WT", options = {"useGPs" : False})
Correcting autofluorescence using WT as the reference. Using one fluorescence wavelength. Correcting autofluorescence using mCherry. ExampleData: Processing reference strain WT for mCherry in 1% Mal.
ExampleData: Processing reference strain WT for mCherry in 1.5% Mal.
p.plot(y= "cmCherryperOD", conditionincludes="1", strainincludes= "mCherry",
hue= "strain", style= "condition", nonull= True)
Estimating the time-derivative of the fluorescence¶
Omniplate will also estimate the time derivative of the fluorescence per cell using getstats. We set logs= False because we want the derivative of the fluorescence and not the derivative of the logarithm of the fluorescence:
p.getstats('cmCherryperOD', conditions= ['1% Mal'],
strains= ['Mal12:mCherry'],
options={"logs": False,
"bd": {2 : (-1,3)}})
Fitting cmCherryperOD for ExampleData: Mal12:mCherry in 1% Mal GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function.
log(max likelihood)= -9.777698e+02 hparam[0]= 7.977984e+04 [1.000000e-05, 1.000000e+05] hparam[1]= 2.264585e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 2.418165e+01 [1.000000e-01, 1.000000e+03] --- hparam[0] determines the amplitude of variation hparam[1] determines the stiffness hparam[2] determines the variance of the measurement error
The results are stored in the p.s and p.sc dataframes:
p.s.query('condition == "1% Mal" and strain == "Mal12:mCherry"')[['time',
'd/dt_cmCherryperOD']].head()
time | d/dt_cmCherryperOD | |
---|---|---|
1155 | 0.000000 | 5.979299 |
1156 | 0.232306 | 5.953111 |
1157 | 0.464583 | 5.909191 |
1158 | 0.696840 | 5.836178 |
1159 | 0.929111 | 5.730639 |
Quantifying the behaviour during mid-log growth¶
Using nunchaku (Huo & Swain), omniplate will automatically identify the segment of the time series where growth is exponential and calculate statistics of all variables in the s dataframe in this mid-log segment.
Black squares mark the segment of growth identified as mid-log.
p.getmidlog(conditions="1% Mal")
Finding mid-log growth for ExampleData : Mal12:GFP in 1% Mal
getting evidence matrix: 0%| | 0/103 [00:00<?, ?it/s]
getting model evidence: 0%| | 0/4 [00:00<?, ?it/s]
WARNING:root:Nunchaku: failed to estimate MLE of sigma when the number of segments is 3. /Users/pswain/wip/nunchaku/nunchaku/nunchaku.py:957: IntegrationWarning: The maximum number of subdivisions (50) has been achieved. If increasing the limit yields no improvement it is advised to analyze the integrand in order to determine the difficulties. If the position of a local difficulty can be determined (singularity, discontinuity) one will probably gain from splitting up the interval and calling the integrator on the subranges. Perhaps a special-purpose integrator should be used. res = quad( WARNING:root:Nunchaku: Numerical integration failed when finding model evidence with 3 segments. WARNING:root:Nunchaku: failed to estimate MLE of sigma when the number of segments is 4. WARNING:root:Nunchaku: Numerical integration failed when finding model evidence with 4 segments.
getting internal boundaries: 0%| | 0/1 [00:00<?, ?it/s]
Finding mid-log growth for ExampleData : Mal12:mCherry in 1% Mal
getting evidence matrix: 0%| | 0/103 [00:00<?, ?it/s]
getting model evidence: 0%| | 0/4 [00:00<?, ?it/s]
WARNING:root:Nunchaku: failed to estimate MLE of sigma when the number of segments is 4. /Users/pswain/wip/nunchaku/nunchaku/nunchaku.py:957: IntegrationWarning: The maximum number of subdivisions (50) has been achieved. If increasing the limit yields no improvement it is advised to analyze the integrand in order to determine the difficulties. If the position of a local difficulty can be determined (singularity, discontinuity) one will probably gain from splitting up the interval and calling the integrator on the subranges. Perhaps a special-purpose integrator should be used. res = quad( WARNING:root:Nunchaku: Numerical integration failed when finding model evidence with 4 segments.
getting internal boundaries: 0%| | 0/2 [00:00<?, ?it/s]
Finding mid-log growth for ExampleData : Mal12:mCherry,Gal10:GFP in 1% Mal
getting evidence matrix: 0%| | 0/103 [00:00<?, ?it/s]
getting model evidence: 0%| | 0/4 [00:00<?, ?it/s]
WARNING:root:Nunchaku: failed to estimate MLE of sigma when the number of segments is 3. /Users/pswain/wip/nunchaku/nunchaku/nunchaku.py:957: IntegrationWarning: The maximum number of subdivisions (50) has been achieved. If increasing the limit yields no improvement it is advised to analyze the integrand in order to determine the difficulties. If the position of a local difficulty can be determined (singularity, discontinuity) one will probably gain from splitting up the interval and calling the integrator on the subranges. Perhaps a special-purpose integrator should be used. res = quad( WARNING:root:Nunchaku: Numerical integration failed when finding model evidence with 3 segments. WARNING:root:Nunchaku: failed to estimate MLE of sigma when the number of segments is 4. WARNING:root:Nunchaku: Numerical integration failed when finding model evidence with 4 segments.
getting internal boundaries: 0%| | 0/1 [00:00<?, ?it/s]
Finding mid-log growth for ExampleData : WT in 1% Mal
getting evidence matrix: 0%| | 0/103 [00:00<?, ?it/s]
getting model evidence: 0%| | 0/4 [00:00<?, ?it/s]
WARNING:root:Nunchaku: failed to estimate MLE of sigma when the number of segments is 4.
Error finding midlog data for ExampleData: WT in 1% Mal.
All mid-log statistics are stored in the sc dataframe.
p.sc[p.sc.condition=="1% Mal"][["strain", "condition"] +
[col for col in p.sc.columns if "midlog" in col]]
strain | condition | mean_midlog_time | mean_midlog_mean_OD | mean_midlog_mean_GFP | mean_midlog_mean_AutoFL | mean_midlog_mean_mCherry | mean_midlog_OD_err | mean_midlog_GFP_err | mean_midlog_AutoFL_err | ... | max_midlog_cmCherry | max_midlog_cmCherry_err | max_midlog_cmCherryperOD | max_midlog_cmCherryperOD_err | max_midlog_smoothed_cmCherryperOD | max_midlog_smoothed_cmCherryperOD_err | max_midlog_d/dt_cmCherryperOD | max_midlog_d/dt_cmCherryperOD_err | max_midlog_d/dt_d/dt_cmCherryperOD | max_midlog_d/dt_d/dt_cmCherryperOD_err | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | Mal12:GFP | 1% Mal | 3.484080 | 0.486105 | 204.774194 | 33.655914 | 17.236559 | 0.010673 | 2.340206 | 0.437000 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
10 | Mal12:mCherry | 1% Mal | 4.993834 | 0.622652 | 54.250000 | 25.604167 | 49.979167 | 0.005673 | 0.707107 | 0.559793 | ... | 55.086287 | 5.027763 | 60.755559 | 8.223167 | 53.658084 | 4.99627 | 4.368963 | 0.710656 | 3.784758 | 1.127147 |
18 | Mal12:mCherry,Gal10:GFP | 1% Mal | 3.948619 | 0.502835 | 44.847619 | 21.409524 | 42.685714 | 0.002589 | 0.656037 | 0.453040 | ... | 52.080075 | 5.714467 | 57.834751 | 13.996024 | NaN | NaN | NaN | NaN | NaN | NaN |
26 | Null | 1% Mal | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
34 | WT | 1% Mal | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
5 rows × 126 columns
You can display the results with plot.
p.plot(x="mean_midlog_mean_GFP", y="strain",
prettify_dict={"mean_midlog_mean_GFP" : "mean GFP at midlog"})
Saving figures¶
You can at any time save all figures that are currently displayed (but not those displayed inline in Jupyter):
p.savefigs()
which saves each figure as a separate page in a single PDF file in the working directory.
Extracting numerical values from a column¶
It may be useful to create a new column in the dataframes by extracting the numerical values given in another, for example from the name of an experiment or condition. Calling
p.addnumericcolumn('new column name', 'original column')
extracts any numbers from each entry in the original column, makes a new column in all the dataframes called 'new column name', and places these numbers in the appropriate entry in this new column.
For example:
p.addnumericcolumn('concentration', 'condition')
p.sc[['condition', 'strain', 'concentration']].head()
condition | strain | concentration | |
---|---|---|---|
0 | 0.25% Mal | Mal12:GFP | 0.25 |
1 | 0.5% Mal | Mal12:GFP | 0.50 |
2 | 1% Mal | Mal12:GFP | 1.00 |
3 | 1.5% Mal | Mal12:GFP | 1.50 |
4 | 2% Mal | Mal12:GFP | 2.00 |
It is also possible to specify which number in the column entry you would like using picknumber and you can find numbers next to a substring of interest using leftsplitstring and rightsplitstring.
Getting a smaller dataframe for plotting directly¶
You can get a subset of the data as a dataframe, from either the r (the raw time-series), s (the processed time-series), or sc (the summary statistics) dataframes, using, for example:
subdf= p.getdataframe('s', conditionincludes= 'Raf', strainincludes= 'Gal')
subdf.head()
experiment | condition | strain | time | mean_OD | mean_GFP | mean_AutoFL | mean_mCherry | OD_err | GFP_err | ... | cmCherry_err | cmCherryperOD | cmCherryperOD_err | smoothed_cmCherryperOD | smoothed_cmCherryperOD_err | d/dt_cmCherryperOD | d/dt_cmCherryperOD_err | d/dt_d/dt_cmCherryperOD | d/dt_d/dt_cmCherryperOD_err | concentration | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2835 | ExampleData | 2% Raf | Mal12:mCherry,Gal10:GFP | 0.000000 | 0.176736 | 37.333333 | 17.000000 | 25.333333 | 0.003436 | 0.57735 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2.0 |
2836 | ExampleData | 2% Raf | Mal12:mCherry,Gal10:GFP | 0.232306 | 0.183252 | 36.000000 | 17.000000 | 24.666667 | 0.006882 | 0.00000 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2.0 |
2837 | ExampleData | 2% Raf | Mal12:mCherry,Gal10:GFP | 0.464583 | 0.195068 | 38.666667 | 18.333333 | 24.000000 | 0.004063 | 0.57735 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2.0 |
2838 | ExampleData | 2% Raf | Mal12:mCherry,Gal10:GFP | 0.696840 | 0.204218 | 37.333333 | 17.666667 | 24.333333 | 0.004065 | 0.57735 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2.0 |
2839 | ExampleData | 2% Raf | Mal12:mCherry,Gal10:GFP | 0.929111 | 0.211668 | 39.000000 | 18.000000 | 26.333333 | 0.003215 | 0.00000 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2.0 |
5 rows × 35 columns
Exporting and importing the dataframes¶
The three dataframes
p.r
p.s
p.sc
can all be saved as tsv (or json or csv) files. A logfile of the commands that you've used will also be saved.
p.exportdf()
Exported ExampleData. Exported ExampleData.log.
The files can be imported similarly:
q= om.platereader(datadir= 'data', ls= False)
q.importdf('ExampleData')
Imported ExampleData_r.tsv Imported ExampleData_s.tsv Imported ExampleData_sc.tsv Experiment: ExampleData --- Conditions: 0.25% Mal 0.5% Mal 1% Mal 1.5% Mal 2% Mal 2% Raf 3% Mal 4% Mal Strains: Mal12:GFP Mal12:mCherry Mal12:mCherry,Gal10:GFP Null WT Data types: OD GFP AutoFL mCherry Ignored wells: None
Adding new processed data¶
Here is a more complex example where we wish to plot the fluorescence at the time when the growth rate is maximal.
We first create a new field, mean_GFP_at_max_growth_rate, in the dataframe p.sc.
s= 'Mal12:GFP'
# store results as an array of dictionaries to eventually convert into a dataframe
results= []
for e in p.allexperiments:
for c in p.allconditions[e]:
# find the time of maximum growth rate for the condition
tm= p.sc.query('experiment == @e and condition == @c and strain == @s')['time_of_max_gr'].values[0]
# take the relevant sub-dataframe for the condition
df= p.s.query('experiment == @e and condition == @c and strain == @s')
# find GFP mean at time tm
i= np.argmin(np.abs(df['time'].values - tm))
results.append({'mean_GFP_at_max_gr' : df['mean_GFP'][df.index[i]],
'experiment' : e, 'condition' : c, 'strain' : s})
# convert to dataframe
rdf= pd.DataFrame(results)
# add to existing dataframe by experiment, condition, and strain
p.sc= pd.merge(p.sc, rdf, how= 'outer')
# check results
p.sc[['experiment', 'condition', 'strain', 'local_max_gr', 'mean_GFP_at_max_gr']].head()
experiment | condition | strain | local_max_gr | mean_GFP_at_max_gr | |
---|---|---|---|---|---|
0 | ExampleData | 0.25% Mal | Mal12:GFP | 0.011276 | 45.000000 |
1 | ExampleData | 0.5% Mal | Mal12:GFP | 0.284654 | 134.000000 |
2 | ExampleData | 1% Mal | Mal12:GFP | 0.326539 | 165.666667 |
3 | ExampleData | 1.5% Mal | Mal12:GFP | 0.319979 | 168.333333 |
4 | ExampleData | 2% Mal | Mal12:GFP | 0.310096 | 190.000000 |
# plot results
p.plot(x= 'max_gr', y= 'mean_GFP_at_max_gr', hue= 'condition',
style= 'experiment', conditionincludes= 'Mal')
Loading and processing more than one data set¶
It is also possible to load more than one data set and simultaneously process the data.
p= om.platereader(['HxtA.xlsx', 'HxtB.xlsx'],
['HxtAContents.xlsx', 'HxtBContents.xlsx'],
datadir= 'data')
Loading HxtA.xlsx Loading HxtB.xlsx Experiment: HxtA --- Conditions: 0.05% Gal 1% Gal 2% Gal 2% Glu Strains: BY4741 Hxt1:GFP Hxt2:GFP Hxt3:GFP Hxt4:GFP Null Data types: OD GFP_80Gain AutoFL_80Gain GFP_65Gain AutoFL_65Gain Ignored wells: None Experiment: HxtB --- Conditions: 0.05% Gal 1% Gal 2% Gal 2% Glu Strains: BY4741 Hxt5:GFP Hxt6:GFP Hxt7:GFP-L Hxt7:GFP-N Null Data types: OD GFP_80Gain AutoFL_80Gain GFP_65Gain AutoFL_65Gain Ignored wells: None Warning: wells with no strains have been changed to "Null".
By default, correctOD corrects all experiments for OD, but you can specify the experiment variable to specialise:
p.correctOD()
Fitting dilution data for OD correction for non-linearities. Using default data.
Corrected for the background OD of the media. HxtA - 0.05% Gal: Correcting OD for the OD of the medium.
Corrected for the background OD of the media. HxtA - 1% Gal: Correcting OD for the OD of the medium.
Corrected for the background OD of the media. HxtA - 2% Gal: Correcting OD for the OD of the medium.
Corrected for the background OD of the media. HxtA - 2% Glu: Correcting OD for the OD of the medium.
Corrected for the background OD of the media. HxtB - 0.05% Gal: Correcting OD for the OD of the medium.
Corrected for the background OD of the media. HxtB - 1% Gal: Correcting OD for the OD of the medium.
Corrected for the background OD of the media. HxtB - 2% Gal: Correcting OD for the OD of the medium.
Corrected for the background OD of the media. HxtB - 2% Glu: Correcting OD for the OD of the medium.
p.plot(y= 'OD', hue= 'condition', strains= 'BY4741', conditionincludes= 'Gal',
style= 'strain', size= 'experiment')
You can also run routines by experiment:
p.getstats(strainincludes= 'Hxt6', conditionincludes= 'Glu',
experiments= 'HxtB')
Fitting log_OD for HxtB: Hxt6:GFP in 2% Glu Taking natural logarithm of the data. GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function.
log(max likelihood)= 4.822662e+02 hparam[0]= 4.356040e+00 [1.000000e-05, 1.000000e+05] hparam[1]= 1.383084e+01 [1.000000e-04, 1.000000e+04] hparam[2]= 1.000000e-02 [1.000000e-02, 1.000000e+00] Warning: hyperparameter 2 is at a lower bound. --- hparam[0] determines the amplitude of variation hparam[1] determines the stiffness hparam[2] determines the variance of the measurement error
p.plot(x= 'mean_OD', y= 'gr', strainincludes= 'Hxt6', conditionincludes= 'Glu',
experiments= 'HxtB')
p.correctauto('GFP_65Gain', strains= ['Hxt6:GFP'], refstrain= 'BY4741',
conditions= '2% Glu', experiments= 'HxtB',
options= {"figs" : True})
Correcting autofluorescence using BY4741 as the reference. Using Bayesian approach for GFP_65Gain. 2% Glu: Processing 1 strains. HxtB: Hxt6:GFP in 2% Glu; 4 replicates Fitting log(flperOD) for HxtB: Hxt6:GFP in 2% Glu GP Warning: input data is not sorted. Sorting. Using a (twice differentiable) Matern covariance function. HxtB: Hxt6:GFP in 2% Glu log(max likelihood)= -4.538247e+04 hparam[0]= 1.326311e+05 [1.000000e-02, 1.000000e+08] hparam[1]= 1.115326e+00 [1.000000e+00, 1.000000e+04] hparam[2]= 1.000000e+01 [1.000000e-02, 1.000000e+01] Warning: hyperparameter 2 is at an upper bound. ---
p.plot(y= 'bcGFP_65GainperOD', strains= 'Hxt6:GFP', conditions= '2% Glu',
experiments= 'HxtB', ylim= [0, None], style= 'condition')
Renaming conditions and strains¶
When combining multiple experiments, you may wish for a more consistent or convenient convention for naming strains and conditions.
You can replace names with alternatives using for example
p.rename({'77.WT' : 'WT', '409.Hxt4' : 'Hxt4'})
to simplify strains 77.WT to WT and 409.Hxt4 to Hxt4. The dictionary comprises pairs of (old name, new name).
Averaging over experiments¶
The plate reader often does not perfectly increment time between measurements and different experients can have slightly different time points despite the plate reader having the same settings. These unique times prevent the plotting module seaborn from taking averages.
If experiments have measurements that start at the same time point and have the same interval between measurements, then setting a commontime for all experiments will allow seaborn to perform averaging.
The commontime array runs from tmin to tmax with an interval dt between time points. These parameters are automatically calculated, but may also be specified.
Each entry in the time column in the dataframes is assigned a commontime value – the closest commontime point to that time point.
p.addcommonvar('time')
With commontime defined, you can average over experiments by plotting with x= commontime.
For example, the strain BY4741 is in both experiments:
# no averaging over experiments
p.plot(x= 'time', y= 'OD', strains= 'BY4741', hue= 'condition',
style= 'experiment',
title= 'no averaging over experiments')
# average over experiments
p.plot(x= 'commontime', y= 'OD', strains= 'BY4741', hue= 'condition',
style= 'strain',
title= 'averaging over experiments')
Importing processed data for more than one experiment¶
Exported data for multiple experiments can be simultaneously imported:
p= om.platereader(datadir= 'data', ls=False)
p.importdf(['Glu', 'GluGal'])
Imported Glu_r.tsv Imported Glu_s.tsv Imported Glu_sc.tsv Imported GluGal_r.tsv Imported GluGal_s.tsv Imported GluGal_sc.tsv Experiment: Glu --- Conditions: 0% Glu 0.0625% Glu 0.125% Glu 0.25% Glu 0.5% Glu 1% Glu media Strains: BY4741 Null Data types: OD Ignored wells: None Experiment: GluGal --- Conditions: 0% Gal 0% Glu 0.0625% Gal 0.0625% Glu 0.125% Gal 0.125% Glu 0.2% Gal 0.2% Glu 0.25% Gal 0.25% Glu 0.3% Gal 0.3% Glu 0.35% Gal 0.35% Glu 0.5% Gal 0.5% Glu 1% Gal 1% Gal-NoSC 1% Glu 1% Glu-NoSC 2% Gal 2% Glu media Strains: BY4741 Null Data types: OD Ignored wells: None
The dataframe from each experiment is combined into one creating an amalgamated p.r, p.s and p.sc.
Data from these dataframes can be plotted as usual:
p.plot(x= 'mean_OD', y= 'gr', conditions= '1% Glu', hue= 'experiment',
title= '1% glucose')
p.plot(x= 'max_gr', y= 'condition', hue= 'experiment', style= 'strain',
conditionincludes= 'Glu')
p.plot(x= 'max_gr', y= 'log2_OD_ratio', hue= 'experiment', style= 'strain',
conditionincludes= 'Glu')
p.plot(x= 'max_gr', y= 'log2_OD_ratio', col= 'experiment', style= 'strain',
conditionincludes= 'Glu', aspect= 0.7)