mdap.MD_Pdist
- class mdap.MD_Pdist(data_type=None, Xname=None, Xindex=1, Yname=None, Yindex=1, Zname=None, Zindex=1, Xinterval=1, Yinterval=1, Zinterval=1, data_proc=None, first_iter=1, last_iter=None, bins=(100, 100), p_units='kT', T=298, histrange_x=None, histrange_y=None, no_pbar=False, timescale=1000000, *args, **kwargs)
These class methods generate probability distributions from input MD data files.
- __init__(data_type=None, Xname=None, Xindex=1, Yname=None, Yindex=1, Zname=None, Zindex=1, Xinterval=1, Yinterval=1, Zinterval=1, data_proc=None, first_iter=1, last_iter=None, bins=(100, 100), p_units='kT', T=298, histrange_x=None, histrange_y=None, no_pbar=False, timescale=1000000, *args, **kwargs)
TODO: add XYZ interval to proc (default 1)
- Parameters:
data_type (str) – ‘time’ for 1 dataset timeseries, or ‘pdist’ for everything else.
Xname (str or list of str) – target data for x axis, default None.
Xindex (int) – If X.ndim > 2, use this to index.
Yname (str or list of str) – target data for y axis, default None.
Yindex (int) – If Y.ndim > 2, use this to index.
Zname (str or list of str) – target data for z axis, default None. Use this if you want to use a dataset instead of pdist for Z axis. This will be best plotted as a scatter plot with Z as the marker color. Instead of returning the pdist, only the XYZ datasets will be returned. This is becasue the weights/pdist isn’t considered.
Zindex (int) – If Z.ndim > 2, use this to index.
Xinterval, Yinterval, Zinterval (int) – Interval for processing dataset. E.g. 10 = every 10 frames.
data_proc (function or tuple of functions) – Of the form f(data) where data has rows=segments, columns=frames until tau, depth=data dims. The input function must return a processed array of the same shape and formatting.
first_iter (int) – Default start plot at iteration 1 data.
last_iter (int) – Last iteration data to include, default is the last recorded iteration in the west.h5 file. Note that instant type pdists only depend on last_iter.
bins (tuple of ints (TODO: maybe the tuple isn’t user friendly for 1 dim?)) – Histogram bins in pdist data to be generated for x and y datasets, default both 100.
p_units (str) – Can be ‘kT’ (default), ‘kcal’, ‘raw’, or ‘raw_norm’. kT = -lnP, kcal/mol = -RT(lnP), where RT = 0.5922 at T Kelvin. ‘raw’ is the raw probabilities and ‘raw_norm’ is the raw probabilities P(max) normalized.
T (int) – Temperature if using kcal/mol.
histrange_x, histrange_y (list or tuple of 2 floats or ints) – Optionally put custom bin ranges.
no_pbar (bool) – Optionally do not include the progress bar for pdist generation.
timescale (int) – Default ps to µs (10**6). Converts frames to time.
TODO (maybe also binsfromexpression?)
Methods
__init__
([data_type, Xname, Xindex, Yname, ...])TODO: add XYZ interval to proc (default 1)
aux_to_pdist_1d
(iteration)Take the auxiliary dataset for a single iteration and generate a weighted 1D probability distribution.
aux_to_pdist_2d
(iteration)Take the auxiliary dataset for a single iteration and generate a weighted 2D probability distribution.
average_datasets_3d
([interval])Unique case where Zname is specified and the XYZ datasets are returned.
average_datasets_4d
([interval])Unique case where Zname is specified and the XYZ datasets are returned.
average_pdist_1d
()1 dataset: average pdist for a range of iterations.
average_pdist_2d
()2 datasets: average pdist for a range of iterations.
evolution_pdist
()Returns the pdist for 1 coordinate for the range iterations specified.
find_iter_seg_from_xy_vals
(val_x, val_y)Find and return (iter, seg) closest to input data value(s).
get_all_weights
()Returns an 1D array of the weight for every frame of each tau for all segments of all iterations specified.
get_coords
(path, data_name, data_index)Get a list of data coordinates for plotting traces.
get_full_coords
(walker_tuple, data_name[, ...])Returns a full 1D set of data for a single trace (path).
get_parents
(walker_tuple)Get parent of an input (iteration, walker).
get_total_data_array
(name[, index, ...])Loop through all iterations specified and get a 1d raw data array.
instant_datasets_3d
()Unique case where Zname is specified and the XYZ datasets are returned.
instant_pdist_1d
()Returns the x and y pdist datasets for a single iteration.
instant_pdist_2d
()Returns the xyz pdist datasets for a single iteration.
make_new_h5
([new_weights])TODO: actually make a new h5 file, see bstate filter code, integrate all.
pdist
()Main public method with pdist generation controls.
pdist_1d
()- returns:
X (ndarray)
pdist_2d
()- returns:
X (ndarray)
pdist_3d
()- returns:
X (ndarray)
plot_trace
(walker_tuple[, color, linewidth, ...])Plot trace.
reshape_total_data_array
(array)Take an input 1d array of the data values at every segment for each iteration, and reshape them to make pdists.
succ_pdist_weight_filter
()TODO: Filter weights to be zero for all non successfull trajectories.
timeseries
()- returns:
X (ndarray)
trace_walker
(walker_tuple[, first_iter])Get trace path of an input (iteration, walker).
w_succ
()Find and return all successfully recycled (iter, seg) pairs.