mdap.MD_Pdist

class mdap.MD_Pdist(data_type=None, Xname=None, Xindex=1, Yname=None, Yindex=1, Zname=None, Zindex=1, Xinterval=1, Yinterval=1, Zinterval=1, data_proc=None, first_iter=1, last_iter=None, bins=(100, 100), p_units='kT', T=298, histrange_x=None, histrange_y=None, no_pbar=False, timescale=1000000, *args, **kwargs)

These class methods generate probability distributions from input MD data files.

__init__(data_type=None, Xname=None, Xindex=1, Yname=None, Yindex=1, Zname=None, Zindex=1, Xinterval=1, Yinterval=1, Zinterval=1, data_proc=None, first_iter=1, last_iter=None, bins=(100, 100), p_units='kT', T=298, histrange_x=None, histrange_y=None, no_pbar=False, timescale=1000000, *args, **kwargs)

TODO: add XYZ interval to proc (default 1)

Parameters:

data_type (str) – ‘time’ for 1 dataset timeseries, or ‘pdist’ for everything else.
Xname (str or list of str) – target data for x axis, default None.
Xindex (int) – If X.ndim > 2, use this to index.
Yname (str or list of str) – target data for y axis, default None.
Yindex (int) – If Y.ndim > 2, use this to index.
Zname (str or list of str) – target data for z axis, default None. Use this if you want to use a dataset instead of pdist for Z axis. This will be best plotted as a scatter plot with Z as the marker color. Instead of returning the pdist, only the XYZ datasets will be returned. This is becasue the weights/pdist isn’t considered.
Zindex (int) – If Z.ndim > 2, use this to index.
Xinterval, Yinterval, Zinterval (int) – Interval for processing dataset. E.g. 10 = every 10 frames.
data_proc (function or tuple of functions) – Of the form f(data) where data has rows=segments, columns=frames until tau, depth=data dims. The input function must return a processed array of the same shape and formatting.
first_iter (int) – Default start plot at iteration 1 data.
last_iter (int) – Last iteration data to include, default is the last recorded iteration in the west.h5 file. Note that instant type pdists only depend on last_iter.
bins (tuple of ints (TODO: maybe the tuple isn’t user friendly for 1 dim?)) – Histogram bins in pdist data to be generated for x and y datasets, default both 100.
p_units (str) – Can be ‘kT’ (default), ‘kcal’, ‘raw’, or ‘raw_norm’. kT = -lnP, kcal/mol = -RT(lnP), where RT = 0.5922 at T Kelvin. ‘raw’ is the raw probabilities and ‘raw_norm’ is the raw probabilities P(max) normalized.
T (int) – Temperature if using kcal/mol.
histrange_x, histrange_y (list or tuple of 2 floats or ints) – Optionally put custom bin ranges.
no_pbar (bool) – Optionally do not include the progress bar for pdist generation.
timescale (int) – Default ps to µs (10**6). Converts frames to time.
TODO (maybe also binsfromexpression?)

Methods

`__init__`([data_type, Xname, Xindex, Yname, ...])	TODO: add XYZ interval to proc (default 1)
`aux_to_pdist_1d`(iteration)	Take the auxiliary dataset for a single iteration and generate a weighted 1D probability distribution.
`aux_to_pdist_2d`(iteration)	Take the auxiliary dataset for a single iteration and generate a weighted 2D probability distribution.
`average_datasets_3d`([interval])	Unique case where Zname is specified and the XYZ datasets are returned.
`average_datasets_4d`([interval])	Unique case where Zname is specified and the XYZ datasets are returned.
`average_pdist_1d`()	1 dataset: average pdist for a range of iterations.
`average_pdist_2d`()	2 datasets: average pdist for a range of iterations.
`evolution_pdist`()	Returns the pdist for 1 coordinate for the range iterations specified.
`find_iter_seg_from_xy_vals`(val_x, val_y)	Find and return (iter, seg) closest to input data value(s).
`get_all_weights`()	Returns an 1D array of the weight for every frame of each tau for all segments of all iterations specified.
`get_coords`(path, data_name, data_index)	Get a list of data coordinates for plotting traces.
`get_full_coords`(walker_tuple, data_name[, ...])	Returns a full 1D set of data for a single trace (path).
`get_parents`(walker_tuple)	Get parent of an input (iteration, walker).
`get_total_data_array`(name[, index, ...])	Loop through all iterations specified and get a 1d raw data array.
`instant_datasets_3d`()	Unique case where Zname is specified and the XYZ datasets are returned.
`instant_pdist_1d`()	Returns the x and y pdist datasets for a single iteration.
`instant_pdist_2d`()	Returns the xyz pdist datasets for a single iteration.
`make_new_h5`([new_weights])	TODO: actually make a new h5 file, see bstate filter code, integrate all.
`pdist`()	Main public method with pdist generation controls.
`pdist_1d`()	returns: X (ndarray)
`pdist_2d`()	returns: X (ndarray)
`pdist_3d`()	returns: X (ndarray)
`plot_trace`(walker_tuple[, color, linewidth, ...])	Plot trace.
`reshape_total_data_array`(array)	Take an input 1d array of the data values at every segment for each iteration, and reshape them to make pdists.
`succ_pdist_weight_filter`()	TODO: Filter weights to be zero for all non successfull trajectories.
`timeseries`()	returns: X (ndarray)
`trace_walker`(walker_tuple[, first_iter])	Get trace path of an input (iteration, walker).
`w_succ`()	Find and return all successfully recycled (iter, seg) pairs.