mdap.MD_Pdist

class mdap.MD_Pdist(data_type=None, Xname=None, Xindex=1, Yname=None, Yindex=1, Zname=None, Zindex=1, Xinterval=1, Yinterval=1, Zinterval=1, data_proc=None, first_iter=1, last_iter=None, bins=(100, 100), p_units='kT', T=298, histrange_x=None, histrange_y=None, no_pbar=False, timescale=1000000, *args, **kwargs)

These class methods generate probability distributions from input MD data files.

__init__(data_type=None, Xname=None, Xindex=1, Yname=None, Yindex=1, Zname=None, Zindex=1, Xinterval=1, Yinterval=1, Zinterval=1, data_proc=None, first_iter=1, last_iter=None, bins=(100, 100), p_units='kT', T=298, histrange_x=None, histrange_y=None, no_pbar=False, timescale=1000000, *args, **kwargs)

TODO: add XYZ interval to proc (default 1)

Parameters:
  • data_type (str) – ‘time’ for 1 dataset timeseries, or ‘pdist’ for everything else.

  • Xname (str or list of str) – target data for x axis, default None.

  • Xindex (int) – If X.ndim > 2, use this to index.

  • Yname (str or list of str) – target data for y axis, default None.

  • Yindex (int) – If Y.ndim > 2, use this to index.

  • Zname (str or list of str) – target data for z axis, default None. Use this if you want to use a dataset instead of pdist for Z axis. This will be best plotted as a scatter plot with Z as the marker color. Instead of returning the pdist, only the XYZ datasets will be returned. This is becasue the weights/pdist isn’t considered.

  • Zindex (int) – If Z.ndim > 2, use this to index.

  • Xinterval, Yinterval, Zinterval (int) – Interval for processing dataset. E.g. 10 = every 10 frames.

  • data_proc (function or tuple of functions) – Of the form f(data) where data has rows=segments, columns=frames until tau, depth=data dims. The input function must return a processed array of the same shape and formatting.

  • first_iter (int) – Default start plot at iteration 1 data.

  • last_iter (int) – Last iteration data to include, default is the last recorded iteration in the west.h5 file. Note that instant type pdists only depend on last_iter.

  • bins (tuple of ints (TODO: maybe the tuple isn’t user friendly for 1 dim?)) – Histogram bins in pdist data to be generated for x and y datasets, default both 100.

  • p_units (str) – Can be ‘kT’ (default), ‘kcal’, ‘raw’, or ‘raw_norm’. kT = -lnP, kcal/mol = -RT(lnP), where RT = 0.5922 at T Kelvin. ‘raw’ is the raw probabilities and ‘raw_norm’ is the raw probabilities P(max) normalized.

  • T (int) – Temperature if using kcal/mol.

  • histrange_x, histrange_y (list or tuple of 2 floats or ints) – Optionally put custom bin ranges.

  • no_pbar (bool) – Optionally do not include the progress bar for pdist generation.

  • timescale (int) – Default ps to µs (10**6). Converts frames to time.

  • TODO (maybe also binsfromexpression?)

Methods

__init__([data_type, Xname, Xindex, Yname, ...])

TODO: add XYZ interval to proc (default 1)

aux_to_pdist_1d(iteration)

Take the auxiliary dataset for a single iteration and generate a weighted 1D probability distribution.

aux_to_pdist_2d(iteration)

Take the auxiliary dataset for a single iteration and generate a weighted 2D probability distribution.

average_datasets_3d([interval])

Unique case where Zname is specified and the XYZ datasets are returned.

average_datasets_4d([interval])

Unique case where Zname is specified and the XYZ datasets are returned.

average_pdist_1d()

1 dataset: average pdist for a range of iterations.

average_pdist_2d()

2 datasets: average pdist for a range of iterations.

evolution_pdist()

Returns the pdist for 1 coordinate for the range iterations specified.

find_iter_seg_from_xy_vals(val_x, val_y)

Find and return (iter, seg) closest to input data value(s).

get_all_weights()

Returns an 1D array of the weight for every frame of each tau for all segments of all iterations specified.

get_coords(path, data_name, data_index)

Get a list of data coordinates for plotting traces.

get_full_coords(walker_tuple, data_name[, ...])

Returns a full 1D set of data for a single trace (path).

get_parents(walker_tuple)

Get parent of an input (iteration, walker).

get_total_data_array(name[, index, ...])

Loop through all iterations specified and get a 1d raw data array.

instant_datasets_3d()

Unique case where Zname is specified and the XYZ datasets are returned.

instant_pdist_1d()

Returns the x and y pdist datasets for a single iteration.

instant_pdist_2d()

Returns the xyz pdist datasets for a single iteration.

make_new_h5([new_weights])

TODO: actually make a new h5 file, see bstate filter code, integrate all.

pdist()

Main public method with pdist generation controls.

pdist_1d()

returns:
  • X (ndarray)

pdist_2d()

returns:
  • X (ndarray)

pdist_3d()

returns:
  • X (ndarray)

plot_trace(walker_tuple[, color, linewidth, ...])

Plot trace.

reshape_total_data_array(array)

Take an input 1d array of the data values at every segment for each iteration, and reshape them to make pdists.

succ_pdist_weight_filter()

TODO: Filter weights to be zero for all non successfull trajectories.

timeseries()

returns:
  • X (ndarray)

trace_walker(walker_tuple[, first_iter])

Get trace path of an input (iteration, walker).

w_succ()

Find and return all successfully recycled (iter, seg) pairs.