mdap.md_pdist module
Convert MD analyzed data to pdists.
- TODO:
Maybe could include an arg for custom weights
- class mdap.md_pdist.MD_Pdist(data_type=None, Xname=None, Xindex=1, Yname=None, Yindex=1, Zname=None, Zindex=1, Xinterval=1, Yinterval=1, Zinterval=1, data_proc=None, first_iter=1, last_iter=None, bins=(100, 100), p_units='kT', T=298, histrange_x=None, histrange_y=None, no_pbar=False, timescale=1000000, *args, **kwargs)
Bases:
H5_Pdist
These class methods generate probability distributions from input MD data files.
- aux_to_pdist_1d(iteration)
Take the auxiliary dataset for a single iteration and generate a weighted 1D probability distribution.
- Parameters:
iteration (int) – Desired iteration to extract timeseries info from.
- Returns:
midpoints_x (ndarray) – Histogram midpoint bin values for target aux coordinate of dimension 0.
midpoints_y (ndarray) – Optional histogram midpoint bin values for target aux coordinate of dimension 1.
histogram (ndarray) – Raw histogram count values of each histogram bin. Can be later normalized as -lnP(x).
- aux_to_pdist_2d(iteration)
Take the auxiliary dataset for a single iteration and generate a weighted 2D probability distribution.
- Parameters:
iteration (int) – Desired iteration to extract timeseries info from.
- Returns:
midpoints_x (ndarray) – Histogram midpoint bin values for target aux coordinate of dimension 0.
midpoints_y (ndarray) – Optional histogram midpoint bin values for target aux coordinate of dimension 1.
histogram (ndarray) – Raw histogram count values of each histogram bin. Can be later normalized as -lnP(x).
- average_datasets_3d(interval=1)
Unique case where Zname is specified and the XYZ datasets are returned. Averaged over the iteration range.
- Returns:
X, Y, Z – Raw data for each named coordinate.
- Return type:
arrays
- average_datasets_4d(interval=1)
Unique case where Zname is specified and the XYZ datasets are returned. Averaged over the iteration range. With Cname, 4d.
- Returns:
X, Y, Z, C – Raw data for each named coordinate.
- Return type:
arrays
- average_pdist_1d()
1 dataset: average pdist for a range of iterations.
- Returns:
x and y axis values, x is the coordinate values and y is probabilities.
- Return type:
x, y
- average_pdist_2d()
2 datasets: average pdist for a range of iterations.
- Returns:
x and y axis values, and if using Y or evolution (with only X), also returns norm_hist. norm_hist is a 2-D matrix of the normalized histogram values.
- Return type:
x, y, norm_hist
- evolution_pdist()
Returns the pdist for 1 coordinate for the range iterations specified.
- Returns:
x, y, norm_hist – x and y axis values, and if using Y or evolution (with only X), also returns norm_hist. norm_hist is a 2-D matrix of the normalized histogram values.
- Return type:
arrays
- find_iter_seg_from_xy_vals(val_x, val_y)
Find and return (iter, seg) closest to input data value(s).
- Parameters:
val_x (int or float) – X dataset value to search for.
val_y (int or float) – Y dataset value to search for.
- Returns:
iter_num, seg_num – Iteration, segment number.
- Return type:
int, int
- get_all_weights()
Returns an 1D array of the weight for every frame of each tau for all segments of all iterations specified.
- Returns:
weights_expanded
- Return type:
array
- get_coords(path, data_name, data_index)
Get a list of data coordinates for plotting traces. Only grabs the last frames.
- Parameters:
path (list of tuples) – Tuples are (iteration, walker) traces.
data_name (str) – Name of dataset.
data_index (int) – Index of dataset.
- Returns:
coordinates – Array of coordinates from the list of (iteration, walker) tuples.
- Return type:
1d array
- get_full_coords(walker_tuple, data_name, data_index=0, first_iter=1)
Returns a full 1D set of data for a single trace (path). This will be ordered from the first iter to the last.
- Parameters:
walker_tuple (tuple) – (iteration, walker) start point to trace from.
data_name (str) – Name of dataset.
data_index (int) – Index of dataset.
first_iter (int) – Iter to trace back to. Default 1.
- Returns:
coordinates – Array of coordinates from the list of (iteration, walker) tuples.
- Return type:
1d array
- get_parents(walker_tuple)
Get parent of an input (iteration, walker).
- Parameters:
walker_tuple (tuple) – (iteration, walker)
- Returns:
parent
- Return type:
iteration, walker
- get_total_data_array(name, index=0, interval=1, reshape=True)
Loop through all iterations specified and get a 1d raw data array. # TODO: this could be organized better with my other methods maybe I can separate the helper functions into another class for extracting and moving data around, this pdist class could be used strictly for making pdists from a nice and standard data array input that is handled by the H5_Processing class
- Parameters:
name (str) – Name of data from h5 file such as pcoord or an aux dataset.
index (int) – Index of the data from h5 file.
interval (int) – If more sparse data is needed for efficiency.
reshape (bool) – Option to reshape into 1d array instead of each seg for all tau values.
- Returns:
data – Raw (unweighted) data array for the name specified.
- Return type:
1d array
- instant_datasets_3d()
Unique case where Zname is specified and the XYZ datasets are returned. For single iteration.
- Returns:
X, Y, Z – Raw data for each named coordinate.
- Return type:
arrays
- instant_pdist_1d()
Returns the x and y pdist datasets for a single iteration.
- Returns:
Xdata, y – x (dataset) and y (pdist) axis values
- Return type:
arrays
- instant_pdist_2d()
Returns the xyz pdist datasets for a single iteration.
- Returns:
x, y, norm_hist – x and y axis values, and if using Y or evolution (with only X), also returns norm_hist. norm_hist is a 2-D matrix of the normalized histogram values.
- Return type:
arrays
- make_new_h5(new_weights=None)
TODO: actually make a new h5 file, see bstate filter code, integrate all. If self.H5save_out is not None and X/Y/Zsave_name is not None. Saves out a new h5 file of name self.H5save_out with the current X/Y/Zname data into auxdata of h5 file with name of X/Y/Zsave_name.
- Parameters:
new_weights (numpy object array) – Updated weight values, e.g. from skip_basis or succ_only.
- pdist()
Main public method with pdist generation controls.
- pdist_1d()
- Returns:
X (ndarray)
Y (ndarray)
- pdist_2d()
- Returns:
X (ndarray)
Y (ndarray)
Z (ndarray)
- pdist_3d()
- Returns:
X (ndarray)
Y (ndarray)
Z (ndarray)
- plot_trace(walker_tuple, color='white', linewidth=1.0, linestyle='-', ax=None, find_iter_seg=False, mark_points=False, mp_size=80, mp_color=None, mp_markers=('o', 'v'), **kwargs)
Plot trace.
- Parameters:
walker_tuple (tuple) – (iteration, walker) start point to trace from. Can also find the closest iteration/seg using input as (X_value,Y_value). find_iter_seg must be True to use this setting.
color (str)
linewidth (int)
linestyle (str)
ax (mpl axes object)
find_iter_seg (bool) – Default False and use walker tuple as (iter, seg). Set True to look for (iter, seg) using walker_tuple input as (X_value,Y_value).
mark_points (bool) – Default False, set to true to mark the starting and end points of the trace path.
mp_size (int) – Size of the marked points, default 80.
mp_color (str) – Color of the marked points, if None, defaults to color arg.
mp_markers (tuple) – Two item tuple: start point marker style, end point marker style.
**kwargs – Passed to mpl plt.plot line plots. E.g. alpha parameter.
- Returns:
aux or aux_x, aux_y – The coordinate values at each point in the trace.
- Return type:
1D arrays
- reshape_total_data_array(array)
Take an input 1d array of the data values at every segment for each iteration, and reshape them to make pdists.
- Parameters:
array (1d array) – Data values at every segment for each iteration.
- Returns:
array – Now rows = segments, columns = frame until tau, depth = data dimensions.
- Return type:
ndarray
- succ_pdist_weight_filter()
TODO: Filter weights to be zero for all non successfull trajectories. Make an array of zero weights and fill out weights for succ trajs only. option to output new h5?
- Returns:
succ_weights – Updated weight array.
- Return type:
numpy object array
- timeseries()
- Returns:
X (ndarray)
Y (ndarray)
- trace_walker(walker_tuple, first_iter=1)
Get trace path of an input (iteration, walker).
- Parameters:
walker_tuple (tuple) – (iteration, walker)
first_iter (int) – Iter to trace back to. Default 1.
- Returns:
trace – Tuples are (iteration, walker) traces.
- Return type:
list of tuples
- w_succ()
Find and return all successfully recycled (iter, seg) pairs.
- Returns:
succ
- Return type:
list of tuples (iter,wlk)
mdap.md_plot module
Plot MD pdists.
- TODO: specify option for:
timeseries (option for KDE side plot and option for stdev vs all reps) pdist (1D hist, 1D KDE, + others from H5_Plot) others?