Further Evalution and Clustering Functions

MCBB.distance_matrix — Function

 distance_matrix(sol::myMCSol, prob::myMCProblem, distance_func::Function, weights::AbstractArray; matrix_distance_func::Union{Function, Nothing}=nothing, histogram_distance_func::Union{Function, Nothing}=wasserstein_histogram_distance, relative_parameter::Bool=false, histograms::Bool=false, use_ecdf::Bool=true, k_bin::Number=1, bin_edges::AbstractArray)

Calculate the distance matrix between all individual solutions.

Histogram Method

If it is called with the histograms flag true, computes for each run in the solution sol for each measure a histogram of the measures of all system dimensions. The binning of the histograms is computed with Freedman-Draconis rule and the same across all runs for each measure.

The distance matrix is then computed given a suitable histogram distance function histogram_distance between these histograms.

This is intended to be used in order to avoid symmetric configurations in larger systems to be distinguished from each other. Example: Given a system with 10 identical oscillators. Given this distance calculation a state where oscillator 1-5 are synchronized and 6-10 are not syncronized would be in the same cluster as a state where oscillator 6-10 are synchronized and 1-5 are not synchronized. If you don't want this kind of behaviour, use the regular distance_matrix function.

Sparse and memory mapped options

There are seperate routines for computing very large matrices, using either memory maped arrays (see distance_matrix_mmap) or sparse arrays (see distance_matrix_sparse).

Arguments

sol: solution
prob: problem
distance_func: The actual calculating the distance between the measures/parameters of each solution with each other. Signature should be (measure_1::Union{Array,Number}, measure_2::Union{Array,Number}) -> distance::Number. Example and default is(x,y)->sum(abs.(x .- y))`.
weights: Instead of the actual measure weights[i_measure]*measure is handed over to distance_func. Thus weights need to be $N_{meas}+N_{par}$ long array.

Kwargs

relative_parameter: If true, the paramater values during distance calcuation is rescaled to [0,1]
histograms::Bool: If true, the distance calculation is based on distance_matrix_histogram with the default histogram distance wasserstein_histogram_distance.
histogram_distance_func: The distance function between two histograms. Default is wasserstein_histogram_distance.
matrix_distance_func: The distance function between two matrices or arrays or length different from $N_{dim}$. Used e.g. for Crosscorrelation.
ecdf::Bool if true the histogram_distance function gets the empirical cdfs instead of the histogram
k_bin::Int: Multiplier to increase ($k_{bin}>1$) or decrease the bin width and thus decrease or increase the number of bins. It is a multiplier to the Freedman-Draconis rule. Default: $k_{bin}=1$
nbin_default::Int: If the IQR is very small and thus the number of bins larger than nbin_default, the number of bins is set back to nbin_default and the edges and width adjusted accordingly.
nbin::Int If specified, ingore all other histogram binning calculation and use nbin bins for the histograms.
bin_edges::AbstractArray: If specified ignore all other histogram binning calculations and use this as the edges of the histogram (has to have one more element than bins, hence all edges). Needs to be an Array with as many elements as measures, if one wants automatic binning for one observables, this element of the array has to be nothing. E.g.: [1:1:10, nothing, 2:0.5:5].

Returns an instance of DistanceMatrix or DistanceMatrixHist