Probability Distributions#


This module implements probability distributions for individual clusters, as well as distribution mixes that assign probability distributions to the clusters in a mixture model.

class repliclust.distributions.DistributionFromNumPy(name, **params)#

Bases: SingleClusterDistribution

Allows arbitrary method from Generator class in numpy.random. The method must take the argument ‘size’ for selecting the number of samples.

class repliclust.distributions.DistributionFromPDF#

Bases: SingleClusterDistribution

Sample from arbitrary probability density function.

class repliclust.distributions.Exponential#

Bases: DistributionFromNumPy

Draw exponentially distributed data for a single cluster.

class repliclust.distributions.FixedProportionMix(distributions=[('normal', 1.0, {})])#

Bases: DistributionMix

Assign probability distributions to clusters according to fixed proportions. For example, you may choose that 50% of clusters have a multivariate normal distribution and 50% have an exponential distribution.


distributions (list of tuple) – List of distributions to mix. Each distribution appears as a tuple (name, proportion, params), where name is a string giving the distribution name; proportion is a number giving the desired proportion of clusters with the named distribution, and params is a dict whose (key, value) pairs are the names and values of distributional parameters.

assign_distributions(self, n_clusters):

Assign probability distributions to clusters.


List of probability distributions.




Desired proportion of clusters having the corresponding distribution. The i-th entry corresponds to the i-th element in _distribution.




Assign probability distributions to all the clusters of a probabilistic mixture model.


n_clusters (int) – The number of clusters for which to assign probability distributions.


distributions – Probability distributions for the clusters of a probabilistic mixture model.

Return type:

list of SingleClusterDistribution

class repliclust.distributions.Normal#

Bases: DistributionFromNumPy

Draw multivariate normal data for a single cluster.

class repliclust.distributions.StandardT(df=1)#

Bases: SingleClusterDistribution

Draw t-distributed data for a single cluster.

repliclust.distributions.parse_distribution(distr_name: str, params: dict = {})#

Return the SingleClusterDistribution object corresponding to the probability distribution with name distr_name.