User Guide#

Generating synthetic data with repliclust is different from the experience you may have had with other cluster generators. Our software is based on data set archetypes, high-level geometric descriptions of whole classes of data sets.

To understand what we mean, try the demo here: here!

But since you’re still here, let me explain. An archetype represents a probability distribution over mixture models with similar overall geometry. For example, data sets that looks like “seven ” To generate individual synthetic data sets, repliclust first generates a probabilistic mixture model that matches the desired archetype. This mixture model, in turn, samples the actual data set(s). The figure below sketches our workflow.

_images/workflow.svg

The following brief tutorials demonstrate how to use repliclust. The section Getting Started will get you started generating your own data sets within minutes.