Mixture Models
A mixture model is a probabilistic distribution that combines a set of components to represent the overall distribution. Generally, the probability density/mass function is given by a convex combination of the pdf/pmf of individual components, as
A mixture model is characterized by a set of component parameters and a prior distribution over these components.
Type Hierarchy
This package introduces a type MixtureModel
, defined as follows, to represent a mixture model:
abstract type AbstractMixtureModel{VF<:VariateForm,VS<:ValueSupport} <: Distribution{VF, VS} end
struct MixtureModel{VF<:VariateForm,VS<:ValueSupport,Component<:Distribution} <: AbstractMixtureModel{VF,VS}
components::Vector{Component}
prior::Categorical
end
const UnivariateMixture = AbstractMixtureModel{Univariate}
const MultivariateMixture = AbstractMixtureModel{Multivariate}
Remarks:
-
We introduce
AbstractMixtureModel
as a base type, which allows one to define a mixture model with different internal implementations, while still being able to leverage the common methods defined forAbstractMixtureModel
.
#
Distributions.AbstractMixtureModel
— Type
All subtypes of AbstractMixtureModel
should implement the following methods:
-
ncomponents(d): the number of components
-
component(d, k): return the k-th component
-
probs(d): return a vector of prior probabilities over components.
-
The
MixtureModel
is a parametric type, with three type parameters:-
VF
: the variate form, which can beUnivariate
,Multivariate
, orMatrixvariate
. -
VS
: the value support, which can beContinuous
orDiscrete
. -
Component
: the type of component distributions, e.g.Normal
.
-
-
We define two aliases:
UnivariateMixture
andMultivariateMixture
.
With such a type system, the type for a mixture of univariate normal distributions can be written as
MixtureModel{Univariate,Continuous,Normal}
Constructors
#
Distributions.MixtureModel
— Type
MixtureModel{VF<:VariateForm,VS<:ValueSupport,C<:Distribution,CT<:Real} A mixture of distributions, parametrized on:
-
VF,VS
variate and support -
C
distribution family of the mixture -
CT
the type for probabilities of the prior
Examples
# constructs a mixture of three normal distributions,
# with prior probabilities [0.2, 0.5, 0.3]
MixtureModel(Normal[
Normal(-2.0, 1.2),
Normal(0.0, 1.0),
Normal(3.0, 2.5)], [0.2, 0.5, 0.3])
# if the components share the same prior, the prior vector can be omitted
MixtureModel(Normal[
Normal(-2.0, 1.2),
Normal(0.0, 1.0),
Normal(3.0, 2.5)])
# Since all components have the same type, we can use a simplified syntax
MixtureModel(Normal, [(-2.0, 1.2), (0.0, 1.0), (3.0, 2.5)], [0.2, 0.5, 0.3])
# Again, one can omit the prior vector when all components share the same prior
MixtureModel(Normal, [(-2.0, 1.2), (0.0, 1.0), (3.0, 2.5)])
# The following example shows how one can make a Gaussian mixture
# where all components share the same unit variance
MixtureModel(map(u -> Normal(u, 1.0), [-2.0, 0.0, 3.0]))
Common Interface
All subtypes of AbstractMixtureModel
(obviously including MixtureModel
) provide the following two methods:
#
Distributions.components
— Method
components(d::AbstractMixtureModel)
Get a list of components of the mixture model d
.
#
Distributions.probs
— Method
probs(d::AbstractMixtureModel)
Get the vector of prior probabilities of all components of d
.
#
Distributions.component_type
— Method
component_type(d::AbstractMixtureModel)
The type of the components of d
.
In addition, for all subtypes of UnivariateMixture
and MultivariateMixture
, the following generic methods are provided:
#
Statistics.mean
— Method
mean(d::Union{UnivariateMixture, MultivariateMixture})
Compute the overall mean (expectation).
#
Statistics.var
— Method
var(d::UnivariateMixture)
Compute the overall variance (only for UnivariateMixture
).
#
Base.length
— Method
length(d::MultivariateMixture)
The length of each sample (only for Multivariate
).
#
Distributions.pdf
— Method
pdf(d::Union{UnivariateMixture, MultivariateMixture}, x)
Evaluate the (mixed) probability density function over x
. Here, x
can be a single sample or an array of multiple samples.
#
Distributions.logpdf
— Method
logpdf(d::Union{UnivariateMixture, MultivariateMixture}, x)
Evaluate the logarithm of the (mixed) probability density function over x
. Here, x
can be a single sample or an array of multiple samples.
#
Base.rand
— Method
rand(d::Union{UnivariateMixture, MultivariateMixture})
Draw a sample from the mixture model d
.
rand(d::Union{UnivariateMixture, MultivariateMixture}, n)
Draw n
samples from d
.
#
Random.rand!
— Method
rand!(d::Union{UnivariateMixture, MultivariateMixture}, r::AbstractArray)
Draw multiple samples from d
and write them to r
.
Estimation
There are several methods for the estimation of mixture models from data, and this problem remains an open research topic. This package does not provide facilities for estimating mixture models. One can resort to other packages, e.g. GaussianMixtures.jl, for this purpose.