Bijectors
Functions
#
Bijectors._inv_link_chol_lkj
— _Method
_inv_link_chol_lkj(y)
Inverse link function for cholesky factor.
#
Bijectors._link_chol_lkj
— _Method
function _link_chol_lkj(w)
Link function for cholesky factor.
An alternative and maybe more efficient implementation was considered:
for i=2:K, j=(i+1):K z[i, j] = (w[i, j] / w[i-1, j]) * (z[i-1, j] / sqrt(1 - z[i-1, j]^2)) end
But this implementation will not work when w[i-1, j] = 0. Though it is a zero measure set, unit matrix initialization will not work.
For equivalence, following explanations is given by @torfjelde:
For (i, j)
in the loop below, we define
z₍ᵢ₋₁, ⱼ₎ = w₍ᵢ₋₁,ⱼ₎ * ∏ₖ₌₁ⁱ⁻² (1 / √(1 - z₍ₖ,ⱼ₎²))
and so
z₍ᵢ,ⱼ₎ = w₍ᵢ,ⱼ₎ * ∏ₖ₌₁ⁱ⁻¹ (1 / √(1 - z₍ₖ,ⱼ₎²)) = (w₍ᵢ,ⱼ₎ * / √(1 - z₍ᵢ₋₁,ⱼ₎²)) * (∏ₖ₌₁ⁱ⁻² 1 / √(1 - z₍ₖ,ⱼ₎²)) = (w₍ᵢ,ⱼ₎ * / √(1 - z₍ᵢ₋₁,ⱼ₎²)) * (w₍ᵢ₋₁,ⱼ₎ * ∏ₖ₌₁ⁱ⁻² 1 / √(1 - z₍ₖ,ⱼ₎²)) / w₍ᵢ₋₁,ⱼ₎ = (w₍ᵢ,ⱼ₎ * / √(1 - z₍ᵢ₋₁,ⱼ₎²)) * (z₍ᵢ₋₁,ⱼ₎ / w₍ᵢ₋₁,ⱼ₎) = (w₍ᵢ,ⱼ₎ / w₍ᵢ₋₁,ⱼ₎) * (z₍ᵢ₋₁,ⱼ₎ / √(1 - z₍ᵢ₋₁,ⱼ₎²))
which is the above implementation.
#
Bijectors.bijector
— Method
bijector(d::Distribution)
Returns the constrained-to-unconstrained bijector for distribution d
.
#
Bijectors.cholesky_lower
— Method
cholesky_lower(X)
Return the lower triangular Cholesky factor of X
as a Matrix
rather than LowerTriangular
.
This is a thin wrapper around |
#
Bijectors.cholesky_upper
— Method
cholesky_upper(X)
Return the upper triangular Cholesky factor of X
as a Matrix
rather than UpperTriangular
.
This is a thin wrapper around |
#
Bijectors.columnwise
— Method
Alias for Base.Fix1(eachcolmaphcat, f)
.
Represents a function f
which is applied to each column of an input.
#
Bijectors.combine
— Method
combine(m::PartitionMask, x_1, x_2, x_3)
Combines x_1
, x_2
, and x_3
into a single vector.
#
Bijectors.compute_r
— Method
compute_r(y_minus_z0::AbstractVector{<:Real}, α, α_plus_β_hat)
Compute the unique solution to the equation
subject to and .
Since and , the solution is unique and given by
where . For details see appendix A.2 of the reference.
References
\D. Rezende, S. Mohamed (2015): Variational Inference with Normalizing Flows. arXiv:1505.05770
#
Bijectors.couple
— Method
Returns the coupling law constructed from x
.
#
Bijectors.coupling
— Method
Returns the constructor of the coupling law.
#
Bijectors.elementwise
— Method
elementwise(f)
Alias for Base.Fix1(broadcast, f)
.
In the case where f::ComposedFunction
, the result is Base.Fix1(broadcast, f.outer) ∘ Base.Fix1(broadcast, f.inner)
rather than Base.Fix1(broadcast, f)
.
#
Bijectors.find_alpha
— Method
find_alpha(wt_y, wt_u_hat, b)
Compute an (approximate) real-valued solution to the equation
The uniqueness of the solution is guaranteed since . For details see appendix A.1 of the reference.
Initial bracket
For all , we have
Thus
which implies ]. To avoid floating point issues if , we use the more conservative interval ] as initial bracket, at the cost of one additional iteration step.
References
\D. Rezende, S. Mohamed (2015): Variational Inference with Normalizing Flows. arXiv:1505.05770
#
Bijectors.get_u_hat
— Method
get_u_hat(u::AbstractVector{<:Real}, w::AbstractVector{<:Real})
Return a tuple of vector that guarantees invertibility of the planar layer, and scalar .
Mathematical background
According to appendix A.1, vector defined by
guarantees that the planar layer is invertible for all and . We can rewrite as
Additionally, we obtain
References
\D. Rezende, S. Mohamed (2015): Variational Inference with Normalizing Flows. arXiv:1505.05770
#
Bijectors.has_constant_bijector
— Method
has_constant_bijector(dist_type::Type)
Returns true
if the distribution type dist_type
has a constant bijector, i.e. the return-value of bijector
does not depend on runtime information.
#
Bijectors.isclosedform
— Method
isclosedform(b::Transform)::bool
isclosedform(b⁻¹::Inverse{<:Transform})::bool
Returns true
or false
depending on whether or not evaluation of b
has a closed-form implementation.
Most transformations have closed-form evaluations, but there are cases where this is not the case. For example the inverse evaluation of PlanarLayer
requires an iterative procedure to evaluate.
#
Bijectors.isinvertible
— Method
isinvertible(t)
Return true
if t
is invertible, and false
otherwise.
#
Bijectors.logabsdetjac!
— Method
logabsdetjac!(b, x[, logjac])
Compute log(abs(det(J(b, x))))
and store the result in logjac
, where J(b, x)
is the jacobian of b
at x
.
#
Bijectors.logabsdetjac
— Method
logabsdetjac(b, x)
Return log(abs(det(J(b, x))))
, where J(b, x)
is the jacobian of b
at x
.
#
Bijectors.logabsdetjacinv
— Method
logabsdetjacinv(b, y)
Just an alias for logabsdetjac(inverse(b), y)
.
#
Bijectors.ordered
— Method
ordered(d::Distribution)
Return a Distribution
whose support are ordered vectors, i.e., vectors with increasingly ordered elements.
This transformation is currently only supported for otherwise unconstrained distributions.
#
Bijectors.output_length
— Method
output_length(f, len::Int)
Returns the output length of f
given the input length len
.
#
Bijectors.output_size
— Method
output_size(f, sz)
Returns the output size of f
given the input size sz
.
#
Bijectors.partition
— Method
partition(m::PartitionMask, x)
Partitions x
into 3 disjoint subvectors.
#
Bijectors.transform!
— Method
transform!(b, x[, y])
Transform x
using b
, storing the result in y
.
If y
is not provided, x
is used as the output.
#
Bijectors.transform
— Method
transform(b, x)
Transform x
using b
, treating x
as a single input.
#
Bijectors.transformed
— Method
transformed(d::Distribution)
transformed(d::Distribution, b::Bijector)
Couples distribution d
with the bijector b
by returning a TransformedDistribution
.
If no bijector is provided, i.e. transformed(d)
is called, then transformed(d, bijector(d))
is returned.
#
Bijectors.triu1_to_vec
— Method
triu1_to_vec(X::AbstractMatrix{<:Real})
Extracts elements from upper triangle of X
with offset 1
and returns them as a vector.
#
Bijectors.triu_mask
— Method
triu_mask(X::AbstractMatrix, k::Int)
Return a mask for elements of X
above the k
th diagonal.
#
Bijectors.triu_to_vec
— Method
triu_to_vec(X::AbstractMatrix{<:Real})
Extracts elements from upper triangle of X
and returns them as a vector.
#
Bijectors.vec_to_triu
— Method
vec_to_triu(x::AbstractVector{<:Real})
Constructs a matrix from a vector x
by filling the upper triangle.
#
Bijectors.vec_to_triu1
— Method
vec_to_triu1(x::AbstractVector{<:Real})
Constructs a matrix from a vector x
by filling the upper triangle with offset 1
.
#
Bijectors.with_logabsdet_jacobian!
— Method
with_logabsdet_jacobian!(b, x[, y, logjac])
Compute transform(b, x)
and logabsdetjac(b, x)
, storing the result in y
and logjac
, respetively.
If y
is not provided, then x
will be used in its place.
Defaults to calling with_logabsdet_jacobian(b, x)
and updating y
and logjac
with the result.
#
InverseFunctions.inverse
— Method
inverse(t::Transform)
Returns the inverse of transform t
.
Types
#
Bijectors.Bijector
— Type
Abstract type of a bijector, i.e. differentiable bijection with differentiable inverse.
#
Bijectors.CorrBijector
— Type
CorrBijector <: Bijector
A bijector implementation of Stan’s parametrization method for Correlation matrix: https://mc-stan.org/docs/2_23/reference-manual/correlation-matrix-transform-section.html
Basically, a unconstrained strictly upper triangular matrix y
is transformed to a correlation matrix by following readable but not that efficient form:
K = size(y, 1) z = tanh.(y) for j=1:K, i=1:K if i>j w[i,j] = 0 elseif 1==i==j w[i,j] = 1 elseif 1<i==j w[i,j] = prod(sqrt(1 .- z[1:i-1, j].^2)) elseif 1==i<j w[i,j] = z[i,j] elseif 1<i<j w[i,j] = z[i,j] * prod(sqrt(1 .- z[1:i-1, j].^2)) end end
It is easy to see that every column is a unit vector, for example:
w3' w3 == w[1,3]^2 + w[2,3]^2 + w[3,3]^2 == z[1,3]^2 + (z[2,3] * sqrt(1 - z[1,3]^2))^2 + (sqrt(1-z[1,3]^2) * sqrt(1-z[2,3]^2))^2 == z[1,3]^2 + z[2,3]^2 * (1-z[1,3]^2) + (1-z[1,3]^2) * (1-z[2,3]^2) == z[1,3]^2 + z[2,3]^2 - z[2,3]^2 * z[1,3]^2 + 1 -z[1,3]^2 - z[2,3]^2 + z[1,3]^2 * z[2,3]^2 == 1
And diagonal elements are positive, so w
is a cholesky factor for a positive matrix.
x = w' * w
Consider block matrix representation for x
x = [w1'; w2'; ... wn'] * [w1 w2 ... wn] == [w1'w1 w1'w2 ... w1'wn; w2'w1 w2'w2 ... w2'wn; ... ]
The diagonal elements are given by wk’wk = 1
, thus x
is a correlation matrix.
Every step is invertible, so this is a bijection(bijector).
The implementation doesn’t follow their "manageable expression" directly, because their equation seems wrong (7/30/2020). Insteadly it follows definition above the "manageable expression" directly, which is also described in above doc. |
#
Bijectors.Coupling
— Type
Coupling{F, M}(θ::F, mask::M)
Implements a coupling-layer as defined in [1].
Examples
julia> using Bijectors: Shift, Coupling, PartitionMask, coupling, couple
julia> m = PartitionMask(3, [1], [2]); # <= going to use x[2] to parameterize transform of x[1]
julia> cl = Coupling(Shift, m); # <= will do `y[1:1] = x[1:1] + x[2:2]`;
julia> x = [1., 2., 3.];
julia> cl(x)
3-element Vector{Float64}:
3.0
2.0
3.0
julia> inverse(cl)(cl(x))
3-element Vector{Float64}:
1.0
2.0
3.0
julia> coupling(cl) # get the `Bijector` map `θ -> b(⋅, θ)`
Shift
julia> couple(cl, x) # get the `Bijector` resulting from `x`
Shift([2.0])
julia> with_logabsdet_jacobian(cl, x)
([3.0, 2.0, 3.0], 0.0)
References
[1] Kobyzev, I., Prince, S., & Brubaker, M. A., Normalizing flows: introduction and ideas, CoRR, (), (2019).
#
Bijectors.Inverse
— Type
inverse(b::Transform)
Inverse(b::Transform)
A Transform
representing the inverse transform of b
.
#
Bijectors.LeakyReLU
— Type
LeakyReLU{T}(α::T) <: Bijector
Defines the invertible mapping
x ↦ x if x ≥ 0 else αx
where α > 0.
#
Bijectors.NamedCoupling
— Type
NamedCoupling{target, deps, F} <: AbstractNamedTransform
Implements a coupling layer for named bijectors.
See also: Coupling
Examples
julia> using Bijectors: NamedCoupling, Scale
julia> b = NamedCoupling(:b, (:a, :c), (a, c) -> Scale(a + c));
julia> x = (a = 1., b = 2., c = 3.);
julia> b(x)
(a = 1.0, b = 8.0, c = 3.0)
julia> (a = x.a, b = (x.a + x.c) * x.b, c = x.c)
(a = 1.0, b = 8.0, c = 3.0)
#
Bijectors.NamedTransform
— Type
NamedTransform <: AbstractNamedTransform
Wraps a NamedTuple
of key -> Bijector
pairs, implementing evaluation, inversion, etc.
Examples
julia> using Bijectors: NamedTransform, Scale
julia> b = NamedTransform((a = Scale(2.0), b = exp));
julia> x = (a = 1., b = 0., c = 42.);
julia> b(x)
(a = 2.0, b = 1.0, c = 42.0)
julia> (a = 2 * x.a, b = exp(x.b), c = x.c)
(a = 2.0, b = 1.0, c = 42.0)
#
Bijectors.OrderedBijector
— Type
OrderedBijector()
A bijector mapping ordered vectors in ℝᵈ to unordered vectors in ℝᵈ.
See also
-
-
Note that this transformation and its inverse are the opposite of in this reference.
-
#
Bijectors.PartitionMask
— Type
PartitionMask{A}(A_1::A, A_2::A, A_3::A) where {A}
This is used to partition and recombine a vector into 3 disjoint "subvectors".
Implements
-
partition(m::PartitionMask, x)
: partitionsx
into 3 disjoint "subvectors" -
combine(m::PartitionMask, x_1, x_2, x_3)
: combines 3 disjoint vectors into a single one
Note that PartitionMask
is not a Bijector
. It is indeed a bijection, but does not follow the Bijector
interface.
Its main use is in Coupling
where we want to partition the input into 3 parts, one part to transform, one part to map into the parameter-space of the transform applied to the first part, and the last part of the vector is not used for anything.
Examples
julia> using Bijectors: PartitionMask, partition, combine
julia> m = PartitionMask(3, [1], [2]) # <= assumes input-length 3
PartitionMask{Bool,SparseArrays.SparseMatrixCSC{Bool,Int64}}(
[1, 1] = true,
[2, 1] = true,
[3, 1] = true)
julia> # Partition into 3 parts; the last part is inferred to be indices `[3, ]` from
# the fact that `[1]` and `[2]` does not make up all indices in `1:3`.
x1, x2, x3 = partition(m, [1., 2., 3.])
([1.0], [2.0], [3.0])
julia> # Recombines the partitions into a vector
combine(m, x1, x2, x3)
3-element Array{Float64,1}:
1.0
2.0
3.0
Note that the underlying SparseMatrix
is using Bool
as the element type. We can also specify this to be some other type using the sp_type
keyword:
julia> m = PartitionMask{Float32}(3, [1], [2])
PartitionMask{Float32,SparseArrays.SparseMatrixCSC{Float32,Int64}}(
[1, 1] = 1.0,
[2, 1] = 1.0,
[3, 1] = 1.0)
#
Bijectors.PartitionMask
— Method
PartitionMask(n::Int, indices)
Assumes you want to split the vector, where indices
refer to the parts of the vector you want to apply the bijector to.
#
Bijectors.Permute
— Type
Permute{A} <: Bijector
A bijector implementation of a permutation. The permutation is performed using a matrix of type A
. There are a couple of different ways to construct Permute
:
Permute([0 1; 1 0]) # will map [1, 2] => [2, 1] Permute([2, 1]) # will map [1, 2] => [2, 1] Permute(2, 2 => 1, 1 => 2) # will map [1, 2] => [2, 1] Permute(2, [1, 2] => [2, 1]) # will map [1, 2] => [2, 1]
If this is not clear, the examples might be of help.
Examples
A simple example is permuting a vector of size 3.
julia> b1 = Permute([
0 1 0;
1 0 0;
0 0 1
])
Permute{Array{Int64,2}}([0 1 0; 1 0 0; 0 0 1])
julia> b2 = Permute([2, 1, 3]) # specify all elements at once
Permute{SparseArrays.SparseMatrixCSC{Float64,Int64}}(
[2, 1] = 1.0
[1, 2] = 1.0
[3, 3] = 1.0)
julia> b3 = Permute(3, 2 => 1, 1 => 2) # element-wise
Permute{SparseArrays.SparseMatrixCSC{Float64,Int64}}(
[2, 1] = 1.0
[1, 2] = 1.0
[3, 3] = 1.0)
julia> b4 = Permute(3, [1, 2] => [2, 1]) # block-wise
Permute{SparseArrays.SparseMatrixCSC{Float64,Int64}}(
[2, 1] = 1.0
[1, 2] = 1.0
[3, 3] = 1.0)
julia> b1.A == b2.A == b3.A == b4.A
true
julia> b1([1., 2., 3.])
3-element Array{Float64,1}:
2.0
1.0
3.0
julia> b2([1., 2., 3.])
3-element Array{Float64,1}:
2.0
1.0
3.0
julia> b3([1., 2., 3.])
3-element Array{Float64,1}:
2.0
1.0
3.0
julia> b4([1., 2., 3.])
3-element Array{Float64,1}:
2.0
1.0
3.0
julia> inverse(b1)
Permute{LinearAlgebra.Transpose{Int64,Array{Int64,2}}}([0 1 0; 1 0 0; 0 0 1])
julia> inverse(b1)(b1([1., 2., 3.]))
3-element Array{Float64,1}:
1.0
2.0
3.0
#
Bijectors.RationalQuadraticSpline
— Type
RationalQuadraticSpline{T} <: Bijector
Implementation of the Rational Quadratic Spline flow [1].
-
Outside of the interval
[minimum(widths), maximum(widths)]
, this mapping is given by the identity map. -
Inside the interval it’s given by a monotonic spline (i.e. monotonic polynomials connected at intermediate points) with endpoints fixed so as to continuously transform into the identity map.
For the sake of efficiency, there are separate implementations for 0-dimensional and 1-dimensional inputs.
Notes
There are two constructors for RationalQuadraticSpline
:
-
RationalQuadraticSpline(widths, heights, derivatives)
: it is assumed thatwidths
,
heights
, and derivatives
satisfy the constraints that makes this a valid bijector, i.e.
-
widths
: monotonically increasing andlength(widths) == K
, -
heights
: monotonically increasing andlength(heights) == K
, -
derivatives
: non-negative andderivatives[1] == derivatives[end] == 1
. -
RationalQuadraticSpline(widths, heights, derivatives, B)
: other than than the lengths, no assumptions are made on parameters. Therefore we will transform the parameters s.t.: -
widths_new
∈ [-B, B]ᴷ⁺¹, whereK == length(widths)
, -
heights_new
∈ [-B, B]ᴷ⁺¹, whereK == length(heights)
, -
derivatives_new
∈ (0, ∞)ᴷ⁺¹ withderivatives_new[1] == derivates_new[end] == 1
, where(K - 1) == length(derivatives)
.
Examples
Univariate
julia> using StableRNGs: StableRNG; rng = StableRNG(42); # For reproducibility.
julia> using Bijectors: RationalQuadraticSpline
julia> K = 3; B = 2;
julia> # Monotonic spline on '[-B, B]' with `K` intermediate knots/"connection points".
b = RationalQuadraticSpline(randn(rng, K), randn(rng, K), randn(rng, K - 1), B);
julia> b(0.5) # inside of `[-B, B]` → transformed
1.1943325397834206
julia> b(5.) # outside of `[-B, B]` → not transformed
5.0
julia> b = RationalQuadraticSpline(b.widths, b.heights, b.derivatives);
julia> b(0.5) # inside of `[-B, B]` → transformed
1.1943325397834206
julia> d = 2; K = 3; B = 2;
julia> b = RationalQuadraticSpline(randn(rng, d, K), randn(rng, d, K), randn(rng, d, K - 1), B);
julia> b([-1., 1.])
2-element Vector{Float64}:
-1.5660106244288925
0.5384702734738573
julia> b([-5., 5.])
2-element Vector{Float64}:
-5.0
5.0
julia> b([-1., 5.])
2-element Vector{Float64}:
-1.5660106244288925
5.0
References
[1] Durkan, C., Bekasov, A., Murray, I., & Papamakarios, G., Neural Spline Flows, CoRR, arXiv:1906.04032 [stat.ML], (2019).
#
Bijectors.Reshape
— Type
Reshape(in_shape, out_shape)
A Bijector
that reshapes the input to the output shape.
Example
julia> b = Reshape((2, 3), (3, 2)) Reshape{Tuple{Int64, Int64}, Tuple{Int64, Int64}}((2, 3), (3, 2)) julia> Array(transform(b, reshape(1:6, 2, 3))) 3×2 Matrix{Int64}: 1 4 2 5 3 6
#
Bijectors.Stacked
— Type
Stacked(bs)
Stacked(bs, ranges)
stack(bs::Bijector...)
A Bijector
which stacks bijectors together which can then be applied to a vector where bs[i]::Bijector
is applied to x[ranges[i]]::UnitRange{Int}
.
Arguments
-
bs
can be either aTuple
or anAbstractArray
of 0- and/or 1-dimensional bijectors-
If
bs
is aTuple
, implementations are type-stable using generated functions -
If
bs
is anAbstractArray
, implementations are not type-stable and use iterative methods
-
-
ranges
needs to be an iterable consisting ofUnitRange{Int}
-
length(bs) == length(ranges)
needs to be true.
-
Examples
b1 = Logit(0.0, 1.0) b2 = identity b = stack(b1, b2) b([0.0, 1.0]) == [b1(0.0), 1.0] # => true
#
Bijectors.Transform
— Type
Abstract type for a transformation.
Implementing
A subtype of Transform
of should at least implement transform(b, x)
.
If the Transform
is also invertible:
-
Required:
-
Either of the following:
-
transform(::Inverse{<:MyTransform}, x)
: thetransform
for its inverse. -
InverseFunctions.inverse(b::MyTransform)
: returns an existingTransform
.
-
-
logabsdetjac
: computes the log-abs-det jacobian factor.
-
-
Optional:
-
with_logabsdet_jacobian
:transform
andlogabsdetjac
combined. Useful in cases where we can exploit shared computation in the two.
-
For the above methods, there are mutating versions which can optionally be implemented:
#
Bijectors.VecCholeskyBijector
— Type
VecCholeskyBijector <: Bijector
A bijector to transform a Cholesky factor of a correlation matrix to an unconstrained vector.
Fields
-
mode :
Symbol
. Controls the inverse tranformation :-
if
mode === :U
returns aLinearAlgebra.Cholesky
holding theUpperTriangular
factor -
if
mode === :L
returns aLinearAlgebra.Cholesky
holding theLowerTriangular
factor
-
Reference
See also: VecCorrBijector
Example
julia> using StableRNGs; rng = StableRNG(42); julia> b = Bijectors.VecCholeskyBijector(:U); julia> X = rand(rng, LKJCholesky(3, 1, :U)) # Sample a correlation matrix. Cholesky{Float64, Matrix{Float64}} U factor: 3×3 UpperTriangular{Float64, Matrix{Float64}}: 1.0 0.937494 0.865891 ⋅ 0.348002 -0.320442 ⋅ ⋅ 0.384122 julia> y = b(X) # Transform to unconstrained vector representation. 3-element Vector{Float64}: -0.8777149781928181 -0.3638927608636788 -0.29813769428942216 julia> X*inv = inverse(b)(y); julia> X*inv.U ≈ X.U # (✓) Round-trip through `b` and its inverse. true julia> X_inv.L ≈ X.L # (✓) Also works for the lower triangular factor. true
#
Bijectors.VecCorrBijector
— Type
VecCorrBijector <: Bijector
A bijector to transform a correlation matrix to an unconstrained vector.
Reference
See also: CorrBijector
and 'VecCholeskyBijector'
Example
julia> using StableRNGs; rng = StableRNG(42); julia> b = Bijectors.VecCorrBijector(); julia> X = rand(rng, LKJ(3, 1)) # Sample a correlation matrix. 3×3 Matrix{Float64}: 1.0 -0.705273 -0.348638 -0.705273 1.0 0.0534538 -0.348638 0.0534538 1.0 julia> y = b(X) # Transform to unconstrained vector representation. 3-element Vector{Float64}: -0.8777149781928181 -0.3638927608636788 -0.29813769428942216 julia> inverse(b)(y) ≈ X # (✓) Round-trip through `b` and its inverse. true