Full formulas: Gaussian and Fisher-von-mises

Log-likelihood of a single Gaussian

Variables:

N(x) = The normal distribution, with mean mu and variance sigma^2.

C = the covariance matrix. It indicates the relations between each all dimensions x,y,z. C is diagonal if the axis are independant, with the variances on the diagonal.

d = the dimension of data (here, 3: [x,y,z])

We pose that the axis are independant. We assume the x, y, and z components are independent because the model predicts each directional axis separately, and there is no prior reason to expect correlations between them in the learned representation. This simplifies the covariance structure to a diagonal matrix without significantly affecting the model’s performance.

Formulas:

Definition: The Square Mahalanobis distance for one Gaussian:

\[m^2 = (x - \mu)^T \cdot C^{-1} \cdot (x - \mu)\]

For independant variables:

\[m^2 = \sum_d \frac{(x - \mu)^2}{\sigma^2}\]

Definition: Probability function for a single multivariate Gaussian:

\[P(x) = N(x | \mu, C) = \frac{1}{\sqrt{\det(2\pi C)}}{e^{-0.5m^2}}\]

Kowing that:

\[\det(kX) = k^d \cdot \det(X),\]

We obtain

\[P(x) = \frac{1}{\sqrt{(2\pi)^d \cdot \det(C))}} \cdot e^{-0.5m^2}\]

For independant variables:

\[\det(C) = \sum \sigma_i^2\]

And thus we obtain:

\[P(x) = \frac{1}{\sqrt{(2\pi)^d \cdot \sum \sigma_i^2}} \cdot e^{-0.5m^2}\]

Finally, we can calculate the log-likelihood of a single Gaussian:

\[ \begin{align}\begin{aligned}\log(P(x)) = \log(1) - \log(\sqrt{(2\pi)^d \cdot \sum \sigma_i^2}) + (-0.5m^2)\\\log(P(x)) = - 0.5\log((2\pi)^d \cdot \sum \sigma_i^2) - 0.5m^2\\\log(P(x)) = -0.5 (d\log(2\pi) + 2\log(\sigma_i) + m^2)\\\log(P(x)) = -0.5(d\log(2\pi) + m^2) - \sum \log(\sigma_i)\end{aligned}\end{align} \]

Log-likelihood of a mixture of Gaussians

Variables:

z_i = P(z=i), the probability that the current observation belongs to the ith Gaussian, is called the mixture coefficient.

Formulas:

Probability function for a mixture of multivariate Gaussians

\[ \begin{align}\begin{aligned}P(x) = \sum_k (P(x | z=k) \cdot P(z=k)\\P(x) = \sum_k (z_k * N(x | \mu_k, C_k))\\P(x) = \sum_k \frac{z_k}{\sqrt{(2\pi)^d \cdot \det(C_k)}} \cdot e^{-0.5m_k^2}\end{aligned}\end{align} \]

Log-likelihood for a mixture of Gaussians:

\[\log(P(x)) = \log(\sum_k\frac{z_k}{\sqrt{(2\pi)^d \cdot \det(C_k)}} \cdot e^{-0.5m_k^2})\]

If z_i is already known (mixture coefficient, computed separately), it is easy to separate this variable in the computation by using x = exp(log(x)), giving a shape typically known as logsumexp.

\[ \begin{align}\begin{aligned}\log(P(x)) = \log(\sum_k(\exp(\log(\frac{z_k}{\sqrt{(2pi)^d \cdot \det(C_k)}} \cdot e^{-0.5m_k^2})))\\ = \text{logsumexp}[\log(z_k) - 0.5(d\log(2\pi) + \log(\det(C_k)) + m_k^2)]\end{aligned}\end{align} \]

For independant variables:

\[ \begin{align}\begin{aligned}= \text{logsumexp}[\log(z_k) - 0.5(d\log(2\pi) + m_k^2 ) - 0.5\log(\det(C_k)]\\= \text{logsumexp}[\log(z_k) - 0.5( d\log(2\pi) + m_k^2 ) - \sum_i(\log \sigma_i)]\end{aligned}\end{align} \]

Note that the second half of this is the logpdf of a single Gaussian!

\[= \text{logsumexp}[\log(z_k) + \text{logpdf}_k]\]

References:

https://en.wikipedia.org/wiki/Mahalanobis_distance

https://stephens999.github.io/fiveMinuteStats/intro_to_em.html

https://www.ee.columbia.edu/~stanchen/spring16/e6870/slides/lecture3.pdf

https://github.com/jych/cle/blob/master/cle/cost/__init__.py

Log-likelihood of a single Fisher-Von Mises distribution

Variables:

v = the normalized target

mu = the mean

kappa = the concentration parameter

d = the dimension

C = the distribution normalizing constant

I_n = the modified Bessel function at order n (see wiki or Wolfram).

Formulas:

Probability function:

\[P(v | \mu, \kappa) = C e^{\kappa \cdot \mu^T \cdot v}\]

Where

\[C(\kappa) = \frac{\kappa^{\frac{d}{2}-1}}{(2\pi)^\frac{d}{2} \cdot I_{\frac{d}{2}-1}(\kappa)}\]

In our case, d=3, the value is reduced to the following expression, as stated here.

\[C = \frac{\kappa}{2\pi\cdot (e^\kappa - e^{-\kappa})}\]

log-likelihood:

\[ \begin{align}\begin{aligned}\log(P(v)) = \log(C e^{\kappa \cdot \mu^T \cdot v})\\ = log(C) + \kappa \cdot \mu^T \cdot v\end{aligned}\end{align} \]

Where

\[\log(C) = \log(\kappa) - \log(2\pi) - \log(e^\kappa - e^{-\kappa})\]

References:

https://en.wikipedia.org/wiki/Von_Mises%E2%80%93Fisher_distribution

http://www.mitsuba-renderer.org/~wenzel/files/vmf.pdf