I realized there is a lack of educational resources that concentrate on the theory of variational inference in this platform. If you’re interested in the math behind variational inference, or more generally speaking probabilistic programming, you may find this post useful.
According to Bayes rule,
where p(z|x) denotes posterior, p(x|z) denotes likelihood and p(x) denotes evidence, also called marginal likelihood. Computing the posterior is known as inference, but inference in some cases is analytically intractable because evidence, which is represented by the integration ∫p(x,z)dz, is a very high dimensional distribution, and it’s hard to draw samples from. Additionally…