Parameter estimation in text modeling [unfinished]

In this post, we would like to present some parameter estimation methods common with discrete probability distribution, which is very popular in text modeling. Then we explain the model of Latent Dirichlet Allocation (LDA) in detail.

I. Introduction
There are two inference problems in parameter estimation:
(1) how to estimate values for a set of distribution parameters theta that can best explain a set of observation data.
(2) calculate the probability of a new observation given by previous observation.

We introduce Bayes' rule to solve these problem above. Bayes' rule is defined as:
and may be called: 

We will introduce maximum likelihood, a posteriori and Bayesian estimation, central concepts like conjugate distributions and Bayesian networks to tackle two problems.

II. Parameter estimation methods

1. Maximum likelihood estimation
Maximum likelihood (ML) tries to find the parameters that maximize the likelihood
The common way to obtain the parameter estimates is to derive the formula above. The probability of new observation given the data X can be found using the approximation:

Appendix: we have p(x-hat|theta) <= p(x-hat|theta-ML)
However, the ML doesn't consider the priori belief on the parameters leading to achieve a wrong probability of new observation when the data distribution is very unbiased.

2. Maximum a posteriori estimation
3. Bayesian estimation


Reference
Source: http://www.arbylon.net/publications/text-est.pdf

Comments

Popular posts from this blog

Manifold Learning [unfinished]

Feature scaling

Find all pairs in array of integers whose sum is equal to a given number ?