![]() |
Sistema de Consulta Abierta
Sistema de consulta abierta con módulo de análisis semántico
|
Métodos públicos | |
LdaGibbsSampler (int[][] documents, int V) | |
void | initialState (int K) |
void | gibbs (int K, double alpha, double beta) |
double[][] | getTheta () |
double[][] | getPhi () |
int[][] | getZ () |
int[][] | getDocuments () |
int | getV () |
int | getK () |
void | configure (int iterations, int burnIn, int thinInterval, int sampleLag) |
Métodos públicos estáticos | |
static void | hist (double[] data, int fmax) |
static void | main (String[] args) |
static String | shadeDouble (double d, double max) |
Gibbs sampler for estimating the best assignments of topics for words and documents in a corpus. The algorithm is introduced in Tom Griffiths' paper "Gibbs sampling in the generative model of Latent Dirichlet Allocation" (2002).
|
inline |
Initialise the Gibbs sampler with data.
V | vocabulary size |
data |
|
inline |
Configure the gibbs sampler
iterations | number of total iterations |
burnIn | number of burn-in iterations |
thinInterval | update statistics interval |
sampleLag | sample interval (-1 for just one sample at the end) |
|
inline |
Retrieve estimated topic–word associations. If sample lag > 0 then the mean value of all sampled statistics for phi[][] is taken.
|
inline |
Retrieve estimated document–topic associations. If sample lag > 0 then the mean value of all sampled statistics for theta[][] is taken.
|
inline |
Added in by Doori Lee
|
inline |
Main method: Select initial state ? Repeat a large number of times: 1. Select an element 2. Update conditional on other elements. If appropriate, output summary for each run.
K | number of topics |
alpha | symmetric prior parameter on document–topic associations |
beta | symmetric prior parameter on topic–term associations |
|
inlinestatic |
Print table of multinomial data
data | vector of evidence |
fmax | max frequency in display |
|
inline |
Initialisation: Must start with an assignment of observations to topics ? Many alternatives are possible, I chose to perform random assignments with equal probabilities
K | number of topics |
|
inlinestatic |
Driver with example data.
args |
|
inlinestatic |
create a string representation whose gray value appears as an indicator of magnitude, cf. Hinton diagrams in statistics.
d | value |
max | maximum value |