anobiidae's blog


4.3 Fast LDA Algorithm


4.3 Fast LDA Algorithm

The sampling step for FastLDA begins with a sorted list of topics in descending order by Njk, the most popular topic for a document to the least popular.

FastLDAのsampling stepは、まずNjkを逆順にするtopicのソート済みlistから始める。popularなdocumentのpopularなtopic。


A random value u is sampled u ∼ Uniform[0, 1].



The algorithm then considers topics in order, calculating the length of segments sk l as it goes.

アルゴリズムはtopicを順列で考えるとき、それが必要になったときにs_k_l segmentの長さを計算する。


Each time the next topic is considered the bound Zk is improved.



As soon as the sum of segments calculated so far is greater then u, the algorithm can stop and return the topic associated with the segment u falls on.

segmentの合計がuよりも大きくなり次第、アルゴリズムは止まり、segment uの落ちる関係に合わせてtopicをreturnする。


Graphically, the algorithm scans down the line in Figure 3 calculating only sk l and Zk for the k topics visited so far.

図示すれば、アルゴリズムは図3内の線上において、k topic のs_k_lとZkだけを計算している。


When the  algorithm finds a segment whose end point is past u it stops and returns the associated topic.



By intelligently ordering the comparisons as to whether u is within a segment, we need to do 2K comparisons in the worst case.