4.3 Fast LDA Algorithm

The sampling step for FastLDA begins with a sorted list of topics in descending order by Njk, the most popular topic for a document to the least popular.

FastLDAのsampling stepは、まずNjkを逆順にするtopicのソート済みlistから始める。popularなdocumentのpopularなtopic。

A random value u is sampled u ∼ Uniform[0, 1].

乱数値uは0～1の正規乱数である。

The algorithm then considers topics in order, calculating the length of segments sk l as it goes.

アルゴリズムはtopicを順列で考えるとき、それが必要になったときにs_k_l segmentの長さを計算する。

Each time the next topic is considered the bound Zk is improved.

それぞれ次のtopicを考えるときに、Zkを減少させる。

As soon as the sum of segments calculated so far is greater then u, the algorithm can stop and return the topic associated with the segment u falls on.

segmentの合計がuよりも大きくなり次第、アルゴリズムは止まり、segment uの落ちる関係に合わせてtopicをreturnする。

Graphically, the algorithm scans down the line in Figure 3 calculating only sk l and Zk for the k topics visited so far.

図示すれば、アルゴリズムは図３内の線上において、k topic のs_k_lとZkだけを計算している。

When the algorithm finds a segment whose end point is past u it stops and returns the associated topic.

アルゴリズムがsegmentを見つけた時の最終点が過去のuとなり、アルゴリズムは終了して関連するtopicをreturnする。

By intelligently ordering the comparisons as to whether u is within a segment, we need to do 2K comparisons in the worst case.

uがsegmentに含まれるかの知的な順列の比較処理は最悪時間で2Kを要する。

anobiidae's blog

個人的に自然言語処理とかに関して、調べ物したもののメモとかを残す場所。

4.3 Fast LDA Algorithm