Skip to content

Instantly share code, notes, and snippets.

@myui
Created September 5, 2018 06:09
Show Gist options
  • Save myui/8ca2abe160f0781f2665e29cf6f88c52 to your computer and use it in GitHub Desktop.
Save myui/8ca2abe160f0781f2665e29cf6f88c52 to your computer and use it in GitHub Desktop.

https://en.wikipedia.org/wiki/Okapi_BM25

|D| is the length of the document D in word

create or replace view doc_len
as
select 
  docid, count(1) as cnt 
from
  exploded
group by
  docid
;

avgdl is the average document length in the text collection from which documents are drawn

select avg(cnt) as avgdl from doc_len;

okapi_bm25

tf(q_i,D), |D|, avgdl, N, n(q_i)

Hyperparameters: k_1, b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment