Lightweight text indexer for PHP
Uses the dba extension (with db4)
Hello my name is Arnold and I'm not crazy. Arnold's kids are crazy though.
Cast string to lowercase, remove all stop words and words < 3 characters from the string.
hello my name is arnold and i'm not crazy. arnold~~'s~~ kids are crazy though.
Split up the string in words and pull the array through array_unique
. Put all words in the index with a key consisting of group:id
.
word:hello ["mygroup:1"]
word:name ["mygroup:1"]
word:arnold ["mygroup:1"]
word:crazy ["mygroup:1"]
word:kids ["somegroup:7", "mygroup:1"]
Put all words in the index under the key.
item:mygroup:1 ["hello", "name", "arnold", "crazy", "kids"]
My crazy kids
Cast string to lowercase, remove all stop words and words < 3 characters from the string.
my crazy kids
Split up the string in words and pull the array through array_unique
. Lookup each word.
word:crazy ["mygroup:1", "mygroup:22", "somegroup:99"]
word:kids ["somegroup:7", "mygroup:1"]
Use array_intersect
to get the items with all words matching.
Hello my name is Arnold and I'm not smart. Arnold's kids are very smart though.
Cast string to lowercase, remove all stop words and words < 3 characters from the string.
hello my name is arnold and i'm not smart. arnold~~'s~~ kids are very smart though.
Split up the string in words and pull the array through array_unique
. Get the old word for group:id
.
Use array_difference(old, new)
to get the words that are removed. Remove the key from the array and remove the words key from the index completely if the word is not present for any other items.
word:crazy ["mygroup:1"]
Use array_difference(new, old)
to get the words that are new and add them from the index.
word:very ["mygroup:9", "mygroup:1"]
word:smart ["mygroup:1"]
Update item:group:id
item:mygroup:1 ["hello", "name", "arnold", "smart", "kids", "very"]