Skip to content

Instantly share code, notes, and snippets.

@antirez

antirez/LPOS.md Secret

Last active June 23, 2020 23:36
Show Gist options
  • Save antirez/3591c5096bc79cad8b5a992e08304f48 to your computer and use it in GitHub Desktop.
Save antirez/3591c5096bc79cad8b5a992e08304f48 to your computer and use it in GitHub Desktop.

Background: this command was proposed as LRANK in PR 7379. I (antirez) modified the implementation to provide the matchpos and ALL option, and renamed the command into LPOS. This Gist was written as a draft of the command documentation, in order to get feedbacks from the community of Redis users about this command. A note about time complexity. This command is O(N) like a few Redis commadns operating on Redis lists. Normally we don't add commands that have worse than logarithmic complexity in Redis without thinking twice, however there are a few reasons why I'm positive about LPOS:

  1. List operations are often used against very small lists, making them effectively O(1), because certain applications always have a maximum size.
  2. It is a fundamental operation. When it's needed, people will resort to Lua scripting which is an order of magnitude slower than a direct implementation.
  3. The new development branch that shows threaded slow operations, not blocking the server when O(N) commands are called, is very promising. This may be one of the commands implemented in that way.

The implementation of LPOS can be found here: https://github.com/antirez/redis/commits/lpos

If you have feedbacks, please add a comment to this Gist or into the original issue.

LPOS key element [FIRST rank] [COUNT num-matches] [MAXLEN len]

The command returns the index of matching elements inside a Redis list. By default, when no options are given, the command will scan the list from head to tail, looking for the first match of "element". If the element is found, its index (the position) is returned, otherwise if no match is found, NULL is returned.

> RPUSH mylist a b c 1 2 3 c c
> LPOS mylist c
2

The optional arguments and options are able to modify the command behavior. The FIRST option specifies the "rank" of the first element to return, in case there are multiple matches. A rank of 1 means to return the first match, 2 to return the second match, and so forth.

For instance in the above example the element "c" is present multiple times, if I want the index of the second match, I'll write:

> LPOS mylist c FIRST 2
6

That is, the second occurrence of "c" is at position 6. A negative "rank" as the FIRST argument tells LPOS to invert the search direction, starting from the tail to the head. However the indexes are normally still reported in the "natural" way, that is, considering the first element starting from the head of the list at index 0, the next element at index 1, and so forth.

So, we want to say, give me the first element starting from the tail of the list:

> LPOS mylist c -1
7

Sometimes we want to return not just the Nth matching element, but the position of all the first N matching elements. This can be achieved using the COUNT option.

> LPOS mylist c COUNT 2
[2,6]

We can combine COUNT and FIRST, so that COUNT will try to return up to the specified number of matches, but starting from the first one specified.

> LPOS mylist c FIRST -1 COUNT 2
[7,6]

When COUNT is used, if the number of matches may be 0, to just tell the command we want all the matches found returned as an array of indexes.

> LPOS mylist COUNT 0
[2,6,7]

When COUNT is used and no match is found, an empty array is returned. However when COUNT is not used and there are no matches, the command returns NULL.

Finally the MAXLEN option, tells the command to compare the provided element only with the given number of list items. So for instance specifying MAXLEN 1000 will make sure that the command performs only 1000 comparison, effectively running the algorithm on a subset of the list. This is useful to limit the maximum complexity of the command.

@antirez
Copy link
Author

antirez commented Jun 11, 2020

Ok I think I found a good approach, that uses COUNT, converts all the non mandatory arguments to options, and has the 1/-1 symmetry. I updated the implementation and now I'm updating this gist to reflect it.

@CaddyDz
Copy link

CaddyDz commented Jun 11, 2020

This looks like a public RFC!

@antirez
Copy link
Author

antirez commented Jun 11, 2020

Hello @mike-plivo, I'm all for clients that don't mask much Redis internals, but here the solution for clients that want to be higher level, is to just always pass the COUNT option, and the return value will be always the array.

@mgravell
Copy link

With the current edits, there's a slight incongruity because you can't do anything useful with MAXLEN in the case of a miss. You can't check the next chunk, for example.

Purely for discussion, but how about:

LPOS key element [START start] [SKIP skip] [COUNT count] [MAXLEN MAXLEN]`

Where start is an index (0 for first, -1 for end, negative triggers backwards crawl), SKIP is the number of matches to bypass non-negative, etc. This resolves the previous nuances and means that I can do:

LPOS ... MAXLEN 5000
...
LPOS ... START 5000 ... MAXLEN 5000
...

Etc to look iteratively through an uber-list.

@mgravell
Copy link

Also: does COUNT -1 have the semantics that ALL used to?

@antirez
Copy link
Author

antirez commented Jun 11, 2020

@mgravell: Yep we still have "ALL" with COUNT 0 (not -1), since to specify an empty count looks very odd and is unlikely that this is programatically done. However now that you mention this, I would like to check in other Redis commands with a COUNT of 0 does. I remeber that in certain instances it returns all the elements, but in others may not be like that. Better to check and see if -1 is more appropriate. Thaniks.

About MAXLEN, I avoided the START option because the MAXLEN main goal is to reduce the time complexity of the command and avoid it to be unbound. So if you have a large list, but you believe you are interested only in "near" matches, you can do that. START would in some way defeat the original goal of making LPOS "less O(N)".

Thanks!

@antirez
Copy link
Author

antirez commented Jun 11, 2020

XRANGE count 0 means all the elements. In other instances seems like that COUNT <= 0 is not allowed at all.
Looks like COUNT of 0 makes sense as a way to tell the implementation to return all the elements after all.

@mgravell
Copy link

👍

@swayamraina
Copy link

swayamraina commented Jun 11, 2020

IMHO, COUNT = 0 should mean return no match and while COUNT < 0 is illegal. And COUNT = n means return n matches.
Not specifying the COUNT should mean user wants all matches (Currently, not giving COUNT means default value of 1)

regarding @mgravell suggestion on excluding the COUNT keyword, the approach hides the meaning of the request while adding the keyword makes the command verbose plus gives a separator to the command which can be used for future modifications in the command and reordering of options.

Though agreeing to @antirez, the MAXLEN option should be added as part of building a guardrail for abuse.

@antirez
Copy link
Author

antirez commented Jun 11, 2020

@swayamraina what you suggest is not possible becuase COUNT is what triggers the array reply.

@swayamraina
Copy link

Taking this into note that presence of COUNT triggers the array reply, using * or all seems a good choice then.

Also, it seems LPOS will be used for list search of a single element. Will multi-value search make sense in near future or such complex search will have to be done via lua?
What i mean is most of the time one would want to check if certain set of values are present in the list or not i.e. if list contains any or all of the provided values in list.

@antirez
Copy link
Author

antirez commented Jun 12, 2020

Looks strange that the majority of use cases are about multi-value searches. But it will be possible to extend the command later if really needed. IMHO it's better to have a very simple usage for the base case. Then to be more complex only if needed, like with a special option to specify N patterns. We did this for MIGRATE with the "KEYS" option for instance.

@antirez
Copy link
Author

antirez commented Jun 12, 2020

Also note that COUNT 0 meaning "return all the items" is already used in Redis. For instance for the XRANGE command.

@andrewsensus
Copy link

@antirez I think this example didn't get converted from when [FIRST rank] was [matchpos]. This also exists in the documentation here https://redis.io/commands/lpos

- > LPOS mylist c -1
+ > LPOS mylist c FIRST -1
7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment