Created
June 12, 2020 19:43
-
-
Save greggirwin/cc6bc916c6b85446db9aff25f0a543b3 to your computer and use it in GitHub Desktop.
Minimal sorted series support, allowing fast binary searches.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Red [ | |
comment: { | |
Minimal sorted series support, allowing fast binary searches. | |
I tinkered with a "dialected" interface, and more support | |
functions, like `[remove take at index?]` but quickly saw | |
that they were of little value, and that using a parameter | |
for the command was downright misleading to read, no matter | |
how clever the implementation. `Find` is not strictly | |
necessary either, but it does make it nicely consistent | |
with the standard `find` function. | |
} | |
] | |
sorted: context [ | |
e.g.: :none | |
;!! This has an issue, in that it can't deal with heterogeneous data in | |
;!! a block, where the types can't be compared via `<`. | |
; sorted-search as a name? | |
; ?? offer comparison alternatives?: [case strict same] | |
; `Same` makes no sense, as you can't sort by identity | |
binary-search: function [ | |
"Returns the index where a value is (success), or the index of the insertion point as a negative offset (not found)" | |
series [series!] "Pre-sorted series" | |
value "Value to find" | |
/? "Return NONE if the value is not found" | |
][ | |
if empty? series [return either ? [none][-1]] | |
low: 1 | |
high: length? series | |
while [low <= high][ | |
; Normally I would use ROUND/DOWN, but this is one place we do | |
; care about performance, because this func could get called a | |
; lot depending on the application. | |
;mid: round/down low + (high - low / 2) | |
mid: to integer! low + (high - low / 2) | |
; Pick is faster than path syntax, and just as clear here. | |
cmp-val: pick series mid | |
either cmp-val = value [return mid][ ;!! Exit point | |
either cmp-val < value [low: mid + 1] [high: mid - 1] | |
] | |
] | |
either ? [none][negate low] | |
] | |
e.g. [ | |
binary-search [1 2 3 4 5 6] 4 | |
binary-search [1 2 3 4 5 6] 7 | |
binary-search [1 2 3 4 5 6] 0 | |
binary-search [1 2 4 5 6] 3 | |
] | |
; /first and /last are just an idea, based on the need for a record | |
; manager to find the first matching value, so it can step through | |
; all matching keys. That can be done at a higher level than this | |
; function, along with comparison options. | |
find: function [ | |
"Returns the series where a value is found, not necessarily the first, or NONE" | |
;"Returns the series where a value is found, or NONE" | |
series [series!] "Pre-sorted series" | |
value [any-type!] | |
/first "Find the first value" | |
/last "Find the last value" | |
][ | |
if integer? idx: binary-search/? series value [ | |
case [ | |
first [ ; step back until we find a non-match | |
while [series/(idx - 1) = value][idx: idx - 1] | |
] | |
last [ ; step forward until we find a non-match | |
while [series/(idx + 1) = value][idx: idx + 1] | |
] | |
;'else [at series idx] | |
] | |
at series idx | |
] | |
] | |
;!! Note the name, and use of system/words/insert | |
insert: function [ | |
"Inserts a value at its sorted location; returns series past the insertion" | |
series [series!] "Pre-sorted series" | |
value [any-type!] | |
][ | |
system/words/insert/only | |
at series absolute binary-search series value | |
value | |
] | |
] | |
e.g.: :none | |
e.g.: :do | |
e.g. [ | |
blk: [] | |
print "# INSERT" | |
repeat i 20 [print sorted/insert blk random 5] | |
print mold blk | |
print "# FIND" | |
repeat i 5 [ | |
print [i mold sorted/find blk i] | |
] | |
print "/first" | |
repeat i 5 [ | |
print [i mold sorted/find/first blk i] | |
] | |
print "/last" | |
repeat i 5 [ | |
print [i mold sorted/find/last blk i] | |
] | |
] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment