Skip to content

Instantly share code, notes, and snippets.

@ilyash-b
Created February 9, 2022 11:02
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ilyash-b/fd87050fd55e5d0de65a19cbee4b3ceb to your computer and use it in GitHub Desktop.
Save ilyash-b/fd87050fd55e5d0de65a19cbee4b3ceb to your computer and use it in GitHub Desktop.
Autovivification for Next Generation Shell - experiment
# Following line will be needed when files will be auto-wrapped in "ns { ... }"
{ global deep_get, deep_set, deep_get_step, deep_set_step, deep_index_to_container_type }
doc %STATUS - experimental
F deep_get_step(node, _) {
throw LookupFail().set(container=node)
}
doc %STATUS - experimental
F deep_set_step(container, idx_or_key, next_idx_or_key) throw NotImplemented()
doc %STATUS - experimental
F deep_get_step(node, key) {
guard node =~ AnyOf(Hash, HashLike)
node[key]
}
doc %STATUS - experimental
F deep_set_step(node, key, val) {
guard node =~ AnyOf(Hash, HashLike)
node[key] = val
}
doc %STATUS - experimental
F deep_index_to_container_type(key) Hash
doc %STATUS - experimental
F deep_get_step(node:NormalTypeInstance, field:Str) {
node.(field)
}
doc %STATUS - experimental
F deep_set_step(node:NormalTypeInstance, field:Str, val) {
node.(field) = val
}
doc %STATUS - experimental
F deep_get_step(node, idx:Int) {
guard node =~ AnyOf(Arr, ArrLike)
node[idx]
}
doc %STATUS - experimental
F deep_set_step(node, idx:Int, val) {
guard node =~ AnyOf(Arr, ArrLike)
section "Ensure array is long enough" for(i=len(node);i<=idx;i+=1) {
node.push(null)
}
node[idx] = val
}
doc %STATUS - experimental
F deep_index_to_container_type(idx:Int) Arr
doc %STATUS - experimental
F deep_get(root, path, default=null) block b {
node = root
for p in path {
try {
node = deep_get_step(node, p)
} catch(lf:LookupFail) {
guard lf.container === node
b.return(default)
}
}
node
}
doc %STATUS - experimental
F deep_set(root, path:Arr, val) {
assert(path, 'path is expected to have at least one element, deep_set can not set the root')
path_iter = Iter(path)
container = null
idx_or_key = null
node = section "Navigate to last present element" block b {
node = root
while path_iter {
try {
container = node
idx_or_key = path_iter.next()
node = deep_get_step(node, idx_or_key)
} catch(lf:LookupFail) {
guard lf.container === node
b.return(node)
}
}
node
}
# echo("P $path_iter")
for p in path_iter {
# echo("P $path_iter $p")
next_container = deep_index_to_container_type(p)()
node = deep_set_step(container, idx_or_key, next_container)
container = next_container
idx_or_key = p
# echo("for end: container $container idx_or_key $idx_or_key")
}
deep_set_step(container, idx_or_key, val)
root
}
test("get existing hash->hash", {
h = {"a": {"b": 1}}
deep_get(h, ['a', 'b']).assert(1)
})
test("get non-existing hash->hash", {
h = {"a": {"b": 1}}
deep_get(h, ['a', 'bx']).assert(null)
deep_get(h, ['a', 'b', 'c']).assert(null)
})
test("set first level on custom type", {
type T
t = T()
t.deep_set(['field1'], 'val1').assert(T, 'deep_set should return object of type T')
t.assert({"field1": "val1"})
})
test("set two levels on custom type (hash)", {
type T
t = T()
t.deep_set(['field1', 'key1'], 'val1').assert(T, 'deep_set should return object of type T')
t.assert({"field1": {"key1": "val1"}})
})
test("set two levels on custom type (arr)", {
type T
t = T()
t.deep_set(['field1', 2], 'val1').assert(T, 'deep_set should return object of type T')
t.assert({"field1": [null, null, "val1"]})
})
@rdje
Copy link

rdje commented Feb 9, 2022

1 basic question.

What's the meaning of an undercore '_' as the rightmost argument of a function/method definition ?

Like the one in the definition of deep_get_step().

@rdje
Copy link

rdje commented Feb 9, 2022

A few comments

You're really fast ! :)

The code looks operational already ! :)

Let's say we have a path P which can split in two, say P1 and P2, so that

P == P1 P2

If you deep_set(P1, aScalar), aScalar = Int or Str

Trying to later do a deep_set(P, aValue) or deep_get(P) should throw and exception.

Before being able to deep_set(P, aValue) or deep_get(P) the user shall change the type of the value stored at P1 from a aScalar to an aggregate of the right type.

I am saying right type because depending on the type of the very first component of P2, P might/should still fail.

Let's say, the user changes the value at P1 to a Hash, then if P2 starts with an Int the access set/get past P1 shall fail and vice versa,
if the value at P1 is changed to an Arr, then if P2 starts with a Str, the access set/get past P1 shall also fail.

Not sure if I am clear here.

That is why to me, the call on line 117 shall not return null but an exception, because the value at a->b is a scalar and we try to get past this scalar using a->b -> c.

That's the only feedback I can provide right now.

Well done ! :)

-Richard

@ilyash-b
Copy link
Author

What's the meaning of an undercore '_' as the rightmost argument of a function/method definition ?

Just followed convention from other languages where leading underscore means unused parameter. Should probably be _idx_or_val

You're really fast ! :)

It just happened to be the less loaded days. It's not indicative :)

Trying to later do a deep_set(P, aValue) or deep_get(P) should throw and exception.

The following does throw. Maybe not very clear one but it does.

t.deep_set(["field1", 2], "val1")
t.deep_set(["field1", 2, "x"], "val1b")

Before being able to deep_set(P, aValue) or deep_get(P) the user shall change the type of the value stored at P1 from a aScalar to an aggregate of the right type.

Sounds like that's how it works now.

That is why to me, the call on line 117 shall not return null but an exception, because the value at a->b is a scalar and we try to get past this scalar using a->b -> c.

My view on that: the whole purpose of get family functions is that the caller knows the data is not necessarily there. If you know it's there, just use the regular h.a.b.c.

Let's say, the user changes the value at P1 to a Hash, then if P2 starts with an Int the access set/get past P1 shall fail and vice versa,
if the value at P1 is changed to an Arr, then if P2 starts with a Str, the access set/get past P1 shall also fail.

Need to think about that. Interesting point.

Unrelated note: the autovivification works not when just referencing. It looks like this behavior in Perl got bad feedback, which I agree with. New containers will only be added when using deep_set().

I would like to see some scripts in NGS which use this facility to get more feeling about the facility and the implementation. It does bother me that I did not have (or did not notice) use cases for this previously.

@rdje
Copy link

rdje commented Feb 10, 2022

An additional comment.

Your current implementation accepts only a single argument of type Arr to specify the path to the final node.

In my initial suggestion I also mentioned the possibility to use a mixture of Int, Str and Arr, meaning that in my view the following two deep_get() calls shall be equivalent

deep_get(7, "A", 16, **[**"aa", "bb", 10, "cc"**]**, "lastcomp") ≡ deep_get(7, "A", 16, "aa", "bb", 10, "cc", "lastcomp")

The same for deep_set()

deep_set(7, "A", 16, **[**"aa", "bb", 10, "cc"**]**, "lastcomp", aValue) ≡ deep_set(7, "A", 16, "aa", "bb", 10, "cc", "lastcomp", aValue)

For deep_set() there is no ambiguity, at least to me, about which arguments are part of the path's elements and which one is the actual value to store, since the latter is the last argument, that is the rightmost.

Do you think allowing the path to be specified using potentially more than one argument and not just a single Arr argument is doable ?
Is it acceptable for you ? performance-wise, ...

-Richard

@rdje
Copy link

rdje commented Feb 10, 2022

You wrote

...New containers will only be added when using deep_set()

That is fine by me if you stick on this idea but I would say that using deep_get() you're explicitly asking NGS to use the autovivification feature which means you're ok, or may be that's what you want, to have the whole path constructed if it doesn't yet exist and in that case have that node initialized to null.

It is not that NGS is doing autovivification behind the user's back. The user has to explicit ask for it, so he knows what to expect.

My view is here is that as long as the autovivification feature is precisely explained both for deep_set() and for deep_get(), I would find it a bit limiting or constraining to say we won't autovivify on get() because it might create new paths, but that is the whole point.

deep_set() and deep_get() are functions provided explicitly for a particular type of activity. There are not part of the NGS language syntax unlike in Perl5.

If the node at path does not yet exist, meaning if its value when fetched is null then we should have the following equivalence

deep_get(path) ≡ deep_set(path, null)

For me both deep_set() and deep_get() should autovivify when called, but you're the judge.

@rdje
Copy link

rdje commented Feb 10, 2022

You could even provide a variant of deep_get() for probing for a path without constructing the said path.

It could be called

deep_peek()

This would be deep_get() without the side effect of constructing the path when consuming it.

For short, using get() would have a well explained and understood side effect, which could be exactly what the user is looking for and peek() which would travel along path without affecting anything, that is it would check for path in ninja or stealth mode that is without leave a trace.

That is deep_peek() won't autovivify but deep_get() will.

I really think you should not rule out this get/peek pair as the idea is to give people (me ? :) ) choices and not corner them as long as things are well explained so there is no surprise with what to expect.

What do you think ?

@rdje
Copy link

rdje commented Feb 10, 2022

Another comment to something you wrote earlier

You wrote

My view on that: the whole purpose of get family functions is that the caller knows the data is not necessarily there. If you know it's there, just use the regular h.a.b.c.

I assume that in

h.a.b.c

The keys a, b and c are identifiers, that is they match the following RE

/[[:alpha:]_]\w*/

Whereas using any of the deep_*() functions you would free the user from that constraint and allow any string of characters to be used when doing deep parkour along path which may be specified in pieces or components where each could be one of

  • Int
  • Str
  • Arr of either of the above. Ints and Strs may appear in the same Arr

At least that is how I see it.

@ilyash-b
Copy link
Author

The keys a, b and c are identifiers

Correct.

constraint

Nope. There is no constraint because of alternative syntax:

  • For Hash, like in JavaScript, one can use h["what ever"]
  • For objects, including user defined, the dot access syntax h.a access becomes h.("what ever")

@ilyash-b
Copy link
Author

For short, using get() would have a well explained and understood side effect

Somehow I think that anything with get should not have side effects. We can maybe add deep_make() or deep_ensure() or anything of that sort to act like deep_set() but without the final step of setting the leaf value.

@ilyash-b
Copy link
Author

deep_set(path, null)

mm... absence of an element is different than having it with null value in NGS.

@ilyash-b
Copy link
Author

I guess it's time to try the "discussions" feature of GitHub. Did not have the chance yet.

@ilyash-b
Copy link
Author

Moving discussion to ngs-lang/ngs#550

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment