In the light client (or any other client), the user may want to subscribe to a subset of events (rather than all of them) using /subscribe?event=X
. For example, I want to subscribe for all transactions associated with a particular account. Same for fetching. The user may want to fetch transactions based on some filter (rather than fetching all the blocks). For example, I want to get all transactions for a particular account in the last two weeks (tx's block time >= time.Now().Sub(2 * time.Week)
).
The goal is a simple and easy to use API for doing that.
Tx Send Flow Diagram - https://www.dropbox.com/s/x45qaj5fq04qo1x/tags1.png?dl=0
1. Design question: in the above diagram, why the light client connects to the tendermint directly, but not through the ABCI app? @srm's current architecture looks more like https://www.dropbox.com/s/c4r99a7a5p1tpz9/tags2.png?dl=0 where a client is a part of an ABCI app (he is not using the light-client library).
2. Won't we end up in a place where both Tendermint and App have different indexers (Tendermint storing tx results, App storing domain-specific details)? If so, maybe we should let our users do the indexing stuff. Yes, it means in Basecoin we will have to use KV indexer or http://www.blevesearch.com/ or smth else to index accounts. The problem of the tags approach below (see Proposal) is that it doesn't allow complex types (http://www.blevesearch.com/docs/Getting%20Started/). What if the user wants to index some complex struct. How will we encode this and transfer to the Tendermint? go-wire? (means requiring custom encoding)
3. There is a question of who should manage tx indexing keys - app or tendermint. We've discussed it already. But my question is (maybe it is silly) - why we need a hash in the first place? Is the tuple {heigh, index}
not enough? It uniquely identifies transaction.(?) So, instead of letting the app manage the keys or saying that your data should not be malleable, we could send an index
field with a DeliverTx
request and let the app do the indexing (it can add some domain specific details or smth else - we cannot predict really).
# {block height, tx index} => ...
{123, 10} => [{account_holder, "Joe"}, {account_desc, "Private account}]
ABCI app return tags with a DeliverTx
response inside the data
field (for now, later we may create a separate field). Tags is a list of key-value pairs.
Example data:
{
"channels": ["abci.account_owner.Igor", "abci.account_number.333222111"],
"work": 10,
"priority": 5,
"account.owner": "Igor"
}
Tendermint will most likely have some reserved tags - e.g. "channels" (see below).
If the user wants to receive only a subset of events of type X, he/she must return channels
tag with a DeliverTx
response. For every channel in that list, Tendermint will notify the subscribers.
We will need to add an optional channels
field:
/subscribe?event=X - events of type X
/subscribe?event=X&channels="abci.account_owner.Igor" - events of type X tagged `abci.account_owner.Igor`
/subscribe?channels="abci.account_owner.Igor"&channels="abci.account_owner.Ivan" - all events tagged `abci.account_owner.Igor` OR `abci.account_owner.Ivan`
/subscribe?channels="abci.account_owner.Igor" - all events tagged `abci.account_owner.Igor`
/subscribe?event=X&channel_patterns="abci.account_owner.*" - events of type X tagged `abci.account_owner.` (e.g. `abci.account_owner.Igor`, `abci.account_owner.Ivan`)
/subscribe?event=X&channel_patterns="abci.account_owner.*" - all events tagged `abci.account_owner.` (e.g. `abci.account_owner.Igor`, `abci.account_owner.Ivan`)
Frey suggested adding wildcard routes to allow clients to subscribe using regexps (e.g. abci.accounts.*
). Do we need full regexp syntax or just a strict subset - *?[
? Glob-style pattern matching should be enough (https://github.com/antirez/redis/blob/d680eb6dbdf2d2030cb96edfb089be1e2a775ac1/src/util.c#L46).
This is a bit tricky because a) we want to support a number of indexers, all of which have a different API b) we don't know whenever tags will be sufficient for the most apps (I guess we'll see). c) I am still not convinced this should be on the Tendermint side and not on the ABCI side.
Some indexers (Elasticsearch) require schema (they call it "mapping") to be able to index the data. Where this schema should be defined? And when?
Schema for the account
field:
{
"account" : {
"properties" : {
"owner" : {
"type" : "string"
},
"ID" : {
"type" : "integer"
},
}
}
}
Data:
{
"channels": ["abci.account_owner.Igor", "abci.account_number.333222111"],
"work": 10,
"priority": 5,
"account": {
"owner": "Igor Black",
"ID": 333222111
}
}
Or I am digging too deep? What was the original plan? Tags to be a list of strings (["work:5","account_owner.Igor"]
), which we index and allow for match queries?
Besides indexing, every indexer has its own query syntax. http://okfnlabs.org/blog/2013/07/01/elasticsearch-query-tutorial.html
Based on the feedback from our users (@srm), we can assume that queries could be arbitrary: tendermint/tendermint#525 (comment).
{
"query": {
"term" : { "account": { "owner": "Igor" } }
"constant_score": {
"filter": {
"range": {
tx_commited_at: {
"from": "2017-01-01",
"to": "2017-05-01"
}
}
}
}
}
}
Notes:
- Trusted vs Untrusted index, that we verify after
- app duplicating hash implementation
Originally I was thinking this is just for tx stuff, but I suppose it might be useful to have it on non-tx events too. I still think we need to do work on rpc authentication to limit how much load a client can put on a public server, or how many clients can even connect, and so on.
Full regex probably overkill. Just
*
might even be sufficient for now.Still? Open to your thoughts here. Remember clients need to talk to Tendermint for proofs, unless we burden all app devs with exposing Tendermint proof stuff. What do you think is better ?
These are good questions. I'm not sure exactly. Maybe we're trying to do too much with all this super general purpose abstraction (arbitrary tags, arbitrary indexers). The original thinking was that tags would just be set of key-value pairs:
I think we can avoid more complex data structure values for now with a convention over keys (ie.
account.name
andaccount.address
instead ofaccount
being some struct).The channels thing is interesting. But why not allow any tag to be a channel? Is it because it makes it more complicated to subscribe based on the value? Again we could have some standard, like subscribing to
account.name=Igor
means subscribing toaccount.name
where value is Igor. I could l see folks wanting more complex things too, likepriority>7
for txs with priority above 7.So at what point would we just design to go with a particular indexing solution so we can support some of that sort of complexity, or is that too dangerous a rabbit hole ?
For now, the main priorities are too subscribe to particular classes of app events (I think the channels and
*
is pretty good for this) and to have some other tags to support mempool behaviour (work
,priority
). I don't think we need to worry about more datastructures yet. For instance, if we returnaccount.name.Igor
as a channel, don't think we need to return"account": {"name": "Igor"}
as a tag too.