-
-
Save bobvanluijt/a6f812589095f7435e4e8a99a7f8fef6 to your computer and use it in GitHub Desktop.
### | |
# The result below shows the sum of population of all cities. | |
### | |
{ | |
Local { | |
Get(where:{ | |
operands: [{ | |
path: ["Things", "City", "population"], | |
operator: GreaterThan | |
valueInt: 1000000 | |
}] | |
},{ | |
group:{ | |
operands: [{ | |
path: ["Things", "City", "population"], | |
aggregate: SUM # other options: COUNT, MAX, MIN, SUM, AVG, | |
}] | |
} | |
}) { | |
Things { | |
City { | |
population | |
} | |
} | |
} | |
} | |
} |
A extra advantage of doing tis that you'll be able to clearly defend that these are different functions with different pricing than just slurping the data out of weaviate/a network.
That indeed sounds reasonable. Not necessarily in favor for one or the other but syntactically it might absolutely be preferable to introduce a Stats{}
function.
Naming wise, maybe Aggregate{}
would suit better. Any thoughts @laura-ham and @moretea?
It would also be possible to add all aggregation functions as GQL-functions.
{
Local {
Aggregate(where: { ... }) { # or Stats...
Sum{}
Percentile{}
Count{}
Average{}
Maximum{}
Median{}
Minimum{}
Mode{}
GroupBy() {} # Would be used for more complext group by functions.
}
}
@moretea would it be fair to say that splitting these aggregate functions (except for GroupBy()
) would be relatively easier to implement?
Maybe they are simple to implement. I would expect so, based on my experience with SQL. However, Gremlin is not as well rounded, and I did not research this yet for Gremlin.
To me these feel like 'statistical' functions, and not 'Get' functions.
I'd argue that a separate 'Aggregated' or 'Statistics' field under Local would do wonders for keeping the 'Get' function simple.
I image that such a field could be translated to Network queries too.
I find it hard to understand what these different aggregations are supposed to do, based on the GraphQL query.
I believe that we should distinguish between simple counts with conditions, and more complex operations like
groupBy
.Each different function/operation should ideally correspond to a field below 'Aggregated' or 'Statistics'.
This will make it very simple for end users to start to do some operations.
Initial impressions count, and a initial expore to simple statistics are very good for the demo-ability of Weaviate.
Simple sum
Output
95% percentile
Output
Group By