Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Casbah DSL for MongoDB Aggregation

I'm interested in community input, as I'm finally finishing adding support to Casbah's Query DSL for MongoDB Aggregation.

The trick is to try to prevent accidental usage of the aggregation syntax for queries, and check statements as cleanly as possible.

There are a few ways to go about this, and some concerns with each. My current favorite looks a bit like this:

(scroll over to see the whole thing)

val agg = | $group { ("lastAuthor" $last "$author") ++ ("firstAuthor" $first "$author")  ++ ("_id" -> "$foo") } $unwind "$tags" $sort ( "foo" -> 1, "bar" -> -1 ) $skip 5 $limit 10 $match { "score" $gt 50 $lte 90 }

In fact, that's a completely valid statement which compiles down to a proper aggregation pipeline. It has a bunch of syntax checking w/ validates arguments etc.

The upside of it is that it doesn't require any globally exported "Bareword Operations". I.E. $group is only exposed off of the | operator. The composite result of a | operation can be passed into an aggregate function and executed on a collection.

The downside is that it isn't to make a single statement in one variable capture - it won't properly compile when formatted down to multiple lines. You can, however, build your statement up across multiple lines:

val x = | $sort ( "foo" -> 1, "bar" -> -1 ) $skip 5 $limit 10 

val y = x $match { "score" $gt 50 $lte 90 }

val z = y $group  { ("lastAuthor" $last "$author") ++ ("firstAuthor" $first "$author")  ++ ("_id" -> "$foo") }

This isn't too bad, and still keeps things self contained.

However, I'm also toying slightly with the idea of making the operators be barewords, and having something more suited to multiline statements.

A rough sketch might be:

val agg = |($group { ("lastAuthor" $last "$author") ++ ("firstAuthor" $first "$author")  ++ ("_id" -> "$foo") },
           $unwind("$tags"),
           $sort ( "foo" -> 1, "bar" -> -1 ),
           $skip(5),
           $limit(10),
           $match { "score" $gt 50 $lte 90 })

I don't know that I like this syntax, but it might be a bit more "fluid" for writing the pipelines.

Anyone have opinions, thoughts or comments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment