Skip to content

Instantly share code, notes, and snippets.

@amedeo
Created February 10, 2011 12:00
Show Gist options
  • Star 15 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save amedeo/820412 to your computer and use it in GitHub Desktop.
Save amedeo/820412 to your computer and use it in GitHub Desktop.
CouchDB and multiple tags

CouchDB and multiple tags

The problem

Given a set of documents, each associated with multiple tags, how can I retrieve the documents tagged with an arbitrary set of tags?

My solution

In order to simplify the problem I have assumed that the array of tags in each document is ordered and that the key to search for documents will be an ordered array of tags as well.

Create the db:

$ DB="http://localhost:5984/db"
$ curl -XPUT $DB

Create some documents with tags:

$ curl -XPOST $DB -d '{"_id":"One", "tags":["A", "B", "C"]}' -H "Content-Type: application/json"
$ curl -XPOST $DB -d '{"_id":"Two", "tags":["B", "C", "D"]}' -H "Content-Type: application/json"
$ curl -XPOST $DB -d '{"_id":"Six", "tags":["C", "D"]}' -H "Content-Type: application/json"

Now create the view:

$ echo '
{
  "_id":"_design/all",
  "language":"javascript",
  "views":{
    "by_tags":{
      "map":"function(doc) {\n
          if (doc.tags == []) return;\n\n

          var emit_sequence = function(base, disp) {\n
            if (disp.length > 1) {\n
              emit(base.concat(disp[0]), 1);\n
              emit_sequence(base.concat(disp[0]), disp.slice(1, disp.length));\n
              emit_sequence(base, disp.slice(1, disp.length));\n
            } else if (disp.length == 1) {\n
              emit(base.concat(disp[0]), 1);\n
            }\n
          }\n\n

          emit_sequence([], doc.tags);\n
        }"
    }
  }
}
' > view.json

$ curl -XPOST $DB -d @view.json -H "Content-Type: application/json"

This view use a recursive function so that will emit all the possible combination of tags required to identify the document with any of the possible ordered subset of tags.

So, if the only document "One" was inserted, the list of keys will be:

$ curl -XGET $DB/_design/all/_view/by_tags

{"total_rows":7,"offset":0,"rows":[
{"id":"One","key":["A"],"value":1},
{"id":"One","key":["A","B"],"value":1},
{"id":"One","key":["A","B","C"],"value":1},
{"id":"One","key":["A","C"],"value":1},
{"id":"One","key":["B"],"value":1},
{"id":"One","key":["B","C"],"value":1},
{"id":"One","key":["C"],"value":1}
]}

With the three given sample documents, if we want now retrieve the list of document tagged with "B" and "C", the request will be:

$ curl -XGET $DB/_design/all/_view/by_tags?key='\["B","C"\]'

{"total_rows":17,"offset":6,"rows":[
{"id":"One","key":["B","C"],"value":1},
{"id":"Two","key":["B","C"],"value":1}
]}

Or, if we are looking for all the documents with tag "C":

$ curl -XGET $DB/_design/all/_view/by_tags?key='\["C"\]'

{"total_rows":17,"offset":10,"rows":[
{"id":"One","key":["C"],"value":1},
{"id":"Six","key":["C"],"value":1},
{"id":"Two","key":["C"],"value":1}
]}

It is worth to note that the view computed in this way will be quite big, the number of rows for each document is related to the number of tags. Given t as the tags size for a document, the rows count can be computed with the following formula:

rows = 2^t - 1

I am sure this is not the most efficient way to solve the original problem, but it works for me. Thanks to Claudio for his help.

@akocmark
Copy link

Hi,

Nice work. I already got it running but the thing is, How do I query the keys in url format? thanks.. I'm using couchdb-odm by the way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment