Storing Hierarchical Data in CouchDB
« Thursday, July 18, 2013 »

couchdb javascript tree

In an article of the same name located here, we can read how to build a tree in couchdb using a traditional method. I think this solution is good in some ways but for my own needs, it isn't a usable solution.

From what I believe here is a list of Pros and Cons:

Pros

  • One request to load the tree
  • Quite easy to implement
  • Editing the tree doesn't require to edit more than one object per node

Cons

  • Hard or impossible to sort objects in the tree (they come as they are by the map function)
  • Editing/Moving nodes in the three requires “n/2” to “n” requests to the database per change
  • Data in the node is poluted by the tree, mixing data and structure
  • Building the “path” object might be complicated if one parent changes you might have to change the path for all children or they will remain on a virtual node
  • Impossible to have two identical node in the tree at different spaces (unless you emit multiple path for a node, but the data in the tree might get duplicated)

I guess most of the cons I found could be fixed, but the resulting algorithm wouldn't end up much more complicated than it was. The solution might also get much more dependent on the language used to post process the tree (building the tree).

For that reason, I came up with a different solution that gives me almost absolute control on what can be contained in my tree. Couchdb is an interesting database, to get something out right. You pretty much have to smash your head somewhere pretty hard. Couchdb isn't a relational database and shouldn't be used as such. In couchdb we can store objects and this is what we're going to do.

Let say we had a tree like this:

"Vehicles"
|  |_ "Car"
|  |   |_ "Mitsubishi"
|  |   \_ "Toyota"   
|  |_ "Moto"
|      |_ "Kawasaki"
|      \_ "Honda"
\_"Tools"      
   \_ "Gardening"

Then we would store something like this in our database:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
 {
     _id: "tree_id",
     type: "category_tree"
     children: [{
         value: "Vehicles",
         children: [{
             value: "Car",
             children: [{
                 value: "Mitsubishi",
                 children: []
             },
             {
                 value: "Toyota",
                 children: []
             }]
         },
         {
             value: "Moto",
             children: [{
                 value: "Kawasaki",
                 children: []
             },
             {
                 value: "Honda",
                 children: []
             }]
         }]
     },
     {
         value: "Tools",
         children: [{
             value: "Gardening",
             children: []
         }]
     }]
 }

The structure of the tree is quite simple, each node has a value and children. When there is no children, we could simply not add it but for the sake of simplicity, we'll just leave empty list in there. In the exemple above, the value being stored is a string. But in a real situation, we could probably insert objects instead. In my case, we can see that each node is a category. The saved tree is saved with type: category_tree.

So we have a tree where each node contains a value. It's not very handy if you want to edit individual nodes. For that reason, we will add as values references. Instead of:

1
value: "Honda"

We will add:

1
value: {id: "Honda", type: "category"}

With this, we can add our Honda object anywhere in the tree and it should always reference only one object. Each category can be changed, renamed or whatever withough having to change the tree.

Here is the map reduce I'm currently using:

1
2
3
4
5
6
7
8
9
function (doc) {
    if (doc.type === "category_tree") {
        emit([1, doc._id, doc.type, doc._id], doc);
    }

    if (doc.type === "category") {
        emit([2, doc.category, doc.type, doc._id], doc);
    }
}

The first emitted key could be removed, it is there to set my tree object first in the list. The second key is the root object. Since all objects are owned by the tree object, it is the id we're going to use. The third key is the type and the last key is the id.

So when doing a query to this view, we will receive something like this:

1
2
3
"key":[1,"52a4d504968cc470ba21793e86000c06","category_tree","52a4d504968cc470ba21793e86000c06"], "value": "..."
"key":[2,"52a4d504968cc470ba21793e86000c06","category","52a4d504968cc470ba21793e8600041c"], "value": ".."
"key":[2,"52a4d504968cc470ba21793e86000c10","category","52a4d504968cc470ba21793e8600041c"], "value": "..."

If we use the startkey and endkey, we can filter the result of the map call to something like this:

1
2
startkey=[1,"52a4d504968cc470ba21793e86000c06"]
endkey=[2,"52a4d504968cc470ba21793e86000c06",{},{}]

And the result will be:

1
2
"key":[1,"52a4d504968cc470ba21793e86000c06","category_tree","52a4d504968cc470ba21793e86000c06"], "value": "..."
"key":[2,"52a4d504968cc470ba21793e86000c06","category","52a4d504968cc470ba21793e8600041c"], "value": ".."

Good enough:

Now we'll have to build our tree:

1
2
3
4
5
6
7
8
9
nodesMap = {};
// Transform all our category as node and skip the tree
nodes = results.slice(1).map(function (obj) {
    return nodesMap[obj._id] = new Category(obj);
});

tree = results[0];

rebuildTree(tree.children, nodes);

The rebuildTree function should look like this:

1
2
3
4
5
6
7
8
    function rebuildTree(children) {
        children = children || [];

        children.forEach(function (obj, index, list) {
            obj.value = nodesMap[obj.value.id];
            rebuildTree(obj.children);
        });
    }

It will replace all values in the tree by the coresponding object that has been parsed above. When editing the tree, you can move any node anywhere you want. When you're ready. All you have to do is to serialize back all Category objects, to {id: type:}. Save back the tree into couchdb and all is fine.

Since objects are stored individually of the object tree, we can store the values in any way we want. We can possibly also add them to multiple tree structure. In one request, we're able to fetch the tree and all it's sub objects. If we wanted, we could also load only the tree. It's not possible to fetch some part of the tree.

The big plus of this solution is that editing the tree can be performed locally with your favorite language. You can move/add nodes and save everything in only one request. If we wanted to add revisions to the tree, we could possibly save add an history property that saves all past states of the tree. Since the tree is an object by itself. It is much more easier to manage than having to do some weird construction using map reduce. The code here is quite straight forward and what we have in the database can almost map 1:1 with our application

comments powered by Disqus

Copyright © 2015 Loïc Faure-Lacroix