developerworks/closure_table.md

## closure_table.md

      
    Raw
  

              closure_table.md
            
          
    使用closure表在MySQL中管理分层关系

创建表

创建一个表示所有树节点的表.
CREATE TABLE `tree_node` (
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `data_body` text,
    `node_deleted` datetime DEFAULT NULL,
    PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

创建一个closure表来管理树的结构
CREATE TABLE `tree_hierarchy` (
    `ancestor` int(10) unsigned NOT NULL,
    `descendant` int(10) unsigned NOT NULL,
    PRIMARY KEY (`ancestor`,`descendant`),
    KEY `descendant` (`descendant`),
    CONSTRAINT `tree_hierarchy_ibfk_1` FOREIGN KEY (`ancestor`) REFERENCES `tree_node` (`id`),
    CONSTRAINT `tree_hierarchy_ibfk_2` FOREIGN KEY (`descendant`) REFERENCES `tree_node` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Multiple trees can be stored in these two tables, effectively creating a relational persistence container for a forest of trees.  The closure table behaves as loosely coupled overlay, and additional tables could be used to represent alternative structure pertaining to the data.
在这两个表中,可以存储多个树,这是为由不同的树组成的森林创建持久的关系容器非常高效的方式.
添加/删除枝叶节点

要INSERT一个叶子节点到树结构中,需要执行两次数据库操作: 向tree_node表中插入一个节点,并获取其节点ID作为<node_id>, 它是<parent_node_id>的子节点.
INSERT INTO tree_node (data_body) VALUES ("Demo node data body.");

INSERT INTO tree_hierarchy (ancestor, descendant) SELECT ancestor, <node_id> FROM tree_hierarchy WHERE descendant=<parent_node_id> UNION ALL SELECT <node_id>, <node_id>;

删除一个叶子节点:
UPDATE tree_node SET node_deleted=NOW() WHERE id=<node_id>;

DELETE FROM tree_hierarchy WHERE descendant=<node_id>;

获取一颗完整的树

We can query the tree_node table itself to perform a key-value look up an individual tree node or we can do a more complex query like the following to get an entire tree of elements.
SELECT
    tree_node.id,
    tree_node.node_deleted,
    tree_node.data_body,
    tree_hierarchy.ancestor AS parent_id
FROM tree_node
    JOIN tree_hierarchy ON tree_node.id=tree_hierarchy.ancestor
WHERE tree_hierarchy.descendant=<node_id>
    AND tree_node.node_deleted IS NULL
UNION
SELECT
    tree_node.id,
    tree_node.node_deleted,
    tree_node.data_body,
    tree_hierarchy.ancestor AS parent_id
FROM tree_node
    JOIN tree_hierarchy ON tree_node.id=tree_hierarchy.descendant
WHERE tree_hierarchy.ancestor=<node_id>
    AND tree_node.node_deleted IS NULL;

不需要使用递归逻辑来从数据库中获得一个树的表示. SELECT查询的作用只是查找整个树包含了那些节点,并返回所有树中的节点. 把查询结果转换为应用程序能够使用的数据结构表示则是程序逻辑的责任,不是数据的责任.
Note an interesting quirk: this technique enables getting an entire tree by specifying any node ID within the tree.
TODO: Other common tree operations

TODO: Take this further to explore the limitations or potential variations on use of this technique.  Make note of how this solution scales or what it could be paired with for scaling (i.e. gizzard, etc.?).
What we have is essentially a key-value table paired with a mapping table that can be queried.  I'm interested in seeing how this could be represented using distributed data stores, especially using data stores that emphasize key-value data and pairing various data store technologies to manage different types of data (i.e. store nodes in something like riak while storing the ancestry details as secondary indexes for the data stored in riak).
#references Slides 40-68 in http://www.slideshare.net/billkarwin/models-for-hierarchical-data slide deck.  This technique is commonly used for modeling comment systems that support nested comments.