创建一个表示所有树节点的表.
CREATE TABLE `tree_node` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`data_body` text,
`node_deleted` datetime DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
创建一个closure表来管理树的结构
CREATE TABLE `tree_hierarchy` (
`ancestor` int(10) unsigned NOT NULL,
`descendant` int(10) unsigned NOT NULL,
PRIMARY KEY (`ancestor`,`descendant`),
KEY `descendant` (`descendant`),
CONSTRAINT `tree_hierarchy_ibfk_1` FOREIGN KEY (`ancestor`) REFERENCES `tree_node` (`id`),
CONSTRAINT `tree_hierarchy_ibfk_2` FOREIGN KEY (`descendant`) REFERENCES `tree_node` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Multiple trees can be stored in these two tables, effectively creating a relational persistence container for a forest of trees. The closure table behaves as loosely coupled overlay, and additional tables could be used to represent alternative structure pertaining to the data.
在这两个表中,可以存储多个树,这是为由不同的树组成的森林创建持久的关系容器非常高效的方式.
要INSERT
一个叶子节点到树结构中,需要执行两次数据库操作: 向tree_node
表中插入一个节点,并获取其节点ID作为<node_id>
, 它是<parent_node_id>
的子节点.
INSERT INTO tree_node (data_body) VALUES ("Demo node data body.");
INSERT INTO tree_hierarchy (ancestor, descendant) SELECT ancestor, <node_id> FROM tree_hierarchy WHERE descendant=<parent_node_id> UNION ALL SELECT <node_id>, <node_id>;
删除一个叶子节点:
UPDATE tree_node SET node_deleted=NOW() WHERE id=<node_id>;
DELETE FROM tree_hierarchy WHERE descendant=<node_id>;
We can query the tree_node
table itself to perform a key-value look up an individual tree node or we can do a more complex query like the following to get an entire tree of elements.
SELECT
tree_node.id,
tree_node.node_deleted,
tree_node.data_body,
tree_hierarchy.ancestor AS parent_id
FROM tree_node
JOIN tree_hierarchy ON tree_node.id=tree_hierarchy.ancestor
WHERE tree_hierarchy.descendant=<node_id>
AND tree_node.node_deleted IS NULL
UNION
SELECT
tree_node.id,
tree_node.node_deleted,
tree_node.data_body,
tree_hierarchy.ancestor AS parent_id
FROM tree_node
JOIN tree_hierarchy ON tree_node.id=tree_hierarchy.descendant
WHERE tree_hierarchy.ancestor=<node_id>
AND tree_node.node_deleted IS NULL;
不需要使用递归逻辑来从数据库中获得一个树的表示. SELECT
查询的作用只是查找整个树包含了那些节点,并返回所有树中的节点. 把查询结果转换为应用程序能够使用的数据结构表示则是程序逻辑的责任,不是数据的责任.
Note an interesting quirk: this technique enables getting an entire tree by specifying any node ID within the tree.
TODO: Take this further to explore the limitations or potential variations on use of this technique. Make note of how this solution scales or what it could be paired with for scaling (i.e. gizzard, etc.?).
What we have is essentially a key-value table paired with a mapping table that can be queried. I'm interested in seeing how this could be represented using distributed data stores, especially using data stores that emphasize key-value data and pairing various data store technologies to manage different types of data (i.e. store nodes in something like riak while storing the ancestry details as secondary indexes for the data stored in riak).
#references Slides 40-68 in http://www.slideshare.net/billkarwin/models-for-hierarchical-data slide deck. This technique is commonly used for modeling comment systems that support nested comments.