ekmett/mihaescu.doc

## mihaescu.doc
COS 528			Notes on Catenable Deques
Fall 03				         in Pure LISP
HO1

Data structure devised by Radu Mihaesau and Robert Tarjan 8/2003. See also:

H. Kaplan and R.E. Tarjan, “Purely functional, real-time deques with catenation,” J. Assoc. Comput. Mach. 46 (1999), 577-603.

H. Kaplan, C. Okasaki and R.E. Tarjan, “Simple confluently persistent catenable lists,” SIAM J. Comput. 30 (2000), 965-977.

Non catenable deques

A non catenable deque (n-deque) over a set A is empty, or consists of a single element of A, or consists of two or more elements of A grouped into a prefix of 0-2 elements of A, a child deque of pairs of elements of A, and a suffix of 0-2 elements of A.  We denote an empty deque by "-", a one element deque containing x by [x], and a deque with prefix p, child d, and suffix s by [p,c,s]; if d=[p,c,s] we denote p,c,s, by p(d),c(d),s(d), respectively.  A prefix is a 0-, 1-, or 2- prefix depending on its number of elements; similarly for suffixes.

We define the sequence of prefixes p0,p1, p2,... of a deque d by pi= the prefix of c' (d), and similarly define the sequence of suffixes.  A deque is regular if 0- and 2-prefixes alternate in the sequence of prefixes, ignoring 1-prefixes  (the first non-1-prefix can be either a 0 or a 2), the corresponding invariant holds for the sequence of suffixes, and the bottom deque is not of the form [2,0,0] or [0,0,2].  We maintain regularity of deques.

We will describe push and pop using certain auxiliary operations; inject and eject are symmetric.  A deque is 0-exposed (2-exposed) if its first non-1-prefix is a 0 (2, respectively).

npush(x, d) (naïve push) applies and preserves regularity if d is not 2-exposed.

npush(x, d) =
     if d= - then [x]
     else if d=[y] then [[x],-,[y]]
     else if d=[-,c,s] then [[x],c,s]
     else [[x,y],c,s] where [[y],c,s]=d

2fix(d) applies and preserves regularity if d is 2-exposed.  It eliminates the 2-exposure.
Let d' be the uppermost descendant deque in d with a 2-prefix.  Then 2fix(d) is d with d' replaced by [-,npush(p', c'),s')], where [p',c',s']=d'.


push(x,d) = if d is 2-exposed then npush(x,2fix(d)) else npush(x, d)

npop(d) (naïve pop) applies and preserves regularity if d is not 0-exposed.

npop(d)=
     if d= - then error
     else if d =[x] then x,-
     else if d=[[x],-,[y]] then x, [y]
     else if d=[[x],-,[y,z]] then x, [[y],-,[z]]
     else if d=[[x],c,s] then x,[-,c,s]
     else x, [[y],c,s] where [[x,y],c,s]=d

0fix(d) applies and preserves regularity if d is 0-exposed.  It eliminates the 0-exposure.
Let d' be the uppermost descendant deque in d with a 0-prefix.  Then 2fix(d) is d with d' replaced by [p'',c'',s'], where [-,c',s']=d' and p'', c'' = npop(c').  Note that c' must be nonempty, since [0,0,1] and [0,0,2] are disallowed states.  Regularity implies that c' is not 0-exposed, which means that npop applies to c'.

pop(d)=if d is 0-exposed then npop(0fix(d)) else npop(d)

We need to verify that regularity is preserved in all cases.  This is straightforward.  Also, we need the right data structure to support constant-time operations.  A stack of stacks of stacks works. Each innermost stack is a maximal block of prefix-suffix pairs without a 0 or 2 except on top. A second-level stack is a maximal block of innermost stacks without both a 0 or 2-prefix and a 0 or 2-suffix.

Catenable deques

A binary catenable deque is a buffer of 0 to 7 elements (bottom deque), or a 3-tuple consisting of a prefix of 3-6 elements, a child deque of entries, and a suffix of 3-6 elements (unary deque), or a 5-tuple consisting of a prefix of 3-6 elements, a nonempty left child deque of entries, a middle of 2 elements, a nonempty right child deque of entries, and a suffix of 3-6 elements (binary deque). An entry is either a buffer of 2 or 3 elements, or a triple consisting of a left buffer of 2 or 3 elements, a nonempty binary catenable deque, and a right buffer of 2 or 3 elements. We define prefix sequences through left or only children and suffix sequences through right or only children. The regularity condition is that along any such sequence, 3-buffers alternate with 5-or-6-buffers, ignoring 4 buffers. We say a deque is prefix(suffix) 3-exposed (5,6-exposed) if the first non-4 prefix(suffix) in its sequence is a 3 (5 or 6). We deal with push, pop, catenate; inject and eject are symmetric to push and pop.

We can do a naïve push while preserving normality if the deque is not a bottom deque and not 5,6-exposed. To do a push on a bottom deque, we merely push the new element on the buffer, unless this would make the buffer of size 8, in which case we group the elements into a prefix and suffix each of size 4 and construct a unary deque with this prefix and suffix and an empty child deque. To do a push on a 5,6-exposed deque, we first do a 5,6-fix to make it not 5,6-exposed, and then we do a naïve push. To do a 5,6-fix on a 5,6-exposed deque, we replace the topmost deque d in the deque stack such that d has a 5-or-6-prefix by the deque d' constructed as follows. Let p be the prefix of d; let c be the child deque of d if d is unary or the left child deque of d if d is binary. Contruct  p' from the first three elements of p; let c' be formed from c by pushing onto c a buffer containing the remaining two or three elements of p. Let d' be d with p replaced by p' and c replaced by c'.  By regularity, c is not 5,6-exposed, which means that the prefix of c' is a 4, 5, or 6-buffer, and c' is regular. (If the prefix of c' is 4, the child or left child of c' is not 3-exposed; if the prefix of c' is 5 or 6, the child or left child of c' is not 5, 6-exposed.) Observe that a push always results not only in a regular deque but in one that is not 3-exposed.

To do a catenate of d1 and d2 we must consider several cases. If d1 is a bottom deque we merely push the elements of d1 one-at-a-time onto d2; symmetrically if d2 is a bottom deque.

In the remaining cases (d1 and d2 each a unary or binary deque), we convert d1 to a half 5-tuple h1, consisting of a prefix, a child deque, and a single element, symmetrically convert d2 to a half 5-tuple h2 consisting of a single element, a child deque, and a suffix, and let the result deque have as prefix the prefix of h1, as left child the child deque of h1, as middle the pair consisting of the last element of h1 and the first element of h2, as right child the child of h2, and as suffix the suffix of h2.

To form h1, we use the prefix of d1 as prefix. If d1 is unary, we use the last element of the suffix s1 of d1 as the last element of d1, we group the remaining elements of s1 into a pair, a triple, or two pairs (If there are 2,3, or 4 remaining elements, respectively), and form the child deque of h1 from the child deque of d1 by injecting the pair, triple, or two pairs (two injects) into the child deque of d1. If d1 is binary, we use the last element of the suffix s1 of d1 as the last element of h1, we group the remaining elements of s1 into a pair, a triple, or two pairs, and form the child deque of h1 from the left child deque of d1 by doing one or two injects. The first inject is of a triple consisting of the middle pair of d1, the right child of d1, and the first pair or triple formed from s1. The second inject is of the second pair formed from s1, if there is one.

The deque d formed by catenating d1 and d2 is regular, since its prefix is the prefix of d1, its left child deque is formed from the child or left child deque of d1 by doing injects, which do not harm the regularity condition on prefixes, and the symmetric argument applies to the suffixes. Furthermore, if d1 is not a bottom deque, the left(prefix) exposure of d is the same as that of d1; symmetrically for d2.

The remaining operation is pop. We can do a naïve pop on a deque d if the deque is not 3-exposed. If d is 3-exposed, we first do a 3-fix to make it not 3-exposed, and then we do a naïve pop. To do a 3-fix, we replace the topmost deque d in the deque stack such that d has a 3-prefix by the deque d' constructed as follows. If d is unary with an empty child, we form d' by grouping the 6-9 elements of d into a single buffer if d is of size 6, or 7, and into a prefix of size 4 and a suffix of size 4 or 5 if d is of size 8 or 9, respectively.
Otherwise let c be the child of d if d is unary or its left child if it is binary. If the first entry on c is a buffer b, we pop this buffer from c to form c'. We form a prefix p' by injecting the two or three elements of b into the prefix of d. If d is unary, we form d' from this p', c', and the suffix of d. If c' is empty and d is binary, we push the middle buffer of d into the right child of d to form a child c”, and form d' from p', c”, and the suffix of d. Finally, if c' is nonempty and d is binary, we form d' from p', c', and the middle, right child, and suffix of d.

The remaining case of a fix occurs when the buffer b is a triple [b1,d2,b3]. We pop this triple from c to form c'. We form the prefix of d' by injecting the elements of b1 into the prefix of d. If d is unary, we form a deque d3 by injecting b3 into d2 and catenating the resulting deque with c'. Deque d3 is regular, but it may be 5-exposed on the left. If so, we do a 5-fix on the left. If c' is not a bottom deque, then d3 has the same exposure on the right as c and c', which means that the overall result is a regular deque. If c' is a bottom deque, d3 is not 3-exposed on the right. We may need to make it 3-exposed, which we do by 5-fix on the right. (In this case we need to know the type of the last non-4 suffix on the sequence to d.) The resulting deque becomes the child of d'; the suffix of d' is the suffix of d. Finally, if d is binary, we form a deque d3 by injecting b3 into d2 and catenating the resulting deque with the left child of d (or pushing b3 onto the left child of d and catenating d2 with this deque). If d3 is 5-exposed on the left, we do a 5-fix on the left. The resulting deque becomes the left child of d'. The middle, right child, and suffix of d' are the same as those of d, respectively.

To implement this method, we need a way to access the top non-4 prefix and the top non-4 suffix in constant time. We use a representation like that of noncatentable deques, but generalized to handle binary branching. We decompose the binary tree corresponding to the nested deque structure by breaking every link from a binary deque that is a left (right) child or the root to its right(left) child. This breaks the tree into a set of paths, which we represent using the stack of stacks of stacks structure used for noncatenable deques, except that  for a left(right) child deque, only its prefix(suffix) counts as non-4. We also maintain the bottommost deque in the stack of stacks separately.  In addition, for each stack of blocks, we keep two bits that indicate whether its bottommost non-4 prefix and suffix are 3, or 5-or-6. (This information is needed for one case of 5, 6-fix, when a triple gets popped from the child deque of a unary deque, making the child deque empty.) Finally, each binary deque points to each stack of stacks of stacks whose top deque is one of its children. Every binary node points to one such stack of stacks of stacks, except for the root, which points to two.


Handout 1
	COS 528 Notes on Catenable Deques
	Fall 03 in Pure LISP
	HO1

	Data structure devised by Radu Mihaesau and Robert Tarjan 8/2003. See also:

	H. Kaplan and R.E. Tarjan, “Purely functional, real-time deques with catenation,” J. Assoc. Comput. Mach. 46 (1999), 577-603.

	H. Kaplan, C. Okasaki and R.E. Tarjan, “Simple confluently persistent catenable lists,” SIAM J. Comput. 30 (2000), 965-977.

	Non catenable deques

	A non catenable deque (n-deque) over a set A is empty, or consists of a single element of A, or consists of two or more elements of A grouped into a prefix of 0-2 elements of A, a child deque of pairs of elements of A, and a suffix of 0-2 elements of A. We denote an empty deque by "-", a one element deque containing x by [x], and a deque with prefix p, child d, and suffix s by [p,c,s]; if d=[p,c,s] we denote p,c,s, by p(d),c(d),s(d), respectively. A prefix is a 0-, 1-, or 2- prefix depending on its number of elements; similarly for suffixes.

	We define the sequence of prefixes p0,p1, p2,... of a deque d by pi= the prefix of c' (d), and similarly define the sequence of suffixes. A deque is regular if 0- and 2-prefixes alternate in the sequence of prefixes, ignoring 1-prefixes (the first non-1-prefix can be either a 0 or a 2), the corresponding invariant holds for the sequence of suffixes, and the bottom deque is not of the form [2,0,0] or [0,0,2]. We maintain regularity of deques.

	We will describe push and pop using certain auxiliary operations; inject and eject are symmetric. A deque is 0-exposed (2-exposed) if its first non-1-prefix is a 0 (2, respectively).

	npush(x, d) (naïve push) applies and preserves regularity if d is not 2-exposed.

	npush(x, d) =
	if d= - then [x]
	else if d=[y] then [[x],-,[y]]
	else if d=[-,c,s] then [[x],c,s]
	else [[x,y],c,s] where [[y],c,s]=d

	2fix(d) applies and preserves regularity if d is 2-exposed. It eliminates the 2-exposure.
	Let d' be the uppermost descendant deque in d with a 2-prefix. Then 2fix(d) is d with d' replaced by [-,npush(p', c'),s')], where [p',c',s']=d'.





	push(x,d) = if d is 2-exposed then npush(x,2fix(d)) else npush(x, d)

	npop(d) (naïve pop) applies and preserves regularity if d is not 0-exposed.

	npop(d)=
	if d= - then error
	else if d =[x] then x,-
	else if d=[[x],-,[y]] then x, [y]
	else if d=[[x],-,[y,z]] then x, [[y],-,[z]]
	else if d=[[x],c,s] then x,[-,c,s]
	else x, [[y],c,s] where [[x,y],c,s]=d

	0fix(d) applies and preserves regularity if d is 0-exposed. It eliminates the 0-exposure.
	Let d' be the uppermost descendant deque in d with a 0-prefix. Then 2fix(d) is d with d' replaced by [p'',c'',s'], where [-,c',s']=d' and p'', c'' = npop(c'). Note that c' must be nonempty, since [0,0,1] and [0,0,2] are disallowed states. Regularity implies that c' is not 0-exposed, which means that npop applies to c'.

	pop(d)=if d is 0-exposed then npop(0fix(d)) else npop(d)

	We need to verify that regularity is preserved in all cases. This is straightforward. Also, we need the right data structure to support constant-time operations. A stack of stacks of stacks works. Each innermost stack is a maximal block of prefix-suffix pairs without a 0 or 2 except on top. A second-level stack is a maximal block of innermost stacks without both a 0 or 2-prefix and a 0 or 2-suffix.

	Catenable deques

	A binary catenable deque is a buffer of 0 to 7 elements (bottom deque), or a 3-tuple consisting of a prefix of 3-6 elements, a child deque of entries, and a suffix of 3-6 elements (unary deque), or a 5-tuple consisting of a prefix of 3-6 elements, a nonempty left child deque of entries, a middle of 2 elements, a nonempty right child deque of entries, and a suffix of 3-6 elements (binary deque). An entry is either a buffer of 2 or 3 elements, or a triple consisting of a left buffer of 2 or 3 elements, a nonempty binary catenable deque, and a right buffer of 2 or 3 elements. We define prefix sequences through left or only children and suffix sequences through right or only children. The regularity condition is that along any such sequence, 3-buffers alternate with 5-or-6-buffers, ignoring 4 buffers. We say a deque is prefix(suffix) 3-exposed (5,6-exposed) if the first non-4 prefix(suffix) in its sequence is a 3 (5 or 6). We deal with push, pop, catenate; inject and eject are symmetric to push and pop.

	We can do a naïve push while preserving normality if the deque is not a bottom deque and not 5,6-exposed. To do a push on a bottom deque, we merely push the new element on the buffer, unless this would make the buffer of size 8, in which case we group the elements into a prefix and suffix each of size 4 and construct a unary deque with this prefix and suffix and an empty child deque. To do a push on a 5,6-exposed deque, we first do a 5,6-fix to make it not 5,6-exposed, and then we do a naïve push. To do a 5,6-fix on a 5,6-exposed deque, we replace the topmost deque d in the deque stack such that d has a 5-or-6-prefix by the deque d' constructed as follows. Let p be the prefix of d; let c be the child deque of d if d is unary or the left child deque of d if d is binary. Contruct p' from the first three elements of p; let c' be formed from c by pushing onto c a buffer containing the remaining two or three elements of p. Let d' be d with p replaced by p' and c replaced by c'. By regularity, c is not 5,6-exposed, which means that the prefix of c' is a 4, 5, or 6-buffer, and c' is regular. (If the prefix of c' is 4, the child or left child of c' is not 3-exposed; if the prefix of c' is 5 or 6, the child or left child of c' is not 5, 6-exposed.) Observe that a push always results not only in a regular deque but in one that is not 3-exposed.

	To do a catenate of d1 and d2 we must consider several cases. If d1 is a bottom deque we merely push the elements of d1 one-at-a-time onto d2; symmetrically if d2 is a bottom deque.

	In the remaining cases (d1 and d2 each a unary or binary deque), we convert d1 to a half 5-tuple h1, consisting of a prefix, a child deque, and a single element, symmetrically convert d2 to a half 5-tuple h2 consisting of a single element, a child deque, and a suffix, and let the result deque have as prefix the prefix of h1, as left child the child deque of h1, as middle the pair consisting of the last element of h1 and the first element of h2, as right child the child of h2, and as suffix the suffix of h2.

	To form h1, we use the prefix of d1 as prefix. If d1 is unary, we use the last element of the suffix s1 of d1 as the last element of d1, we group the remaining elements of s1 into a pair, a triple, or two pairs (If there are 2,3, or 4 remaining elements, respectively), and form the child deque of h1 from the child deque of d1 by injecting the pair, triple, or two pairs (two injects) into the child deque of d1. If d1 is binary, we use the last element of the suffix s1 of d1 as the last element of h1, we group the remaining elements of s1 into a pair, a triple, or two pairs, and form the child deque of h1 from the left child deque of d1 by doing one or two injects. The first inject is of a triple consisting of the middle pair of d1, the right child of d1, and the first pair or triple formed from s1. The second inject is of the second pair formed from s1, if there is one.

	The deque d formed by catenating d1 and d2 is regular, since its prefix is the prefix of d1, its left child deque is formed from the child or left child deque of d1 by doing injects, which do not harm the regularity condition on prefixes, and the symmetric argument applies to the suffixes. Furthermore, if d1 is not a bottom deque, the left(prefix) exposure of d is the same as that of d1; symmetrically for d2.

	The remaining operation is pop. We can do a naïve pop on a deque d if the deque is not 3-exposed. If d is 3-exposed, we first do a 3-fix to make it not 3-exposed, and then we do a naïve pop. To do a 3-fix, we replace the topmost deque d in the deque stack such that d has a 3-prefix by the deque d' constructed as follows. If d is unary with an empty child, we form d' by grouping the 6-9 elements of d into a single buffer if d is of size 6, or 7, and into a prefix of size 4 and a suffix of size 4 or 5 if d is of size 8 or 9, respectively.
	Otherwise let c be the child of d if d is unary or its left child if it is binary. If the first entry on c is a buffer b, we pop this buffer from c to form c'. We form a prefix p' by injecting the two or three elements of b into the prefix of d. If d is unary, we form d' from this p', c', and the suffix of d. If c' is empty and d is binary, we push the middle buffer of d into the right child of d to form a child c”, and form d' from p', c”, and the suffix of d. Finally, if c' is nonempty and d is binary, we form d' from p', c', and the middle, right child, and suffix of d.

	The remaining case of a fix occurs when the buffer b is a triple [b1,d2,b3]. We pop this triple from c to form c'. We form the prefix of d' by injecting the elements of b1 into the prefix of d. If d is unary, we form a deque d3 by injecting b3 into d2 and catenating the resulting deque with c'. Deque d3 is regular, but it may be 5-exposed on the left. If so, we do a 5-fix on the left. If c' is not a bottom deque, then d3 has the same exposure on the right as c and c', which means that the overall result is a regular deque. If c' is a bottom deque, d3 is not 3-exposed on the right. We may need to make it 3-exposed, which we do by 5-fix on the right. (In this case we need to know the type of the last non-4 suffix on the sequence to d.) The resulting deque becomes the child of d'; the suffix of d' is the suffix of d. Finally, if d is binary, we form a deque d3 by injecting b3 into d2 and catenating the resulting deque with the left child of d (or pushing b3 onto the left child of d and catenating d2 with this deque). If d3 is 5-exposed on the left, we do a 5-fix on the left. The resulting deque becomes the left child of d'. The middle, right child, and suffix of d' are the same as those of d, respectively.

	To implement this method, we need a way to access the top non-4 prefix and the top non-4 suffix in constant time. We use a representation like that of noncatentable deques, but generalized to handle binary branching. We decompose the binary tree corresponding to the nested deque structure by breaking every link from a binary deque that is a left (right) child or the root to its right(left) child. This breaks the tree into a set of paths, which we represent using the stack of stacks of stacks structure used for noncatenable deques, except that for a left(right) child deque, only its prefix(suffix) counts as non-4. We also maintain the bottommost deque in the stack of stacks separately. In addition, for each stack of blocks, we keep two bits that indicate whether its bottommost non-4 prefix and suffix are 3, or 5-or-6. (This information is needed for one case of 5, 6-fix, when a triple gets popped from the child deque of a unary deque, making the child deque empty.) Finally, each binary deque points to each stack of stacks of stacks whose top deque is one of its children. Every binary node points to one such stack of stacks of stacks, except for the root, which points to two.







	Handout 1