Skip to content

Instantly share code, notes, and snippets.

@mapio
Last active December 19, 2023 04:17
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mapio/c5fe89f55cb6499a743f912b5e53198b to your computer and use it in GitHub Desktop.
Save mapio/c5fe89f55cb6499a743f912b5e53198b to your computer and use it in GitHub Desktop.
A simple monotonic grammar for the repeated word language
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# A simple monotonic grammar for the repeated word language"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is well known that the language $L = \\left\\{ww | w \\in \\{a, b\\}^+ \\right\\}$ is *context-sensitive*, yet many examples of grammars for $L$ (provided in textbooks and on-line) are actually *unrestricted* grammars containing $\\varepsilon$-rules instead of *context-sensitive* grammars.\n",
"\n",
"Leveraging the ideas behind such examples, here I design a simple *monotonic* grammar for $L$ in [Kuroda normal form](https://en.wikipedia.org/wiki/Kuroda_normal_form) that can be easily translated to a *context-sensitive* grammar using the György Révész transformation.\n",
"\n",
"One of the simplest and more common approach to obtain $L$ is to generate $ww^R$ and then reverse the second half of the word. More precisely, the reversed part is generated as a non-terminal word $W \\in \\{A, B\\}^+$; from $wW^R$ a set of \"swapping\" productions of the kind $xX \\rightarrow Xx$ are then used to reverse the non termnial part (and $A$ and $B$ are turned to $a$ and $b$). So far, so good: the grammar is in Kuroda normal form and it would be a straightforward exercise to turn it to a *context-free* grammar.\n",
"\n",
"But terminal symbols from the second half of the word should not be mixed with ones from the first half by the swapping productions: a nonterminal *marker* is usually added (to generate $w\\$W^R$) so that swapping happens just to the right of it. Such marker is eventually disposed with a $\\varepsilon$-rule. Eliminating such rule (that makes the grammar *unrestricted* and not *context-free*, nor *monotone*) requires a bit more ingenuity, since there is no well know textbook transformation for the general case.\n",
"\n",
"Observe on passing that similar approaches put the marker at the and and reverse the swapping productions, or use two markers moving towards each other (initially at the middle and end of the reversed nonterminal word) and dispose of them when they get to touch. But shrinking the marker (or markers) seems to be ubiquitous in all suggested solutions.\n",
" \n",
"To cut a long story short, one of the most suggested and simple *unrestricted* grammars $G$ for $L$ is given by the set of productions\n",
"$$\n",
"\\begin{align*} \n",
"S &\\rightarrow a S A | b S B | \\$ \\\\\n",
"\\$ &\\rightarrow \\varepsilon \\\\\n",
"\\$ A &\\rightarrow \\$ a \\\\\n",
"\\$ B &\\rightarrow \\$ b \\\\\n",
"a A &\\rightarrow A a \\\\\n",
"a B &\\rightarrow B a \\\\\n",
"b A &\\rightarrow A b \\\\\n",
"b B &\\rightarrow B b \\\\\n",
"\\end{align*}\n",
"$$\n",
"\n",
"It is easy to check that this works, informally: \n",
"\n",
"* the productions involving $S$ generate $w\\$W^R$,\n",
"* the marker can only be disposed as the last nonterminal (otherwise $A$\n",
" and $B$ will never be removed from the sentential form),\n",
"* only the leftmost $A$ or $B$ can be converted to a teminal (when it is\n",
" next to the marker), such terminal can then only travel to the right,\n",
" possibly pushing other nonterminal to the left, but never corss any\n",
" other terminal.\n",
"\n",
"So how to dispose of the marker without shrinking?\n",
"\n",
"The idea is to turn the $\\varepsilon$-rule to $\\$\\rightarrow a|b$. To keep the final word of even length, one can use two markers starting the swap process from $w\\$W^R\\$$, but turning *both* the occurences of the marker to the *same* terminal seems too much for a *context-sensitive* grammar. \n",
"\n",
"On the other hand, if one substitutes the first two lines of $G$ with\n",
"\n",
"$$\n",
"\\begin{align*} \n",
"S &\\rightarrow R $ \\\\\n",
"R &\\rightarrow a R A | b R B | \\$ \\\\\n",
"\\$ &\\rightarrow a\n",
"\\end{align*}\n",
"$$\n",
"\n",
"obtains a grammar $G_a$ that, as it is easy to check, generates $L_a = \\left\\{wawa | w \\in \\{a, b\\}^+ \\right\\}$. Similarly one can obtain a grammar $G_b$ for $L_b = \\left\\{wbwb | w \\in \\{a, b\\}^+ \\right\\}$. Since $L = L_a \\cup L_b$ deriving a *context-free* grammar for $L$ using $G_a$ and $G_b$ is again a straightforward exercise.\n",
"\n",
"Putting everithing together:\n",
"\n",
"$$\n",
"\\begin{align*} \n",
"S &\\rightarrow R_a\\$_a | R_b\\$_b \\\\\n",
"R_a &\\rightarrow a R_a A | b R_a B | \\$_a \\\\\n",
"R_b &\\rightarrow a R_b A | b R_b B | \\$_b \\\\\n",
"\\$_a &\\rightarrow a \\\\\n",
"\\$_b &\\rightarrow b \\\\\n",
"\\$_a A &\\rightarrow \\$_a a \\\\\n",
"\\$_a B &\\rightarrow \\$_a b \\\\\n",
"\\$_b A &\\rightarrow \\$_b a \\\\\n",
"\\$_b B &\\rightarrow \\$_b b \\\\\n",
"a A &\\rightarrow A a \\\\\n",
"a B &\\rightarrow B a \\\\\n",
"b A &\\rightarrow A b \\\\\n",
"b B &\\rightarrow B b \\\\\n",
"\\end{align*}\n",
"$$\n",
"\n",
"This grammar generates $L = L_a \\cup L_b$ and is in Kuroda normal form."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
@matejsubrt
Copy link

Hi, idk if you'll read this, but thanks man, this just saved my ass in my today's Automata and Grammars exam in Prague :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment