Skip to content

Instantly share code, notes, and snippets.

@mwhittaker
Last active September 3, 2019 00:55
Show Gist options
  • Save mwhittaker/8cef5715db869a400fb0aad2685cd0b7 to your computer and use it in GitHub Desktop.
Save mwhittaker/8cef5715db869a400fb0aad2685cd0b7 to your computer and use it in GitHub Desktop.
Provenance Practice Prelim
\NeedsTeXFormat{LaTeX2e}
\ProvidesClass{mwhittaker}
\LoadClass[12pt]{article}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Imports
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RequirePackage[compact]{titlesec}
\RequirePackage[letterpaper,margin=1in]{geometry}
\RequirePackage{fancyhdr}
\RequirePackage{lastpage}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Header and Footer
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Title
\newcommand{\TITLE}{}
\renewcommand{\title}[1]{\renewcommand{\TITLE}{#1}}
% Date
\newcommand{\DATE}{\today}
\renewcommand{\date}[1]{\renewcommand{\DATE}{#1}}
% Author
\newcommand{\AUTHOR}{ }
\renewcommand{\author}[1]{\renewcommand{\AUTHOR}{#1}}
% Format header and Footer
\renewcommand{\lhead}[2][L]{\fancyhead[#1]{\footnotesize{#2}}}
\renewcommand{\chead}[2][C]{\fancyhead[#1]{\footnotesize{#2}}}
\renewcommand{\rhead}[2][R]{\fancyhead[#1]{\footnotesize{#2}}}
\renewcommand{\lfoot}[2][L]{\fancyfoot[#1]{\footnotesize{#2}}}
\renewcommand{\cfoot}[2][C]{\fancyfoot[#1]{\footnotesize{#2}}}
\renewcommand{\rfoot}[2][R]{\fancyfoot[#1]{\footnotesize{#2}}}
\renewcommand{\headrulewidth}{0pt}
\renewcommand{\footrulewidth}{0pt}
\pagestyle{fancy}
\lhead{}
\chead{}
\rhead{}
\lfoot{}
\cfoot{\thepage{} of \pageref{LastPage}}
\rfoot{}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Title
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\renewcommand\maketitle{
\begin{center}
{\Large \textbf{\TITLE}}\\
\textit{\DATE}\\
\end{center}
}
\usepackage{amsfonts}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{etoolbox}
\usepackage{mathrsfs}
\usepackage{mathtools}
\usepackage{tikz}
\usepackage{xcolor}
% https://tex.stackexchange.com/a/43837
\let\proof\relax
\let\endproof\relax
\usepackage{amsthm}
% A pretty color palette taken from https://flatuicolors.com/palette/defo. To
% see a preview of the different colors, use the \showcolors command below.
\definecolor{flatdarkgray}{HTML}{7F8C8D}
\definecolor{flatgray}{HTML}{BDC3C7}
\definecolor{flatred}{HTML}{C0392B}
\definecolor{flatorange}{HTML}{D35400}
\definecolor{flatyellow}{HTML}{F39C12}
\definecolor{flatdenim}{HTML}{2C3E50}
\definecolor{flatpurple}{HTML}{8E44AD}
\definecolor{flatblue}{HTML}{2980B9}
\definecolor{flatgreen}{HTML}{27AE60}
\definecolor{flatcyan}{HTML}{16A085}
\definecolor{flatdarkgrayalt}{HTML}{95A5A6}
\definecolor{flatgrayalt}{HTML}{ECF0F1}
\definecolor{flatredalt}{HTML}{E74C3C}
\definecolor{flatorangealt}{HTML}{E67E22}
\definecolor{flatyellowalt}{HTML}{F1C40F}
\definecolor{flatdenimalt}{HTML}{34495E}
\definecolor{flatpurplealt}{HTML}{9B59B6}
\definecolor{flatbluealt}{HTML}{3498DB}
\definecolor{flatgreenalt}{HTML}{2ECC71}
\definecolor{flatcyanalt}{HTML}{1ABC9C}
\newcommand{\showcolors}{%
\begin{center}
\begin{tikzpicture}
\tikzstyle{swatch}=[minimum width=3cm, minimum height=0.5cm, y=0.5cm]
\node[swatch, fill=flatdarkgray] at (0, 9) {\texttt{flatdarkgray}};
\node[swatch, fill=flatgray] at (0, 8) {\texttt{flatgray}};
\node[swatch, fill=flatred] at (0, 7) {\texttt{flatred}};
\node[swatch, fill=flatorange] at (0, 6) {\texttt{flatorange}};
\node[swatch, fill=flatyellow] at (0, 5) {\texttt{flatyellow}};
\node[swatch, fill=flatdenim] at (0, 4) {\texttt{flatdenim}};
\node[swatch, fill=flatpurple] at (0, 3) {\texttt{flatpurple}};
\node[swatch, fill=flatblue] at (0, 2) {\texttt{flatblue}};
\node[swatch, fill=flatgreen] at (0, 1) {\texttt{flatgreen}};
\node[swatch, fill=flatcyan] at (0, 0) {\texttt{flatcyan}};
\node[swatch, fill=flatdarkgrayalt] at (3, 9) {\texttt{flatdarkgrayalt}};
\node[swatch, fill=flatgrayalt] at (3, 8) {\texttt{flatgrayalt}};
\node[swatch, fill=flatredalt] at (3, 7) {\texttt{flatredalt}};
\node[swatch, fill=flatorangealt] at (3, 6) {\texttt{flatorangealt}};
\node[swatch, fill=flatyellowalt] at (3, 5) {\texttt{flatyellowalt}};
\node[swatch, fill=flatdenimalt] at (3, 4) {\texttt{flatdenimalt}};
\node[swatch, fill=flatpurplealt] at (3, 3) {\texttt{flatpurplealt}};
\node[swatch, fill=flatbluealt] at (3, 2) {\texttt{flatbluealt}};
\node[swatch, fill=flatgreenalt] at (3, 1) {\texttt{flatgreenalt}};
\node[swatch, fill=flatcyanalt] at (3, 0) {\texttt{flatcyanalt}};
\end{tikzpicture}
\end{center}
}
% New theorem environments.
\newtheorem{theorem}{Theorem}
\newtheorem{lemma}{Lemma}
\theoremstyle{definition}
\newtheorem{definition}{Definition}
\newtheorem{benchmark}{Benchmark}
\newtheorem{example}{Example}
% Toggleable TODOs.
\newtoggle{showtodos}
\toggletrue{showtodos}
%\togglefalse{showtodos}
\newcommand{\TODO}[2][]{%
\iftoggle{showtodos}{{\textcolor{blue}{\textbf{TODO(#1): #2}}}}{}%
}
% Labels and references. To label a figure use the \figlabel command, to label
% a lemma, use the \lemlabel command, etc. Similarly, use the \figref, lemref,
% etc. commands to reference these labels. For example:
%
% \begin{figure}
% % ...
% \caption{A nice figure}\figlabel{MyNiceFigure}
% \end{figure}
%
% Refer to \figref{MyNiceFigure} for a nice figure.
%
% Toggle showlabels to show or hide all labels.
\newtoggle{showlabels}
%\toggletrue{showlabels}
\togglefalse{showlabels}
\newcommand{\genericlabel}[2]{%
\label{#1:#2}%
\iftoggle{showlabels}{\textcolor{flatdarkgray}{\texttt{[#2]}}}{}%
}
\newcommand{\algolabel}[1]{\genericlabel{alg}{#1}}
\newcommand{\algoref}[1]{Algorithm~\ref{alg:#1}}
\newcommand{\applabel}[1]{\genericlabel{app}{#1}}
\newcommand{\appref}[1]{Appendix~\ref{app:#1}}
\newcommand{\benchlabel}[1]{\genericlabel{bench}{#1}}
\newcommand{\benchref}[1]{Benchmark~\ref{bench:#1}}
\newcommand{\clmlabel}[1]{\genericlabel{clm}{#1}}
\newcommand{\clmref}[1]{Claim~\ref{clm:#1}}
\newcommand{\eqnlabel}[1]{\genericlabel{eqn}{#1}}
\newcommand{\eqnref}[1]{\eqref{eqn:#1}}
\newcommand{\invlabel}[1]{\genericlabel{inv}{#1}}
\newcommand{\invref}[1]{Invariant~\ref{inv:#1}}
\newcommand{\examplelabel}[1]{\genericlabel{exa}{#1}}
\newcommand{\exampleref}[1]{Example~\ref{exa:#1}}
\newcommand{\figlabel}[1]{\genericlabel{fig}{#1}}
\newcommand{\figref}[1]{Figure~\ref{fig:#1}}
\newcommand{\lemlabel}[1]{\genericlabel{lem}{#1}}
\newcommand{\lemref}[1]{Lemma~\ref{lem:#1}}
\newcommand{\linelabel}[1]{\genericlabel{line}{#1}}
\newcommand{\lineref}[1]{Line~\ref{line:#1}}
\newcommand{\lstlabel}[1]{\genericlabel{lst}{#1}}
\newcommand{\lstref}[1]{Listing~\ref{lst:#1}}
\newcommand{\seclabel}[1]{\genericlabel{sec}{#1}}
\newcommand{\secref}[1]{Section~\ref{sec:#1}}
\newcommand{\tablabel}[1]{\genericlabel{tab}{#1}}
\newcommand{\tabref}[1]{Table~\ref{tab:#1}}
\newcommand{\thmlabel}[1]{\genericlabel{thm}{#1}}
\newcommand{\thmref}[1]{Theorem~\ref{thm:#1}}
% Surrounding symbols.
\DeclarePairedDelimiter{\parens}{(}{)}
\DeclarePairedDelimiter{\set}{\{}{\}}
\DeclarePairedDelimiterX{\setst}[2]{\{}{\}}{#1 \,\delimsize|\, #2}
\DeclarePairedDelimiter{\brackets}{[}{]}
\DeclarePairedDelimiterX{\pfrac}[2]{(}{)}{\frac{#1}{#2}}
\DeclarePairedDelimiter{\ceil}{\lceil}{\rceil}
\DeclarePairedDelimiter{\floor}{\lfloor}{\rfloor}
% Symbols and abbreviations.
\newcommand{\nats}{\mathbb{N}}
\newcommand{\ints}{\mathbb{Z}}
\newcommand{\rats}{\mathbb{Q}}
\newcommand{\reals}{\mathbb{R}}
\newcommand{\complexes}{\mathbb{C}}
% Misc.
\newcommand{\defword}[1]{\textbf{\textcolor{flatdenim}{#1}}}
Display the source blob
Display the rendered blob
Raw
\documentclass{mwhittaker}
\title{Provenance Practice Prelim}
\date{}
\usepackage{pervasives}
\newenvironment{answer}{\begingroup \color{flatred}}{\endgroup}
\newcommand{\join}{\bowtie}
\begin{document}
\maketitle
\section{Lineage}
What is the grammar of relational algebra?
\begin{answer}
\[
Q ::= R
| \set{t}
| \sigma_\theta(Q)
| \pi_U(Q)
| Q_1 \join Q_2
| Q_1 \cup Q_2
| \rho_{A \mapsto B}(Q)
\]
\end{answer}
But let's ignore renaming. How does Cui define the lineage of a tuple?
\begin{answer}
Given $Q$, $I=R_1, \ldots, R_n$, $t$, the lineage of $t$ is $R_1', \ldots,
R_n'$ such that
\begin{enumerate}
\item $Q(R_1', \ldots, R_n') = \set{t}$,
\item $Q(R_1', \ldots, {t_i}, \ldots, R_n') \neq \emptyset$, and
\item $R_1', \ldots, R_n'$ is maximal.
\end{enumerate}
\end{answer}
How can we compute the lineage of a query?
\newcommand{\Lin}{\textsf{Lin}}
\newcommand{\tif}{~\text{if}~}
\newcommand{\telse}{~\text{else}~}
\begin{answer}
\begin{align*}
\Lin(\set{t}, I, t) &= \emptyset \\
\Lin(\set{u}, I, t) &= \bot \\
\Lin(R, I, t) &= R(t) \tif t \in R \telse \bot \\
\Lin(\sigma_\theta(Q), I, t) &= \Lin(Q, I, t) \tif \theta(t) \telse \bot \\
\Lin(\pi_U(Q), I, t) &= \bigcup_L \setst{\Lin(Q, I, u)}{u \in Q(I), u[U] = t} \\
\Lin(Q_1 \join Q_2, I, t) &= \Lin(Q_1, I, t[U_1]) \cup_S \Lin(Q_2, I, t[U_2]) \\
\Lin(Q_1 \cup Q_2, I, t) &= \Lin(Q_1, I, t) \cup_L \Lin(Q_2, I, t)
\end{align*}
\end{answer}
Why do we need Cui's point (2) from above? What if we get rid of it?
\begin{answer}
We can add a bunch of extra junk tuples. For example, consider
$\sigma_{=t}(R)$. We could output $R$ as the lineage.
\end{answer}
Why do we need Cui's point (3) from above? What if we get rid of it?
\begin{answer}
We can return only some explaining tuples. For example, $\pi_a(R)$ could
return only one tuple as lineage.
\end{answer}
What is $\Lin$ for set difference?
\begin{answer}
\[
\Lin(Q_1 - Q_2, I, t)
= \Lin(Q_1, I, t) \cup_S (\cup \setst{\Lin(Q_2, I, u)}{u \in Q_2(I)})
\]
\end{answer}
\section{Why-Provenance}
What is a disadvantage of lineage?
\begin{answer}
Lineage shows tuples that contribute, but not the relationship at all between
them. For example, if the lineage is $\set{t, u}$, do both $t$ and $u$ need
to appear or just one?
\end{answer}
What is a witness?
\begin{answer}
A witness $J \subseteq I$ where $t \in Q(J)$.
\end{answer}
Define Why-Provenance.
\newcommand{\Why}{\textsf{Why}}
\begin{answer}
\begin{align*}
\Why(\set{t}, I, t) &= \set{\emptyset} \\
\Why(\set{u}, I, t) &= \emptyset \\
\Why(R, I, t) &= \set{\set{t}} \tif t \in R \telse \emptyset \\
\Why(\sigma_\theta(Q), I, t) &= \Why(Q, I, t) \tif \theta(t) \telse \emptyset \\
\Why(\pi_U(Q), I, t) &= \bigcup \setst{\Why(Q, I, u)}{u \in Q(I), u[U] = t} \\
\Why(Q_1 \join Q_2, I, t) &= \Why(Q_1, I, t[U_1]) \Cup \Why(Q_2, I, t[U_2]) \\
\Why(Q_1 \cup Q_2, I, t) &= \Why(Q_1, I, t) \cup \Why(Q_2, I, t)
\end{align*}
\end{answer}
Relationship between why-provenance and witnesses?
\begin{answer}
Every witness is a superset of some element in the why provenance. Minimal
why provenance is equal to minimal witnesses.
\end{answer}
Case when why provenance doesn't return minimal why?
\begin{answer}
$R \join \pi_{x, y}(R \join S)$ with $R(x, y) = \set{(1, 1)}$ and $S(y, z) =
\set{(1, 2), (1, 3)}$.
\end{answer}
\section{How-Provenance}
Disadvantage of why?
\begin{answer}
Doesn't tell you \emph{how} tuples are put together.
\end{answer}
What is a $K$-relation?
\begin{answer}
A $K$-relation is a function from tuples to elements of a commutative
semiring with finite basis.
\end{answer}
What $K$ gives us set semantics?
\begin{answer}
$(\set{0, 1}, 0, 1, \lor, \land)$.
\end{answer}
What $K$ gives us bag semantics?
\begin{answer}
$(\nats, 0, 1, +, \cdot)$.
\end{answer}
How do we evaluate queries on $K$-relations?
\begin{answer}
\begin{align*}
\Lin(\set{u}, I, t) &= 0 \\
\Lin(\set{t}, I, t) &= 1 \\
\Lin(R, I, t) &= I(R)(t) \\
\Lin(\sigma_\theta(Q), I, t) &= \theta(t) \cdot Q(I)t \\
\Lin(\pi_U(Q), I, t) &= \sum_{u \in supp(Q(I)), u[V] = t} Q(I)u \\
\Lin(Q_1 \join Q_2, I, t) &= Q_1(I)(t[U_1]) \cdot Q_2(I, t[U_2]) \\
\Lin(Q_1 \cup Q_2, I, t) &= Q_1(I)(t) + Q_2(I)t
\end{align*}
\end{answer}
What is how-provenance?
\begin{answer}
It's just $K$-relational algebra with $K$ being polynomials over tuple
identifiers.
\end{answer}
So do $K$-relations give us a way to evaluate queries or compute provenance?
\begin{answer}
Both. Just pick the right $K$.
\end{answer}
How does how-provenance relate to lineage and why provenance?
\begin{answer}
It subsumes them.
$K_\Lin = (P(tuple id)_\bot, \bot, \emptyset, \cup_L, \cup_S)$.
$K_\Why = (P(P(tuple id)), \emptyset, \set{\emptyset}, \cup, \Cup)$.
\end{answer}
\end{document}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment