petered/kasper-project

## kasper-project


# 1) Simple Maximum Likelihood

    F --> X

$$
p(F=1 | X=x) = \frac{p(X=x|F=1) p(F=1)}{p(X=x)} = \frac{p(X=x|F=1) p(F=1)}{p(X=x|F=0)p(F=0) + p(X=x|F=1)p(F=1)}
$$

Now, you've learned $p(F)$ and $p(X|F)$ over the dataset, so we can solve for this.

# 2) C causes X and F: p(X, F | C) = p(X|C) p(F|C)

    C --> F
    '---> X

Train naive-bayes EM by just concatenating F onto X, then marginalizing out C at test time.  Then in your EM model, you've trained $p(F|C)$,  $p(C)$, $p(X|C)$.  And you can infer:

\begin{align}
p(F|X) &= \sum_{c\in |C|} p(F, C| X) \\
&= \sum_C p(F, C, X)/p(X) \\
&= \sum_C p(F|C) p(X|C)p(C) /p(X)  \text{... Because of the graph structure}\\
&= \sum_C p(F|C) p(C|X)  \text{... Bayes rule}\\
&=  \frac{\sum_C p(F|C) p(X|C)p(C)}{\sum_{C}p(X|C)p(C) }
\end{align}

**Using past customer data**.
In your "averaging" assumption, you average out the latent distributions of transactions for a given customer.

\begin{align}
\hat p(C|X=x) &= \frac1N \sum_{x': x'_{id}=x_{id}} p(C|X=x') \\
&= \frac1N \sum_{x': x'_{id}=x_{id}} \frac{p(X=x'|C)p(C)}{P(X=x')}\\
&= \frac1N \sum_{x': x'_{id}=x_{id}} \frac{p(X=x'|C)p(C)}{\sum_C(X=x'|C)p(C)}\\
\end{align}

So you could plug this in to the above equation:

\begin{align}
p(F=1|X=x) &= \sum_{c\in |C|} p(F=1|C=c)\hat p(C=c|X=x)
\end{align}


#3) F and C cause X

   C --> X <-- F

You need to define $p(X|C,F)$ now, which we discussed before.

I think

$$

$$

... TODO: Fill in.


	# 1) Simple Maximum Likelihood

	F --> X

	$$
	p(F=1 \| X=x) = \frac{p(X=x\|F=1) p(F=1)}{p(X=x)} = \frac{p(X=x\|F=1) p(F=1)}{p(X=x\|F=0)p(F=0) + p(X=x\|F=1)p(F=1)}
	$$

	Now, you've learned $p(F)$ and $p(X\|F)$ over the dataset, so we can solve for this.

	# 2) C causes X and F: p(X, F \| C) = p(X\|C) p(F\|C)

	C --> F
	'---> X

	Train naive-bayes EM by just concatenating F onto X, then marginalizing out C at test time. Then in your EM model, you've trained $p(F\|C)$, $p(C)$, $p(X\|C)$. And you can infer:

	\begin{align}
	p(F\|X) &= \sum_{c\in \|C\|} p(F, C\| X) \\
	&= \sum_C p(F, C, X)/p(X) \\
	&= \sum_C p(F\|C) p(X\|C)p(C) /p(X) \text{... Because of the graph structure}\\
	&= \sum_C p(F\|C) p(C\|X) \text{... Bayes rule}\\
	&= \frac{\sum_C p(F\|C) p(X\|C)p(C)}{\sum_{C}p(X\|C)p(C) }
	\end{align}

	Using past customer data.
	In your "averaging" assumption, you average out the latent distributions of transactions for a given customer.

	\begin{align}
	\hat p(C\|X=x) &= \frac1N \sum_{x': x'_{id}=x_{id}} p(C\|X=x') \\
	&= \frac1N \sum_{x': x'_{id}=x_{id}} \frac{p(X=x'\|C)p(C)}{P(X=x')}\\
	&= \frac1N \sum_{x': x'_{id}=x_{id}} \frac{p(X=x'\|C)p(C)}{\sum_C(X=x'\|C)p(C)}\\
	\end{align}

	So you could plug this in to the above equation:

	\begin{align}
	p(F=1\|X=x) &= \sum_{c\in \|C\|} p(F=1\|C=c)\hat p(C=c\|X=x)
	\end{align}




	#3) F and C cause X

	C --> X <-- F

	You need to define $p(X\|C,F)$ now, which we discussed before.

	I think

	$$

	$$

	... TODO: Fill in.