hfukada/gist:26425b5b8d37b5a0279d94d03575a61e

## gistfile1.txt


Background

Simplifying somewhat, each word of English is composed of a string of sounds, called "phones"/"phonemes".  For example, "cat" has the phones /k/, /a/, /t/.

Some phones are more "confusable" with other phones--it's easier to mishear /m/ (as in "ram") as /n/ (as in "ran") than /t/ (as in "rat").  I happen to have a phone-to-phone confusion matrix, which tells me the probability of confusing any phone as any other phone.

If I want to calculate how easy it is to confuse one string of phones with another of equal length (e.g., hearing "ram" as "rat"), I simply multiply the probability of hearing each phone in the first string as the corresponding phone in the second string.  E.g., P(/r/|/r/) * P(/a/|/a/) * P(/t/|/m/).  Like I said, I get these probabilities from my confusion matrix.

Problem

That works fine and dandy for words of the same length, but I want to compare strings of phones of different lengths.  If I have the probability of inserting or deleting phones, I should be able to compare the probability of hearing "ram" and "rant".

Now Zach, you might say, that's as simple as calculating the weighted Levehnstein (edit) distance between two strings!  That's like Compsci 101!  But the problem is that Levehnstein distance is the minimum edit distance between two strings.

I would assume that the probability of hearing the sounds in "ram" as "rant" would be the marginalization over all possible ways of mishearing it.  For example, you could mishear "ram" as "rant" by mishearing the /m/ as /n/ and inserting /t/, but you could ALSO mishear it by mishearing /m/ as /t/ and inserting /n/.

The closest I've seen to something that does what I want was some people using a weighted finite-state automata as a kernel function for calculating slightly different stuff.  I think they ended up only calculating the k closet paths for each item, however, so what I'm asking about might be technically impossible.


	Background

	Simplifying somewhat, each word of English is composed of a string of sounds, called "phones"/"phonemes". For example, "cat" has the phones /k/, /a/, /t/.

	Some phones are more "confusable" with other phones--it's easier to mishear /m/ (as in "ram") as /n/ (as in "ran") than /t/ (as in "rat"). I happen to have a phone-to-phone confusion matrix, which tells me the probability of confusing any phone as any other phone.

	If I want to calculate how easy it is to confuse one string of phones with another of equal length (e.g., hearing "ram" as "rat"), I simply multiply the probability of hearing each phone in the first string as the corresponding phone in the second string. E.g., P(/r/\|/r/) * P(/a/\|/a/) * P(/t/\|/m/). Like I said, I get these probabilities from my confusion matrix.

	Problem

	That works fine and dandy for words of the same length, but I want to compare strings of phones of different lengths. If I have the probability of inserting or deleting phones, I should be able to compare the probability of hearing "ram" and "rant".

	Now Zach, you might say, that's as simple as calculating the weighted Levehnstein (edit) distance between two strings! That's like Compsci 101! But the problem is that Levehnstein distance is the minimum edit distance between two strings.

	I would assume that the probability of hearing the sounds in "ram" as "rant" would be the marginalization over all possible ways of mishearing it. For example, you could mishear "ram" as "rant" by mishearing the /m/ as /n/ and inserting /t/, but you could ALSO mishear it by mishearing /m/ as /t/ and inserting /n/.

	The closest I've seen to something that does what I want was some people using a weighted finite-state automata as a kernel function for calculating slightly different stuff. I think they ended up only calculating the k closet paths for each item, however, so what I'm asking about might be technically impossible.