Instantly share code, notes, and snippets.

# martijnvermaat/hgvs.md Last active May 30, 2016

Operational semantics for HGVS

# Operational semantics for HGVS

This formalisation is based on the idea that the meaning of a variant description is a set of fixed sequences projected on the referencesequence. (We call them replacements, but they can also be identities.) We explicitely do not mean the result (sequence) of applying these replacements. At least in my understanding, that is not how HGVS is intended to interpreted (and would also make it impossible to combine variants; how do you combine two plain sequences?).

## Allele descriptions

A replacement `r` is a triple `s @ n:m` where `s` is a sequence and `n<=m` are two integers.

With `s @ n:m` we mean replacing the interbase range `n:m` by sequence `s`.

We shall define a relation

``````s, d => s, R
``````

which means the application of variant description `d` on sequence `s` results in the set of replacements `R` on `s`.

For example:

``````ATCG, 3del        => s, { '' @ 3:4 }
ATCG, 2dup        => s, { T @ 2:2 }
ATCG, [3del;2dup] => s, { '' @ 3:4 , T @ 2:2 }
``````

We now define our relation:

``````s, iX>Y        => s, { Y @ i:i+1 }           if i < len(s) and s[i] = X

s, idel        => s, { '' @ i:i+1 }          if i < len(s)

s, i_jdel      => s, { '' @ i:j+1 }          if i < j < len(s)

s, idup        => s, { s[i] @ i+1:i+1 }      if i < len(s)

s, i_jdup      => s, { s[i:j+1] @ j+1:j+1 }  if i < j < len(s)

s, [d1;...;dn] => s, R1 + .. + Rn            if s, di => s, Ri for i = 1...n
``````