Skip to content

Instantly share code, notes, and snippets.

@pmelsted
Created February 6, 2013 11:21
Show Gist options
  • Save pmelsted/4721991 to your computer and use it in GitHub Desktop.
Save pmelsted/4721991 to your computer and use it in GitHub Desktop.
De Bruijn graph toy example in Fastg format
De Bruijn graph toy example
Genome:
ATGAAGTGGGTAAC
Reads:
ATGAAGTGGG
TAGTGGGTAA
AGTGCGTAAC
K-mers:
1 ATGA
2 TGAA
3 GAAG
4 AAGT
5 TAGT
6 AGTG
7 GTGG
8 TGGG
9 GGGT
10 GGTA
11 GTGC
12 TGCG
13 GCGT
14 CGTA
15 GTAA
16 TAAC
>km1:km2;
ATGA
>km2:km3;
A
>km3:km4;
G
>km4:km6;
T
>km5:km6;
TAGT
>km6:km7,km11;
G
>km7:km8;
G
>km8:km9;
G
>km9:km10;
T
>km10:km15;
A
>km11:km12;
C
>km12:km13;
G
>km13:km14;
T
>km14:km15;
A
>km15:km16;
A
>km16
C
If we create contigs in the de Bruijn graph we can compress this into
>c1:c3;
ATGAAGT
>c2:c3;
TAGT
>c3:c4,c5;
G
>c4:c6;
GGTA
>c5:c6;
CGTA
>c6;
AC
Recognizing the bubble in the middle this can be reduced to
>c1:c3;
ATGAAGT
>c2:c3;
TAGT
>c3;
G[1:alt|G,C]GGTAAC
And finally
>c1;
ATGAA[1:alt|A,T]GTG[1:alt|G,C]GGTAAC
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment