Skip to content

Instantly share code, notes, and snippets.

@tomsing1
Last active November 24, 2016 15:32
Show Gist options
  • Save tomsing1/d48966279b23c1b4fc6019d410f754c0 to your computer and use it in GitHub Desktop.
Save tomsing1/d48966279b23c1b4fc6019d410f754c0 to your computer and use it in GitHub Desktop.
Generating sequence logo from position frequency matrices with Weblogo / RWeblogo

WebLogo3

The weblogo3 website accepts motifs in many different formats, most of which represent the alignmed motifs themselves (eg fasta format). It is also possible to submit a position-specific score matrix (PSSM) in Transfac format. This format is also accepted by the RWebLogo::weblogo R function.

For conversion between many different motif formats, the convert-matrix tool is helpful.

Example

Transfac format is specified as:

ID <motif name>
BF <species name>
P0 <a1> ... <an>
01 <c1,1> ... <c1,n> <consensus letter>
02 <c2,1> ... <c2,n> <consensus letter> ...
nn <cnn,1> ... <cnn,n> <consensus letter>
XX
//

  • The P0 row labels the columns with the letters of the sequence alphabet.
  • The numbered rows (01...nn) contain counts for each letter in the sequence alphabet for that position in the motif. These counts should give the number of times each letter appears in known examples of the motif at the given position.
  • The counts in each numbered row should add up to the same total count.
  • The last column in each count row gives the consensus letter for that position in the motif, and is ignored by weblogo (but has to be present).

The following example shows a PSSM in Transfac format.

ID any_old_name_for_motif_1
BF species_name_for_motif_1
P0 A C G T
01 1 2 2 0 S
02 2 1 2 0 R
03 3 0 1 1 A
04 0 5 0 0 C
05 5 0 0 0 A
06 0 0 4 1 G
07 0 1 4 0 G
08 0 0 0 5 T
09 0 0 5 0 G
10 0 1 2 2 K
11 0 2 0 3 Y
12 1 0 3 1 G
XX

To generate a motif using RWebLogo::weblogo, this motif should be save to a file its named should be provied via the file.in argument:

library(RWebLogo)
weblogo(file.in= "transfac_matrix.txt")

This custom motif will work as well:

ID any_old_name_for_motif_1
BF species_name_for_motif_1
P0 T H O M A S
01 5 1 2 0 0 0 T
02 0 6 1 2 0 0 H
03 1 0 7 1 0 0 O
04 0 1 1 5 0 0 M
05 1 0 0 0 6 0 A
06 0 0 4 1 0 7 S
XX

Alternatives

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment