The
weblogo3
website accepts motifs in many different formats, most of which represent the alignmed motifs themselves (eg fasta format).
It is also possible to submit a position-specific score matrix (PSSM) in
Transfac format.
This format is also accepted by the RWebLogo::weblogo
R function.
For conversion between many different motif formats, the convert-matrix tool is helpful.
Transfac format is specified as:
ID <motif name>
BF <species name>
P0 <a1> ... <an>
01 <c1,1> ... <c1,n> <consensus letter>
02 <c2,1> ... <c2,n> <consensus letter> ...
nn <cnn,1> ... <cnn,n> <consensus letter>
XX
//
- The P0 row labels the columns with the letters of the sequence alphabet.
- The numbered rows (01...nn) contain counts for each letter in the sequence alphabet for that position in the motif. These counts should give the number of times each letter appears in known examples of the motif at the given position.
- The counts in each numbered row should add up to the same total count.
- The last column in each count row gives the consensus letter for that position in the motif, and is ignored by weblogo (but has to be present).
The following example shows a PSSM in Transfac format.
ID any_old_name_for_motif_1
BF species_name_for_motif_1
P0 A C G T
01 1 2 2 0 S
02 2 1 2 0 R
03 3 0 1 1 A
04 0 5 0 0 C
05 5 0 0 0 A
06 0 0 4 1 G
07 0 1 4 0 G
08 0 0 0 5 T
09 0 0 5 0 G
10 0 1 2 2 K
11 0 2 0 3 Y
12 1 0 3 1 G
XX
To generate a motif using RWebLogo::weblogo, this motif should be save to a file its named should be provied via the file.in
argument:
library(RWebLogo)
weblogo(file.in= "transfac_matrix.txt")
This custom motif will work as well:
ID any_old_name_for_motif_1
BF species_name_for_motif_1
P0 T H O M A S
01 5 1 2 0 0 0 T
02 0 6 1 2 0 0 H
03 1 0 7 1 0 0 O
04 0 1 1 5 0 0 M
05 1 0 0 0 6 0 A
06 0 0 4 1 0 7 S
XX
- The motifStack Bioconductor package.