Skip to content

Instantly share code, notes, and snippets.

@rtraborn
Last active November 7, 2023 06:55
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rtraborn/e395776b965398c54c4d to your computer and use it in GitHub Desktop.
Save rtraborn/e395776b965398c54c4d to your computer and use it in GitHub Desktop.
Converting Homer motif files to MEME format
### Running motif2meme.R
> source("/path/to/motif2meme.R")
> motif2meme("/path/to/motif/file/")
#generate adjlist format file with the specific motifsim_parsed.txt file in ../data directory as input
fin=open("motifsim_parsed.txt","r")
fout=open("motifsim_parsed_adj.txt","w")
for line in fin:
column=line.split('\t')
adjlist=column[0]+' '+column[1].replace(',',' ')+'\n'
fout.write(adjlist)
fin.close()
fout.close()
MEME version 4
ALPHABET= ACGT
strands: + -
0.0415 0.9212 0.0217 0.0156
0.0835 0.0119 0.8417 0.0629
0.0358 0.8354 0.0771 0.0517
0.029 0.123 0.019 0.829
0.9652 0.011 0.0188 0.005
0.0154 0.0141 0.9332 0.0373
0.8045 0.0253 0.1091 0.061
MEME version 4
ALPHABET= ACGT
MOTIF motif_Train8_9 10-CGCTAGA
letter-probability matrix: alength= 4 w= 7 nsites= 20 E= 1e-43
0.103 0.786 0.043 0.068
0.16 0.001 0.793 0.046
0.107 0.73 0.063 0.1
0.109 0.317 0.045 0.529
0.941 0.011 0.047 0.001
0.053 0.014 0.869 0.064
0.674 0.101 0.124 0.101
MOTIF motif_Train1_10 10-CGCTAGA
letter-probability matrix: alength= 4 w= 7 nsites= 20 E= 1e-39
0.089 0.821 0.055 0.035
0.097 0.018 0.844 0.041
0.057 0.811 0.056 0.076
0.074 0.142 0.062 0.722
0.914 0.019 0.048 0.019
0.026 0.03 0.859 0.085
0.812 0.066 0.063 0.059
MOTIF motif_Train4_16 14-CGCTAGA
letter-probability matrix: alength= 4 w= 7 nsites= 20 E= 1e-33
0.07 0.8 0.084 0.046
0.06 0.054 0.855 0.031
0.056 0.818 0.072 0.054
0.086 0.106 0.043 0.765
0.865 0.073 0.039 0.023
0.028 0.054 0.823 0.095
0.802 0.05 0.098 0.05
MOTIF motif_Train9_12 12-CGCTAGA
letter-probability matrix: alength= 4 w= 7 nsites= 20 E= 1e-40
0.014 0.984 0.001 0.001
0.135 0.001 0.819 0.045
0.001 0.786 0.061 0.152
0.015 0.214 0.001 0.77
0.984 0.001 0.014 0.001
0.001 0.001 0.997 0.001
0.653 0.03 0.271 0.046
0.06 0.151 0.366 0.424
MOTIF motif_Train6_17 17-CGCTAGA
letter-probability matrix: alength= 4 w= 7 nsites= 20 E= 1e-24
0.001 0.997 0.001 0.001
0.001 0.001 0.864 0.134
0.001 0.997 0.001 0.001
0.001 0.001 0.001 0.997
0.997 0.001 0.001 0.001
0.001 0.001 0.997 0.001
0.837 0.001 0.161 0.001
MOTIF motif_Train3_12 12-CGCTAGA
letter-probability matrix: alength= 4 w= 7 nsites= 20 E= 1e-35
0.001 0.984 0.014 0.001
0.014 0.001 0.984 0.001
0.015 0.896 0.074 0.015
0.001 0.001 0.001 0.997
0.997 0.001 0.001 0.001
0.014 0.001 0.984 0.001
0.969 0.001 0.029 0.001
MOTIF motif_Train7_6 7-CGCYAGW
letter-probability matrix: alength= 4 w= 7 nsites= 20 E= 1e-50
0.134 0.864 0.001 0.001
0.287 0.001 0.711 0.001
0.035 0.736 0.191 0.038
0.001 0.446 0.034 0.519
0.997 0.001 0.001 0.001
0.001 0.001 0.875 0.123
0.449 0.001 0.2 0.349
MOTIF motif_10_14 15-CGCTAGA
letter-probability matrix: alength= 4 w= 7 nsites= 20 E= 1e-33
0.001 0.997 0.001 0.001
0.001 0.04 0.856 0.103
0.001 0.912 0.086 0.001
0.001 0.001 0.001 0.997
0.997 0.001 0.001 0.001
0.001 0.037 0.961 0.001
0.863 0.001 0.135 0.001
MOTIF motif_Train2_13 16-CGCTAGA
letter-probability matrix: alength= 4 w= 7 nsites= 20 E= 1e-30
0.001 0.982 0.016 0.001
0.017 0.001 0.965 0.017
0.022 0.839 0.122 0.017
0.001 0.001 0.001 0.997
0.982 0.001 0.016 0.001
0.009 0.001 0.989 0.001
0.989 0.001 0.009 0.001
MOTIF motif_Train5_11 10-CGCTAGA
letter-probability matrix: alength= 4 w= 7 nsites= 20 E= 1e-39
0.001 0.997 0.001 0.001
0.063 0.001 0.726 0.21
0.063 0.829 0.045 0.063
0.001 0.001 0.001 0.997
0.978 0.001 0.02 0.001
0.02 0.001 0.978 0.001
0.997 0.001 0.001 0.001
MEME version 4
ALPHABET= ACGT
strands: + -
0.0773 0.0194 0.8625 0.0408
0.0432 0.0481 0.8997 0.009
0.1013 0.0168 0.8627 0.0192
0.0247 0.0433 0.0309 0.9011
0.1207 0.0727 0.6892 0.1174
0.902 0.031 0.0484 0.0186
0.024 0.0765 0.4884 0.4112
0.6538 0.2663 0.0433 0.0366
0.9476 0.0082 0.0258 0.0184
0.9382 0.0204 0.0254 0.016
MEME version 4
ALPHABET= ACGT
MOTIF motif_Train9_8 10-GGGTGATAAAAA
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-46
0.052 0.017 0.897 0.034
0.184 0.017 0.782 0.017
0.121 0.052 0.793 0.034
0.119 0.001 0.086 0.794
0.051 0.001 0.828 0.12
0.88 0.069 0.017 0.034
0.121 0.052 0.26 0.567
0.889 0.038 0.056 0.017
0.896 0.001 0.086 0.017
0.88 0.034 0.052 0.034
MOTIF motif_Train2_6 7-GGGTGATAAA
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-54
0.126 0.001 0.872 0.001
0.001 0.126 0.872 0.001
0.126 0.001 0.872 0.001
0.001 0.001 0.001 0.997
0.185 0.038 0.776 0.001
0.997 0.001 0.001 0.001
0.001 0.007 0.405 0.587
0.617 0.128 0.254 0.001
0.997 0.001 0.001 0.001
0.997 0.001 0.001 0.001
MOTIF motif_Train5_8 9-GGGTGAGAAA
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-47
0.01 0.001 0.988 0.001
0.001 0.001 0.997 0.001
0.01 0.001 0.988 0.001
0.001 0.001 0.001 0.997
0.001 0.001 0.997 0.001
0.997 0.001 0.001 0.001
0.001 0.01 0.709 0.28
0.783 0.215 0.001 0.001
0.997 0.001 0.001 0.001
0.997 0.001 0.001 0.001
MOTIF motif_Train1_6 8-GGGTGATHAA
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-45
0.001 0.001 0.997 0.001
0.001 0.001 0.997 0.001
0.001 0.001 0.997 0.001
0.001 0.38 0.001 0.618
0.262 0.127 0.61 0.001
0.872 0.126 0.001 0.001
0.001 0.001 0.392 0.606
0.384 0.361 0.001 0.254
0.997 0.001 0.001 0.001
0.997 0.001 0.001 0.001
MOTIF motif_Train3_6 8-GGGTDAGAAA
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-52
0.001 0.001 0.921 0.077
0.001 0.001 0.997 0.001
0.001 0.001 0.997 0.001
0.001 0.001 0.001 0.997
0.311 0.078 0.3 0.311
0.997 0.001 0.001 0.001
0.001 0.077 0.594 0.328
0.672 0.326 0.001 0.001
0.997 0.001 0.001 0.001
0.921 0.001 0.077 0.001
MOTIF motif_10_8 9-GGGTGAGAAA
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-46
0.01 0.001 0.988 0.001
0.001 0.01 0.988 0.001
0.01 0.001 0.988 0.001
0.001 0.001 0.001 0.997
0.001 0.01 0.988 0.001
0.997 0.001 0.001 0.001
0.001 0.01 0.68 0.309
0.775 0.223 0.001 0.001
0.997 0.001 0.001 0.001
0.997 0.001 0.001 0.001
MOTIF motif_Train7_5 8-GGGTBRBMAA
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-50
0.084 0.042 0.725 0.149
0.075 0.074 0.809 0.042
0.32 0.021 0.648 0.011
0.042 0.001 0.053 0.904
0.117 0.276 0.3 0.307
0.521 0.011 0.414 0.054
0.043 0.234 0.363 0.36
0.459 0.467 0.042 0.032
0.863 0.021 0.074 0.042
0.926 0.021 0.032 0.021
MOTIF motif_Train4_7 9-GGGTGATAAA
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-46
0.088 0.022 0.8 0.09
0.157 0.191 0.628 0.024
0.022 0.088 0.802 0.088
0.023 0.045 0.111 0.821
0.091 0.106 0.736 0.067
0.821 0.045 0.046 0.088
0.067 0.257 0.236 0.441
0.71 0.287 0.001 0.002
0.845 0.001 0.088 0.066
0.864 0.089 0.025 0.022
MOTIF motif_Train6_7 9-GGGTGAGAAAAA
letter-probability matrix: alength= 4 w= 12 nsites= 20 E= 1e-49
0.077 0.001 0.921 0.001
0.001 0.001 0.997 0.001
0.232 0.001 0.766 0.001
0.001 0.001 0.001 0.997
0.019 0.078 0.592 0.311
0.997 0.001 0.001 0.001
0.001 0.011 0.732 0.256
0.947 0.029 0.023 0.001
0.997 0.001 0.001 0.001
0.911 0.001 0.011 0.077
0.997 0.001 0.001 0.001
0.845 0.077 0.077 0.001
MOTIF motif_Train8_7 9-GGGTGAKCAA
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-51
0.324 0.107 0.516 0.053
0.01 0.059 0.93 0.001
0.17 0.001 0.776 0.053
0.057 0.001 0.053 0.889
0.169 0.012 0.765 0.054
0.941 0.054 0.001 0.004
0.003 0.106 0.513 0.378
0.302 0.589 0.053 0.056
0.89 0.053 0.004 0.053
0.892 0.054 0.053 0.001
MEME version 4
ALPHABET= ACGT
strands: + -
0.0186 0.001 0.9794 0.001
0.9604 0.014 0.0246 0.001
0.0191 0.0151 0.9485 0.0173
0.9578 0.0043 0.0369 0.001
0.988 0.001 0.001 0.01
0.9646 0.007 0.001 0.0274
0.9733 0.0081 0.001 0.0176
0.0244 0.0164 0.9495 0.0097
0.02 0.001 0.6795 0.2995
0.0245 0.8681 0.001 0.1064
0.0183 0.115 0.7831 0.0835
MEME version 4
ALPHABET= ACGT
MOTIF motif_Train4_11 12-GAGAAAAGGCG
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 1e-40
0.001 0.001 0.997 0.001
0.985 0.001 0.013 0.001
0.001 0.001 0.997 0.001
0.997 0.001 0.001 0.001
0.997 0.001 0.001 0.001
0.973 0.013 0.001 0.013
0.997 0.001 0.001 0.001
0.014 0.014 0.958 0.014
0.001 0.001 0.774 0.224
0.001 0.958 0.001 0.04
0.014 0.122 0.823 0.041
MOTIF motif_Train1_11 13-GAGAAAAGGCG
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 1e-38
0.001 0.001 0.997 0.001
0.986 0.001 0.012 0.001
0.001 0.001 0.997 0.001
0.986 0.012 0.001 0.001
0.997 0.001 0.001 0.001
0.973 0.013 0.001 0.013
0.997 0.001 0.001 0.001
0.013 0.013 0.961 0.013
0.001 0.001 0.718 0.28
0.001 0.96 0.001 0.038
0.013 0.131 0.83 0.026
MOTIF motif_Train9_10 11-GAGAAAAGGCG
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 1e-40
0.001 0.001 0.997 0.001
0.987 0.011 0.001 0.001
0.012 0.012 0.975 0.001
0.997 0.001 0.001 0.001
0.997 0.001 0.001 0.001
0.987 0.001 0.001 0.011
0.997 0.001 0.001 0.001
0.024 0.001 0.974 0.001
0.001 0.001 0.704 0.294
0.001 0.974 0.001 0.024
0.037 0.111 0.803 0.049
MOTIF motif_Train7_10 11-GAGAAAAGTCG
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 1e-33
0.001 0.001 0.997 0.001
0.81 0.005 0.184 0.001
0.047 0.001 0.86 0.092
0.997 0.001 0.001 0.001
0.907 0.001 0.001 0.091
0.902 0.001 0.001 0.096
0.907 0.001 0.001 0.091
0.001 0.092 0.904 0.003
0.092 0.001 0.301 0.606
0.092 0.712 0.001 0.195
0.001 0.137 0.676 0.186
MOTIF motif_Train6_11 12-GAGAAAAGGYKA
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 1e-38
0.071 0.001 0.927 0.001
0.985 0.005 0.009 0.001
0.01 0.005 0.912 0.073
0.638 0.001 0.36 0.001
0.997 0.001 0.001 0.001
0.984 0.014 0.001 0.001
0.85 0.072 0.001 0.077
0.001 0.015 0.979 0.005
0.077 0.001 0.67 0.252
0.145 0.483 0.001 0.371
0.001 0.226 0.33 0.442
MOTIF motif_Train3_11 13-GAGAAAAGGCG
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 1e-36
0.006 0.001 0.992 0.001
0.986 0.001 0.012 0.001
0.001 0.012 0.986 0.001
0.986 0.012 0.001 0.001
0.997 0.001 0.001 0.001
0.975 0.012 0.001 0.012
0.997 0.001 0.001 0.001
0.013 0.013 0.961 0.013
0.012 0.001 0.729 0.258
0.001 0.961 0.001 0.037
0.013 0.103 0.871 0.013
MOTIF motif_10_11 11-GAGAAAAGGCG
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 1e-36
0.092 0.001 0.906 0.001
0.906 0.092 0.001 0.001
0.093 0.093 0.813 0.001
0.997 0.001 0.001 0.001
0.997 0.001 0.001 0.001
0.906 0.001 0.001 0.092
0.997 0.001 0.001 0.001
0.114 0.001 0.875 0.01
0.001 0.001 0.731 0.267
0.001 0.712 0.001 0.286
0.021 0.095 0.883 0.001
MOTIF motif_Train8_10 11-GAGAAAAGGCG
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 1e-36
0.001 0.001 0.997 0.001
0.986 0.001 0.012 0.001
0.013 0.013 0.973 0.001
0.986 0.012 0.001 0.001
0.997 0.001 0.001 0.001
0.973 0.013 0.001 0.013
0.997 0.001 0.001 0.001
0.013 0.001 0.973 0.013
0.001 0.001 0.695 0.303
0.001 0.986 0.001 0.012
0.007 0.067 0.9 0.026
MOTIF motif_Train5_10 11-GAGAAAAGGCG
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 1e-41
0.011 0.001 0.987 0.001
0.987 0.011 0.001 0.001
0.012 0.012 0.975 0.001
0.997 0.001 0.001 0.001
0.997 0.001 0.001 0.001
0.987 0.001 0.001 0.011
0.997 0.001 0.001 0.001
0.025 0.001 0.962 0.012
0.001 0.001 0.754 0.244
0.001 0.962 0.001 0.036
0.037 0.091 0.822 0.05
MOTIF motif_Train2_10 12-GAGAAAAGGCG
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 1e-33
0.001 0.001 0.997 0.001
0.986 0.012 0.001 0.001
0.001 0.001 0.997 0.001
0.997 0.001 0.001 0.001
0.997 0.001 0.001 0.001
0.986 0.001 0.001 0.012
0.997 0.001 0.001 0.001
0.026 0.013 0.948 0.013
0.013 0.001 0.719 0.267
0.001 0.973 0.001 0.025
0.039 0.067 0.893 0.001
MEME version 4
ALPHABET= ACGT
strands: + -
0.001 0.13 0.01075 0.85825
0.001 0.83 0.05875 0.11025
0.009 0.001 0.016125 0.973875
0.031125 0.131875 0.009 0.828
0.1105 0.8635 0.0055 0.0205
0.817125 0.041125 0.1255 0.01625
0.06725 0.025375 0.8865 0.020875
0.077875 0.02125 0.874125 0.02675
0.1235 0.016125 0.846875 0.0135
0.72225 0.131375 0.076 0.070375
MEME version 4
ALPHABET= ACGT
MOTIF motif_Train7_24 23-TCTTCAGGGA
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-14
0.001 0.001 0.001 0.997
0.001 0.997 0.001 0.001
0.001 0.001 0.001 0.997
0.001 0.001 0.001 0.997
0.001 0.997 0.001 0.001
0.997 0.001 0.001 0.001
0.001 0.034 0.964 0.001
0.001 0.001 0.997 0.001
0.001 0.001 0.997 0.001
0.964 0.034 0.001 0.001
MOTIF motif_10_25 22-TCTTCAGGGA
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-17
0.001 0.001 0.001 0.997
0.001 0.965 0.001 0.033
0.001 0.001 0.001 0.997
0.001 0.001 0.001 0.997
0.001 0.997 0.001 0.001
0.997 0.001 0.001 0.001
0.001 0.001 0.997 0.001
0.033 0.001 0.965 0.001
0.033 0.001 0.965 0.001
0.965 0.001 0.033 0.001
MOTIF motif_Train1_19 22-TCTTCAGGGA
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-15
0.001 0.001 0.001 0.997
0.001 0.997 0.001 0.001
0.001 0.001 0.001 0.997
0.001 0.001 0.001 0.997
0.001 0.997 0.001 0.001
0.997 0.001 0.001 0.001
0.203 0.001 0.795 0.001
0.011 0.001 0.987 0.001
0.001 0.001 0.997 0.001
0.591 0.204 0.204 0.001
MOTIF motif_Train5_21 20-TCTTCAGGGA
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-19
0.001 0.037 0.037 0.925
0.001 0.924 0.037 0.038
0.001 0.001 0.036 0.962
0.037 0.074 0.001 0.888
0.001 0.925 0.037 0.037
0.776 0.186 0.001 0.037
0.149 0.074 0.703 0.074
0.15 0.074 0.702 0.074
0.001 0.001 0.962 0.036
0.738 0.149 0.076 0.037
MOTIF motif_Train6_26 22-TCTTCAGGGA
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-14
0.001 0.001 0.001 0.997
0.001 0.963 0.001 0.035
0.001 0.001 0.001 0.997
0.035 0.001 0.001 0.963
0.001 0.997 0.001 0.001
0.997 0.001 0.001 0.001
0.001 0.001 0.997 0.001
0.035 0.001 0.963 0.001
0.035 0.001 0.963 0.001
0.963 0.001 0.035 0.001
MOTIF motif_Train4_22 20-TCTTCAGGGA
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-19
0.001 0.001 0.043 0.955
0.001 0.866 0.043 0.09
0.001 0.001 0.087 0.911
0.003 0.043 0.001 0.953
0.001 0.997 0.001 0.001
0.775 0.137 0.001 0.087
0.181 0.09 0.642 0.087
0.222 0.09 0.554 0.134
0.001 0.001 0.997 0.001
0.728 0.14 0.088 0.044
MOTIF motif_Train8_23 23-CKTCAGGGAY
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-17
0.001 0.997 0.001 0.001
0.001 0.1 0.385 0.514
0.065 0.001 0.001 0.933
0.001 0.933 0.065 0.001
0.877 0.001 0.001 0.121
0.001 0.001 0.997 0.001
0.001 0.001 0.997 0.001
0.001 0.001 0.997 0.001
0.746 0.122 0.066 0.066
0.001 0.521 0.001 0.477
MOTIF motif_Train3_18 20-TCTTCAGGGA
letter-probability matrix: alength= 4 w= 10 nsites= 20 E= 1e-19
0.001 0.001 0.001 0.997
0.001 0.828 0.001 0.17
0.001 0.001 0.001 0.997
0.17 0.001 0.001 0.828
0.001 0.997 0.001 0.001
0.997 0.001 0.001 0.001
0.001 0.001 0.997 0.001
0.17 0.001 0.828 0.001
0.17 0.001 0.828 0.001
0.828 0.001 0.17 0.001
MEME version 4
ALPHABET= ACGT
strands: + -
0.2605 0.190333333333333 0.273 0.276
0.0806666666666667 0.107333333333333 0.291166666666667 0.520666666666667
0.0321666666666667 0.898 0.001 0.0688333333333333
0.970833333333333 0.001 0.0206666666666667 0.0075
0.004 0.0405 0.8745 0.081
0.296166666666667 0.0181666666666667 0.01 0.675666666666667
0.117666666666667 0.569166666666667 0.001 0.312166666666667
MEME version 4
ALPHABET= ACGT
MOTIF motif_Train9_2 1-KTCAGTCG
letter-probability matrix: alength= 4 w= 7 nsites= 20 E= 1e-262
0.189 0.177 0.268 0.366
0.027 0.064 0.291 0.618
0.001 0.881 0.001 0.117
0.963 0.001 0.017 0.019
0.001 0.001 0.976 0.022
0.246 0.028 0.055 0.671
0.034 0.666 0.001 0.299
MOTIF motif_Train1_2 1-NKCAGTY
letter-probability matrix: alength= 4 w= 7 nsites= 20 E= 1e-283
0.288 0.199 0.281 0.232
0.192 0.162 0.284 0.361
0.001 0.997 0.001 0.001
0.976 0.001 0.001 0.022
0.001 0.001 0.826 0.172
0.398 0.001 0.001 0.6
0.267 0.434 0.001 0.298
MOTIF motif_Train2_2 1-DTCAGTC
letter-probability matrix: alength= 4 w= 7 nsites= 20 E= 1e-293
0.224 0.17 0.351 0.254
0.001 0.001 0.257 0.741
0.001 0.997 0.001 0.001
0.997 0.001 0.001 0.001
0.001 0.001 0.997 0.001
0.257 0.001 0.001 0.741
0.037 0.589 0.001 0.373
MOTIF motif_Train7_2 1-TCAGTC
letter-probability matrix: alength= 4 w= 6 nsites= 20 E= 1e-298
0.25 0.25 0.25 0.25
0.068 0.11 0.295 0.527
0.001 0.765 0.001 0.233
0.916 0.001 0.082 0.001
0.001 0.109 0.703 0.187
0.313 0.001 0.001 0.685
0.169 0.621 0.001 0.209
MOTIF motif_Train8_1 1-NKCAGTY
letter-probability matrix: alength= 4 w= 7 nsites= 20 E= 1e-308
0.303 0.173 0.244 0.28
0.164 0.211 0.242 0.382
0.188 0.81 0.001 0.001
0.997 0.001 0.001 0.001
0.019 0.13 0.82 0.031
0.334 0.077 0.001 0.588
0.137 0.49 0.001 0.372
MOTIF motif_10_2 1-NKCAGTC
letter-probability matrix: alength= 4 w= 7 nsites= 20 E= 1e-283
0.309 0.173 0.244 0.274
0.032 0.096 0.378 0.495
0.001 0.938 0.001 0.06
0.976 0.001 0.022 0.001
0.001 0.001 0.925 0.073
0.229 0.001 0.001 0.769
0.062 0.615 0.001 0.322
MEME version 4
ALPHABET= ACGT
strands: + -
0.0448888888888889 0.168111111111111 0.0201111111111111 0.766888888888889
0.0103333333333333 0.00688888888888889 0.886666666666667 0.0961111111111111
0.123444444444444 0.001 0.845 0.0305555555555556
0.00866666666666667 0.987222222222222 0.00311111111111111 0.001
0.983777777777778 0.001 0.0127777777777778 0.00244444444444444
0.828777777777778 0.0307777777777778 0.0594444444444444 0.081
0.0131111111111111 0.806 0.00166666666666667 0.179222222222222
0.203444444444444 0.106111111111111 0.587111111111111 0.103444444444444
0.0657777777777778 0.398555555555556 0.118333333333333 0.417444444444444
0.133555555555556 0.253 0.0583333333333333 0.555111111111111
0.0416666666666667 0.08 0.707777777777778 0.170555555555556
MEME version 4
ALPHABET= ACGT
MOTIF motif_Train9_3 4-TGGCAACGYTG
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 1e-143
0.034 0.234 0.021 0.711
0.013 0.001 0.917 0.069
0.145 0.001 0.806 0.048
0.001 0.997 0.001 0.001
0.997 0.001 0.001 0.001
0.813 0.001 0.062 0.124
0.001 0.805 0.001 0.193
0.242 0.152 0.474 0.133
0.104 0.382 0.145 0.369
0.186 0.279 0.057 0.478
0.058 0.048 0.632 0.262
MOTIF motif_Train5_3 4-TGGCAACGYTG
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 1e-170
0.041 0.136 0.027 0.796
0.006 0.001 0.898 0.095
0.157 0.001 0.822 0.02
0.001 0.997 0.001 0.001
0.979 0.001 0.019 0.001
0.836 0.014 0.075 0.075
0.013 0.816 0.001 0.17
0.28 0.116 0.547 0.057
0.102 0.398 0.103 0.398
0.123 0.294 0.05 0.533
0.055 0.048 0.719 0.178
MOTIF motif_Train6_3 4-TGGCAACGYTG
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 1e-136
0.052 0.149 0.021 0.778
0.01 0.001 0.87 0.119
0.132 0.001 0.797 0.07
0.004 0.994 0.001 0.001
0.968 0.001 0.03 0.001
0.844 0.01 0.067 0.079
0.067 0.765 0.001 0.167
0.187 0.107 0.589 0.117
0.091 0.394 0.143 0.372
0.133 0.232 0.047 0.588
0.064 0.13 0.644 0.162
MOTIF motif_Train1_3 5-TGGCAACGYTG
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 1e-161
0.063 0.174 0.014 0.749
0.019 0.001 0.807 0.173
0.135 0.001 0.828 0.036
0.004 0.981 0.014 0.001
0.997 0.001 0.001 0.001
0.778 0.053 0.072 0.097
0.001 0.765 0.001 0.233
0.191 0.097 0.572 0.14
0.072 0.428 0.133 0.367
0.123 0.217 0.044 0.616
0.034 0.098 0.696 0.172
MOTIF motif_Train2_3 4-TGGCAACGTTG
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 1e-134
0.018 0.061 0.024 0.897
0.012 0.012 0.947 0.029
0.059 0.001 0.928 0.012
0.011 0.987 0.001 0.001
0.987 0.001 0.011 0.001
0.926 0.001 0.05 0.023
0.011 0.831 0.001 0.157
0.106 0.059 0.788 0.047
0.035 0.27 0.031 0.664
0.111 0.226 0.071 0.592
0.047 0.047 0.835 0.071
MOTIF motif_Train7_3 5-TGGCAACGYTG
letter-probability matrix: alength= 4 w= 11 nsites= 20 E= 1e-148
0.066 0.251 0.014 0.669
0.001 0.007 0.892 0.1
0.129 0.001 0.849 0.021
0.001 0.997 0.001 0.001
0.997 0.001 0.001 0.001
0.791 0.061 0.059 0.089
0.001 0.781 0.001 0.217
0.229 0.143 0.52 0.108
0.072 0.377 0.136 0.416
0.186 0.273 0.1 0.441
0.043 0.107 0.578 0.272
MOTIF motif_Train8_3 5-TGGCAACGYTG
letter-probability matrix: alength= 4 w= 12 nsites= 20 E= 1e-138
0.043 0.172 0.014 0.771
0.014 0.014 0.85 0.122
0.093 0.001 0.878 0.028
0.028 0.964 0.007 0.001
0.95 0.001 0.035 0.014
0.859 0.036 0.054 0.051
0.022 0.815 0.007 0.156
0.187 0.062 0.586 0.165
0.014 0.486 0.124 0.376
0.083 0.256 0.065 0.596
0.022 0.079 0.789 0.11
MOTIF motif_Train4_3 4-TGGCAACGYTG
letter-probability matrix: alength= 4 w= 12 nsites= 20 E= 1e-162
0.039 0.168 0.022 0.771
0.017 0.017 0.848 0.118
0.118 0.001 0.842 0.039
0.027 0.971 0.001 0.001
0.982 0.001 0.016 0.001
0.82 0.045 0.056 0.079
0.001 0.87 0.001 0.128
0.264 0.123 0.517 0.096
0.078 0.382 0.185 0.354
0.157 0.275 0.051 0.517
0.051 0.107 0.645 0.197
MOTIF motif_10_3 4-TGGCAACGYTG
letter-probability matrix: alength= 4 w= 12 nsites= 20 E= 1e-147
0.048 0.168 0.024 0.76
0.001 0.008 0.951 0.04
0.143 0.001 0.855 0.001
0.001 0.997 0.001 0.001
0.997 0.001 0.001 0.001
0.792 0.056 0.04 0.112
0.001 0.806 0.001 0.192
0.145 0.096 0.691 0.068
0.024 0.47 0.065 0.441
0.1 0.225 0.04 0.635
0.001 0.056 0.832 0.111
MEME version 4
ALPHABET= ACGT
strands: + -
0.2445 0.239 0.442 0.0745
0.019 0.02275 0.033 0.92525
0.91125 0.008 0.02525 0.0555
0.0145 0.008 0.027 0.9505
0.97825 0.0035 0.0055 0.01275
0.6815 0.003 0.03175 0.28375
0.97575 0.00425 0.0175 0.0025
0.70425 0.01 0.0685 0.21725
MEME version 4
ALPHABET= ACGT
MOTIF motif_Train6_1 1-GTATAAAA
letter-probability matrix: alength= 4 w= 8 nsites= 20 E= 1e-299
0.211 0.278 0.441 0.07
0.001 0.001 0.001 0.997
0.997 0.001 0.001 0.001
0.001 0.001 0.001 0.997
0.997 0.001 0.001 0.001
0.624 0.001 0.001 0.374
0.997 0.001 0.001 0.001
0.657 0.001 0.001 0.341
MOTIF motif_Train7_1 1-GTATAAAA
letter-probability matrix: alength= 4 w= 8 nsites= 20 E= 1e-308
0.253 0.257 0.424 0.066
0.033 0.055 0.052 0.86
0.919 0.004 0.004 0.073
0.018 0.004 0.037 0.941
0.952 0.007 0.001 0.04
0.651 0.001 0.047 0.301
0.989 0.001 0.007 0.003
0.69 0.015 0.102 0.193
MOTIF motif_Train8_2 1-GTATAAAA
letter-probability matrix: alength= 4 w= 8 nsites= 20 E= 1e-300
0.238 0.188 0.489 0.085
0.023 0.001 0.033 0.943
0.883 0.019 0.042 0.056
0.012 0.019 0.028 0.941
0.967 0.005 0.019 0.009
0.732 0.009 0.033 0.226
0.929 0.014 0.052 0.005
0.822 0.009 0.075 0.094
MOTIF motif_10_1 1-RTATAAAA
letter-probability matrix: alength= 4 w= 8 nsites= 20 E= 1e-301
0.276 0.233 0.414 0.077
0.019 0.034 0.046 0.901
0.846 0.008 0.054 0.092
0.027 0.008 0.042 0.923
0.997 0.001 0.001 0.001
0.719 0.001 0.046 0.234
0.988 0.001 0.01 0.001
0.648 0.015 0.096 0.241
MEME version 4
ALPHABET= ACGT
strands: + -
0.5102 0.1866 0.0398 0.2636
0.7931 0.0516 0.1074 0.0479
0.2635 0.0973 0.5882 0.051
0.6342 0.0085 0.1842 0.1731
0.0297 0.0061 0.2514 0.7128
0.0565 0.0155 0.8807 0.0473
0.0735 0.179 0.6126 0.1349
0.0968 0.6525 0.0852 0.1653
MEME version 4
ALPHABET= ACGT
MOTIF motif_Train5_4 5-AAGATGGCG
letter-probability matrix: alength= 4 w= 9 nsites= 20 E= 1e-88
0.612 0.132 0.024 0.232
0.969 0.018 0.001 0.012
0.163 0.154 0.623 0.06
0.896 0.012 0.091 0.001
0.012 0.01 0.071 0.907
0.064 0.047 0.853 0.036
0.106 0.007 0.791 0.096
0.127 0.739 0.024 0.11
0.067 0.156 0.567 0.21
MOTIF motif_Train9_5 7-WAGATGGCK
letter-probability matrix: alength= 4 w= 9 nsites= 20 E= 1e-89
0.403 0.204 0.138 0.254
0.758 0.085 0.001 0.156
0.241 0.19 0.475 0.094
0.711 0.001 0.169 0.119
0.038 0.006 0.005 0.951
0.055 0.031 0.913 0.001
0.004 0.001 0.721 0.274
0.152 0.693 0.002 0.153
0.164 0.105 0.424 0.307
MOTIF motif_Train2_5 8-WAGATGGCG
letter-probability matrix: alength= 4 w= 9 nsites= 20 E= 1e-84
0.425 0.239 0.009 0.328
0.901 0.06 0.014 0.025
0.174 0.1 0.652 0.074
0.78 0.009 0.168 0.043
0.074 0.014 0.095 0.817
0.046 0.006 0.926 0.022
0.087 0.032 0.749 0.132
0.098 0.796 0.025 0.081
0.109 0.079 0.615 0.197
MOTIF motif_Train1_4 6-WAGATGGCK
letter-probability matrix: alength= 4 w= 9 nsites= 20 E= 1e-89
0.452 0.14 0.019 0.389
0.997 0.001 0.001 0.001
0.156 0.145 0.622 0.077
0.841 0.001 0.157 0.001
0.018 0.001 0.042 0.939
0.001 0.001 0.997 0.001
0.021 0.001 0.866 0.112
0.023 0.852 0.009 0.116
0.096 0.055 0.448 0.401
MOTIF motif_Train6_4 6-YAGATGGCG
letter-probability matrix: alength= 4 w= 9 nsites= 20 E= 1e-82
0.204 0.421 0.007 0.369
0.997 0.001 0.001 0.001
0.001 0.001 0.997 0.001
0.676 0.001 0.322 0.001
0.001 0.001 0.178 0.82
0.126 0.001 0.872 0.001
0.217 0.001 0.611 0.171
0.071 0.81 0.001 0.118
0.208 0.043 0.517 0.232
MOTIF motif_Train3_4 6-AGATGGCK
letter-probability matrix: alength= 4 w= 9 nsites= 20 E= 1e-86
0.957 0.017 0.01 0.016
0.202 0.167 0.492 0.139
0.622 0.029 0.321 0.028
0.012 0.023 0.237 0.728
0.013 0.025 0.941 0.021
0.046 0.025 0.673 0.256
0.208 0.73 0.012 0.05
0.153 0.167 0.352 0.327
MOTIF motif_Train7_4 7-AGATGGCK
letter-probability matrix: alength= 4 w= 8 nsites= 20 E= 1e-104
0.942 0.037 0.001 0.02
0.276 0.123 0.558 0.043
0.887 0.001 0.07 0.042
0.038 0.018 0.147 0.797
0.061 0.001 0.905 0.033
0.001 0.006 0.863 0.13
0.007 0.949 0.007 0.037
0.182 0.076 0.427 0.314
MOTIF motif_Train4_5 8-MAGATGGCG
letter-probability matrix: alength= 4 w= 9 nsites= 20 E= 1e-95
0.433 0.306 0.001 0.261
0.997 0.001 0.001 0.001
0.167 0.242 0.508 0.083
0.844 0.001 0.154 0.001
0.01 0.001 0.153 0.836
0.087 0.001 0.911 0.001
0.001 0.001 0.845 0.153
0.066 0.719 0.001 0.214
0.121 0.118 0.47 0.29
MOTIF motif_10_5 5-TAGATGGCG
letter-probability matrix: alength= 4 w= 9 nsites= 20 E= 1e-96
0.16 0.298 0.044 0.498
0.97 0.028 0.001 0.001
0.014 0.014 0.971 0.001
0.673 0.015 0.276 0.036
0.029 0.001 0.065 0.905
0.115 0.001 0.87 0.014
0.066 0.029 0.721 0.184
0.043 0.862 0.001 0.094
0.161 0.029 0.686 0.124
MOTIF motif_Train8_4 6-AAGATGGCG
letter-probability matrix: alength= 4 w= 9 nsites= 20 E= 1e-92
0.514 0.072 0.145 0.269
0.864 0.032 0.004 0.1
0.21 0.097 0.643 0.05
0.871 0.004 0.121 0.004
0.041 0.001 0.059 0.899
0.024 0.036 0.929 0.011
0.018 0.039 0.803 0.14
0.053 0.811 0.01 0.126
0.075 0.098 0.591 0.236
#converts a Homer .motif PWM/PFM into MEME format
motif2meme <- function(inFile) {
library(tools)
#establishing the number of distinct motifs in the file and parsing it accordingly
stopifnot(is.character(inFile))
outFile <- paste(inFile,"meme",sep=".")
thisFile <- file(outFile)
fileName <- file_path_sans_ext(inFile)
#reading the input file
motif.file <- scan(file=inFile,character(0), sep="\n",quote=NULL)
motif.index <- grep(pattern="^>",motif.file)
n.motifs <- length(motif.index)
total.len <- length(motif.file)
#print(n.motifs)
sink(thisFile,append=TRUE)
cat("MEME version 4\n\n",file=thisFile,append=TRUE)
cat("ALPHABET= ACGT\n\n",file=thisFile,append=TRUE)
cat("strands: + -\n\n")
nameSplit <- strsplit(fileName,"_")
nameNum <- nameSplit[[1]][1]
motifNum <- sub("^.....(..).*", "\\1", nameNum) # fifth
for (i in 1:(n.motifs-1)) {
#print(i)
#print(" ")
(motif.index[i]+1) -> index.start
((motif.index[i+1])-1) -> index.end
motif.file[index.start:index.end] -> this_motif
strsplit(this_motif,split="\t") -> motif_split
#print(head(motif_split))
length(this_motif) -> motif_row
array(NA,c(motif_row,4)) -> motif_array
for (n in 1:motif_row) {
as.numeric(motif_split[[n]]) -> motif_array[n,]
}
motif_array <- as.data.frame(motif_array)
motif.file[motif.index[i]] -> header.string
strsplit(header.string,split="[\t]") -> prob.string
prob.string[[1]][6] -> prob.string2
strsplit(prob.string2,"[:]") -> prob.string3
prob.string3[[1]][4] -> this.p.val
prob.string[[1]][2] -> name.string
strsplit(name.string,split=",") -> name.string2
name.string2[[1]][1] -> name.string3
motif_name <- paste("motif",motifNum,i,sep="_")
cat("MOTIF",motif_name,name.string3,"\n",file=thisFile, append=TRUE)
cat("letter-probability matrix: ",file=thisFile, append=TRUE)
cat("alength= 4 w=", motif_row, file=thisFile, append=TRUE)
cat(" nsites= 20 ",file=thisFile, append=TRUE)
cat("E= ",file=thisFile, append=TRUE)
cat(this.p.val,"\n",file=thisFile, append=TRUE)
write.table(motif_array,file=thisFile,append=TRUE,col.names=FALSE,row.names=FALSE,sep="\t")
cat("\n",file=thisFile, append=TRUE)
}
(motif.index[n.motifs]+1) -> index.start
total.len -> index.end
motif.file[index.start:index.end] -> this_motif
strsplit(this_motif,split="\t") -> motif_split
#print(head(motif_split))
length(this_motif) -> motif_row
array(NA,c(motif_row,4)) -> motif_array
for (n in 1:motif_row) {
as.numeric(motif_split[[n]]) -> motif_array[n,]
}
motif_array <- as.data.frame(motif_array)
motif.file[motif.index[i]] -> header.string
strsplit(header.string,split="[\t]") -> prob.string
prob.string[[1]][6] -> prob.string2
strsplit(prob.string2,"[:]") -> prob.string3
prob.string3[[1]][4] -> this.p.val
prob.string[[1]][2] -> name.string
strsplit(name.string,split=",") -> name.string2
name.string2[[1]][1] -> name.string3
motif_name <- paste("motif",motifNum,i,sep="_")
cat("MOTIF",motif_name,name.string3,"\n",file=thisFile, append=TRUE)
cat("letter-probability matrix: ",file=thisFile, append=TRUE)
cat("alength= 4 w=", motif_row, file=thisFile, append=TRUE)
cat(" nsites= 20 ",file=thisFile, append=TRUE)
cat("E= ",file=thisFile, append=TRUE)
cat(this.p.val,"\n",file=thisFile, append=TRUE)
write.table(motif_array,file=thisFile,append=TRUE,col.names=FALSE,row.names=FALSE,sep="\t")
cat("\n",file=thisFile, append=TRUE)
sink()
close(thisFile)
print("matrix has been converted to MEME")
}
267 55 194 168 108 33 156 211 221 235 210
167 120 87 219 155 15 208 30
28 217 132 242 105 50 189 5 76 162 88 173 112
99 203 17 263 258 37 151 148 221 119 178
69 250 35 12 112 195 183 139 110 196 41
76 50 132 242 5 189 217 28 181 249 10 199
217 28 242 132 105 50 5 76 249 88 53
233 179 151 21 264 246 117 41 193
172 253 143 42 50 196 209 109 34 76 87
89 140 196 154 114 61 171 260 69 206 180 228 142
162 50 242 105 5 189 76 28 15 109 196 34
139 226 86 196 171 110 11 61 34 250 257 181 249
229 151 92 201 16 38 264 143 133
179 233 64 95 206 21 118 246
140 89 196 86 61 171 260 69 206 180 228 112
167 120 219 109 155 15 208 30
267 55 194 168 108 33 211 246 221
28 217 132 242 105 50 189 5 76 162 88 173 112
99 203 17 263 258 37 151 148 221 119 178
69 250 116 35 12 112 195 183 206 139 110 196
76 50 132 242 5 189 217 28 181 249 10 162 199
217 28 242 132 105 50 5 76 181 88
162 50 242 5 189 217 28 15 109 196 34
89 140 196 86 154 114 61 171 260 69 206 180 228
233 179 151 21 246 117 193
229 151 92 231 201 16 264 40 143
172 253 143 42 180 50 209 109 34 76 183
140 89 196 86 154 61 171 260 206 180 228 112
139 226 86 196 171 110 11 61 34 250 257 181 249
179 233 95 206 21 118 264 246 41 236
4 216 131 104 75 161 49 241 190 27
5 105 50 132 162 216 10 4
6 133 77 191 163 243 218 29 51 80 138 109 43
7 79 107 31 245 134 220 164 142 116 146
8 221 55 33 168 80 262 258
9 54 248 111 137 195 223 165 141 59 198 14 85
10 165 58 218 77 133 241 131
11 86 196 171 61 110 139 226 250 30 203 208 244 219
12 35 228 173 204
13 90 197 227 166 245 39 60 114 31 107 147 52
14 85 198 141 36 251 170 248 223 32
15 222 169 138 58 252 83 35 162 200 34 228
16 264 257 151 231 123 256 27
17 145 176 37 263 42 196
18 148 263 178 94 121 67 230 143 204
19 92 174 144 40 260 256 67 232 80 207 175
20 155 173 30 112 210
21 119 152 98 182 205 232
22 265 156 66 96 211 235 42 236 225 9
27 75 104 49 161 241 131 4 216 16 88
28 217 132 242 105 50 189 5 76 162 88 173 112
29 243 191 106 133 6 77 218 163 51 80 261
30 199 219 208 65 70 147 11 91 110 202
31 107 79 7 134 192 227 164 197 116 90 13 146
32 248 111 54 137 223 195 165 141 59 198 14 85 170
33 168 8 221 136 146 91 144 62
34 196 109 139 226 15 203 17 145 249 149 162 258
35 112 228 193 12 173 138 56 15 80 169 135 83
36 113 225 170 59 14 85 141 111 248 137 44 54
37 258 117 224 203 176 145 99 8 178 169
38 257 151 183 138 216 51 258
39 60 114 227 13 197 254 164 65
40 19 92 62 144 174 43 229 98
41 118 264 151 233 173 69
42 180 143 156 22 64 17
43 261 183 257 68 204 263
44 126 266 97 113 36 150 84 65 208 121 207
49 131 190 4 216 27 86 10 145
50 242 105 132 5 189 162 58 172 263
51 218 77 106 133 6 243 191 29 109 205 175 63 255
52 164 134 197 116 90 13
53 259 84 98 219 217 251 76 36 225 113
54 248 111 137 32 195 223 81 165 9 225 113 251 36
55 168 33 108 146 91 144 136 247
56 80 35 221 17 173 8 250
57 86 140 171 125 139 12
58 252 169 202 15 222 138 155 10 50 109 42
59 85 141 36 251 170 111 223 32
60 39 227 197 166 254 164 219 65 244
61 171 86 110 139 226 266 169 208 89 140
62 144 174 92 201 243 108 106 33 40 260
63 205 232 175 255 40 91 235 221 51 228 180 42
64 151 95 171 42 180 94 153
65 97 244 219 120 30 164 195 171 44 167 126 249
66 265 22 96 235 180 110 11
67 207 121 178 18 206 230 182 19 42 175 205
68 153 261 43 12 193 173 204 151 28 222 255
69 250 116 35 12 112 195 183 206 139 110 196
70 120 20 155 91 208 219 65 33
75 104 131 190 161 4 241 216 27 227
76 50 132 242 5 189 217 28 181 249 10 162 199
77 106 133 218 163 6 191 243 29 51 80 135
78 219 244 138 83 91 192 39
79 7 31 107 134 245 192 220 164 116 197
80 56 246 193 8 221 112 228 29 218 163 243 77 191 106
81 223 195 165 54 248 137 111 183 256 203
82 136 108 209 125 8 59
83 252 169 122 222 15 235 138 155 84 65
84 249 125 83 44 219 198 14 59 85
85 14 59 198 141 36 251 170 248 223 32
86 11 61 250 122 190 241 49
87 221 108 194 168 55 167 135 259
88 152 28 217 173
89 140 196 86 154 114 61 171 260 69 206 180 228
90 13 197 227 166 245 39 31 107 192 52
91 20 255 232 63 205 175 120 168 55 108 33 30
92 257 151 38 201 109 40 253 171 31
93 255 63 175 205 232 40 142 195 207 137 54 125 210
94 178 121 67 207 233 179 17
95 64 206 118 41 200
96 124 156 22 66 211 42 11 259
97 65 208 44 120 164 59 198 126 141 36
98 152 21 204 257 182 53 178
99 203 17 263 258 37 151 148 221 119 178
104 75 131 4 216 190 161 27 86 227
105 5 242 132 189 50 53 263
106 77 163 191 243 29 51 80 62 261
107 31 79 7 192 220 164 116 13 90 197 146
108 194 168 82 136 247 246 146 135 91 144 62
109 34 51 263 92 162 43 228 6
110 86 61 171 11 226 139 250 66 244 208 69
111 248 137 54 32 223 195 81 165 9 225 113 36 251 198 59
112 35 173 80 222 28 88
113 225 36 170 59 141 111 248 137 44 54
114 39 227 13 197 254 147 164
115 181 221 42 53
116 31 107 79 7 52 146 92 80
117 203 17 224 37 145 233 99 41 196
118 264 41 151 179 233 171 266 12 211
119 152 21 257 182 145 224 203 148 263 178
120 208 91 244 97 260 123
121 178 230 94 97 175 205 232
122 83 58 169 222 235 155 13 20 177
123 150 256 142 210 155
124 96 156 180 42 11 117
125 247 82 136 184 14 59 85 198 200 244
126 44 266 97 65 208 200 250 206 179
131 4 75 216 104 49 161 190 27 227 163 218
132 242 105 50 189 5 53 87 64 106
133 6 77 218 163 243 29 51 80
134 245 227 31 79 7 197 166 52
135 246 80 8 221 55 194
136 82 247 194 108 33 55 168 209 210 8 221 184
137 111 248 54 32 223 195 81 165 9 251 113 36 225 59
138 15 169 58 35 83 191 6 78
139 226 86 196 171 110 11 61 34 250 257 181 249
140 89 196 86 154 61 171 260 206 180 228 112
141 14 85 59 198 225 113 36 251 170 32 195 223
142 175 40 123 7 93 146 266 165 44
143 253 249 172 174 18 196 19
144 174 62 92 33 108 168 55 201 243 21 40
145 203 17 224 258 119 176 37 117 252 263 94 257
146 192 33 108 55 36 113 7 107 123 31 79
147 39 114 227 13 90 245 30
148 263 18 178 67 152
149 181 260 257
150 123 256 44 210 219 199 120 19 61 30 155
151 64 264 118 257 41 38 95 183 231 260
152 98 119 21 204 257 182 88 148 178 175 171 61
153 261 68 183 204 29 64
154 254 259 182 199 181 168
155 70 58 83 20 202 123 150
156 265 22 124 96 180 42 113
161 75 131 190 4 104 216 27 86 10
162 50 242 5 189 217 28 15 109 196 34
163 218 77 106 133 6 243 191 29 51 80 135 261 43
164 256 192 220 52 7 31 107 245 134 65 97 90
165 81 54 223 137 32 248 111 9 10 203
166 197 227 13 90 245 134 60 39 180 135 216
167 120 219 109 155 15 208 30
168 108 55 194 135 91 144 136 247
169 222 15 83 58 252 138 202 122 61 173 35 86 228 112
170 225 251 113 36 14 85 59 198 141 32 53
171 61 110 196 226 139 64 169 266 118 257 65 182
172 253 143 42 180 50 209 109 34 76 183
173 35 112 193 222 153 28 70 80 169 88 56
174 144 62 92 201 143 40 260
175 232 205 63 255 40 91 142 235 51 152 180 42 207
176 224 258 203 145 117 37 221 41 235 226
177 120 254 173 30 266
178 263 18 121 67 207 206 119 98 152
179 233 95 206 21 118 264 246 41 236
180 42 235 253 154 257 124 96 66 22 156 209
181 249 149 115 84 139 196 42 260 180 82 136 182 247
182 21 119 152 98 61 121 204 250
183 81 257 43 151 38 153 40 205 63
184 265 156 22 66 96 235 180 42 170
189 242 105 132 50 162 261 151 216 163 10 263
190 75 241 49 161 104 131 4 216 27 86 227
191 243 6 106 77 163 218 29 51 80 138
192 220 31 79 245 146 90 191 106
193 112 35 204 135 233 171 222 28
194 108 33 136 82 247 8 221 259 70 53
195 32 81 223 54 248 137 111 9 141 65 256 107 204 31
196 34 226 171 139 110 250 17 162 181
197 13 90 227 166 245 134 60 39 31 107 79 52
198 14 85 141 113 251 170 111 223 32
199 244 254 227
200 15 252 56 125 44 257 183 165 171
201 92 38 16 62 144 174 229 119 231 15
202 58 252 169 222 155 30 20 256 70 91 110 232
203 145 119 37 263 196
204 98 152 228 12 193 43
205 232 255 63 175 40 91 235 221 51 207
206 264 95 207 233 179 178 18 148 263 113 251 225
207 121 67 178 18 206 230 94 182 21 44 175 205 232
208 97 244 120 30 61 54 44 126
209 143 42 180 136 82 125 219 235 196 66
210 123 150 267 82 247 208 97 116 93
211 22 124 184 235 259 233 180
216 4 131 104 75 241 161 49 190 27 163 77 38 166
217 28 242 132 105 50 5 76 181 88
218 163 77 133 6 243 191 29 51 80 135 10
219 199 244 65 122 30 84 234
220 192 79 31 164 7 106 218 77 6 87
221 8 246 135 108 33 55 87 80 205 63 56
222 15 169 83 252 58 35 202 122 193
223 81 137 32 248 111 195 54 165 59 198 14 85 141
224 258 145 119 37 221 257 266
225 113 36 251 170 85 141 111 248 137 54 32
226 139 86 196 110 171 11 61 34 173
227 245 134 114 60 39 199 167 256 104
228 35 173 204 80 138 222 63
229 151 92 231 201 16 264 40 143
230 121 97 207 178 67 18 198
231 98 260 152 119 204 151 16
232 255 205 175 63 40 91 235 51 42 183 207 19
233 179 151 21 246 117 193
234 219 121 205 173 261 175
235 66 265 22 42 184 83 205 232 175
236 259 233 179 22 265
241 190 75 4 216 27 10 202
242 132 105 189 50 162 67 64 261 141
243 191 6 106 133 29 163 218 77 51 80 62 144
244 208 65 219 199 120 85 14 59 53 198
245 134 227 7 107 79 197 13 90 166 254
246 135 221 33 87 179 262 233 267
247 136 194 33 108 210 59 85
248 111 54 137 32 223 195 81 165 9 113 225 36 251 198
249 181 84 143 139 65 257 260 180
250 86 61 171 226 110 139 11 196 69 266 32 223
251 225 170 14 198 141 137 111 248 54 223
252 58 83 169 15 222 122 155 145 200 42 167
253 143 180 172 92 263 196 229 244
254 60 39 114 227 13 197 199 245 154 177 30
255 232 205 175 63 40 91 51 180 183 207 19
256 123 150 164 116 54 248 111 195
257 151 92 16 98 183 119 152 43 180 171
258 224 145 176 117 37 196 8 119 221
259 53 198 59 14 85 225 36 113
260 152 98 231 21 151 19 16 120 181 149
261 43 153 68 204 29 243 106 138 133 191 189 77
262 246 8 221 16 196 139 226 229 109 96 124
263 148 18 178 109 87 17 203 98
264 118 151 41 206 16 179 51 232
265 156 66 22 180 42 225
266 44 126 61 150 123 250 17 203 208 65 142 175
267 55 194 168 108 33 211 246 221
#!/usr/bin/env python
from __future__ import print_function
import argparse
import re
import sys
def parse_motif(infile):
motifid = None
matches = list()
for line in infile:
motif = re.search('Best Matches for Significant Motif ID (\d+)', line)
topmotif = re.search('Best Matches for Top Significant Motif ID (\d+)',
line)
matchdesc = re.search('^(\d+)\s+(\d+)\s+Motif (\d+)\s+(\S+ \S+)', line)
if motif or topmotif:
if motifid and len(matches) > 0:
yield motifid, matches
motifid = None
matches = list()
if topmotif:
motifid = topmotif.group(1)
elif matchdesc:
if motifid is None:
continue
matchid = matchdesc.group(2)
matchname = matchdesc.group(3)
matchformat = matchdesc.group(4)
assert matchname == matchid
assert matchformat in ['Original Motif', 'Reverse Complement']
if matchformat == 'Original Motif':
matches.append(matchid)
if motifid and len(matches) > 0:
yield motifid, matches
def get_parser():
parser = argparse.ArgumentParser()
parser.add_argument('-o', '--outfile', metavar='OUT', default=sys.stdout,
type=argparse.FileType('w'))
parser.add_argument('infile', type=argparse.FileType('r'))
return parser
def main(args):
for motifid, matches in parse_motif(args.infile):
matchstring = ','.join(matches)
nummatches = len(matches)
print(motifid, matchstring, nummatches, sep='\t')
if __name__ == '__main__':
main(get_parser().parse_args())
267 55,194,168,108,33,156,211,221,235,210 10
167 120,87,219,155,15,208,30 7
28 217,132,242,105,50,189,5,76,162,88,173,112 12
99 203,17,263,258,37,151,148,221,119,178 10
69 250,35,12,112,195,183,139,110,196,41 10
76 50,132,242,5,189,217,28,181,249,10,199 11
217 28,242,132,105,50,5,76,249,88,53 10
233 179,151,21,264,246,117,41,193 8
172 253,143,42,50,196,209,109,34,76,87 10
89 140,196,154,114,61,171,260,69,206,180,228,142 12
162 50,242,105,5,189,76,28,15,109,196,34 11
139 226,86,196,171,110,11,61,34,250,257,181,249 12
229 151,92,201,16,38,264,143,133 8
179 233,64,95,206,21,118,246 7
140 89,196,86,61,171,260,69,206,180,228,112 11
**************************************************************************************************************
MOTIFSIM - Motif Similarity Detection Tool
Version 1.0
**************************************************************************************************************
Please, consult user manual for using this tool.
**************************************************************************************************************
INPUT
**************************************************************************************************************
Number of files: 10
Number of best matches: 15
Similarity cutoff: >= 0.85
Number of threads: 4
Input Files and Motif Counts
File Name File Type Count of Motifs Dataset #
Train1_motifs.txt.meme 5 23 1
Train2_motifs.txt.meme 5 22 2
Train3_motifs.txt.meme 5 26 3
Train4_motifs.txt.meme 5 29 4
Train5_motifs.txt.meme 5 27 5
Train6_motifs.txt.meme 5 30 6
Train7_motifs.txt.meme 5 28 7
Train8_motifs.txt.meme 5 27 8
Train9_motifs.txt.meme 5 25 9
Train10_motifs.txt.meme 5 31 10
**************************************************************************************************************
RESULTS
**************************************************************************************************************
********** Top 15 Significant Motifs - Global Matching (Highest to Lowest) **********
Dataset # 10 Motif ID 267 Motif name Motif 267
******* Best Matches for Top Significant Motif ID 267 (Highest to Lowest) *******
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
3 55 Motif 55 Original Motif Original Motif Backward 3 6 0.0152917
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
8 194 Motif 194 Original Motif Original Motif Forward 6 6 0.016125
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
7 168 Motif 168 Original Motif Original Motif Forward 3 6 0.0165833
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
5 108 Motif 108 Original Motif Original Motif Backward 3 6 0.016625
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
2 33 Motif 33 Original Motif Original Motif Backward 3 6 0.0170417
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
1 8 Motif 8 Reverse Complement Reverse Complement Forward 6 6 0.0190417
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
6 156 Motif 156 Original Motif Reverse Complement Forward 5 6 0.0355
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
7 184 Motif 184 Reverse Complement Original Motif Backward 4 6 0.0357083
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
8 211 Motif 211 Original Motif Reverse Complement Backward 3 6 0.037
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
3 66 Motif 66 Reverse Complement Original Motif Backward 5 6 0.041125
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
9 221 Motif 221 Original Motif Original Motif Forward 2 6 0.0412917
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
5 124 Motif 124 Reverse Complement Original Motif Backward 5 6 0.0425417
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
1 22 Motif 22 Reverse Complement Original Motif Forward 3 6 0.04275
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
9 235 Motif 235 Original Motif Reverse Complement Forward 7 6 0.043125
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
8 210 Motif 210 Original Motif Original Motif Backward 1 6 0.043375
-----------------------------------------------------------------------
Dataset # 7 Motif ID 167 Motif name Motif 167
******* Best Matches for Top Significant Motif ID 167 (Highest to Lowest) *******
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
8 199 Motif 199 Reverse Complement Reverse Complement Backward 3 6 0
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
5 120 Motif 120 Original Motif Original Motif Forward 6 6 0.00520833
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
3 65 Motif 65 Reverse Complement Reverse Complement Forward 7 6 0.0148333
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
4 87 Motif 87 Original Motif Reverse Complement Forward 1 6 0.0187917
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
9 219 Motif 219 Original Motif Reverse Complement Forward 1 6 0.0212917
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
5 109 Motif 109 Reverse Complement Original Motif Backward 4 6 0.0276667
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
3 58 Motif 58 Reverse Complement Reverse Complement Backward 2 6 0.0280983
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
6 155 Motif 155 Original Motif Original Motif Backward 2 6 0.0325417
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
1 15 Motif 15 Original Motif Original Motif Forward 5 6 0.0344167
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
8 208 Motif 208 Original Motif Original Motif Backward 7 6 0.0387083
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
2 30 Motif 30 Original Motif Reverse Complement Backward 2 6 0.0389167
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
10 244 Motif 244 Reverse Complement Reverse Complement Forward 3 6 0.0430417
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
10 263 Motif 263 Reverse Complement Reverse Complement Backward 6 6 0.0437917
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
4 78 Motif 78 Reverse Complement Original Motif Backward 2 5 0.51807
-----------------------------------------------------------------------
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction Position # # of Overlap Similarity Score
2 34 Motif 34 Reverse Complement Original Motif Forward 6 5 0.522566
-----------------------------------------------------------------------
Dataset # 2 Motif ID 28 Motif name Motif 28
******* Best Matches for Top Significant Motif ID 28 (Highest to Lowest) *******
Dataset # Motif ID Motif Name Matching Format in Dataset Matching Format in Comparing Dataset Direction
### this function generates a non-redundant set of motifs from the TomTom output during 10-fold cross-validiation
motifsReduce <- function(x) {
myTable <- read.table(file=x,header=FALSE,skip=1)
colnames(myTable) <- c("QueryID","TargetID","Optimal_offset","p_value","E_value","q_value","Overlap","Query","consensus","Target_consensus","Orientation")
myTable <- as.data.frame(myTable)
QueryList <- myTable$QueryID
uniqueQuery <- unique(QueryList)
for (i in 1:length(uniqueQuery){
uniqueQuary[i] -> myQuery
which(myTable$QueryID==myQuery) -> queryIndex
min(queryIndex)+1 -> startPos
max(queryIndex) -> endPos
myTable$TargetID[startPos:endPos] -> redundantMoitfs
}
>VTATAAAARNNN 1-VTATAAAARNNN 8.274038 -655.070354 0 T:1116.0(7.30%),B:329.9(1.00%),P:1e-284 Tpos:25.7,Tstd:9.3,Bpos:51.7,Bstd:27.2,StrandBias:10.0,Multiplicity:1.00
0.333 0.302 0.311 0.054
0.001 0.001 0.001 0.997
0.965 0.001 0.001 0.033
0.003 0.001 0.001 0.995
0.995 0.001 0.001 0.003
0.624 0.001 0.001 0.374
0.997 0.001 0.001 0.001
0.590 0.001 0.186 0.223
0.371 0.122 0.347 0.160
0.240 0.290 0.290 0.181
0.268 0.279 0.277 0.175
0.281 0.260 0.277 0.182
>NNKCAGTYDN 1-NNKCAGTYDN 5.053783 -652.158774 0 T:5657.0(36.99%),B:6968.1(21.22%),P:1e-283 Tpos:51.8,Tstd:20.1,Bpos:50.6,Bstd:35.5,StrandBias:10.0,Multiplicity:1.18
0.245 0.261 0.226 0.268
0.288 0.199 0.281 0.232
0.192 0.162 0.284 0.361
0.001 0.997 0.001 0.001
0.976 0.001 0.001 0.022
0.001 0.001 0.826 0.172
0.398 0.001 0.001 0.600
0.267 0.434 0.001 0.298
0.257 0.155 0.332 0.255
0.266 0.227 0.259 0.248
>TGGCAACGYTGC 5-TGGCAACGYTGC 7.779931 -370.948254 0 T:705.0(4.61%),B:245.3(0.75%),P:1e-161 Tpos:29.0,Tstd:18.2,Bpos:48.1,Bstd:29.5,StrandBias:10.0,Multiplicity:1.02
0.063 0.174 0.014 0.749
0.019 0.001 0.807 0.173
0.135 0.001 0.828 0.036
0.004 0.981 0.014 0.001
0.997 0.001 0.001 0.001
0.778 0.053 0.072 0.097
0.001 0.765 0.001 0.233
0.191 0.097 0.572 0.140
0.072 0.428 0.133 0.367
0.123 0.217 0.044 0.616
0.034 0.098 0.696 0.172
0.170 0.430 0.123 0.276
>MWAGATGGCK 6-MWAGATGGCK,BestGuess:MA0531.1_CTCF/Jaspar(0.815) 6.965604 -205.696207 0 T:798.0(5.22%),B:586.3(1.79%),P:1e-89
0.484 0.404 0.068 0.043
0.452 0.140 0.019 0.389
0.997 0.001 0.001 0.001
0.156 0.145 0.622 0.077
0.841 0.001 0.157 0.001
0.018 0.001 0.042 0.939
0.001 0.001 0.997 0.001
0.021 0.001 0.866 0.112
0.023 0.852 0.009 0.116
0.096 0.055 0.448 0.401
>TCTCCGACACGG 9-TCTCCGACACGG,BestGuess:prd/dmmpmm(Noyes)/fly(0.573) 8.566248 -134.767919 0 T:140.0(0.92%),B:8.3(0.03%),P:1e-58
0.001 0.001 0.001 0.997
0.001 0.997 0.001 0.001
0.276 0.001 0.092 0.631
0.001 0.997 0.001 0.001
0.001 0.815 0.001 0.183
0.001 0.001 0.997 0.001
0.997 0.001 0.001 0.001
0.001 0.815 0.001 0.183
0.815 0.183 0.001 0.001
0.001 0.997 0.001 0.001
0.001 0.001 0.997 0.001
0.065 0.001 0.933 0.001
>GGGTGATHAA 8-GGGTGATHAA,BestGuess:CG12361/dmmpmm(Noyes_hd)/fly(0.674) 10.791086 -105.472496 0 T:111.0(0.73%),B:8.0(0.02%),P:1e-45
0.001 0.001 0.997 0.001
0.001 0.001 0.997 0.001
0.001 0.001 0.997 0.001
0.001 0.380 0.001 0.618
0.262 0.127 0.610 0.001
0.872 0.126 0.001 0.001
0.001 0.001 0.392 0.606
0.384 0.361 0.001 0.254
0.997 0.001 0.001 0.001
0.997 0.001 0.001 0.001
>WKYCGAGTGAGC 11-WKYCGAGTGAGC,BestGuess:z/dmmpmm(SeSiMCMC)/fly(0.665) 8.163639 -100.240415 0 T:104.0(0.68%),B:6.1(0.02%),P:1e-43
0.446 0.022 0.174 0.357
0.043 0.036 0.528 0.393
0.011 0.453 0.010 0.526
0.028 0.624 0.029 0.319
0.065 0.045 0.852 0.038
0.660 0.022 0.284 0.034
0.249 0.228 0.472 0.050
0.031 0.227 0.164 0.578
0.031 0.249 0.491 0.228
0.521 0.198 0.240 0.041
0.188 0.194 0.397 0.221
0.041 0.487 0.206 0.267
>CTGTTTAGTCYC 12-CTGTTTAGTCYC,BestGuess:MF0005.1_Forkhead_class/Jaspar(0.707) 9.909245 -95.729655 0 T:110.0(0.72%),B:11.1(0.03%),P:1e-41
0.001 0.997 0.001 0.001
0.065 0.001 0.001 0.933
0.001 0.019 0.901 0.079
0.001 0.019 0.001 0.979
0.001 0.001 0.001 0.997
0.216 0.020 0.020 0.744
0.997 0.001 0.001 0.001
0.020 0.099 0.821 0.060
0.040 0.262 0.236 0.462
0.374 0.585 0.001 0.040
0.178 0.425 0.020 0.376
0.059 0.698 0.001 0.242
>GGCAACAACT 9-GGCAACAACT,BestGuess:Aef1/dmmpmm(Pollard)/fly(0.796) 8.681100 -93.044559 0 T:152.0(0.99%),B:41.1(0.13%),P:1e-40
0.046 0.030 0.923 0.001
0.001 0.001 0.997 0.001
0.029 0.808 0.041 0.122
0.968 0.001 0.001 0.030
0.997 0.001 0.001 0.001
0.001 0.989 0.001 0.009
0.997 0.001 0.001 0.001
0.931 0.012 0.001 0.056
0.001 0.969 0.001 0.029
0.001 0.001 0.029 0.969
>CGCTAGAGGG 10-CGCTAGAGGG,BestGuess:MA0531.1_CTCF/Jaspar(0.599) 7.470390 -91.114645 0 T:268.0(1.75%),B:157.8(0.48%),P:1e-39
0.089 0.821 0.055 0.035
0.097 0.018 0.844 0.041
0.057 0.811 0.056 0.076
0.074 0.142 0.062 0.722
0.914 0.019 0.048 0.019
0.026 0.030 0.859 0.085
0.812 0.066 0.063 0.059
0.023 0.014 0.794 0.169
0.111 0.115 0.689 0.085
0.148 0.029 0.740 0.083
>GAGAAAAGGCGA 13-GAGAAAAGGCGA,BestGuess:MA0459.1_tll/Jaspar(0.628) 11.520154 -88.555803 0 T:105.0(0.69%),B:12.5(0.04%),P:1e-38
0.001 0.001 0.997 0.001
0.986 0.001 0.012 0.001
0.001 0.001 0.997 0.001
0.986 0.012 0.001 0.001
0.997 0.001 0.001 0.001
0.973 0.013 0.001 0.013
0.997 0.001 0.001 0.001
0.013 0.013 0.961 0.013
0.001 0.001 0.718 0.280
0.001 0.960 0.001 0.038
0.013 0.131 0.830 0.026
0.960 0.001 0.001 0.038
>CAACTGCAGTGC 14-CAACTGCAGTGC,BestGuess:Aef1/dmmpmm(Pollard)/fly(0.656) 9.351577 -79.984730 0 T:95.0(0.62%),B:11.7(0.04%),P:1e-34
0.100 0.620 0.171 0.109
0.639 0.094 0.150 0.117
0.624 0.007 0.241 0.128
0.012 0.739 0.077 0.172
0.204 0.118 0.151 0.527
0.099 0.102 0.512 0.287
0.055 0.838 0.036 0.071
0.710 0.040 0.108 0.142
0.125 0.081 0.709 0.085
0.074 0.261 0.068 0.597
0.157 0.098 0.480 0.265
0.186 0.576 0.102 0.136
>TTGTGAATTT 12-TTGTGAATTT,BestGuess:tll/dmmpmm(Papatsenko)/fly(0.784) 6.499862 -66.785012 0 T:1129.0(7.38%),B:1571.7(4.79%),P:1e-29
0.001 0.089 0.015 0.895
0.024 0.074 0.001 0.901
0.088 0.125 0.656 0.131
0.049 0.047 0.001 0.903
0.078 0.176 0.647 0.099
0.951 0.035 0.001 0.013
0.606 0.168 0.113 0.113
0.054 0.060 0.095 0.791
0.001 0.016 0.001 0.982
0.102 0.060 0.170 0.668
>ACACGTGTTT 15-ACACGTGTTT,BestGuess:MF0007.1_bHLH(zip)_class/Jaspar(0.805) 7.090316 -60.925226 0 T:380.0(2.48%),B:371.4(1.13%),P:1e-26
0.770 0.075 0.118 0.037
0.104 0.760 0.081 0.055
0.862 0.046 0.029 0.063
0.049 0.737 0.077 0.137
0.024 0.045 0.781 0.150
0.041 0.044 0.036 0.879
0.055 0.034 0.853 0.058
0.033 0.065 0.071 0.831
0.067 0.095 0.077 0.761
0.090 0.060 0.079 0.771
>TCGTGAAAMGAG 17-TCGTGAAAMGAG,BestGuess:Abd-B/dmmpmm(Noyes_hd)/fly(0.542) 11.786647 -58.525977 0 T:51.0(0.33%),B:0.5(0.00%),P:1e-25
0.001 0.001 0.001 0.997
0.001 0.796 0.116 0.087
0.019 0.001 0.896 0.084
0.168 0.001 0.001 0.830
0.086 0.001 0.912 0.001
0.997 0.001 0.001 0.001
0.997 0.001 0.001 0.001
0.575 0.001 0.001 0.423
0.430 0.568 0.001 0.001
0.171 0.001 0.825 0.003
0.659 0.001 0.084 0.256
0.001 0.001 0.997 0.001
>TGTCAAAA 16-TGTCAAAA,BestGuess:MA0222.1_exd/Jaspar(0.917) 7.031695 -50.285219 0 T:562.0(3.68%),B:694.6(2.12%),P:1e-21
0.001 0.001 0.001 0.997
0.001 0.001 0.997 0.001
0.001 0.001 0.001 0.997
0.001 0.997 0.001 0.001
0.997 0.001 0.001 0.001
0.886 0.001 0.112 0.001
0.775 0.104 0.001 0.120
0.796 0.001 0.001 0.202
>CCWAGACGCAGC 19-CCWAGACGCAGC,BestGuess:POL010.1_DCE_S_III/Jaspar(0.570) 12.190905 -48.189409 0 T:42.0(0.27%),B:0.4(0.00%),P:1e-20
0.112 0.886 0.001 0.001
0.001 0.997 0.001 0.001
0.512 0.001 0.001 0.486
0.744 0.001 0.001 0.254
0.001 0.001 0.997 0.001
0.886 0.001 0.112 0.001
0.001 0.773 0.001 0.225
0.112 0.001 0.886 0.001
0.225 0.773 0.001 0.001
0.997 0.001 0.001 0.001
0.001 0.001 0.997 0.001
0.001 0.886 0.001 0.112
>TTCAAACCTCAT 20-TTCAAACCTCAT,BestGuess:pan/dmmpmm(SeSiMCMC)/fly(0.601) 12.947963 -45.892886 0 T:40.0(0.26%),B:0.8(0.00%),P:1e-19
0.001 0.001 0.001 0.997
0.145 0.001 0.001 0.853
0.001 0.997 0.001 0.001
0.997 0.001 0.001 0.001
0.853 0.001 0.001 0.145
0.853 0.001 0.001 0.145
0.202 0.729 0.068 0.001
0.001 0.997 0.001 0.001
0.001 0.001 0.290 0.708
0.001 0.997 0.001 0.001
0.853 0.001 0.001 0.145
0.001 0.001 0.001 0.997
>GTCTTCAGGGAC 22-GTCTTCAGGGAC,BestGuess:POL008.1_DCE_S_I/Jaspar(0.591) 8.381307 -36.506036 0 T:39.0(0.26%),B:3.8(0.01%),P:1e-15
0.001 0.001 0.997 0.001
0.001 0.001 0.001 0.997
0.001 0.997 0.001 0.001
0.001 0.001 0.001 0.997
0.001 0.001 0.001 0.997
0.001 0.997 0.001 0.001
0.997 0.001 0.001 0.001
0.203 0.001 0.795 0.001
0.011 0.001 0.987 0.001
0.001 0.001 0.997 0.001
0.591 0.204 0.204 0.001
0.001 0.585 0.001 0.413
>CACCAC 15-CACCAC,BestGuess:Run/dmmpmm(Papatsenko)/fly(0.872) 1.656420 -30.697006 0 T:2526.0(16.52%),B:4566.7(13.91%),P:1e-13
0.082 0.916 0.001 0.001
0.984 0.001 0.014 0.001
0.001 0.997 0.001 0.001
0.016 0.982 0.001 0.001
0.997 0.001 0.001 0.001
0.001 0.997 0.001 0.001
=============================
letter probability matrix:
=============================
alength= 4 w= 12 nsites= 20 E= 1e-284
0.333 0.302 0.311 0.054
0.001 0.001 0.001 0.997
0.965 0.001 0.001 0.033
0.003 0.001 0.001 0.995
0.995 0.001 0.001 0.003
0.624 0.001 0.001 0.374
0.997 0.001 0.001 0.001
0.59 0.001 0.186 0.223
0.371 0.122 0.347 0.16
0.24 0.29 0.29 0.181
0.268 0.279 0.277 0.175
0.281 0.26 0.277 0.182
=============================
letter probability matrix:
=============================
alength= 4 w= 10 nsites= 20 E= 1e-283
0.245 0.261 0.226 0.268
0.288 0.199 0.281 0.232
0.192 0.162 0.284 0.361
0.001 0.997 0.001 0.001
0.976 0.001 0.001 0.022
0.001 0.001 0.826 0.172
0.398 0.001 0.001 0.6
0.267 0.434 0.001 0.298
0.257 0.155 0.332 0.255
0.266 0.227 0.259 0.248
=============================
letter probability matrix:
=============================
alength= 4 w= 12 nsites= 20 E= 1e-161
0.063 0.174 0.014 0.749
0.019 0.001 0.807 0.173
0.135 0.001 0.828 0.036
0.004 0.981 0.014 0.001
0.997 0.001 0.001 0.001
0.778 0.053 0.072 0.097
0.001 0.765 0.001 0.233
0.191 0.097 0.572 0.14
0.072 0.428 0.133 0.367
0.123 0.217 0.044 0.616
0.034 0.098 0.696 0.172
0.17 0.43 0.123 0.276
=============================
letter probability matrix:
=============================
alength= 4 w= 10 nsites= 20 E= 1e-89
0.484 0.404 0.068 0.043
0.452 0.14 0.019 0.389
0.997 0.001 0.001 0.001
0.156 0.145 0.622 0.077
0.841 0.001 0.157 0.001
0.018 0.001 0.042 0.939
0.001 0.001 0.997 0.001
0.021 0.001 0.866 0.112
0.023 0.852 0.009 0.116
0.096 0.055 0.448 0.401
=============================
letter probability matrix:
=============================
alength= 4 w= 12 nsites= 20 E= 1e-58
0.001 0.001 0.001 0.997
0.001 0.997 0.001 0.001
0.276 0.001 0.092 0.631
0.001 0.997 0.001 0.001
0.001 0.815 0.001 0.183
0.001 0.001 0.997 0.001
0.997 0.001 0.001 0.001
0.001 0.815 0.001 0.183
0.815 0.183 0.001 0.001
0.001 0.997 0.001 0.001
0.001 0.001 0.997 0.001
0.065 0.001 0.933 0.001
=============================
letter probability matrix:
=============================
alength= 4 w= 10 nsites= 20 E= 1e-45
0.001 0.001 0.997 0.001
0.001 0.001 0.997 0.001
0.001 0.001 0.997 0.001
0.001 0.38 0.001 0.618
0.262 0.127 0.61 0.001
0.872 0.126 0.001 0.001
0.001 0.001 0.392 0.606
0.384 0.361 0.001 0.254
0.997 0.001 0.001 0.001
0.997 0.001 0.001 0.001
=============================
letter probability matrix:
=============================
alength= 4 w= 12 nsites= 20 E= 1e-43
0.446 0.022 0.174 0.357
0.043 0.036 0.528 0.393
0.011 0.453 0.01 0.526
0.028 0.624 0.029 0.319
0.065 0.045 0.852 0.038
0.66 0.022 0.284 0.034
0.249 0.228 0.472 0.05
0.031 0.227 0.164 0.578
0.031 0.249 0.491 0.228
0.521 0.198 0.24 0.041
0.188 0.194 0.397 0.221
0.041 0.487 0.206 0.267
=============================
letter probability matrix:
=============================
alength= 4 w= 12 nsites= 20 E= 1e-41
0.001 0.997 0.001 0.001
0.065 0.001 0.001 0.933
0.001 0.019 0.901 0.079
0.001 0.019 0.001 0.979
0.001 0.001 0.001 0.997
0.216 0.02 0.02 0.744
0.997 0.001 0.001 0.001
0.02 0.099 0.821 0.06
0.04 0.262 0.236 0.462
0.374 0.585 0.001 0.04
0.178 0.425 0.02 0.376
0.059 0.698 0.001 0.242
=============================
letter probability matrix:
=============================
alength= 4 w= 10 nsites= 20 E= 1e-40
0.046 0.03 0.923 0.001
0.001 0.001 0.997 0.001
0.029 0.808 0.041 0.122
0.968 0.001 0.001 0.03
0.997 0.001 0.001 0.001
0.001 0.989 0.001 0.009
0.997 0.001 0.001 0.001
0.931 0.012 0.001 0.056
0.001 0.969 0.001 0.029
0.001 0.001 0.029 0.969
=============================
letter probability matrix:
=============================
alength= 4 w= 10 nsites= 20 E= 1e-39
0.089 0.821 0.055 0.035
0.097 0.018 0.844 0.041
0.057 0.811 0.056 0.076
0.074 0.142 0.062 0.722
0.914 0.019 0.048 0.019
0.026 0.03 0.859 0.085
0.812 0.066 0.063 0.059
0.023 0.014 0.794 0.169
0.111 0.115 0.689 0.085
0.148 0.029 0.74 0.083
=============================
letter probability matrix:
=============================
alength= 4 w= 12 nsites= 20 E= 1e-38
0.001 0.001 0.997 0.001
0.986 0.001 0.012 0.001
0.001 0.001 0.997 0.001
0.986 0.012 0.001 0.001
0.997 0.001 0.001 0.001
0.973 0.013 0.001 0.013
0.997 0.001 0.001 0.001
0.013 0.013 0.961 0.013
0.001 0.001 0.718 0.28
0.001 0.96 0.001 0.038
0.013 0.131 0.83 0.026
0.96 0.001 0.001 0.038
=============================
letter probability matrix:
=============================
alength= 4 w= 12 nsites= 20 E= 1e-34
0.1 0.62 0.171 0.109
0.639 0.094 0.15 0.117
0.624 0.007 0.241 0.128
0.012 0.739 0.077 0.172
0.204 0.118 0.151 0.527
0.099 0.102 0.512 0.287
0.055 0.838 0.036 0.071
0.71 0.04 0.108 0.142
0.125 0.081 0.709 0.085
0.074 0.261 0.068 0.597
0.157 0.098 0.48 0.265
0.186 0.576 0.102 0.136
=============================
letter probability matrix:
=============================
alength= 4 w= 10 nsites= 20 E= 1e-29
0.001 0.089 0.015 0.895
0.024 0.074 0.001 0.901
0.088 0.125 0.656 0.131
0.049 0.047 0.001 0.903
0.078 0.176 0.647 0.099
0.951 0.035 0.001 0.013
0.606 0.168 0.113 0.113
0.054 0.06 0.095 0.791
0.001 0.016 0.001 0.982
0.102 0.06 0.17 0.668
=============================
letter probability matrix:
=============================
alength= 4 w= 10 nsites= 20 E= 1e-26
0.77 0.075 0.118 0.037
0.104 0.76 0.081 0.055
0.862 0.046 0.029 0.063
0.049 0.737 0.077 0.137
0.024 0.045 0.781 0.15
0.041 0.044 0.036 0.879
0.055 0.034 0.853 0.058
0.033 0.065 0.071 0.831
0.067 0.095 0.077 0.761
0.09 0.06 0.079 0.771
=============================
letter probability matrix:
=============================
alength= 4 w= 12 nsites= 20 E= 1e-25
0.001 0.001 0.001 0.997
0.001 0.796 0.116 0.087
0.019 0.001 0.896 0.084
0.168 0.001 0.001 0.83
0.086 0.001 0.912 0.001
0.997 0.001 0.001 0.001
0.997 0.001 0.001 0.001
0.575 0.001 0.001 0.423
0.43 0.568 0.001 0.001
0.171 0.001 0.825 0.003
0.659 0.001 0.084 0.256
0.001 0.001 0.997 0.001
=============================
letter probability matrix:
=============================
alength= 4 w= 8 nsites= 20 E= 1e-21
0.001 0.001 0.001 0.997
0.001 0.001 0.997 0.001
0.001 0.001 0.001 0.997
0.001 0.997 0.001 0.001
0.997 0.001 0.001 0.001
0.886 0.001 0.112 0.001
0.775 0.104 0.001 0.12
0.796 0.001 0.001 0.202
=============================
letter probability matrix:
=============================
alength= 4 w= 12 nsites= 20 E= 1e-20
0.112 0.886 0.001 0.001
0.001 0.997 0.001 0.001
0.512 0.001 0.001 0.486
0.744 0.001 0.001 0.254
0.001 0.001 0.997 0.001
0.886 0.001 0.112 0.001
0.001 0.773 0.001 0.225
0.112 0.001 0.886 0.001
0.225 0.773 0.001 0.001
0.997 0.001 0.001 0.001
0.001 0.001 0.997 0.001
0.001 0.886 0.001 0.112
=============================
letter probability matrix:
=============================
alength= 4 w= 12 nsites= 20 E= 1e-19
0.001 0.001 0.001 0.997
0.145 0.001 0.001 0.853
0.001 0.997 0.001 0.001
0.997 0.001 0.001 0.001
0.853 0.001 0.001 0.145
0.853 0.001 0.001 0.145
0.202 0.729 0.068 0.001
0.001 0.997 0.001 0.001
0.001 0.001 0.29 0.708
0.001 0.997 0.001 0.001
0.853 0.001 0.001 0.145
0.001 0.001 0.001 0.997
=============================
letter probability matrix:
=============================
alength= 4 w= 12 nsites= 20 E= 1e-15
0.001 0.001 0.997 0.001
0.001 0.001 0.001 0.997
0.001 0.997 0.001 0.001
0.001 0.001 0.001 0.997
0.001 0.001 0.001 0.997
0.001 0.997 0.001 0.001
0.997 0.001 0.001 0.001
0.203 0.001 0.795 0.001
0.011 0.001 0.987 0.001
0.001 0.001 0.997 0.001
0.591 0.204 0.204 0.001
0.001 0.585 0.001 0.413
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment