This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import re | |
DNA = 'GAGCGCTAGCCAAA' | |
match = re.search(pattern='AAA', string=DNA) | |
# match = re.search('AAA', 'DNA') | |
print(match) | |
<re.Match object; span=(11, 14), match='AAA'> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import re | |
DNA = 'GAGCGCTAGCCAAA' | |
if re.search('AAA', DNA): | |
print("Tri-nucleotide found!") | |
#console output | |
# Tri-nucleotide found! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import re | |
DNA = 'GAGCGCTAGCCAAA' | |
match = re.search('AAA', DNA) | |
print(match.start()) | |
#11 | |
print(match.end()) | |
#14 | |
print(match.span()) | |
# (11,14) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
DNA = 'ATCGACCGGGTTT' | |
if re.search('CCGGG', DNA) or re.search('CCCGG', DNA): | |
print('Restriction enzyme found!') | |
if re.search('CC(G|C)GG', DNA): | |
print('Restriction enzyme found!') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
open_reading_frame = 'AUG.*(AA|AG|GA)' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
inframe_open_reading_frame = 'AUG(...)*U(AA|AG|GA)' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
N_glycosylation_pattern = 'N[^P][ST][^P]' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import re | |
N_glycosylation_pattern = 'N[^P][ST][^P]' | |
# putting a caret ^ at the start of the group will negate it | |
# and match any character that is not in that group | |
Protein_seq = 'YHWKYELIQNNSNEFC' | |
if re.search(N_glycosylation_pattern, Protein_seq): | |
print("N-glycosylation site motif found") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
htt_pattern = '(CAG|CAA){18,}' | |
# just like with substrings we can leave out the lower and upper limits | |
# here, we will match the pattern 18 or more times |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import re | |
htt_pattern = '(CAG|CAA){18,}' | |
htt_mRNA = open('C:/Users/apsciuser/Downloads/htt_gene.fasta').read() | |
match = re.findall(htt_pattern, htt_mRNA) | |
print("The number of polyQ repeats found are: " + str(len(match))) | |
# Console output | |
# The number of polyQ repeats found are: 1 |
OlderNewer