Created
April 25, 2014 17:27
-
-
Save marcelcaraciolo/11297014 to your computer and use it in GitHub Desktop.
Sequence example file
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Sequence | |
-------- | |
String object representing biological sequences with alphabets. | |
""" | |
cdef class Sequence(object): | |
''' | |
The Sequence class is a general container for storing a big string (a long sequence of characters) and | |
for making its manipulation easy and efficient. | |
Like python standard strings, the basic sequence object is immutable. So you cannot do attribution such as | |
seq[5] = 'A'. However it allows the Sequence objects to be used as dictionary keys. | |
The Sequence object provides a number of string like methods (such as count, find, split and strip), which | |
are general for any biological sequences. | |
For more biology-oriented purpose of storing a DNA, RNA or a sequence of amino acids, please check those | |
subclasses: DNASequence, RNASequence, AASequence. They all derive directly from the Sequence class. | |
The Sequence object can be used for storing any single string based on single-byte character set. | |
Parameters | |
---------- | |
data: string | |
The sequence of letters | |
alphabet: Optional argument, an Alphabet object, default is None. | |
The expected symbols (letters), for Sequence the alphabet is None. | |
''' | |
cdef public bytes data | |
cdef public object alphabet | |
def __init__(self, bytes data, alphabet=None): | |
self.data = data | |
self.alphabet = alphabet | |
def __repr__(self): | |
''' | |
Returns a (truncated) representation of the sequence for debugging. | |
''' | |
return '%d-letter "%s" instance\nseq: %s' % (len(self.data), self.__class__.__name__, repr(self.data)) | |
def __str__(self): | |
''' | |
Return the full sequence as a python string, use str(sequence). | |
''' | |
return self.data | |
def __len__(self): | |
''' | |
Return the length of the sequence. | |
''' | |
return len(self.data) | |
def __getitem__(self, index): | |
''' | |
Returns a subsequence of single letter. | |
''' | |
return Sequence(self.data[index], self.alphabet) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment