bx.seq.seq module
Classes to support “biological sequence” files.
- Author:
Bob Harris (rsharris@bx.psu.edu)
- class bx.seq.seq.SeqFile(file=None, revcomp=False, name='', gap=None)
Bases:
object
A biological sequence is a sequence of bytes or characters. Usually these represent DNA (A,C,G,T), proteins, or some variation of those.
class attributes:
file: file object containing the sequence revcomp: whether gets from this sequence should be reverse-complemented
False => no reverse complement True => (same as “-5’”) “maf” => (same as “-5’”) “+5’” => minus strand is from plus strand’s 5’ end (same as “-3’”) “+3’” => minus strand is from plus strand’s 3’ end (same as “-5’”) “-5’” => minus strand is from its 5’ end (as per MAF file format) “-3’” => minus strand is from its 3’ end (as per genome browser,
but with origin-zero)
- name: usually a species and/or chromosome name (e.g. “mule.chr5”); if
the file contains a name, that overrides this one
gap: gap character that aligners should use for gaps in this sequence
- close()
- extract_name(line)
- get(start, length)
Fetch subsequence starting at position start with length length. This method is picky about parameters, the requested interval must have non-negative length and fit entirely inside the NIB sequence, the returned string will contain exactly ‘length’ characters, or an AssertionError will be generated.
- raw_fetch(start, length)
- reverse_complement(text)
- set_text(text)