bx.gene_reader module

Readers extracting gene (exon and intron) information from bed / gtf / gff formats.

  • GeneReader: yields exons

  • CDSReader: yields cds_exons

  • FeatureReader: yields cds_exons, introns, exons

For gff/gtf, the start_codon stop_codon line types are merged with CDSs.

bx.gene_reader.CDSReader(fh, format='gff')

yield chrom, strand, cds_exons, name

bx.gene_reader.FeatureReader(fh, format='gff', alt_introns_subtract='exons', gtf_parse=None)

yield chrom, strand, cds_exons, introns, exons, name

gtf_parse Example: # parse gene_id from transcript_id “AC073130.2-001”; gene_id “TES”; gene_name = lambda s: s.split(‘;’)[1].split()[1].strip(‘”’)

for chrom, strand, cds_exons, introns, exons, name in FeatureReader( sys.stdin, format=’gtf’, gtf_parse=gene_name )

bx.gene_reader.GeneReader(fh, format='gff')

yield chrom, strand, gene_exons, name