bx.intervals.intersection module
Data structure for performing intersect queries on a set of intervals which preserves all information about the intervals (unlike bitset projection methods).
- Authors:
James Taylor (james@jamestaylor.org), Ian Schenk (ian.schenck@gmail.com), Brent Pedersen (bpederse@gmail.com)
- bx.intervals.intersection.Intersecter
alias of
IntervalTree
- class bx.intervals.intersection.Interval
Bases:
object
Basic feature, with required integer start and end properties. Also accepts optional strand as +1 or -1 (used for up/downstream queries), a name, and any arbitrary data is sent in on the info keyword argument
>>> from bx.intervals.intersection import Interval >>> from collections import OrderedDict
>>> f1 = Interval(23, 36) >>> f2 = Interval(34, 48, value=OrderedDict([('chr', 12), ('anno', 'transposon')]))
- chrom
- end
- start
- strand
- value
- class bx.intervals.intersection.IntervalNode
Bases:
object
A single node of an IntervalTree.
- NOTE: Unless you really know what you are doing, you probably should us
IntervalTree rather than using this directly.
- end
- find(start, end, sort=True)
given a start and a end, return a list of features falling within that range
- insert(start, end, interval)
Insert a new IntervalNode into the tree of which this node is currently the root. The return value is the new root of the tree (which may or may not be this node!)
- intersect(start, end, sort=True)
given a start and a end, return a list of features falling within that range
- interval
- left(position, n=1, max_dist=2500)
find n features with a start > than position f: a Interval object (or anything with an end attribute) n: the number of features to return max_dist: the maximum distance to look before giving up.
- left_node
- right(position, n=1, max_dist=2500)
find n features with a end < than position f: a Interval object (or anything with a start attribute) n: the number of features to return max_dist: the maximum distance to look before giving up.
- right_node
- root_node
- start
- traverse(func)
- class bx.intervals.intersection.IntervalTree
Bases:
object
Data structure for performing window intersect queries on a set of of possibly overlapping 1d intervals.
Usage
Create an empty IntervalTree
>>> from bx.intervals.intersection import Interval, IntervalTree >>> intersecter = IntervalTree()
An interval is a start and end position and a value (possibly None). You can add any object as an interval:
>>> intersecter.insert( 0, 10, "food" ) >>> intersecter.insert( 3, 7, dict(foo='bar') )
>>> intersecter.find( 2, 5 ) ['food', {'foo': 'bar'}]
If the object has start and end attributes (like the Interval class) there is are some shortcuts:
>>> intersecter = IntervalTree() >>> intersecter.insert_interval( Interval( 0, 10 ) ) >>> intersecter.insert_interval( Interval( 3, 7 ) ) >>> intersecter.insert_interval( Interval( 3, 40 ) ) >>> intersecter.insert_interval( Interval( 13, 50 ) )
>>> intersecter.find( 30, 50 ) [Interval(3, 40), Interval(13, 50)] >>> intersecter.find( 100, 200 ) []
Before/after for intervals
>>> intersecter.before_interval( Interval( 10, 20 ) ) [Interval(3, 7)] >>> intersecter.before_interval( Interval( 5, 20 ) ) []
Upstream/downstream
>>> intersecter.upstream_of_interval(Interval(11, 12)) [Interval(0, 10)] >>> intersecter.upstream_of_interval(Interval(11, 12, strand="-")) [Interval(13, 50)]
>>> intersecter.upstream_of_interval(Interval(1, 2, strand="-"), num_intervals=3) [Interval(3, 7), Interval(3, 40), Interval(13, 50)]
- add(start, end, value=None)
Insert the interval [start,end) associated with value value.
- add_interval(interval)
Insert an “interval” like object (one with at least start and end attributes)
- after(position, num_intervals=1, max_dist=2500)
Find num_intervals intervals that lie after position and are no further than max_dist positions away
- after_interval(interval, num_intervals=1, max_dist=2500)
Find num_intervals intervals that lie completely after interval and are no further than max_dist positions away
- before(position, num_intervals=1, max_dist=2500)
Find num_intervals intervals that lie before position and are no further than max_dist positions away
- before_interval(interval, num_intervals=1, max_dist=2500)
Find num_intervals intervals that lie completely before interval and are no further than max_dist positions away
- downstream_of_interval(interval, num_intervals=1, max_dist=2500)
Find num_intervals intervals that lie completely downstream of interval and are no further than max_dist positions away
- find(start, end)
Return a sorted list of all intervals overlapping [start,end).
- insert(start, end, value=None)
Insert the interval [start,end) associated with value value.
- insert_interval(interval)
Insert an “interval” like object (one with at least start and end attributes)
- traverse(fn)
call fn for each element in the tree
- upstream_of_interval(interval, num_intervals=1, max_dist=2500)
Find num_intervals intervals that lie completely upstream of interval and are no further than max_dist positions away