Hi all, 

I think I must be doing something incredibly wrong because it seems that the OGI subtract operation has the mirror image problem to intersect.  That is, instead of say that my file 1 has (N - 1105) intervals that do not overlap file 2, it says that N intervals do not overlap.  Do subtract and intersect use the same underlying intersection code?  
 
Best,
- Aaron
quinlanlab.org





On Feb 8, 2013, at 10:23 AM, Aaron Quinlan <aaronquinlan@gmail.com> wrote:

Dear list,

I have a student that found an unexplained discrepancy between the results produced by the "Operate on Genomic Intervals" (OGI) intersect operation versus the OGI join operation.  In particular, we know for certain that there are exactly 1105 intersection of at least 1bp between the two files we are testing, as we have confirmed this with our own bedtools and the ucsc table browser.  An example intersection (intersecting positions: 10012008 - 10012013):

file 1:
chr1 10012008 10012021 5.6186

file 2:
chr1 10011813 10012013 5_Strong_Enhancer 0 + 10011813 10012013 250,202,0

However, OGI intersect find 0 intersections between the files (settings: return overlapping intervals, >= 1bp).  In an effort to make sure we didn't goof up on file formats (BED) or genome builds (hg19), we tested the exact same two files with the OGI join operation and found 1105 intersections as expected.

I also tested the files with the bx-python bed_intersect.py and bed_intersect_basewise.py scripts and get the expected results.

Does anyone have a suggestion for how to resolve this?

Thanks for your help and for providing such a fantastic resource to the genomics community.

Best,