Cluster, using a distance of 0, does the exact same thing as merge.  However, you can specify a minimum number of intervals per cluster (2 ensures you're only grabbing merging intervals).  Maximum distance can be set to a negative number, which the forces overlap (-1 forces 1 bp of overlap).  You can also tweak your output to either merge, group (clustered intervals will be grouped together) or preserve the original ordering of the file.

I think that is what you are trying to do.

The other possibility is that you want to capture the overlapping regions of intervals within the same file.  When two intervals are merged, they might not actually have any overlap.  They only need to be touching, as in [a,b),[b,c) would be merged to [a,c).  The overlapping interval there is [b,b), which doesn't really make sense (the length of that interval is 0).

I can easily write a tool to find regions that are referenced more than once (i.e. overlap with other intervals in the same file).  However, this will not include that one case where two intervals are merged because they are next to each other.

I hope this helps,


Erika wrote:
Dear Galaxy Help,

I was wondering if it would be possible to get the coordinates that caused the merge as the output from "Tools: Operate on Genomic Intervals: Merge the overlapping intervals of a query", rather than the entire merged interval as the output.  Kind of like the output from "Intersect: Overlapping Pieces of intervals" option, which returns the exact base pair overlap between two queries.  It might be helpful in some cases to see only the coordinates that caused the merge.  From my limited Galaxy knowledge, by using the "Intersect" option and comparing a file to itself, the output would also include those complete overlaps of interval_1 in file1 to it's copy interval_1 in file2.  If there is already a way to get just the coordinates that caused the merge, I would be interested to learn more.  

Thanks again for your help!
- Erika 

E.M. Kvikstad
Academic Computing Fellow
IGDP Genetics
Center for Comparative Genomics and Bioinformatics 
The Pennsylvania State University
208 Mueller Lab
University Park, PA 16802
(814) 863-2185

_______________________________________________ Galaxy-user mailing list