Hi Andigoni,

I'm unable to reproduce the issue reported by you. I used the BED file sent by you as attachment and fetched sequences on it. The sequences produced were of the same lengths as in the BED file. 

Also, I computed summary statistics for the lengths of your BED intervals - the mean and median lengths were 19790 bp and 3466 bp respectively. Since you mention that you expect the constitutive exons to be 100-200 bp long, I'm guessing there might be a problem in one of the steps prior to the subtraction step in your pipeline. 
If you can share your history with me (guru@psu.edu), I can take a look at what might be going on. Here's how to do it:
-go to the Options menu above your current history, select "Saved Histories"
-go to the pull-down menu for the problematic history and select "Share or Publish"
-share the history with me by clicking on "share with a user" and entering my email address: guru@psu.edu.

Thanks for using Galaxy,
Guru
Galaxy team.


On Mar 27, 2010, at 5:06 AM, Andigoni Malousi wrote:

Dear Galaxy member,

I'm sending you this e-mail because of a problem I have in fetching sequences. I used Galaxy to fetch sequences corresponding to alternatively spliced exons as well as constitutive exons.
Here they are the steps I followed to do that:
First, I extracted the coordinates of the ref genes from whole human genome:
1.     I selected Get Data -> UCSC table browser
2.     Genome: "Human", assembly: "Feb. 2009", Group: "Genes and Gene Prediction Tracks", Track:"RefSeq genes".
3.     Filter:  (+) region: Genome
4.     "get output" and "Exons plus 0 bases…"
Second, I extracted the coordinates of alternative splicing events
1.     I selected "Get Data" -> "UCSC Main table browser".
2.     Genome: "Human", assembly: "Feb. 2009", Group: "Genes and Gene Prediction Tracks", Track:"Alt Events".
3.     Filter: (+)
region: Genome

Then, I stored the outcome of these two processes in BED format. To extract the coordinates of the constitutive exons I used the Subtract operation in "Operate on Genomic Intervals"
1. Substract: Alternative splicing events  From: Ref genes
2. "Intervals with no overlap"
3. Stored the constitutive exon coordinates in BED format

Finally, using the "Fetch sequences" I tried to extract the genomic sequences for the outcome of the substraction (constitutive exons) in FASTA format. Please see the attached files for the extracted coordinates and sample sequences corresponding to constitutive exons.

The strange thing about the results is that while constitutive exons are on average 100-200nt length the extracted sequences are much more larger.

I was wondering whether there is something wrong in the whole procedure or this is a bug of Galaxy that we need to report.


Thank you very much in advance,
Andigoni


Andigoni Malousi
Post-doc in Bioinformatics
Aristotle University of Thessaloniki







<GalaxyHistoryItem-3-[Subtract_on_data_2_and_data_1]-constitutive.bed><constitutive.txt>_______________________________________________
galaxy-user mailing list
galaxy-user@lists.bx.psu.edu
http://lists.bx.psu.edu/listinfo/galaxy-user

Guruprasad Ananda
Graduate Student
Bioinformatics and Genomics
The Pennsylvania State University