Rehman:
Here is one way of dealing with this:
1. Upload genes from UCSC in bed format. 2. Use "Operate on Genomic Intervals --> Cluster" with "Return Type" option set to "Find largest interval in each cluster" (see attached image). This will select longest transcript for each gene. 3. Use "Extract Features --> Gene BED To Exon/Intron/Codon BED expander" to convert gene coordinates to exon coordinates.
Let me know if you run into any trouble with this.
Thanks for using Galaxy!
anton galaxy team
On Sep 30, 2009, at 12:31 PM, Rehman, Atteeq (NIH/NIDCD) [F] wrote:
Hey everyone, I am a new galaxy user. I am trying to upload coding exons of specific region from ucsc genome browser but I end up with loading coding exons of all splice variants/isoforms. As a result some exons are repeated many times (depending upon how many isoforms the crossponding gene has) which I don’t want. Can somebody help me how can I have a list of coding exons in which each exon is represented only once irrespective of how many splice variants/isoforms the gene has? Thanks a lot
<image001.jpg>
Atteeq Ur Rehman Visiting Fellow National Institute on Deafness and Other Communication Disorders, National Institutes of Health, Room 2A-19, 5 Research court, Rockville, MD, USA, 20850. Lab Ph. No. 301-402-9059
galaxy-user mailing list galaxy-user@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-user
Anton Nekrutenko http://nekrut.bx.psu.edu http://galaxyproject.org