Here is one way of dealing with this:
1. Upload genes from UCSC in bed format.
2. Use "Operate on Genomic Intervals --> Cluster" with "Return
option set to "Find largest interval in each cluster" (see attached
image). This will select longest transcript for each gene.
3. Use "Extract Features --> Gene BED To Exon/Intron/Codon BED
expander" to convert gene coordinates to exon coordinates.
Let me know if you run into any trouble with this.
Thanks for using Galaxy!
On Sep 30, 2009, at 12:31 PM, Rehman, Atteeq (NIH/NIDCD) [F] wrote:
I am a new galaxy user. I am trying to upload coding exons of
specific region from ucsc genome browser but I end up with loading
coding exons of all splice variants/isoforms. As a result some exons
are repeated many times (depending upon how many isoforms the
crossponding gene has) which I don’t want. Can somebody help me how
can I have a list of coding exons in which each exon is represented
only once irrespective of how many splice variants/isoforms the gene
has? Thanks a lot
Atteeq Ur Rehman
National Institute on Deafness and Other Communication Disorders,
National Institutes of Health,
Room 2A-19, 5 Research court, Rockville, MD, USA, 20850.
Lab Ph. No. 301-402-9059
galaxy-user mailing list