Hello Abdullah,

The tool geecee will take fasta sequence as input. I am not sure if you just have the bed coordinates of the regions of interest or already have the coordinates of genes contained within these regions yet.

If you need the genes, then one choice is to extract a track from the UCSC table browser to obtain transcripts in bed12 format with the tool "Get Data: UCSC Main". Tracks in the group "Genes and Gene Predictions" are most likely what you will want. You can read about the choices at UCSC, but common selections include UCSC Genes, Refseq Genes, etc. You can get them all, them use tools in the group "Operate on Genomic Intervals" to limit the group to just those that fit within the isochore coordinate bounds.

For a list of associated gene identifiers, related tables to most gene tracks at UCSC contain that sort of information. Do a separate extract operation to obtain a file that contains the gene and transcript identifiers, then join the data together with the transcript you obtain after performing the above filtering, to link in the gene name.

Once you have the transcript coordinates, fasta sequence can be obtained in two ways. If you want to do the GC counts off of the mRNA, use the transcript identifiers in the UCSC Table browser again, choose sequence output (not bed), and this time extract "mRNA" when prompted (not genomic). If genomic sequence is fine, the tool "Fetch Sequences -> Extract Genomic DNA" can be used.

Then use the fasta sequeces as input to the "geecee" tool - the problems you were having were most likely with giving the tool the wrong type of input.

This is a lot of steps, and how you decided to organzize the data before running geecee will affect how the summary stats are calculated. Really, any stretch of nucleotide fasta sequence can be used for input (I do not know of an upper length bound, but there probably is one, so just watch for that - if an error comes up, work with smaller regions). You could also just convert the fasta sequence to tabular, and add up the total bases, count Gs, count Cs, etc. then perform a calculation on your own. See also "Regional Variation -> Feature coverage", "Graph/Display Data", and "BEDTools", each may be helpful, for different reasons.

There are several tutorials that do many of these same basic operations as part of the analysis or tool demos. Reviewing them will help you to know how to structure inputs, use particular tools, etc, if you would like the guidance. Under "Shared Pages": pls see Galaxy 101 and Using Galaxy 2012 for the introduction tutorials.
https://main.g2.bx.psu.edu/page/list_published

Best,

Jen
Galaxy project

On 6/8/13 6:17 PM, Abdullah Al Mahmud wrote:
Hi,

In my account I have uploaded a file name iso_mm10.bed. The bed files contains coordinates of 6018 isochores of mouse genome mm10. I want to extract GC% of each scores with the list of genes present in each isochores.

I tried using extract features, geecee, and many other tools from galaxy. But every time either it said error or no peak.

I will be grateful to you if you kindly give me an idea about how to solve this problem.

Abdullah

--
Abdullah Al Mahmud, PhD
Postdoctoral fellow,
University of Montreal,
Lab. of Dr. Jacques Michaud
CHU Sainte-Justine Research Center,
Montreal, Quebec, Canada.
abdullah.al.mahmud@umontreal.ca


___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

-- 
Jennifer Hillman-Jackson
Galaxy Support and Training
http://galaxyproject.org