Hi Jen, Many thanks for the reply. Sadly my programming is not up to anything like a gbk to gtf converter! The main reason I want one is that as a virologist this would be very useful since many viruses do not have a gtf file but do have genbank submissions. I know of a site that has some viruses listed together with GFF files but alas I cannot find a GFF to GTF converter - nightmare!! I'll keep looking for one and if I find it I'll let you know. Cheers David On 23 Mar 2011, at 18:02, Jennifer Jackson wrote:
Hello David,
This is a great idea that the team has been considering adding, but nothing immediate is planned. There are some external teams that are working on outside development, and this is on their list, to.
If interested in what that project is doing, please see this thread: http://lists.bx.psu.edu/pipermail/galaxy-dev/2011-March/004692.html
For now, if the data resides in a track at UCSC (many are, especially for vertebrate genomes and it is updated daily), using the Table browser can allow you to export the data in GTF and push to Galaxy with the "Get Data" tool. Since some of the data can be large, using BX Main (our local UCSC mirror) may be the best source.
To do this, navigate to the target genome and track (RefSeq under Gene Predictions, others under Mrna & EST), and choose output format "GTF - gene transfer format". Please note that the "gene_id" attribute in the 9th field will not be populated with the gene name (will be same as transcript_id). This is just how UCSC does it right now (on their list to get the full GTF output set up in the TB, as far as we know). But, to get that info now, go back in and reexport the same table data again as "all fields from selected table" into Galaxy and the gene name will be in the data field named "name2". The text manipulation tools can help to format the data.
A workflow would be a good option once you have the tool path worked out, so that it can be reused without having to do it all again, for future similar genbank datasets. You may even want to publish the workflow for others to use, as it is very popular request, maybe add published page to explain how to use/prep data for input.
Apologies for the current inconvenience, but hopefully this can get you going until a more direct method is implemented directly in Galaxy main.
Great idea that many other users are also very interested in. Any contributions (page, workflow) would be most welcomed. A tool that does the extraction directly from Genbank would also be welcomed in the Tool Shed, if you want to contribute. http://community.g2.bx.psu.edu/
Best,
Jen Galaxy team
On 3/14/11 1:15 PM, David Matthews wrote:
Hi again,
Does anyone know of a genbank to gtf converter? I have heard such things exist but never found one...
Cheers David
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org