Getting data from NCBI
The Get Data | Get Microbial Data tool is nice, but is limited to a subset of the older genomes among those that are currently available. It should surely be possible to get any accession directly from NCBI, but I don't see a tool to do that. The best I can do is to download a file to my computer and then upload it again to Galaxy. Am I missing something obvious? On a related point, I don't see GenBank listed as a format for conversion. Thanks Peter -- Prof J P W Young Department of Biology (Area 3) University of York Wentworth Way Heslington York YO10 5DD jpy1@york.ac.uk tel: +44 1904 328630(direct), 328500(department) fax: +44 1904 328505(department) http://bioltfws1.york.ac.uk/biostaff/staffdetail.php?id=jpwy http://www.york.ac.uk/depts/biol/staff/jpwy/jpwy.htm Email disclaimer: http://www.york.ac.uk/docs/disclaimer/email.htm
Hi Peter, We agree that it would be useful to have a tool designed to directly extract sequence data from NCBI/Genbank, but there are some practical reasons why we are currently unable to design one. We can offer two work-around suggestions: 1 - For the files available via HTTP/FTP, the most direct option is to paste that URL into the second box on the tool "Get Data -> Upload File from your computer". 2 - For "data sheet" document style Genbank format, it should be possible to design a workflow using the text manipulation tools to parse out sequence (and/or other data). This is an important data source and we are sorry that we cannot help in a more direct way right now, Best, Jen Galaxy team On 9/17/10 6:08 AM, Peter Young wrote:
The Get Data | Get Microbial Data tool is nice, but is limited to a subset of the older genomes among those that are currently available. It should surely be possible to get any accession directly from NCBI, but I don't see a tool to do that. The best I can do is to download a file to my computer and then upload it again to Galaxy. Am I missing something obvious?
On a related point, I don't see GenBank listed as a format for conversion.
Thanks
Peter
-- Jennifer Jackson http://usegalaxy.org
On 10/01/2010 05:16 PM, Jennifer Jackson wrote:
Hi Peter,
We agree that it would be useful to have a tool designed to directly extract sequence data from NCBI/Genbank, but there are some practical reasons why we are currently unable to design one. We can offer two work-around suggestions:
1 - For the files available via HTTP/FTP, the most direct option is to paste that URL into the second box on the tool "Get Data -> Upload File from your computer".
2 - For "data sheet" document style Genbank format, it should be possible to design a workflow using the text manipulation tools to parse out sequence (and/or other data).
you can also use the tool 'seqret' (this is in the EMBOSS section), which allows you the conversion of GenBank format to fasta or vice versa or many other conversions Hans
This is an important data source and we are sorry that we cannot help in a more direct way right now,
Best,
Jen Galaxy team
On 9/17/10 6:08 AM, Peter Young wrote:
The Get Data | Get Microbial Data tool is nice, but is limited to a subset of the older genomes among those that are currently available. It should surely be possible to get any accession directly from NCBI, but I don't see a tool to do that. The best I can do is to download a file to my computer and then upload it again to Galaxy. Am I missing something obvious?
On a related point, I don't see GenBank listed as a format for conversion.
Thanks
Peter
participants (3)
-
Hans-Rudolf Hotz
-
Jennifer Jackson
-
Peter Young