On Wed, Sep 14, 2011 at 5:21 PM, Hans-Rudolf Hotz <hrh@fmi.ch> wrote:
On 09/14/2011 10:39 AM, Timothy Wu wrote:
//
Alternatively, I can just ask user to download from NCBI ftp themselves, decompress them, and upload it to galaxy.
What's the best approach here?
How about: you download the data once, and then offer it as a 'data library' to your users. This way you avoid data duplication.
I do not know how to prepare a "data library". However, I think this is less than optimal as the data itself may be updated. And I don't think data duplication is really a problem if the users install their own version of Galaxy. I think I need some kind of "data source" implementation that allow user to obtain the data themselves. However with the current tool XML definition, I don't know how to have a FTP download tool to download EST data from NCBI to Galaxy directly. Oh well, I guess I'll resort to users uploading zipped EST genbank files themselves by uploading to galaxy via FTP if all else fails. Or I'll just have the FTP tool to also parses the parses the genbank downloaded and merges all data to a single file. But this really limits the flexibility of the FTP tool which could be more generic. Timothy