On Thu, Feb 17, 2011 at 8:07 AM, Peter Cock
<p.j.a.cock@googlemail.com> wrote:
On Thu, Feb 17, 2011 at 12:37 PM, Sean Davis wrote:
>
> On Thu, Feb 17, 2011 at 5:48 AM, Peter wrote:
>>
>> Once in Galaxy all the data files have the extension .dat on disk, so
>> I would try using a wrapper script that creates a symbolic link from the
>> input.dat file to something like input.pdb or input.ent (and if that
>> doesn't
>> work, copy the file) before running the compiled code and then remove
>> it afterwards.
>>
>
> Hi, Peter. I ended up doing just that. The hack in all its messiness is
> here:
>
https://gist.github.com/831017
I would be wary of using ${input.name} like that - test with things
like renaming the dataset in Galaxy, and pasting in a PBD file
rather than uploading one. Also I suspect you can get filenames
with spaces in them which will probably cause trouble. You'll
notice that Galaxy generates its own *.dat filename which avoid
spaces.
Personally I would generate the *.pdb or *.ent filename within
the wrapper script based on the input file name (*.dat). Try:
Unfortunately, the command-line executable assumes that the filename contains the ID of the PDB record, so I actually need this right now. I'm going to have a chat with the command-line tool developer about designing a more robust interface.
os.symlink(fname,fname+".pdb")
...
symdcmd = "SymD %s.pdb" % fname
>>
>> Separately from this, you may need to extend Galaxy to define pdb
>> as a new file format (ideally with a data type sniffer).
>>
>> This kind of question is better asked on the dev list (CC'dd)
>>
>
> Thanks. That is the next step.
I haven't done this myself yet (but I may well need to before long).
I extended based on filename extension and added the datatype to data.py. This works like a charm, but it isn't foolproof, obviously (no sniffer yet). The PDB format isn't too complicated, but it is flexible, so I need to find out exactly what is required as opposed to "possible". I see that biopython has a class and parser for it, so I might be able to use that rather directly.
Sean