From my experience, it is trivial to make this work with a wrapper script in Python. Given a directory listing of the passed in extra_files_path, you can easily
Marine, It seems that the files your tool needs are in the extra_files_path and you just have to figure out how to get at them for processing. figure out which input is which. It's probably possible with bash but it would definitely not be my personal preference. A general purpose scripting language like python would probably make it easier but if you insist on bash, I am sure you already have enough information to get it to work. On Sat, Jun 16, 2012 at 2:08 AM, Marine Rohmer <marine.rohmer@yahoo.fr>wrote:
New try : ${os.path.join( $input.extra_files_path, 'first_component_file.xxx' )} ${os.path.join( $input.extra_files_path, 'second_component_file.yyy')}
With this, I can see my tool takes as parameters : myTool path/to/first_component_file.xxx path/to/second_component_file.yyy This sounds great, isn't it ?
But I have now a message quoting the ''os.path.join'' line, and saying "mauvaise substitution" which means "bad substitution". (First why isn't it in English ? Does it mean it's not a Galaxy problem ?)
So I generated a new composite file, made of 2 components files, named exactly as the "metadata.base_name" is set. I used this two component files as an input in my other tool, and now I no longer have this message ! That means Galaxy wants the components files' name to be the same as the "metadata.base_name" , if I understood well... So if user want to change the name, it sure will fail... How to fix this problem ?
Furthermore, it doesn't completely work. I think my tool thinks there is two parameters instead of only one, when seeing the two paths for my components files. I don't know how to fix it either...
Any ideas ?
Thank you for your attention,
Marine
------------------------------ *De :* Ross <ross.lazarus@gmail.com> *À :* Marine Rohmer <marine.rohmer@yahoo.fr> *Cc :* "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu> *Envoyé le :* Vendredi 15 juin 2012 12h40 *Objet :* Re: Re : Composite output with self-declarated datatypes
Hi Marine,
Other people may have better ideas, but the way I've always done it is to ensure that the tool knows how to find the input files inside the extra_files_path because that's easy to pass.
If $i is the name of data parameter = composite file chosen from the user history (ie a data input on your form), then you can pass it to the script as (eg) --extra_files_path "$i.extra_files_path"
You might also find --base_name "$i.metadata.base_name" handy sometimes for naming outputs
I hope this helps.
On Fri, Jun 15, 2012 at 7:29 PM, Marine Rohmer <marine.rohmer@yahoo.fr>wrote:
Update : Now my tool creates a html composite output, made of 2 outputs .xxx and .yyy. I've added the "def get_mime(self)" function in the python file describing all my formats, and now it works. When I run my tool and click on the eye symbol, I can see a html page with links to download the two component files.
I thought everything was now going perfect, but when I try to use this composite output as an input in another tool I've added, this other tool can't open it.... Which makes sense to me, since the other tool needs both of the component files, and not an html input.
So I wondered how to change the html composite output into the two component files ? I've tried to retrieve them in a bash wrapper with :
component="" HTML_FILE="$1" for i in HTML_FILE do component="$i" done
But with a simple "echo $component" test, I get "HTML_FILE" as a result, and not one of the component file.
So is there any specified thing to do to recover the component files of a composite file ?
Best regards,
Marine
------------------------------ *De :* Ross <ross.lazarus@gmail.com> *À :* Marine Rohmer <marine.rohmer@yahoo.fr> *Envoyé le :* Mercredi 13 juin 2012 10h50
*Objet :* Re: [galaxy-dev] Re : Composite output with self-declarated datatypes
Look at your xml. Output_name is a text parameter - it doesn't have any paths It's certainly not a new output file Galaxy will create or an existing composite object - so it quite correctly complains about not having a files_path or extra_files_path
On Wed, Jun 13, 2012 at 6:34 PM, Marine Rohmer <marine.rohmer@yahoo.fr>wrote:
Hi Ross,
Thank you so much for your answer !
I've changed my command line in myTool.xml as followed :
<command> path/to/myTool-wrapper.sh '$output_name.files_path/$output_name.metadata.base_name' $input_file </command>
But I still have the same error message, with "files_path" instead of "extra_files_path" :
NotFound: cannot find 'files_path' while searching for 'output_name.files_path'
Well I'm going to grep those files as you said, and see if it can help me...
Thank you again,
Marine
------------------------------ *De :* Ross <ross.lazarus@gmail.com> *À :* Marine Rohmer <marine.rohmer@yahoo.fr> *Envoyé le :* Mercredi 13 juin 2012 6h44 *Objet :* Re: [galaxy-dev] Re : Composite output with self-declarated datatypes
Marine, Sorry to hear you're having problems - composite objects definitely do work but they are definitely not simple or properly documented.
I don't really have time to figure out exactly what the problem is but one very obvious error message
NotFound: cannot find 'extra_files_path' while searching for 'output_name.extra_files_path'
is telling you that a new output file does not have an extra_files_path at job submission.
Try output_name.files_path - why it differs is something I do not understand but I've learned to live with....
If you grep for files_path in your tool/*/*.xml files, you'll find lots of examples of tools using files_path and extra_files_path (mostly html files) and studying those working examples might be useful in getting your code to work?
cheers...
Hi,
Maybe my message was not understandable enough. I really need your help, so I'll try to be more concise :
How do I make a composite output from 2 datatypes that I have declared myself ? I've followed the "Composite Datatypes" wiki but it seems that I've missed something... My composite datatype appears well in "file format" from Get Data's upload file section, but when I run my tool, I have 2 outputs which are the components of my primary datatype, instead of only one output.
Best regards,
Marine
________________________________ De : Marine Rohmer <marine.rohmer@yahoo.fr> À : "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu> Envoyé le : Vendredi 8 juin 2012 15h15 Objet : Composite output with self-declarated datatype
Hi everyone,
I'm trying to add a tool which generates 2 files, that I will call ".xxx" (a text file) and ".yyy" (a binary file) . Both files are needed to use the result of my tool with an other tool I've added. So I wanted to create a composite datatype , that I will call ".composite", whose components are ".xxx" and ".yyy".
I've declared the datatype ".xxx", ".yyy" and ".composite" in the datatypes_conf.xml file, and written the required python files . Now, ".xxx", ".yyy" and ".composite" appear in Get Data's "file format" .
These are my files :
In datatype_conf.xml :
<datatype extension="xxx" type="galaxy.datatypes.xxx:xxx" mimetype="text/html" display_in_upload = "True" subclass="True"/> <datatype extension="yyy" type="galaxy.datatypes.yyy:yyy" mimetype="application/octet-stream" display_in_upload = "True" subclass="True" /> <datatype extension="composite" type="galaxy.datatypes.composite:Composite" mimetype="text/html" display_in_upload="True"/>
xxx.py (summarized) :
import logging from metadata import MetadataElement from data import Text
log = logging.getLogger(__name__)
class xxx(Text): file_ext = "xxx"
def __init__( self, **kwd ): Text.__init__( self, **kwd )
yyy.py (summarized) :
import logging from metadata import MetadataElement from data import Text
log = logging.getLogger(__name__)
# yyy is a binary file, don't know what to put instead of "Text". "Binary" and "Bin" don't work. class yyy(Text): file_ext = "yyy"
def __init__( self, **kwd ): Text.__init__( self, **kwd )
composite.py (summarized) :
import logging from metadata import MetadataElement from data import Text
log = logging.getLogger(__name__)
class Composite(Text): composite_type = 'auto_primary_file' MetadataElement( name="base_name", desc="base name for all
versions of this index dataset", default="your_index", readonly=True, set_in_upload=True) file_ext = 'composite'
def __init__( self, **kwd ): Text.__init__( self, **kwd ) self.add_composite_file( '%s.xxx', description = "XXX file", substitute_name_with_metadata = 'base_name') self.add_composite_file( '%s.yyy', description = "YYY file", substitute_name_with_metadata = 'base_name', is_binary = True )
Atfer having read Composite Datatypes in the wiki, my myTool.xml looks
On Tue, Jun 12, 2012 at 6:24 PM, Marine Rohmer <marine.rohmer@yahoo.fr> wrote: transformed like
:
<tool id="my tool"> <command> path/to/crac-index-wrapper.sh ${os.path.join( $output_name_yyy.extra_files_path, '%s.yyy')} ${os.path.join( $output_name_xxx.extra_files_path, '%s.xxx' )} $input_file </command> <inputs> <param name="output_name" type="text" value ="IndexOutput" label="Output name"/> <param name="input_file" type="data" label="Source file" format="fasta"/> </inputs> <outputs> <data format="ssa" name="output_name_ssa" from_work_dir="crac-index_output.ssa" label="CRAC-index: ${output_name}.ssa"> </data> <data format="conf" name="output_name_conf" from_work_dir="crac-index_output.conf" label="CRAC-index: ${output_name}.conf"> </data> </outputs> </tool>
I have 2 main problems :
When I upload a xxx file via "Get Data", there's no problem. However, when I upload a yyy file (the binary one), history bloc rests eternally blue ("uploading dataset") , even for a small file.
The second problem is that I want my tool to only generate the .composite file on the history, and not each of .xxx and .yyy. . But when I run my tool I still have 2 outputs displayed in the history : one for xxx and one for yyy. Furthermore, neither of them work, and I have the following message :
path/to/myTool-wrapper.sh: 6: path/to/myTool-wrapper.sh.sh<http://mytool-wrapper.sh.sh/>: cannot create
/home/myName/work/galaxy-dist/database/files/000/dataset_302_files/%s.yyy.xxx:
Directory nonexistent path/to/myTool-wrapper.sh: 6: path/to/myTool-wrapper.sh: cannot create
/home/myName/work/galaxy-dist/database/files/000/dataset_302_files/%s.yyy.yyy:
Directory nonexistent path/to/myTool-wrapper.sh: 11: path/to/myTool-wrapper.sh: Syntax error: redirection unexpected
So I've checked manually in /home/myName/work/galaxy-dist/database/files/000/ and there's only "dataset_302.dat", an empty file. (And whatsmore, I don't understand why I get in the message "%s.yyy.xxx" and "%s.yyy.yyy" instead of "%s.yyy" and "%s.xxx" ...)
Then I've looked the example of rgenetics.xml, and tried to change the command line and the output :
<tool id="my tool"> <command> path/to/myTool-wrapper.sh '$output_name.extra_files_path/$output_name.metadata.base_name' $input_file </command> <inputs> <param name="output_name" type="text" value ="IndexOutput" label="Output name" /> <param name="input_file" type="data" label="Source file" /> </inputs> <outputs> <data format="html" name="output" label="myTool: ${output_name}.html" metadata_source="input_file"/> </outputs> </tool>
This gave me :
Traceback (most recent call last): File "/home/myName/work/galaxy-dist/lib/galaxy/jobs/runners/local.py", line 59, in run_job job_wrapper.prepare() File "/home/myName/work/galaxy-dist/lib/galaxy/jobs/__init__.py", line 429, in prepare self.command_line = self.tool.build_command_line( param_dict ) File "/home/myName/work/galaxy-dist/lib/galaxy/tools/__init__.py", line 1971, in build_command_line command_line = fill_template( self.command, context=param_dict ) File "/home/myName/work/galaxy-dist/lib/galaxy/util/template.py", line 9, in fill_template return str( Template( source=template_text, searchList=[context] ) ) File
"/home/myName/work/galaxy-dist/eggs/Cheetah-2.2.2-py2.7-linux-x86_64-ucs4.egg/Cheetah/Template.py",
line 1004, in __str__ return getattr(self, mainMethName)() File "cheetah_DynamicallyCompiledCheetahTemplate_1339157051_58_87978.py", line 83, in respond NotFound: cannot find 'extra_files_path' while searching for 'output_name.extra_files_path'
So now I don't know which way is the one to follow : the first one inspired by the example in the wiki, or the second one inspired by rgenetics.xml. And what's wrong with it... I will really appreciate any suggestion !
-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;